Sample records for highest sequence conservation

  1. Comparison of ZP3 protein sequences among vertebrate species: to obtain a consensus sequence for immunocontraception.

    PubMed

    Zhu, X; Naz, R K

    1999-03-01

    The deduced ZP3 amino acid (aa) sequences of 13 vertebrate species namely mouse, hamster, rabbit, pig, porcine, cow, dog, cat, human, bonnet, marmoset, carp, and frog were compared using the PILEUP and PRETTY alignment programs (GCG, Wisconsin, USA). The published aa sequences obtained from 13 vertebrate species indicated the overall evolutionarily conservation in the N-terminus, central region, and C-terminus of the ZP3 polypeptide. More variations of ZP3 polypeptide sequences were seen in the alignments of carp and frog from the 11 mammalian species making the leader sequence more prominent. The canonical furin proteolytic processing signal at the C-terminus was found in all the ZP3 polypeptide sequences except of carp and frog. In the central region, the ZP3 deduced aa sequences of all the 13 vertebrate species aligned well, and six relatively conserved sequences were found. There are 11 conserved cysteine residues in the central region across all species including carp and frog, indicating that these residues have longer evolutionary history. The ZP3 aa sequence similarities were examined using the GAP program (GCG). The highest aa similarities are observed between the members of the same order within the class mammalia, and also (95.4%) between pig (ungulata) and rabbit (lagomorpha). The deduced ZP3 aa sequences per se may not be enough to build a phylogenetic tree.

  2. Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

    PubMed Central

    2012-01-01

    Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence-based methods. Conclusions Appropriate homologous sequences are selected automatically and objectively by the index. Such sequence selection improved the performance of functional region prediction. As far as we know, this is the first approach in which spatial statistics have been applied to protein analyses. Such integration of structure and sequence information would be useful for other bioinformatics problems. PMID:22643026

  3. Gene family size conservation is a good indicator of evolutionary rates.

    PubMed

    Chen, Feng-Chi; Chen, Chiuan-Jung; Li, Wen-Hsiung; Chuang, Trees-Juen

    2010-08-01

    The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the conservation of gene family size is a good indicator of the rate of sequence evolution and some other biological properties. By comparing the human-chimpanzee-macaque orthologous gene families with and without family size conservation, we demonstrate that genes with family size conservation evolve more slowly than those without family size conservation. Our results further demonstrate that both family expansion and contraction events may accelerate gene evolution, resulting in elevated evolutionary rates in the genes without family size conservation. In addition, we show that the duplicate genes with family size conservation evolve significantly more slowly than those without family size conservation. Interestingly, the median evolutionary rate of singletons falls in between those of the above two types of duplicate gene families. Our results thus suggest that the controversy on whether duplicate genes evolve more slowly than singletons can be resolved when family size conservation is taken into consideration. Furthermore, we also observe that duplicate genes with family size conservation have the highest level of gene expression/expression breadth, the highest proportion of essential genes, and the lowest gene compactness, followed by singletons and then by duplicate genes without family size conservation. Such a trend accords well with our observations of evolutionary rates. Our results thus point to the importance of family size conservation in the evolution of duplicate genes.

  4. Genetic diversity of the captive Asian tapir population in Thailand, based on mitochondrial control region sequence data and the comparison of its nucleotide structure with Brazilian tapir.

    PubMed

    Muangkram, Yuttamol; Amano, Akira; Wajjwalku, Worawidh; Pinyopummintr, Tanu; Thongtip, Nikorn; Kaolim, Nongnid; Sukmak, Manakorn; Kamolnorranath, Sumate; Siriaroonrat, Boripat; Tipkantha, Wanlaya; Maikaew, Umaporn; Thomas, Warisara; Polsrila, Kanda; Dongsaard, Kwanreaun; Sanannu, Saowaphang; Wattananorrasate, Anuwat

    2017-07-01

    The Asian tapir (Tapirus indicus) has been classified as Endangered on the IUCN Red List of Threatened Species (2008). Genetic diversity data provide important information for the management of captive breeding and conservation of this species. We analyzed mitochondrial control region (CR) sequences from 37 captive Asian tapirs in Thailand. Multiple alignments of the full-length CR sequences sized 1268 bp comprised three domains as described in other mammal species. Analysis of 16 parsimony-informative variable sites revealed 11 haplotypes. Furthermore, the phylogenetic analysis using median-joining network clearly showed three clades correlated with our earlier cytochrome b gene study in this endangered species. The repetitive motif is located between first and second conserved sequence blocks, similar to the Brazilian tapir. The highest polymorphic site was located in the extended termination associated sequences domain. The results could be applied for future genetic management based in captivity and wild that shows stable populations.

  5. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    PubMed Central

    2010-01-01

    Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162

  6. Cloning and bacterial expression of adenosine-5'-triphosphate sulfurylase from the enteric protozoan parasite Entamoeba histolytica.

    PubMed

    Nozaki, T; Arase, T; Shigeta, Y; Asai, T; Leustek, T; Takeuchi, T

    1998-12-08

    A gene encoding adenosine-5'-triphosphate sulfurylase (AS) was cloned from the enteric protozoan parasite Entamoeba histolytica by polymerase chain reaction using degenerate oligonucleotide primers corresponding to conserved regions of the protein from a variety of organisms. The deduced amino acid sequence of E. histolytica AS revealed a calculated molecular mass of 47925 Da and an unusual basic pI of 9.38. The amebic protein sequence showed 23-48% identities with AS from bacteria, yeasts, fungi, plants, and animals with the highest identities being to Synechocystis sp. and Bacillus subtilis (48 and 44%, respectively). Four conserved blocks including putative sulfate-binding and phosphate-binding regions were highly conserved in the E. histolytica AS. The upstream region of the AS gene contained three conserved elements reported for other E. histolytica genes. A recombinant E. histolytica AS revealed enzymatic activity, measured in both the forward and reverse directions. Expression of the E. histolytica AS complemented cysteine auxotrophy of the AS-deficient Escherichia coli strains. Genomic hybridization revealed that the AS gene exists as a single copy gene. In the literature, this is the first description of an AS gene in Protozoa.

  7. The glycoprotein genes and gene junctions of the fish rhabdoviruses spring viremia of carp virus and hirame rhabdovirus: Analysis of relationships with other rhabdoviruses

    USGS Publications Warehouse

    Bjorklund, H.V.; Higman, K.H.; Kurath, G.

    1996-01-01

    The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2–33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.

  8. The glycoprotein genes and gene junctions of the fish rhabdoviruses spring viremia of carp virus and hirame rhabdovirus: Analysis of relationships with other rhabdoviruses

    USGS Publications Warehouse

    Bjorklund, H.V.; Higman, K.H.; Kurath, G.

    1996-01-01

    The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2-33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.

  9. Cloning and analysis of DnaJ family members in the silkworm, Bombyx mori.

    PubMed

    Li, Yinü; Bu, Cuiyu; Li, Tiantian; Wang, Shibao; Jiang, Feng; Yi, Yongzhu; Yang, Huipeng; Zhang, Zhifang

    2016-01-15

    Heat shock proteins (Hsps) are involved in a variety of critical biological functions, including protein folding, degradation, and translocation and macromolecule assembly, act as molecular chaperones during periods of stress by binding to other proteins. Using expressed sequence tag (EST) and silkworm (Bombyx mori) transcriptome databases, we identified 27 cDNA sequences encoding the conserved J domain, which is found in DnaJ-type Hsps. Of the 27 J domain-containing sequences, 25 were complete cDNA sequences. We divided them into three types according to the number and presence of conserved domains. By analyzing the gene structures, intron numbers, and conserved domains and constructing a phylogenetic tree, we found that the DnaJ family had undergone convergent evolution, obtaining new domains to expand the diversity of its family members. The acquisition of the new DnaJ domains most likely occurred prior to the evolutionary divergence of prokaryotes and eukaryotes. The expression of DnaJ genes in the silkworm was generally higher in the fat body. The tissue distribution of DnaJ1 proteins was detected by western blotting, demonstrating that in the fifth-instar larvae, the DnaJ1 proteins were expressed at their highest levels in hemocytes, followed by the fat body and head. We also found that the DnaJ1 transcripts were likely differentially translated in different tissues. Using immunofluorescence cytochemistry, we revealed that in the blood cells, DnaJ1 was mainly localized in the cytoplasm. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Mitogen-activated protein kinase 1 from disk abalone (Haliotis discus discus): Roles in early development and immunity-related transcriptional responses.

    PubMed

    Perera, N C N; Godahewa, G I; Lee, Jehee

    2016-12-01

    Mitogen-activated protein kinase (MAPK) is involved in the regulation of cellular events by mediating signal transduction pathways. MAPK1 is a member of the extracellular-signal regulated kinases (ERKs), playing roles in cell proliferation, differentiation, and development. This is mainly in response to growth factors, mitogens, and many environmental stresses. In the current study, we have characterized the structural features of a homolog of MAPK1 from disk abalone (AbMAPK1). Further, we have unraveled its expressional kinetics against different experimental pathogenic infections or related chemical stimulants. AbMAPK1 harbors a 5' untranslated region (UTR) of 23 bps, a coding sequence of 1104 bps, and a 3' UTR of 448 bp. The putative peptide comprises a predicted molecular mass of 42.2 kDa, with a theoretical pI of 6.28. Based on the in silico analysis, AbMAPK1 possesses two N-glycosylation sites, one S_TK catalytic domain, and a conserved His-Arg-Asp domain (HRD). In addition, a conservative glycine rich ATP-phosphate-binding loop and a threonine-x-tyrosine motif (TEY) important for the autophosphorylation were also identified in the protein. Homology assessment of AbMAPK1 showed several conserved regions, and ark clam (Aplysia californica) showed the highest sequence identity (87.9%). The phylogenetic analysis supported close evolutionary kinship with molluscan orthologs. Constitutive expression of AbMAPK1 was observed in six different tissues of disk abalone, with the highest expression in the digestive tract, followed by the gills and hemocytes. Highest AbMAPK1 mRNA expression level was detected at the trochophore developmental stage, suggesting its role in abalone cell differentiation and proliferation. Significant modulation of AbMAPK1 expression under pathogenic stress suggested its putative involvement in the immune defense mechanism. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Characterization of Developmental- and Stress-Mediated Expression of Cinnamoyl-CoA Reductase in Kenaf (Hibiscus cannabinus L.)

    PubMed Central

    Lim, Hyoun-Sub; Park, Sang-Un; Bae, Hyeun-Jong; Natarajan, Savithiry

    2014-01-01

    Cinnamoyl-CoA reductase (CCR) is an important enzyme for lignin biosynthesis as it catalyzes the first specific committed step in monolignol biosynthesis. We have cloned a full length coding sequence of CCR from kenaf (Hibiscus cannabinus L.), which contains a 1,020-bp open reading frame (ORF), encoding 339 amino acids of 37.37 kDa, with an isoelectric point (pI) of 6.27 (JX524276, HcCCR2). BLAST result found that it has high homology with other plant CCR orthologs. Multiple alignment with other plant CCR sequences showed that it contains two highly conserved motifs: NAD(P) binding domain (VTGAGGFIASWMVKLLLEKGY) at N-terminal and probable catalytic domain (NWYCYGK). According to phylogenetic analysis, it was closely related to CCR sequences of Gossypium hirsutum (ACQ59094) and Populus trichocarpa (CAC07424). HcCCR2 showed ubiquitous expression in various kenaf tissues and the highest expression was detected in mature flower. HcCCR2 was expressed differentially in response to various stresses, and the highest expression was observed by drought and NaCl treatments. PMID:24723816

  12. Genetic diversity and gene differentiation among ten species of Zingiberaceae from Eastern India.

    PubMed

    Mohanty, Sujata; Panda, Manoj Kumar; Acharya, Laxmikanta; Nayak, Sanghamitra

    2014-08-01

    In the present study, genetic fingerprints of ten species of Zingiberaceae from eastern India were developed using PCR-based markers. 19 RAPD (Rapid Amplified polymorphic DNA), 8 ISSR (Inter Simple Sequence Repeats) and 8 SSR (Simple Sequence Repeats) primers were used to elucidate genetic diversity important for utilization, management and conservation. These primers produced 789 loci, out of which 773 loci were polymorphic (including 220 unique loci) and 16 monomorphic loci. Highest number of bands amplified (263) in Curcuma caesia whereas lowest (209) in Zingiber cassumunar. Though all the markers discriminated the species effectively, analysis of combined data of all markers resulted in better distinction of individual species. Highest number of loci was amplified with SSR primers with resolving power in a range of 17.4-39. Dendrogram based on three molecular data using unweighted pair group method with arithmetic mean classified all the species into two clusters. Mantle matrix correspondence test revealed high matrix correlation in all the cases. Correlation values for RAPD, ISSR and SSR were 0.797, 0.84 and 0.8, respectively, with combined data. In both the genera wild and cultivated species were completely separated from each other at genomic level. It also revealed distinct genetic identity between species of Curcuma and Zingiber. High genetic diversity documented in the present study provides a baseline data for optimization of conservation and breeding programme of the studied zingiberacious species.

  13. Evolutionary dynamics of Newcastle disease virus

    USGS Publications Warehouse

    Miller, P.J.; Kim, L.M.; Ip, Hon S.; Afonso, C.L.

    2009-01-01

    A comprehensive dataset of NDV genome sequences was evaluated using bioinformatics to characterize the evolutionary forces affecting NDV genomes. Despite evidence of recombination in most genes, only one event in the fusion gene of genotype V viruses produced evolutionarily viable progenies. The codon-associated rate of change for the six NDV proteins revealed that the highest rate of change occurred at the fusion protein. All proteins were under strong purifying (negative) selection; the fusion protein displayed the highest number of amino acids under positive selection. Regardless of the phylogenetic grouping or the level of virulence, the cleavage site motif was highly conserved implying that mutations at this site that result in changes of virulence may not be favored. The coding sequence of the fusion gene and the genomes of viruses from wild birds displayed higher yearly rates of change in virulent viruses than in viruses of low virulence, suggesting that an increase in virulence may accelerate the rate of NDV evolution. ?? 2009 Elsevier Inc.

  14. Molecular characterization of Myf5 and comparative expression patterns of myogenic regulatory factors in Siniperca chuatsi.

    PubMed

    Zhu, Xin; Li, Yu-Long; Liu, Li; Wang, Jian-Hua; Li, Hong-Hui; Wu, Ping; Chu, Wu-Ying; Zhang, Jian-She

    2016-01-01

    Myogenic regulatory factors (MRFs) are muscle-specific basic helix-loop-helix (bHLH) transcription factor that plays an essential role in regulating skeletal muscle development and growth. To investigate molecular characterization of Myf5 and compare the expressional patterns of the four MRFs, we cloned the Myf5 cDNA sequence and analyzed the MRFs expressional patterns using quantitative real-time polymerase chain reaction in Chinese perch (Siniperca chuatsi). Sequence analysis indicated that Chinese perch Myf5 and other MRFs shared a highly conserved bHLH domain with those of other vertebrates. Sequence alignment and phylogenetic tree showed that Chinese perch MRFs had the highest identity with the MRFs of Epinephelus coioides. Spatio-temporal expressional patterns revealed that the MRFs were primarily expressed in muscle, especially in white muscle. During embryonic development period, Myf5, MyoD and MyoG mRNAs had a steep increase at neurula stage, and their highest expressional level was predominantly observed at hatching period. Whereas the highest expressional level of the MRF4 was observed at the muscular effect stage. The expressional patterns of post-embryonic development showed that the Myf5, MyoD and MyoG mRNAs were highest at 90 days post-hatching (dph). Furthermore, starvation and refeeding results showed that the transcription of the MRFs in the fast skeletal muscle of Chinese perch responded quickly to a single meal after 7 days of fasting. It indicated that the MRFs might contribute to muscle recovery after refeeding in Chinese perch. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. [Sequencing and analysis of the complete genome of a rabies virus isolate from Sika deer].

    PubMed

    Zhao, Yun-Jiao; Guo, Li; Huang, Ying; Zhang, Li-Shi; Qian, Ai-Dong

    2008-05-01

    One DRV strain was isolated from Sika Deer brain and sequenced. Nine overlapped gene fragments were amplified by RT-PCR through 3'-RACE and 5'-RACE method, and the complete DRV genome sequence was assembled. The length of the complete genome is 11863bp. The DRV genome organization was similar to other rabies viruses which were composed of five genes and the initiation sites and termination sites were highly conservative. There were mutated amino acids in important antigen sites of nucleoprotein and glycoprotein. The nucleotide and amino acid homologies of gene N, P, M, G, L in strains with completed genomie sequencing were compared. Compared with N gene sequence of other typical rabies viruses, a phylogenetic tree was established . These results indicated that DRV belonged to gene type 1. The highest homology compared with Chinese vaccine strain 3aG was 94%, and the lowest was 71% compared with WCBV. These findings provided theoretical reference for further research in rabies virus.

  16. Evolution and Diversity of the Human Hepatitis D Virus Genome

    PubMed Central

    Huang, Chi-Ruei; Lo, Szecheng J.

    2010-01-01

    Human hepatitis delta virus (HDV) is the smallest RNA virus in genome. HDV genome is divided into a viroid-like sequence and a protein-coding sequence which could have originated from different resources and the HDV genome was eventually constituted through RNA recombination. The genome subsequently diversified through accumulation of mutations selected by interactions between the mutated RNA and proteins with host factors to successfully form the infectious virions. Therefore, we propose that the conservation of HDV nucleotide sequence is highly related with its functionality. Genome analysis of known HDV isolates shows that the C-terminal coding sequences of large delta antigen (LDAg) are the highest diversity than other regions of protein-coding sequences but they still retain biological functionality to interact with the heavy chain of clathrin can be selected and maintained. Since viruses interact with many host factors, including escaping the host immune response, how to design a program to predict RNA genome evolution is a great challenging work. PMID:20204073

  17. Molecular Characterization of the Complete Genome of Three Basal-BR Isolates of Turnip mosaic virus Infecting Raphanus sativus in China.

    PubMed

    Zhu, Fuxiang; Sun, Ying; Wang, Yan; Pan, Hongyu; Wang, Fengting; Zhang, Xianghui; Zhang, Yanhua; Liu, Jinliang

    2016-06-04

    Turnip mosaic virus (TuMV) infects crops of plant species in the family Brassicaceae worldwide. TuMV isolates were clustered to five lineages corresponding to basal-B, basal-BR, Asian-BR, world-B and OMs. Here, we determined the complete genome sequences of three TuMV basal-BR isolates infecting radish from Shandong and Jilin Provinces in China. Their genomes were all composed of 9833 nucleotides, excluding the 3'-terminal poly(A) tail. They contained two open reading frames (ORFs), with the large one encoding a polyprotein of 3164 amino acids and the small overlapping ORF encoding a PIPO protein of 61 amino acids, which contained the typically conserved motifs found in members of the genus Potyvirus. In pairwise comparison with 30 other TuMV genome sequences, these three isolates shared their highest identities with isolates from Eurasian countries (Germany, Italy, Turkey and China). Recombination analysis showed that the three isolates in this study had no "clear" recombination. The analyses of conserved amino acids changed between groups showed that the codons in the TuMV out group (OGp) and OMs group were the same at three codon sites (852, 1006, 1548), and the other TuMV groups (basal-B, basal-BR, Asian-BR, world-B) were different. This pattern suggests that the codon in the OMs progenitor did not change but that in the other TuMV groups the progenitor sequence did change at divergence. Genetic diversity analyses indicate that the PIPO gene was under the highest selection pressure and the selection pressure on P3N-PIPO and P3 was almost the same. It suggests that most of the selection pressure on P3 was probably imposed through P3N-PIPO.

  18. The genome sequence and effector complement of the flax rust pathogen Melampsora lini.

    PubMed

    Nemri, Adnane; Saunders, Diane G O; Anderson, Claire; Upadhyaya, Narayana M; Win, Joe; Lawrence, Gregory J; Jones, David A; Kamoun, Sophien; Ellis, Jeffrey G; Dodds, Peter N

    2014-01-01

    Rust fungi cause serious yield reductions on crops, including wheat, barley, soybean, coffee, and represent real threats to global food security. Of these fungi, the flax rust pathogen Melampsora lini has been developed most extensively over the past 80 years as a model to understand the molecular mechanisms that underpin pathogenesis. During infection, M. lini secretes virulence effectors to promote disease. The number of these effectors, their function and their degree of conservation across rust fungal species is unknown. To assess this, we sequenced and assembled de novo the genome of M. lini isolate CH5 into 21,130 scaffolds spanning 189 Mbp (scaffold N50 of 31 kbp). Global analysis of the DNA sequence revealed that repetitive elements, primarily retrotransposons, make up at least 45% of the genome. Using ab initio predictions, transcriptome data and homology searches, we identified 16,271 putative protein-coding genes. An analysis pipeline was then implemented to predict the effector complement of M. lini and compare it to that of the poplar rust, wheat stem rust and wheat stripe rust pathogens to identify conserved and species-specific effector candidates. Previous knowledge of four cloned M. lini avirulence effector proteins and two basidiomycete effectors was used to optimize parameters of the effector prediction pipeline. Markov clustering based on sequence similarity was performed to group effector candidates from all four rust pathogens. Clusters containing at least one member from M. lini were further analyzed and prioritized based on features including expression in isolated haustoria and infected leaf tissue and conservation across rust species. Herein, we describe 200 of 940 clusters that ranked highest on our priority list, representing 725 flax rust candidate effectors. Our findings on this important model rust species provide insight into how effectors of rust fungi are conserved across species and how they may act to promote infection on their hosts.

  19. Promoter analysis of the rabbit POU5F1 gene and its expression in preimplantation stage embryos.

    PubMed

    Kobolak, Julianna; Kiss, Katalin; Polgar, Zsuzsanna; Mamo, Solomon; Rogel-Gaillard, Claire; Tancos, Zsuzsanna; Bock, Istvan; Baji, Arpad G; Tar, Krisztina; Pirity, Melinda K; Dinnyes, Andras

    2009-09-04

    The POU5F1 gene encodes the octamer-binding transcription factor-4 (Oct4). It is crucial in the regulation of pluripotency during embryonic development and widely used as molecular marker of embryonic stem cells (ESCs). The objective of this study was to identify and to analyse the promoter region of rabbit POU5F1 gene; furthermore to examine its expression pattern in preimplantation stage rabbit embryos. The upstream region of rabbit POU5F1 was subcloned sequenced and four highly conserved promoter regions (CR1-4) were identified. The highest degree of similarity on sequence level was found among the conserved domains between rabbit and human. Among the enhancers the proximal enhancer region (PE-1A) exhibited the highest degree of homology (96.4%). Furthermore, the CR4 regulator domain containing the distal enhancer (DE-2A) was responsible for stem cell-specific expression. Also, BAC library screen revealed the existence of a processed pseudogene of rabbit POU5F1. The results of quantitative real-time PCR experiments showed that POU5F1 mRNA was abundantly present in oocytes and zygotes, but it was gradually reduced until the activation of the embryonic genome, thereafter a continuous increase in POU5F1 mRNA level was observed until blastocyst stage. By using the XYClone laser system the inner cell mass (ICM) and trophoblast portions of embryos were microdissected and examined separately and POU5F1 mRNA was detected in both cell types. In this study we provide a comparative sequence analysis of the regulatory region of rabbit POU5F1 gene. Our data suggest that the POU5F1 gene is strictly regulated during early mammalian development. We proposed that the well conserved CR4 region containing the DE-2A enhancer is responsible for the highly conserved ESC specific gene expression. Notably, we are the first to report that the rabbit POU5F1 is not restricted to ICM cells only, but it is expressed in trophoblast cells as well. This information may be well applicable to investigate further the possible phylogenetic role and the regulation of POU5F1 gene.

  20. Complete sequence determination of a novel reptile iridovirus isolated from soft-shelled turtle and evolutionary analysis of Iridoviridae

    PubMed Central

    Huang, Youhua; Huang, Xiaohong; Liu, Hong; Gong, Jie; Ouyang, Zhengliang; Cui, Huachun; Cao, Jianhao; Zhao, Yingtao; Wang, Xiujie; Jiang, Yulin; Qin, Qiwei

    2009-01-01

    Background Soft-shelled turtle iridovirus (STIV) is the causative agent of severe systemic diseases in cultured soft-shelled turtles (Trionyx sinensis). To our knowledge, the only molecular information available on STIV mainly concerns the highly conserved STIV major capsid protein. The complete sequence of the STIV genome is not yet available. Therefore, determining the genome sequence of STIV and providing a detailed bioinformatic analysis of its genome content and evolution status will facilitate further understanding of the taxonomic elements of STIV and the molecular mechanisms of reptile iridovirus pathogenesis. Results We determined the complete nucleotide sequence of the STIV genome using 454 Life Science sequencing technology. The STIV genome is 105 890 bp in length with a base composition of 55.1% G+C. Computer assisted analysis revealed that the STIV genome contains 105 potential open reading frames (ORFs), which encode polypeptides ranging from 40 to 1,294 amino acids and 20 microRNA candidates. Among the putative proteins, 20 share homology with the ancestral proteins of the nuclear and cytoplasmic large DNA viruses (NCLDVs). Comparative genomic analysis showed that STIV has the highest degree of sequence conservation and a colinear arrangement of genes with frog virus 3 (FV3), followed by Tiger frog virus (TFV), Ambystoma tigrinum virus (ATV), Singapore grouper iridovirus (SGIV), Grouper iridovirus (GIV) and other iridovirus isolates. Phylogenetic analysis based on conserved core genes and complete genome sequence of STIV with other virus genomes was performed. Moreover, analysis of the gene gain-and-loss events in the family Iridoviridae suggested that the genes encoded by iridoviruses have evolved for favoring adaptation to different natural host species. Conclusion This study has provided the complete genome sequence of STIV. Phylogenetic analysis suggested that STIV and FV3 are strains of the same viral species belonging to the Ranavirus genus in the Iridoviridae family. Given virus-host co-evolution and the phylogenetic relationship among vertebrates from fish to reptiles, we propose that iridovirus might transmit between reptiles and amphibians and that STIV and FV3 are strains of the same viral species in the Ranavirus genus. PMID:19439104

  1. High-utility conserved avian microsatellite markers enable parentage and population studies across a wide range of species

    PubMed Central

    2013-01-01

    Background Microsatellites are widely used for many genetic studies. In contrast to single nucleotide polymorphism (SNP) and genotyping-by-sequencing methods, they are readily typed in samples of low DNA quality/concentration (e.g. museum/non-invasive samples), and enable the quick, cheap identification of species, hybrids, clones and ploidy. Microsatellites also have the highest cross-species utility of all types of markers used for genotyping, but, despite this, when isolated from a single species, only a relatively small proportion will be of utility. Marker development of any type requires skill and time. The availability of sufficient “off-the-shelf” markers that are suitable for genotyping a wide range of species would not only save resources but also uniquely enable new comparisons of diversity among taxa at the same set of loci. No other marker types are capable of enabling this. We therefore developed a set of avian microsatellite markers with enhanced cross-species utility. Results We selected highly-conserved sequences with a high number of repeat units in both of two genetically distant species. Twenty-four primer sets were designed from homologous sequences that possessed at least eight repeat units in both the zebra finch (Taeniopygia guttata) and chicken (Gallus gallus). Each primer sequence was a complete match to zebra finch and, after accounting for degenerate bases, at least 86% similar to chicken. We assessed primer-set utility by genotyping individuals belonging to eight passerine and four non-passerine species. The majority of the new Conserved Avian Microsatellite (CAM) markers amplified in all 12 species tested (on average, 94% in passerines and 95% in non-passerines). This new marker set is of especially high utility in passerines, with a mean 68% of loci polymorphic per species, compared with 42% in non-passerine species. Conclusions When combined with previously described conserved loci, this new set of conserved markers will not only reduce the necessity and expense of microsatellite isolation for a wide range of genetic studies, including avian parentage and population analyses, but will also now enable comparisons of genetic diversity among different species (and populations) at the same set of loci, with no or reduced bias. Finally, the approach used here can be applied to other taxa in which appropriate genome sequences are available. PMID:23497230

  2. Sequencing Conservation Actions Through Threat Assessments in the Southeastern United States

    Treesearch

    Robert D. Sutter; Christopher C. Szell

    2006-01-01

    The identification of conservation priorities is one of the leading issues in conservation biology. We present a project of The Nature Conservancy, called Sequencing Conservation Actions, which prioritizes conservation areas and identifies foci for crosscutting strategies at various geographic scales. We use the term “Sequencing” to mean an ordering of actions over...

  3. Genetic Diversity of Arabica Coffee (Coffea arabica L.) in Nicaragua as Estimated by Simple Sequence Repeat Markers

    PubMed Central

    Geleta, Mulatu; Herrera, Isabel; Monzón, Arnulfo; Bryngelsson, Tomas

    2012-01-01

    Coffea arabica L. (arabica coffee), the only tetraploid species in the genus Coffea, represents the majority of the world's coffee production and has a significant contribution to Nicaragua's economy. The present paper was conducted to determine the genetic diversity of arabica coffee in Nicaragua for its conservation and breeding values. Twenty-six populations that represent eight varieties in Nicaragua were investigated using simple sequence repeat (SSR) markers. A total of 24 alleles were obtained from the 12 loci investigated across 260 individual plants. The total Nei's gene diversity (H T) and the within-population gene diversity (H S) were 0.35 and 0.29, respectively, which is comparable with that previously reported from other countries and regions. Among the varieties, the highest diversity was recorded in the variety Catimor. Analysis of variance (AMOVA) revealed that about 87% of the total genetic variation was found within populations and the remaining 13% differentiate the populations (F ST = 0.13; P < 0.001). The variation among the varieties was also significant. The genetic variation in Nicaraguan coffee is significant enough to be used in the breeding programs, and most of this variation can be conserved through ex situ conservation of a low number of populations from each variety. PMID:22701376

  4. On the relationship between residue structural environment and sequence conservation in proteins.

    PubMed

    Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

    2017-09-01

    Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

  5. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    PubMed Central

    Fauteux, François; Strömvik, Martina V

    2009-01-01

    Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs. The majority of discovered motifs match experimentally characterized cis-regulatory elements. These results provide a good starting point for further experimental analysis of plant seed-specific promoters and our methodology can be used to unravel more transcriptional regulatory mechanisms in plants and other eukaryotes. PMID:19843335

  6. Conservation and diversification of Msx protein in metazoan evolution.

    PubMed

    Takahashi, Hirokazu; Kamiya, Akiko; Ishiguro, Akira; Suzuki, Atsushi C; Saitou, Naruya; Toyoda, Atsushi; Aruga, Jun

    2008-01-01

    Msx (/msh) family genes encode homeodomain (HD) proteins that control ontogeny in many animal species. We compared the structures of Msx genes from a wide range of Metazoa (Porifera, Cnidaria, Nematoda, Arthropoda, Tardigrada, Platyhelminthes, Mollusca, Brachiopoda, Annelida, Echiura, Echinodermata, Hemichordata, and Chordata) to gain an understanding of the role of these genes in phylogeny. Exon-intron boundary analysis suggested that the position of the intron located N-terminally to the HDs was widely conserved in all the genes examined, including those of cnidarians. Amino acid (aa) sequence comparison revealed 3 new evolutionarily conserved domains, as well as very strong conservation of the HDs. Two of the three domains were associated with Groucho-like protein binding in both a vertebrate and a cnidarian Msx homolog, suggesting that the interaction between Groucho-like proteins and Msx proteins was established in eumetazoan ancestors. Pairwise comparison among the collected HDs and their C-flanking aa sequences revealed that the degree of sequence conservation varied depending on the animal taxa from which the sequences were derived. Highly conserved Msx genes were identified in the Vertebrata, Cephalochordata, Hemichordata, Echinodermata, Mollusca, Brachiopoda, and Anthozoa. The wide distribution of the conserved sequences in the animal phylogenetic tree suggested that metazoan ancestors had already acquired a set of conserved domains of the current Msx family genes. Interestingly, although strongly conserved sequences were recovered from the Vertebrata, Cephalochordata, and Anthozoa, the sequences from the Urochordata and Hydrozoa showed weak conservation. Because the Vertebrata-Cephalochordata-Urochordata and Anthozoa-Hydrozoa represent sister groups in the Chordata and Cnidaria, respectively, Msx sequence diversification may have occurred differentially in the course of evolution. We speculate that selective loss of the conserved domains in Msx family proteins contributed to the diversification of animal body organization.

  7. Functionally conserved enhancers with divergent sequences in distant vertebrates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Song; Oksenberg, Nir; Takayama, Sachiko

    To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.

  8. Functionally conserved enhancers with divergent sequences in distant vertebrates

    DOE PAGES

    Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...

    2015-10-30

    To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.

  9. Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences.

    PubMed

    Bergman, C M; Kreitman, M

    2001-08-01

    Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.

  10. Biological function in the twilight zone of sequence conservation.

    PubMed

    Ponting, Chris P

    2017-08-16

    Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast 'twilight zone' in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species' population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional.

  11. A strategy for detecting the conservation of folding-nucleus residues in protein superfamilies.

    PubMed

    Michnick, S W; Shakhnovich, E

    1998-01-01

    Nucleation-growth theory predicts that fast-folding peptide sequences fold to their native structure via structures in a transition-state ensemble that share a small number of native contacts (the folding nucleus). Experimental and theoretical studies of proteins suggest that residues participating in folding nuclei are conserved among homologs. We attempted to determine if this is true in proteins with highly diverged sequences but identical folds (superfamilies). We describe a strategy based on comparisons of residue conservation in natural superfamily sequences with simulated sequences (generated with a Monte-Carlo sequence design strategy) for the same proteins. The basic assumptions of the strategy were that natural sequences will conserve residues needed for folding and stability plus function, the simulated sequences contain no functional conservation, and nucleus residues make native contacts with each other. Based on these assumptions, we identified seven potential nucleus residues in ubiquitin superfamily members. Non-nucleus conserved residues were also identified; these are proposed to be involved in stabilizing native interactions. We found that all superfamily members conserved the same potential nucleus residue positions, except those for which the structural topology is significantly different. Our results suggest that the conservation of the nucleus of a specific fold can be predicted by comparing designed simulated sequences with natural highly diverged sequences that fold to the same structure. We suggest that such a strategy could be used to help plan protein folding and design experiments, to identify new superfamily members, and to subdivide superfamilies further into classes having a similar folding mechanism.

  12. Fine-tuning structural RNA alignments in the twilight zone.

    PubMed

    Bremges, Andreas; Schirmer, Stefanie; Giegerich, Robert

    2010-04-30

    A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index.

  13. Isolation and expression analysis of EcbZIP17 from different finger millet genotypes shows conserved nature of the gene.

    PubMed

    Chopperla, Ramakrishna; Singh, Sonam; Mohanty, Sasmita; Reddy, Nanja; Padaria, Jasdeep C; Solanke, Amolkumar U

    2017-10-01

    Basic leucine zipper (bZIP) transcription factors comprise one of the largest gene families in plants. They play a key role in almost every aspect of plant growth and development and also in biotic and abiotic stress tolerance. In this study, we report isolation and characterization of EcbZIP17 , a group B bZIP transcription factor from a climate smart cereal, finger millet ( Eleusine coracana L.). The genomic sequence of EcbZIP17 is 2662 bp long encompassing two exons and one intron with ORF of 1722 bp and peptide length of 573 aa. This gene is homologous to AtbZIP17 ( Arabidopsis ), ZmbZIP17 (maize) and OsbZIP60 (rice) which play a key role in endoplasmic reticulum (ER) stress pathway. In silico analysis confirmed the presence of basic leucine zipper (bZIP) and transmembrane (TM) domains in the EcbZIP17 protein. Allele mining of this gene in 16 different genotypes by Sanger sequencing revealed no variation in nucleotide sequence, including the 618 bp long intron. Expression analysis of EcbZIP17 under heat stress exhibited similar pattern of expression in all the genotypes across time intervals with highest upregulation after 4 h. The present study established the conserved nature of EcbZIP17 at nucleotide and expression level.

  14. Chalcone synthase genes from milk thistle (Silybum marianum): isolation and expression analysis.

    PubMed

    Sanjari, Sepideh; Shobbar, Zahra Sadat; Ebrahimi, Mohsen; Hasanloo, Tahereh; Sadat-Noori, Seyed-Ahmad; Tirnaz, Soodeh

    2015-12-01

    Silymarin is a flavonoid compound derived from milk thistle (Silybum marianum) seeds which has several pharmacological applications. Chalcone synthase (CHS) is a key enzyme in the biosynthesis of flavonoids; thereby, the identification of CHS encoding genes in milk thistle plant can be of great importance. In the current research, fragments of CHS genes were amplified using degenerate primers based on the conserved parts of Asteraceae CHS genes, and then cloned and sequenced. Analysis of the resultant nucleotide and deduced amino acid sequences led to the identification of two different members of CHS gene family,SmCHS1 and SmCHS2. Third member, full-length cDNA (SmCHS3) was isolated by rapid amplification of cDNA ends (RACE), whose open reading frame contained 1239 bp including exon 1 (190 bp) and exon 2 (1049 bp), encoding 63 and 349 amino acids, respectively. In silico analysis of SmCHS3 sequence contains all the conserved CHS sites and shares high homology with CHS proteins from other plants.Real-time PCR analysis indicated that SmCHS1 and SmCHS3 had the highest transcript level in petals in the early flowering stage and in the stem of five upper leaves, followed by five upper leaves in the mid-flowering stage which are most probably involved in anthocyanin and silymarin biosynthesis.

  15. Divergent evolutionary rates in vertebrate and mammalian specific conserved non-coding elements (CNEs) in echolocating mammals.

    PubMed

    Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J

    2014-12-19

    The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise frequency-modulated echolocation calls varying widely in frequency and intensity high levels of sequence divergence were found. Levels of selective constraint acting on CNEs differed both across genomic locations and taxa, with observed variation in substitution rates of CNEs among bat species. More work is needed to determine whether this variation can be linked to echolocation, and wider taxonomic sampling is necessary to fully document levels of conservation in CNEs across diverse taxa.

  16. The Pekin duck programmed death-ligand 1: cDNA cloning, genomic structure, molecular characterization and mRNA expression analysis.

    PubMed

    Yao, Q; Fischer, K P; Tyrrell, D L; Gutfreund, K S

    2015-04-01

    Programmed death ligand-1 (PD-L1) plays an important role in the attenuation of adaptive immune responses in higher vertebrates. Here, we describe the identification of the Pekin duck PD-L1 orthologue (duPD-L1) and its gene structure. The duPD-L1 cDNA encodes a 311-amino acid protein that has an amino acid identity of 78% and 42% with chicken and human PD-L1, respectively. Mapping of the duPD-L1 cDNA with duck genomic sequences revealed an exonic structure of its coding sequence similar to those of other vertebrates but lacked a noncoding exon 1. Homology modelling of the duPD-L1 extracellular domain was compatible with the tandem IgV-like and IgC-like IgSF domain structure of human PD-L1 (PDB ID: 3BIS). Residues known to be important for receptor binding of human PD-L1 were mostly conserved in duPD-L1 within the N-terminus and the G sheet, and partially conserved within the F sheet but not within sheets C and C'. DuPD-L1 mRNA was constitutively expressed in all tissues examined with highest expression levels in lung and spleen and very low levels of expression in muscle, kidney and brain. Mitogen stimulation of duck peripheral blood mononuclear cells transiently increased duPD-L1 mRNA expression. Our observations demonstrate evolutionary conservation of the exonic structure of its coding sequence, the extracellular domain structure and residues implicated in receptor binding, but the role of the longer cytoplasmic tail in avian PD-L1 proteins remains to be determined. © 2014 John Wiley & Sons Ltd.

  17. Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

    PubMed

    Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

    2018-03-01

    Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.

  18. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

    PubMed Central

    2013-01-01

    Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200

  19. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    PubMed

    Nagar, Anurag; Hahsler, Michael

    2013-01-01

    Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.

  20. Coastal bacterioplankton community diversity along a latitudinal gradient in Latin America by means of V6 tag pyrosequencing.

    PubMed

    Thompson, Fabiano L; Bruce, Thiago; Gonzalez, Alessandra; Cardoso, Alexander; Clementino, Maysa; Costagliola, Marcela; Hozbor, Constanza; Otero, Ernesto; Piccini, Claudia; Peressutti, Silvia; Schmieder, Robert; Edwards, Robert; Smith, Mathew; Takiyama, Luis Roberto; Vieira, Ricardo; Paranhos, Rodolfo; Artigas, Luis Felipe

    2011-02-01

    The bacterioplankton diversity of coastal waters along a latitudinal gradient between Puerto Rico and Argentina was analyzed using a total of 134,197 high-quality sequences from the V6 hypervariable region of the small-subunit ribosomal RNA gene (16S rRNA) (mean length of 60 nt). Most of the OTUs were identified into Proteobacteria, Bacteriodetes, Cyanobacteria, and Actinobacteria, corresponding to approx. 80% of the total number of sequences. The number of OTUs corresponding to species varied between 937 and 1946 in the seven locations. Proteobacteria appeared at high frequency in the seven locations. An enrichment of Cyanobacteria was observed in Puerto Rico, whereas an enrichment of Bacteroidetes was detected in the Argentinian shelf and Uruguayan coastal lagoons. The highest number of sequences of Actinobacteria and Acidobacteria were obtained in the Amazon estuary mouth. The rarefaction curves and Good coverage estimator for species diversity suggested a significant coverage, with values ranging between 92 and 97% for Good coverage. Conserved taxa corresponded to aprox. 52% of all sequences. This study suggests that human-contaminated environments may influence bacterioplankton diversity.

  1. Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.

    PubMed

    Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun

    2012-01-01

    The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.

  2. Significant variance in genetic diversity among populations of Schistosoma haematobium detected using microsatellite DNA loci from a genome-wide database.

    PubMed

    Glenn, Travis C; Lance, Stacey L; McKee, Anna M; Webster, Bonnie L; Emery, Aidan M; Zerlotini, Adhemar; Oliveira, Guilherme; Rollinson, David; Faircloth, Brant C

    2013-10-17

    Urogenital schistosomiasis caused by Schistosoma haematobium is widely distributed across Africa and is increasingly being targeted for control. Genome sequences and population genetic parameters can give insight into the potential for population- or species-level drug resistance. Microsatellite DNA loci are genetic markers in wide use by Schistosoma researchers, but there are few primers available for S. haematobium. We sequenced 1,058,114 random DNA fragments from clonal cercariae collected from a snail infected with a single Schistosoma haematobium miracidium. We assembled and aligned the S. haematobium sequences to the genomes of S. mansoni and S. japonicum, identifying microsatellite DNA loci across all three species and designing primers to amplify the loci in S. haematobium. To validate our primers, we screened 32 randomly selected primer pairs with population samples of S. haematobium. We designed >13,790 primer pairs to amplify unique microsatellite loci in S. haematobium, (available at http://www.cebio.org/projetos/schistosoma-haematobium-genome). The three Schistosoma genomes contained similar overall frequencies of microsatellites, but the frequency and length distributions of specific motifs differed among species. We identified 15 primer pairs that amplified consistently and were easily scored. We genotyped these 15 loci in S. haematobium individuals from six locations: Zanzibar had the highest levels of diversity; Malawi, Mauritius, Nigeria, and Senegal were nearly as diverse; but the sample from South Africa was much less diverse. About half of the primers in the database of Schistosoma haematobium microsatellite DNA loci should yield amplifiable and easily scored polymorphic markers, thus providing thousands of potential markers. Sequence conservation among S. haematobium, S. japonicum, and S. mansoni is relatively high, thus it should now be possible to identify markers that are universal among Schistosoma species (i.e., using DNA sequences conserved among species), as well as other markers that are specific to species or species-groups (i.e., using DNA sequences that differ among species). Full genome-sequencing of additional species and specimens of S. haematobium, S. japonicum, and S. mansoni is desirable to better characterize differences within and among these species, to develop additional genetic markers, and to examine genes as well as conserved non-coding elements associated with drug resistance.

  3. The Most Deeply Conserved Noncoding Sequences in Plants Serve Similar Functions to Those in Vertebrates Despite Large Differences in Evolutionary Rates[W

    PubMed Central

    Burgess, Diane; Freeling, Michael

    2014-01-01

    In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619

  4. PUTATIVE GENE PROMOTER SEQUENCES IN THE CHLORELLA VIRUSES

    PubMed Central

    Fitzgerald, Lisa A.; Boucher, Philip T.; Yanai-Balser, Giane; Suhre, Karsten; Graves, Michael V.; Van Etten, James L.

    2008-01-01

    Three short (7 to 9 nucleotides) highly conserved nucleotide sequences were identified in the putative promoter regions (150 bp upstream and 50 bp downstream of the ATG translation start site) of three members of the genus Chlorovirus, family Phycodnaviridae. Most of these sequences occurred in similar locations within the defined promoter regions. The sequence and location of the motifs were often conserved among homologous ORFs within the Chlorovirus family. One of these conserved sequences (AATGACA) is predominately associated with genes expressed early in virus replication. PMID:18768195

  5. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results.

    PubMed

    Worley, K C; Wiese, B A; Smith, R F

    1995-09-01

    BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).

  6. Metagenomic analysis of Sichuan takin fecal sample viromes reveals novel enterovirus and astrovirus.

    PubMed

    Guan, Tian-Pei; Teng, Jade L L; Yeong, Kai-Yan; You, Zhang-Qiang; Liu, Hao; Wong, Samson S Y; Lau, Susanna K P; Woo, Patrick C Y

    2018-06-07

    The Sichuan takin inhabits the bamboo forests in the Eastern Himalayas and is considered as a national treasure of China with the highest legal protection and conservation status considered as vulnerable according to The IUCN Red List of Threatened Species. In this study, fecal samples of 71 Sichuan takins were pooled and deep sequenced. Among the 103,553 viral sequences, 21,961 were assigned to mammalian viruses. De novo assembly revealed genomes of an enterovirus and an astrovirus and contigs of circoviruses and genogroup I picobirnaviruses. Complete genome sequencing and phylogenetic analysis showed that Sichuan takin enterovirus is a novel serotype/genotype of the species Enterovirus G, with evidence of recombination. Sichuan takin astrovirus is a new subtype of bovine astrovirus, probably belonging to a new genogroup in the genus Mamastrovirus. Further studies will reveal whether these viruses can also be found in Mishmi takin and Shaanxi takin and their pathogenic potentials. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. A conserved mechanism for replication origin recognition and binding in archaea.

    PubMed

    Majerník, Alan I; Chong, James P J

    2008-01-15

    To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.

  8. Fine-tuning structural RNA alignments in the twilight zone

    PubMed Central

    2010-01-01

    Background A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Results Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Conclusions Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index. PMID:20433706

  9. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

    PubMed

    Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

    2017-03-27

    Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.

  10. DNA sequence analysis of ARS elements from chromosome III of Saccharomyces cerevisiae: identification of a new conserved sequence.

    PubMed Central

    Palzkill, T G; Oliver, S G; Newlon, C S

    1986-01-01

    Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036

  11. Identification and functional activity of a staphylocoagulase type XI variant originating from staphylococcal food poisoning isolates.

    PubMed

    Suzuki, Y; Matsushita, S; Kubota, H; Kobayashi, M; Murauchi, K; Higuchi, Y; Kato, R; Hirai, A; Sadamasu, K

    2016-09-01

    Staphylocoagulase, an extracellular protein secreted by Staphylococcus aureus, has been used as an epidemiological marker. At least 12 serotypes and 24 genotypes subdivided on the basis of nucleotide sequence have been reported to date. In this study, we identified a novel staphylocoagulase nucleotide sequence, coa310, from staphylococcal food poisoning isolates that had the ability to coagulate plasma, but could not be typed using the conventional method. The protein encoded by coa310 contained the six fundamental conserved domains of staphylocoagulase. The full-length nucleotide sequence of coa310 shared the highest similarity (77·5%) with that of staphylocoagulase-type (SCT) XIa. The sequence of the D1 region, which would be responsible for the determination of SCT, shared the highest similarity (91·8%) with that of SCT XIa. These results suggest that coa310 is a novel variant of SCT XI. Moreover, we demonstrated that coa310 encodes a functioning coagulase, by confirming the coagulating activity of the recombinant protein expressed from coa310. This is the first study to directly demonstrate that Coa310, a putative SCT XI, has coagulating activity. These findings may be useful for the improvement of the staphylocoagulase-typing method, including serotyping and genotyping. This is the first study to identify a novel variant of staphylocoagulase type XI based on its nucleotide sequence and to demonstrate coagulating activity in the variant using a recombinant protein. Elucidation of the variety of staphylocoagulases will provide suggestions for further improvement of the staphylocoagulase-typing method and contribute to our understanding of the epidemiologic characterization of Staphylococcus aureus. © 2016 The Society for Applied Microbiology.

  12. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation.

    PubMed

    Schneider, T D

    2001-12-01

    The sequence logo for DNA binding sites of the bacteriophage P1 replication protein RepA shows unusually high sequence conservation ( approximately 2 bits) at a minor groove that faces RepA. However, B-form DNA can support only 1 bit of sequence conservation via contacts into the minor groove. The high conservation in RepA sites therefore implies a distorted DNA helix with direct or indirect contacts to the protein. Here I show that a high minor groove conservation signature also appears in sequence logos of sites for other replication origin binding proteins (Rts1, DnaA, P4 alpha, EBNA1, ORC) and promoter binding proteins (sigma(70), sigma(D) factors). This finding implies that DNA binding proteins generally use non-B-form DNA distortion such as base flipping to initiate replication and transcription.

  13. 78 FR 51463 - Energy Conservation Program: Energy Conservation Standards for Metal Halide Lamp Fixtures

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-20

    ... Efficiency at all Possible Voltages b. Posting the Highest and Lowest Efficiencies c. Test at Single Manufacturer-Declared Voltage d. Test at Highest-Rated Voltage e. Test on Input Voltage Based on Wattage and... at the highest voltage for which the ballast is designed to operate. [Dagger] P is defined as the...

  14. Comparative analysis of the L, M, and S RNA segments of Crimean-Congo haemorrhagic fever virus isolates from southern Africa.

    PubMed

    Goedhals, Dominique; Bester, Phillip A; Paweska, Janusz T; Swanepoel, Robert; Burt, Felicity J

    2015-05-01

    Crimean-Congo haemorrhagic fever virus (CCHFV) is a member of the Bunyaviridae family with a tripartite, negative sense RNA genome. This study used predictive software to analyse the L (large), M (medium), and S (small) segments of 14 southern African CCHFV isolates. The OTU-like cysteine protease domain and the RdRp domain of the L segment are highly conserved among southern African CCHFV isolates. The M segment encodes the structural glycoproteins, GN and GC, and the non-structural glycoproteins which are post-translationally cleaved at highly conserved furin and subtilase SKI-1 cleavage sites. All of the sites previously identified were shown to be conserved among southern African CCHFV isolates. The heavily O-glycosylated N-terminal variable mucin-like domain of the M segment shows the highest sequence variability of the CCHFV proteins. Five transmembrane domains are predicted in the M segment polyprotein resulting in three regions internal to and three regions external to the membrane across the G(N), NS(M) and G(C) glycoproteins. The corroboration of conserved genome domains and sequence identity among geographically diverse isolates may assist in the identification of protein function and pathogenic mechanisms, as well as the identification of potential targets for antiviral therapy and vaccine design. As detailed functional studies are lacking for many of the CCHFV proteins, identification of functional domains by prediction of protein structure, and identification of amino acid level similarity to functionally characterised proteins of related viruses or viruses with similar pathogenic mechanisms are a necessary step for selection of areas for further study. © 2015 Wiley Periodicals, Inc.

  15. G-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch.

    PubMed

    Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman

    2016-11-02

    Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Conservation and variability of West Nile virus proteins.

    PubMed

    Koo, Qi Ying; Khan, Asif M; Jung, Keun-Ok; Ramdas, Shweta; Miotto, Olivo; Tan, Tin Wee; Brusic, Vladimir; Salmon, Jerome; August, J Thomas

    2009-01-01

    West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of < or = 1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (< or = 10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing approximately 34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivirus infections, and for studies of homologous sequences among other flaviviruses.

  17. Fifteen new earthworm mitogenomes shed new light on phylogeny within the Pheretima complex

    PubMed Central

    Zhang, Liangliang; Sechi, Pierfrancesco; Yuan, Minglong; Jiang, Jibao; Dong, Yan; Qiu, Jiangping

    2016-01-01

    The Pheretima complex within the Megascolecidae family is a major earthworm group. Recently, the systematic status of the Pheretima complex based on morphology was challenged by molecular studies. In this study, we carry out the first comparative mitogenomic study in oligochaetes. The mitogenomes of 15 earthworm species were sequenced and compared with other 9 available earthworm mitogenomes, with the main aim to explore their phylogenetic relationships and test different analytical approaches on phylogeny reconstruction. The general earthworm mitogenomic features revealed to be conservative: all genes encoded on the same strand, all the protein coding loci shared the same initiation codon (ATG), and tRNA genes showed conserved structures. The Drawida japonica mitogenome displayed the highest A + T content, reversed AT/GC-skews and the highest genetic diversity. Genetic distances among protein coding genes displayed their maximum and minimum interspecific values in the ATP8 and CO1 genes, respectively. The 22 tRNAs showed variable substitution patterns between the considered earthworm mitogenomes. The inclusion of rRNAs positively increased phylogenetic support. Furthermore, we tested different trimming tools for alignment improvement. Our analyses rejected reciprocal monophyly among Amynthas and Metaphire and indicated that the two genera should be systematically classified into one. PMID:26833286

  18. A Fast Alignment-Free Approach for De Novo Detection of Protein Conserved Regions

    PubMed Central

    Abnousi, Armen; Broschat, Shira L.; Kalyanaraman, Ananth

    2016-01-01

    Background Identifying conserved regions in protein sequences is a fundamental operation, occurring in numerous sequence-driven analysis pipelines. It is used as a way to decode domain-rich regions within proteins, to compute protein clusters, to annotate sequence function, and to compute evolutionary relationships among protein sequences. A number of approaches exist for identifying and characterizing protein families based on their domains, and because domains represent conserved portions of a protein sequence, the primary computation involved in protein family characterization is identification of such conserved regions. However, identifying conserved regions from large collections (millions) of protein sequences presents significant challenges. Methods In this paper we present a new, alignment-free method for detecting conserved regions in protein sequences called NADDA (No-Alignment Domain Detection Algorithm). Our method exploits the abundance of exact matching short subsequences (k-mers) to quickly detect conserved regions, and the power of machine learning is used to improve the prediction accuracy of detection. We present a parallel implementation of NADDA using the MapReduce framework and show that our method is highly scalable. Results We have compared NADDA with Pfam and InterPro databases. For known domains annotated by Pfam, accuracy is 83%, sensitivity 96%, and specificity 44%. For sequences with new domains not present in the training set an average accuracy of 63% is achieved when compared to Pfam. A boost in results in comparison with InterPro demonstrates the ability of NADDA to capture conserved regions beyond those present in Pfam. We have also compared NADDA with ADDA and MKDOM2, assuming Pfam as ground-truth. On average NADDA shows comparable accuracy, more balanced sensitivity and specificity, and being alignment-free, is significantly faster. Excluding the one-time cost of training, runtimes on a single processor were 49s, 10,566s, and 456s for NADDA, ADDA, and MKDOM2, respectively, for a data set comprised of approximately 2500 sequences. PMID:27552220

  19. Two rapidly evolving genes contribute to male fitness in Drosophila

    PubMed Central

    Reinhardt, Josephine A; Jones, Corbin D

    2013-01-01

    Purifying selection often results in conservation of gene sequence and function. The most functionally conserved genes are also thought to be among the most biologically essential. These observations have led to the use of sequence conservation as a proxy for functional conservation. Here we describe two genes that are exceptions to this pattern. We show that lack of sequence conservation among orthologs of CG15460 and CG15323 – herein named jean-baptiste (jb) and karr respectively – does not necessarily predict lack of functional conservation. These two Drosophila melanogaster genes are among the most rapidly evolving protein-coding genes in this species, being nearly as diverged from their D. yakuba orthologs as random sequences are. jb and karr are both expressed at an elevated level in larval males and adult testes, but they are not accessory gland proteins and their loss does not affect male fertility. Instead, knockdown of these genes in D. melanogaster via RNA interference caused male-biased viability defects. These viability effects occur prior to the third instar for jb and during late pupation for karr. We show that putative orthologs to jb and karr are also expressed strongly in the testes of other Drosophila species and have similar gene structure across species despite low levels of sequence conservation. While standard molecular evolution tests could not reject neutrality, other data hint at a role for natural selection. Together these data provide a clear case where a lack of sequence conservation does not imply a lack of conservation of expression or function. PMID:24221639

  20. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison

    PubMed Central

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-01-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features. PMID:15215464

  1. Inversions and Gene Order Shuffling in Anopheles gambiae and A. funestus

    NASA Astrophysics Data System (ADS)

    Sharakhov, Igor V.; Serazin, Andrew C.; Grushko, Olga G.; Dana, Ali; Lobo, Neil; Hillenmeyer, Maureen E.; Westerman, Richard; Romero-Severson, Jeanne; Costantini, Carlo; Sagnon, N'Fale; Collins, Frank H.; Besansky, Nora J.

    2002-10-01

    In tropical Africa, Anopheles funestus is one of the three most important malaria vectors. We physically mapped 157 A. funestus complementary DNAs (cDNAs) to the polytene chromosomes of this species. Sequences of the cDNAs were mapped in silico to the A. gambiae genome as part of a comparative genomic study of synteny, gene order, and sequence conservation between A. funestus and A. gambiae. These species are in the same subgenus and diverged about as recently as humans and chimpanzees. Despite nearly perfect preservation of synteny, we found substantial shuffling of gene order along corresponding chromosome arms. Since the divergence of these species, at least 70 chromosomal inversions have been fixed, the highest rate of rearrangement of any eukaryote studied to date. The high incidence of paracentric inversions and limited colinearity suggests that locating genes in one anopheline species based on gene order in another may be limited to closely related taxa.

  2. Expression of a polyubiquitin promoter isolated from Gladiolus.

    PubMed

    Joung, Young Hee; Kamo, Kathryn

    2006-10-01

    A polyubiquitin promoter (GUBQ1) including its 5'UTR and intron was isolated from the floral monocot Gladiolus because high levels of expression could not be obtained using publicly available promoters isolated from either cereals or dicots. Sequencing of the promoter revealed highly conserved 5' and 3' intron splicing sites for the 1.234 kb intron. The coding sequence of the first two ubiquitin genes showed the highest homology (87 and 86%, respectively) to the ubiquitin genes of Nicotiana tabacum and Oryza sativa RUBQ2. Transient expression following gene gun bombardment showed that relative levels of GUS activity with the GUBQ1 promoter were comparable to the CaMV 35S promoter in gladiolus, tobacco, rose, rice, and the floral monocot freesia. The highest levels of GUS expression with GUBQ1 were attained with Gladiolus. The full-length GUBQ1 promoter including 5'UTR and intron were necessary for maximum GUS expression in Gladiolus. The relative GUS activity for the promoter only was 9%, and the activity for the promoter with 5'UTR and 399 bp of the full-length 1.234 kb intron was 41%. Arabidopsis plants transformed with uidA under GUBQ1 showed moderate GUS expression throughout young leaves and in the vasculature of older leaves. The highest levels of transient GUS expression in Gladiolus have been achieved using the GUBQ1 promoter. This promoter should be useful for genetic engineering of disease resistance in Gladiolus, rose, and freesia, where high levels of gene expression are important.

  3. Functional Characterization of the Vitamin K2 Biosynthetic Enzyme UBIAD1

    PubMed Central

    Hirota, Yoshihisa; Nakagawa, Kimie; Sawada, Natsumi; Okuda, Naoko; Suhara, Yoshitomo; Uchino, Yuri; Kimoto, Takashi; Funahashi, Nobuaki; Kamao, Maya; Tsugawa, Naoko; Okano, Toshio

    2015-01-01

    UbiA prenyltransferase domain-containing protein 1 (UBIAD1) plays a significant role in vitamin K2 (MK-4) synthesis. We investigated the enzymological properties of UBIAD1 using microsomal fractions from Sf9 cells expressing UBIAD1 by analysing MK-4 biosynthetic activity. With regard to UBIAD1 enzyme reaction conditions, highest MK-4 synthetic activity was demonstrated under basic conditions at a pH between 8.5 and 9.0, with a DTT ≥0.1 mM. In addition, we found that geranyl pyrophosphate and farnesyl pyrophosphate were also recognized as a side-chain source and served as a substrate for prenylation. Furthermore, lipophilic statins were found to directly inhibit the enzymatic activity of UBIAD1. We analysed the aminoacid sequences homologies across the menA and UbiA families to identify conserved structural features of UBIAD1 proteins and focused on four highly conserved domains. We prepared protein mutants deficient in the four conserved domains to evaluate enzyme activity. Because no enzyme activity was detected in the mutants deficient in the UBIAD1 conserved domains, these four domains were considered to play an essential role in enzymatic activity. We also measured enzyme activities using point mutants of the highly conserved aminoacids in these domains to elucidate their respective functions. We found that the conserved domain I is a substrate recognition site that undergoes a structural change after substrate binding. The conserved domain II is a redox domain site containing a CxxC motif. The conserved domain III is a hinge region important as a catalytic site for the UBIAD1 enzyme. The conserved domain IV is a binding site for Mg2+/isoprenyl side-chain. In this study, we provide a molecular mapping of the enzymological properties of UBIAD1. PMID:25874989

  4. CodonLogo: a sequence logo-based viewer for codon patterns.

    PubMed

    Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V

    2012-07-15

    Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.

  5. New families of site-specific repetitive DNA sequences that comprise constitutive heterochromatin of the Syrian hamster (Mesocricetus auratus, Cricetinae, Rodentia).

    PubMed

    Yamada, Kazuhiko; Kamimura, Eikichi; Kondo, Mariko; Tsuchiya, Kimiyuki; Nishida-Umehara, Chizuko; Matsuda, Yoichi

    2006-02-01

    We molecularly cloned new families of site-specific repetitive DNA sequences from BglII- and EcoRI-digested genomic DNA of the Syrian hamster (Mesocricetus auratus, Cricetrinae, Rodentia) and characterized them by chromosome in situ hybridization and filter hybridization. They were classified into six different types of repetitive DNA sequence families according to chromosomal distribution and genome organization. The hybridization patterns of the sequences were consistent with the distribution of C-positive bands and/or Hoechst-stained heterochromatin. The centromeric major satellite DNA and sex chromosome-specific and telomeric region-specific repetitive sequences were conserved in the same genus (Mesocricetus) but divergent in different genera. The chromosome-2-specific sequence was conserved in two genera, Mesocricetus and Cricetulus, and a low copy number of repetitive sequences on the heterochromatic chromosome arms were conserved in the subfamily Cricetinae but not in the subfamily Calomyscinae. By contrast, the other type of repetitive sequences on the heterochromatic chromosome arms, which had sequence similarities to a LINE sequence of rodents, was conserved through the three subfamilies, Cricetinae, Calomyscinae and Murinae. The nucleotide divergence of the repetitive sequences of heterochromatin was well correlated with the phylogenetic relationships of the Cricetinae species, and each sequence has been independently amplified and diverged in the same genome.

  6. SEPT9 Mutations and a Conserved 17q25 Sequence in Sporadic and Hereditary Brachial Plexus Neuropathy

    PubMed Central

    Klein, Christopher J.; Wu, Yanhong; Cunningham, Julie M.; Windebank, Anthony J.; Dyck, P. James B.; Friedenberg, Scott M.; Klein, Diane M.; Dyck, Peter J.

    2009-01-01

    Background The clinical characteristics of sporadic brachial plexus neuropathy (S-BPN) and hereditary brachial plexus neuropathy (H-BPN) are similar. At times of attack inflammation in brachial plexus nerves has been identified in both conditions. SEPT-9 mutations (Arg88Trp, Ser93Phe, 5UTR-131G to C) occur in some families with H-BPN. These mutations were not found in American H-BPN kindreds with a conserved 500 Kb sequence of DNA at 17q25 (the location of SEPT-9) where a founder mutation has been suggested. Objective To study 17q25 and SEPT-9 in S-BPN (56 patients) and H-BPN (13 kindreds). Methods Allele analysis at 17q25, SEPT-9 DNA sequencing and mRNA analysis from lymphoblast cultures. Results A conserved 17q25 sequence was found in 5 of 13 H-BPN kindreds and one S-BPN patient. This conserved sequence was not found in the family with a SEPT-9 mutation (Arg88Trp) or controls (182). SEPT-9 mRNA expression did not differ between forms of H-BPN and controls. No known mutations of SEPT-9 were found in S-BPN. Conclusions/Relevance Rare S-BPN patients have the same conserved 17q25 sequence found in many American H-BPN kindreds. BPN patients with this conserved sequence do not appear to have SEPT-9 mutations or alterations of its mRNA expression levels in lymphoblast cultures. BPN patients with this conserved sequence may have the most common genetic cause in the Americas by a founder effect mutation. PMID:19204161

  7. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krauthammer, Michael; Kong, Yong; Ha, Byung Hak

    We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequentmore » in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1{sup P29S}) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1{sup P29S} showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.« less

  8. Heterologous Array Analysis in Pinaceae: Hybridization of Pinus Taeda cDNA Arrays With cDNA From Needles and Embryogenic Cultures of P. Taeda, P. Sylvestris or Picea Abies

    PubMed Central

    van Zyl, Leonel; von Arnold, Sara; Bozhkov, Peter; Chen, Yongzhong; Egertsdotter, Ulrika; MacKay, John; Sederoff, Ronald R.; Shen, Jing; Zelena, Lyubov

    2002-01-01

    Hybridization of labelled cDNA from various cell types with high-density arrays of expressed sequence tags is a powerful technique for investigating gene expression. Few conifer cDNA libraries have been sequenced. Because of the high level of sequence conservation between Pinus and Picea we have investigated the use of arrays from one genus for studies of gene expression in the other. The partial cDNAs from 384 identifiable genes expressed in differentiating xylem of Pinus taeda were printed on nylon membranes in randomized replicates. These were hybridized with labelled cDNA from needles or embryogenic cultures of Pinus taeda, P. sylvestris and Picea abies, and with labelled cDNA from leaves of Nicotiana tabacum. The Spearman correlation of gene expression for pairs of conifer species was high for needles (r2 = 0.78 − 0.86), and somewhat lower for embryogenic cultures (r2 = 0.68 − 0.83). The correlation of gene expression for tobacco leaves and needles of each of the three conifer species was lower but sufficiently high (r2 = 0.52 − 0.63) to suggest that many partial gene sequences are conserved in angiosperms and gymnosperms. Heterologous probing was further used to identify tissue-specific gene expression over species boundaries. To evaluate the significance of differences in gene expression, conventional parametric tests were compared with permutation tests after four methods of normalization. Permutation tests after Z-normalization provide the highest degree of discrimination but may enhance the probability of type I errors. It is concluded that arrays of cDNA from loblolly pine are useful for studies of gene expression in other pines or spruces. PMID:18629264

  9. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less

  10. A conserved post-transcriptional BMP2 switch in lung cells.

    PubMed

    Jiang, Shan; Fritz, David T; Rogers, Melissa B

    2010-05-15

    An ultra-conserved sequence in the bone morphogenetic protein 2 (BMP2) 3' untranslated region (UTR) markedly represses BMP2 expression in non-transformed lung cells. In contrast, the ultra-conserved sequence stimulates BMP2 expression in transformed lung cells. The ultra-conserved sequence functions as a post-transcriptional cis-regulatory switch. A common single-nucleotide polymorphism (SNP, rs15705, +A1123C), which has been shown to influence human morphology, disrupts a conserved element within the ultra-conserved sequence and altered reporter gene activity in non-transformed lung cells. This polymorphism changed the affinity of the BMP2 RNA for several proteins including nucleolin, which has an increased affinity for the C allele. Elevated BMP2 synthesis is associated with increased malignancy in mouse models of lung cancer and poor lung cancer patient prognosis. Understanding the cis- and trans-regulatory factors that control BMP2 synthesis is relevant to the initiation or progression of pathologies associated with abnormal BMP2 levels. (c) 2010 Wiley-Liss, Inc.

  11. Function-based classification of carbohydrate-active enzymes by recognition of short, conserved peptide motifs.

    PubMed

    Busk, Peter Kamp; Lange, Lene

    2013-06-01

    Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision.

  12. Molecular characterization of the full-length L and M RNAs of Tomato yellow ring virus, a member of the genus Tospovirus.

    PubMed

    Chen, Tsung-Chi; Li, Ju-Ting; Fan, Ya-Shu; Yeh, Yi-Chun; Yeh, Shyi-Dong; Kormelink, Richard

    2013-06-01

    Tomato yellow ring virus (TYRV), first isolated from tomato in Iran, was classified as a non-approved species of the genus Tospovirus based on the characterization of its genomic S RNA. In the current study, the complete sequences of the genomic L and M RNAs of TYRV were determined and analyzed. The L RNA has 8,877 nucleotides (nt) and codes in the viral complementary (vc) strand for the putative RNA-dependent RNA polymerase (RdRp) of 2,873 amino acids (aa) (331 kDa). The RdRp of TYRV shares the highest aa sequence identity (88.7 %) with that of Iris yellow spot virus (IYSV), and contains conserved motifs shared with those of the animal-infecting bunyaviruses. The M RNA contains 4,786 nt and codes in ambisense arrangement for the NSm protein of 308 aa (34.5 kDa) in viral sense, and the Gn/Gc glycoprotein precursor (GP) of 1,310 aa (128 kDa) in vc-sense. Phylogenetic analyses indicated that TYRV is closely clustered with IYSV and Polygonum ringspot virus (PolRSV). The NSm and GP of TYRV share the highest aa sequence identity with those of IYSV and PolRSV (89.9 and 80.2-86.5 %, respectively). Moreover, the GPs of TYRV, IYSV, and PolRSV share highly similar characteristics, among which an identical deduced N-terminal protease cleavage site that is distinct from all tospoviral GPs analyzed thus far. Taken together, the elucidation of the complete genome sequence and biological features of TYRV support a close ancestral relationship with IYSV and PolRSV.

  13. Transcriptional Activation Signals Found in the Epstein-Barr Virus (EBV) Latency C Promoter Are Conserved in the Latency C Promoter Sequences from Baboon and Rhesus Monkey EBV-Like Lymphocryptoviruses (Cercopithicine Herpesviruses 12 and 15)

    PubMed Central

    Fuentes-Pananá, Ezequiel M.; Swaminathan, Sankar; Ling, Paul D.

    1999-01-01

    The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (−1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates. PMID:9847397

  14. Transcriptional activation signals found in the Epstein-Barr virus (EBV) latency C promoter are conserved in the latency C promoter sequences from baboon and Rhesus monkey EBV-like lymphocryptoviruses (cercopithicine herpesviruses 12 and 15).

    PubMed

    Fuentes-Pananá, E M; Swaminathan, S; Ling, P D

    1999-01-01

    The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (-1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates.

  15. An ectromelia virus profilin homolog interacts with cellular tropomyosin and viral A-type inclusion protein.

    PubMed

    Butler-Cole, Christine; Wagner, Mary J; Da Silva, Melissa; Brown, Gordon D; Burke, Robert D; Upton, Chris

    2007-07-24

    Profilins are critical to cytoskeletal dynamics in eukaryotes; however, little is known about their viral counterparts. In this study, a poxviral profilin homolog, ectromelia virus strain Moscow gene 141 (ECTV-PH), was investigated by a variety of experimental and bioinformatics techniques to characterize its interactions with cellular and viral proteins. Profilin-like proteins are encoded by all orthopoxviruses sequenced to date, and share over 90% amino acid (aa) identity. Sequence comparisons show highest similarity to mammalian type 1 profilins; however, a conserved 3 aa deletion in mammalian type 3 and poxviral profilins suggests that these homologs may be more closely related. Structural analysis shows that ECTV-PH can be successfully modelled onto both the profilin 1 crystal structure and profilin 3 homology model, though few of the surface residues thought to be required for binding actin, poly(L-proline), and PIP2 are conserved. Immunoprecipitation and mass spectrometry identified two proteins that interact with ECTV-PH within infected cells: alpha-tropomyosin, a 38 kDa cellular actin-binding protein, and the 84 kDa product of vaccinia virus strain Western Reserve (VACV-WR) 148, which is the truncated VACV counterpart of the orthopoxvirus A-type inclusion (ATI) protein. Western and far-western blots demonstrated that the interaction with alpha-tropomyosin is direct, and immunofluorescence experiments suggest that ECTV-PH and alpha-tropomyosin may colocalize to structures that resemble actin tails and cellular protrusions. Sequence comparisons of the poxviral ATI proteins show that although full-length orthologs are only present in cowpox and ectromelia viruses, an ~ 700 aa truncated ATI protein is conserved in over 90% of sequenced orthopoxviruses. Immunofluorescence studies indicate that ECTV-PH localizes to cytoplasmic inclusion bodies formed by both truncated and full-length versions of the viral ATI protein. Furthermore, colocalization of ECTV-PH and truncated ATI protein to protrusions from the cell surface was observed. These results suggest a role for ECTV-PH in intracellular transport of viral proteins or intercellular spread of the virus. Broader implications include better understanding of the virus-host relationship and mechanisms by which cells organize and control the actin cytoskeleton.

  16. High throughput deep degradome sequencing reveals microRNAs and their targets in response to drought stress in mulberry (Morus alba).

    PubMed

    Li, Ruixue; Chen, Dandan; Wang, Taichu; Wan, Yizhen; Li, Rongfang; Fang, Rongjun; Wang, Yuting; Hu, Fei; Zhou, Hong; Li, Long; Zhao, Weiguo

    2017-01-01

    MicroRNAs (miRNAs) play important regulatory roles by targeting mRNAs for cleavage or translational repression. Identification of miRNA targets is essential to better understanding the roles of miRNAs. miRNA targets have not been well characterized in mulberry (Morus alba). To anatomize miRNA guided gene regulation under drought stress, transcriptome-wide high throughput degradome sequencing was used in this study to directly detect drought stress responsive miRNA targets in mulberry. A drought library (DL) and a contrast library (CL) were constructed to capture the cleaved mRNAs for sequencing. In CL, 409 target genes of 30 conserved miRNA families and 990 target genes of 199 novel miRNAs were identified. In DL, 373 target genes of 30 conserved miRNA families and 950 target genes of 195 novel miRNAs were identified. Of the conserved miRNA families in DL, mno-miR156, mno-miR172, and mno-miR396 had the highest number of targets with 54, 52 and 41 transcripts, respectively, indicating that these three miRNA families and their target genes might play important functions in response to drought stress in mulberry. Additionally, we found that many of the target genes were transcription factors. By analyzing the miRNA-target molecular network, we found that the DL independent networks consisted of 838 miRNA-mRNA pairs (63.34%). The expression patterns of 11 target genes and 12 correspondent miRNAs were detected using qRT-PCR. Six miRNA targets were further verified by RNA ligase-mediated 5' rapid amplification of cDNA ends (RLM-5' RACE). Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that these target transcripts were implicated in a broad range of biological processes and various metabolic pathways. This is the first study to comprehensively characterize target genes and their associated miRNAs in response to drought stress by degradome sequencing in mulberry. This study provides a framework for understanding the molecular mechanisms of drought resistance in mulberry.

  17. Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

    PubMed

    Dong, Zheng; Zhou, Hongyu; Tao, Peng

    2018-02-01

    PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.

  18. Quantifying the relationship between sequence and three-dimensional structure conservation in RNA

    PubMed Central

    2010-01-01

    Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657

  19. Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

    PubMed

    Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

    2013-08-01

    To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. Airway and Feeding Outcomes of Mandibular Distraction, Tongue-Lip Adhesion, and Conservative Management in Pierre Robin Sequence: A Prospective Study.

    PubMed

    Khansa, Ibrahim; Hall, Courtney; Madhoun, Lauren L; Splaingard, Mark; Baylis, Adriane; Kirschner, Richard E; Pearson, Gregory D

    2017-04-01

    Pierre Robin sequence is characterized by mandibular retrognathia and glossoptosis resulting in airway obstruction and feeding difficulties. When conservative management fails, mandibular distraction osteogenesis or tongue-lip adhesion may be required to avoid tracheostomy. The authors' goal was to prospectively evaluate the airway and feeding outcomes of their comprehensive approach to Pierre Robin sequence, which includes conservative management, mandibular distraction osteogenesis, and tongue-lip adhesion. A longitudinal study of newborns with Pierre Robin sequence treated at a pediatric academic medical center between 2010 and 2015 was performed. Baseline feeding and respiratory data were collected. Patients underwent conservative management if they demonstrated sustainable weight gain without tube feeds, and if their airway was stable with positioning alone. Patients who required surgery underwent tongue-lip adhesion or mandibular distraction osteogenesis based on family and surgeon preference. Postoperative airway and feeding data were collected. Twenty-eight patients with Pierre Robin sequence were followed prospectively. Thirty-two percent had a syndrome. Ten underwent mandibular distraction osteogenesis, eight underwent tongue-lip adhesion, and 10 were treated conservatively. There were no differences in days to extubation or discharge, change in weight percentile, requirement for gastrostomy tube, or residual obstructive sleep apnea between the three groups. No patients required tracheostomy. The greatest reduction in apnea-hypopnea index occurred with mandibular distraction osteogenesis, followed by tongue-lip adhesion and conservative management. Careful selection of which patients with Pierre Robin sequence need surgery, and of the most appropriate surgical procedure for each patient, can minimize the need for postprocedure tracheostomy. A comprehensive approach to Pierre Robin sequence that includes conservative management, mandibular distraction osteogenesis, and tongue-lip adhesion can result in excellent airway and feeding outcomes. Therapeutic, II.

  1. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    PubMed

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species. © 2016 S. Karger AG, Basel.

  2. Identification and Characterization of miRNA Transcriptome in Asiatic Cotton (Gossypium arboreum) Using High Throughput Sequencing

    PubMed Central

    Farooq, Muhammad; Mansoor, Shahid; Guo, Hui; Amin, Imran; Chee, Peng W.; Azim, M. Kamran; Paterson, Andrew H.

    2017-01-01

    MicroRNAs (miRNAs) are small 20–24nt molecules that have been well studied over the past decade due to their important regulatory roles in different cellular processes. The mature sequences are more conserved across vast phylogenetic scales than their precursors and some are conserved within entire kingdoms, hence, their loci and function can be predicted by homology searches. Different studies have been performed to elucidate miRNAs using de novo prediction methods but due to complex regulatory mechanisms or false positive in silico predictions, not all of them express in reality and sometimes computationally predicted mature transcripts differ from the actual expressed ones. With the availability of a complete genome sequence of Gossypium arboreum, it is important to annotate the genome for both coding and non-coding regions using high confidence transcript evidence, for this cotton species that is highly resistant to various biotic and abiotic stresses. Here we have analyzed the small RNA transcriptome of G. arboreum leaves and provided genome annotation of miRNAs with evidence from miRNA/miRNA∗ transcripts. A total of 446 miRNAs clustered into 224 miRNA families were found, among which 48 families are conserved in other plants and 176 are novel. Four short RNA libraries were used to shortlist best predictions based on high reads per million. The size, origin, copy numbers and transcript depth of all miRNAs along with their isoforms and targets has been reported. The highest gene copy number was observed for gar-miR7504 followed by gar-miR166, gar-miR8771, gar-miR156, and gar-miR7484. Altogether, 1274 target genes were found in G. arboreum that are enriched for 216 KEGG pathways. The resultant genomic annotations are provided in UCSC, BED format. PMID:28663752

  3. Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5′-phosphate-dependent enzymes

    PubMed Central

    Paiardini, Alessandro; Bossa, Francesco; Pascarella, Stefano

    2004-01-01

    The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction. PMID:15498941

  4. Polymerase Chain Reaction (PCR)-based methods for detection and identification of mycotoxigenic Penicillium species using conserved genes

    USDA-ARS?s Scientific Manuscript database

    Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of d...

  5. RNA expression in a cartilaginous fish cell line reveals ancient 3′ noncoding regions highly conserved in vertebrates

    PubMed Central

    Forest, David; Nishikawa, Ryuhei; Kobayashi, Hiroshi; Parton, Angela; Bayne, Christopher J.; Barnes, David W.

    2007-01-01

    We have established a cartilaginous fish cell line [Squalus acanthias embryo cell line (SAE)], a mesenchymal stem cell line derived from the embryo of an elasmobranch, the spiny dogfish shark S. acanthias. Elasmobranchs (sharks and rays) first appeared >400 million years ago, and existing species provide useful models for comparative vertebrate cell biology, physiology, and genomics. Comparative vertebrate genomics among evolutionarily distant organisms can provide sequence conservation information that facilitates identification of critical coding and noncoding regions. Although these genomic analyses are informative, experimental verification of functions of genomic sequences depends heavily on cell culture approaches. Using ESTs defining mRNAs derived from the SAE cell line, we identified lengthy and highly conserved gene-specific nucleotide sequences in the noncoding 3′ UTRs of eight genes involved in the regulation of cell growth and proliferation. Conserved noncoding 3′ mRNA regions detected by using the shark nucleotide sequences as a starting point were found in a range of other vertebrate orders, including bony fish, birds, amphibians, and mammals. Nucleotide identity of shark and human in these regions was remarkably well conserved. Our results indicate that highly conserved gene sequences dating from the appearance of jawed vertebrates and representing potential cis-regulatory elements can be identified through the use of cartilaginous fish as a baseline. Because the expression of genes in the SAE cell line was prerequisite for their identification, this cartilaginous fish culture system also provides a physiologically valid tool to test functional hypotheses on the role of these ancient conserved sequences in comparative cell biology. PMID:17227856

  6. Sequence conservation on the Y chromosome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gibson, L.H.; Yang-Feng, L.; Lau, C.

    The Y chromosome is present in all mammals and is considered to be essential to sex determination. Despite intense genomic research, only a few genes have been identified and mapped to this chromosome in humans. Several of them, such as SRY and ZFY, have been demonstrated to be conserved and Y-located in other mammals. In order to address the issue of sequence conservation on the Y chromosome, we performed fluorescence in situ hybridization (FISH) with DNA from a human Y cosmid library as a probe to study the Y chromosomes from other mammalian species. Total DNA from 3,000-4,500 cosmid poolsmore » were labeled with biotinylated-dUTP and hybridized to metaphase chromosomes. For human and primate preparations, human cot1 DNA was included in the hybridization mixture to suppress the hybridization from repeat sequences. FISH signals were detected on the Y chromosomes of human, gorilla, orangutan and baboon (Old World monkey) and were absent on those of squirrel monkey (New World monkey), Indian munjac, wood lemming, Chinese hamster, rat and mouse. Since sequence analysis suggested that specific genes, e.g. SRY and ZFY, are conserved between these two groups, the lack of detectable hybridization in the latter group implies either that conservation of the human Y sequences is limited to the Y chromosomes of the great apes and Old World monkeys, or that the size of the syntenic segment is too small to be detected under the resolution of FISH, or that homologeous sequences have undergone considerable divergence. Further studies with reduced hybridization stringency are currently being conducted. Our results provide some clues as to Y-sequence conservation across species and demonstrate the limitations of FISH across species with total DNA sequences from a particular chromosome.« less

  7. Evolutionary and biophysical relationships among the papillomavirus E2 proteins.

    PubMed

    Blakaj, Dukagjin M; Fernandez-Fuentes, Narcis; Chen, Zigui; Hegde, Rashmi; Fiser, Andras; Burk, Robert D; Brenowitz, Michael

    2009-01-01

    Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.

  8. HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

    PubMed

    Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

    2011-03-10

    Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.

  9. HMMerThread: Detecting Remote, Functional Conserved Domains in Entire Genomes by Combining Relaxed Sequence-Database Searches with Fold Recognition

    PubMed Central

    Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine

    2011-01-01

    Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de. PMID:21423752

  10. Prediction and Identification of Krüppel-Like Transcription Factors by Machine Learning Method.

    PubMed

    Liao, Zhijun; Wang, Xinrui; Chen, Xingyong; Zou, Quan

    2017-01-01

    The Krüppel-like factors (KLFs) are a family of containing Zn finger(ZF) motif transcription factors with 18 members in human genome, among them, KLF18 is predicted by bioinformatics. KLFs possess various physiological function involving in a number of cancers and other diseases. Here we perform a binary-class classification of KLFs and non-KLFs by machine learning methods. The protein sequences of KLFs and non-KLFs were searched from UniProt and randomly separate them into training dataset(containing positive and negative sequences) and test dataset(containing only negative sequences), after extracting the 188-dimensional(188D) feature vectors we carry out category with four classifiers(GBDT, libSVM, RF, and k-NN). On the human KLFs, we further dig into the evolutionary relationship and motif distribution, and finally we analyze the conserved amino acid residue of three zinc fingers. The classifier model from training dataset were well constructed, and the highest specificity(Sp) was 99.83% from a library for support vector machine(libSVM) and all the correctly classified rates were over 70% for 10-fold cross-validation on test dataset. The 18 human KLFs can be further divided into 7 groups and the zinc finger domains were located at the carboxyl terminus, and many conserved amino acid residues including Cysteine and Histidine, and the span and interval between them were consistent in the three ZF domains. Two classification models for KLFs prediction have been built by novel machine learning methods. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  11. Production, Purification, and Gene Cloning of a β-Fructofuranosidase with a High Inulin-hydrolyzing Activity Produced by a Novel Yeast Aureobasidium sp. P6 Isolated from a Mangrove Ecosystem.

    PubMed

    Jiang, Hong; Ma, Yan; Chi, Zhe; Liu, Guang-Lei; Chi, Zhen-Ming

    2016-08-01

    After screening of over 300 yeast strains isolated from the mangrove ecosystems, it was found that Aureobasidium sp. P6 strain had the highest inulin-hydrolyzing activity. Under the optimal conditions, this yeast strain produced an inulin-hydrolyzing activity of 30.98 ± 0.8 U/ml after 108 h of a 10-l fermentation. After the purification, a molecular weight of the enzyme which had the inulin-hydrolyzing activity was estimated to be 47.6 kDa, and the purified enzyme could actively hydrolyze both sucrose and inulin and exhibit a transfructosylating activity at 30.0 % sucrose, converting sucrose into fructooligosaccharides (FOS), indicating that the purified enzyme was a β-D-fructofuranosidase. After the full length of a β-D-fructofuranosidase gene (accession number KU308553) was cloned from Aureobasidium sp. P6 strain, a protein deduced from the cloned gene contained the conserved sequences MNDPNGL, RDP, ECP, FS, and Q of a glycosidehydrolase GH32 family, respectively, but did not contain a conserved sequence SVEVF, and the amino acid sequence of the protein from Aureobasidium sp. P6 strain had a high similarity to that of the β-fructofuranosidase from any other fungal strains. After deletion of the β-D-fructofuranosidase gene, the disruptant still had low inulin hydrolyzing and invertase activities and a trace amount of the transfructosylating activity, indicating that the gene encoding an inulinase may exist in the Aureobasidium sp. P6 strain.

  12. Myrciaria dubia, an Amazonian fruit: population structure and its implications for germplasm conservation and genetic improvement.

    PubMed

    Nunes, C F; Setotaw, T A; Pasqual, M; Chagas, E A; Santos, E G; Santos, D N; Lima, C G B; Cançado, G M A

    2017-03-22

    Myrciaria dubia (camu-camu) is an Amazon tree that produces a tart fruit with high vitamin C content. It is probably the fruit with the highest vitamin C content among all Brazilian fruit crops and it can be used to supplement daily vitamin C dose. This property has attracted the attention of consumers and, consequently, encouraged fruit farmers to produce it. In order to identify and select potential accessions for commercial exploitation and breeding programs, M. dubia has received considerable research attention. The identification and characterization of genetic diversity, as well as identification of the population structure of accessions preserved in germplasm banks are fundamental for the success of any breeding program. The objective of this study was to evaluate the genetic variability of 10 M. dubia populations obtained from the shores of Reis Lake, located in the municipality of Caracaraí, Roraima, Brazil. Fourteen polymorphic inter simple sequence repeat (ISSR) markers were used to study the population genetic diversity, which resulted in 108 identified alleles. Among the 14 primers, GCV, UBC810, and UBC827 produced the highest number of alleles. The study illustrated the suitability and efficiency of ISSR markers to study the genetic diversity of M. dubia accessions. We also revealed the existence of high genetic variability among both accessions and populations that can be exploited in future breeding programs and conservation activities of this species.

  13. A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

    PubMed

    Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

    2006-04-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.

  14. A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

    PubMed Central

    Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

    2006-01-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

  15. Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

    PubMed Central

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-01-01

    Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389

  16. Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

    PubMed

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-12-27

    Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.

  17. Targeting Conserved Genes in Penicillium Species.

    PubMed

    Peterson, Stephen W

    2017-01-01

    Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of dideoxynucleotide-labeled fragments or NGS. The sequences are compared to a database of validated isolates. Identification of species indicates the potential of the fungus to make particular mycotoxins.

  18. Nucleotide sequence of the ribosomal RNA gene of Physarum polycephalum: intron 2 and its flanking regions of the 26S rRNA gene.

    PubMed Central

    Nomiyama, H; Kuhara, S; Kukita, T; Otsuka, T; Sakaki, Y

    1981-01-01

    The 26S ribosomal RNA gene of Physarum polycephalum is interrupted by two introns, and we have previously determined the sequence of one of them (intron 1) (Nomiyama et al. Proc.Natl.Acad.Sci.USA 78, 1376-1380, 1981). In this study we sequenced the second intron (intron 2) of about 0.5 kb length and its flanking regions, and found that one nucleotide at each junction is identical in intron 1 and intron 2, though the junction regions share no other sequence homology. Comparison of the flanking exon sequences to E. coli 23S rRNA sequences shows that conserved sequences are interspersed with tracts having little homology. In particular, the region encompassing the intron 2 interruption site is highly conserved. The E. coli ribosomal protein L1 binding region is also conserved. Images PMID:6171776

  19. Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

    PubMed Central

    Hall, L; Laird, J E; Craig, R K

    1984-01-01

    Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375

  20. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  1. Isolation and Expression Analysis of CYP9A11 and Cytochrome P450 Reductase Gene in the Beet Armyworm (Lepidoptera: Noctuidae)

    PubMed Central

    Zhao, Chunqing; Feng, Xiaoyun; Tang, Tao; Qiu, Lihong

    2015-01-01

    Cytochrome P450 monooxygenases (CYPs), as an enzyme superfamily, is widely distributed in organisms and plays a vital function in the metabolism of exogenous and endogenous compounds by interacting with its obligatory redox partner, CYP reductase (CPR). A novel CYP gene (CYP9A11) and CPR gene from the agricultural pest insect Spodoptera exigua were cloned and characterized. The complete cDNA sequences of SeCYP9A11 and SeCPR are 1,931 and 3,919 bp in length, respectively, and contain open reading frames of 1,593 and 2,070 nucleotides, respectively. Analysis of the putative protein sequences indicated that SeCYP9A11 contains a heme-binding domain and the unique characteristic sequence (SRFALCE) of the CYP9 family, in addition to a signal peptide and transmembrane segment at the N-terminal. Alignment analysis revealed that SeCYP9A11 shares the highest sequence similarity with CYP9A13 from Mamestra brassicae, which is 66.54%. The putative protein sequence of SeCPR has all of the classical CPR features, such as an N-terminal membrane anchor; three conserved domain flavin adenine dinucleotide (FAD), flavin mononucleotide (FMN), and nicotinamide adenine dinucleotide phosphate (NADPH) domain; and characteristic binding motifs. Phylogenetic analysis revealed that SeCPR shares the highest identity with HaCPR, which is 95.21%. The SeCYP9A11 and SeCPR genes were detected in the midgut, fat body, and cuticle tissues, and throughout all of the developmental stages of S. exigua. The mRNA levels of SeCYP9A11 and SeCPR decreased remarkably after exposure to plant secondary metabolites quercetin and tannin. The results regarding SeCYP9A11 and SeCPR genes in the current study provide foundation for the further study of S. exigua P450 system. PMID:26320261

  2. Scop3D: three-dimensional visualization of sequence conservation.

    PubMed

    Vermeire, Tessa; Vermaere, Stijn; Schepens, Bert; Saelens, Xavier; Van Gucht, Steven; Martens, Lennart; Vandermarliere, Elien

    2015-04-01

    The integration of a protein's structure with its known sequence variation provides insight on how that protein evolves, for instance in terms of (changing) function or immunogenicity. Yet, collating the corresponding sequence variants into a multiple sequence alignment, calculating each position's conservation, and mapping this information back onto a relevant structure is not straightforward. We therefore built the Sequence Conservation on Protein 3D structure (scop3D) tool to perform these tasks automatically. The output consists of two modified PDB files in which the B-values for each position are replaced by the percentage sequence conservation, or the information entropy for each position, respectively. Furthermore, text files with absolute and relative amino acid occurrences for each position are also provided, along with snapshots of the protein from six distinct directions in space. The visualization provided by scop3D can for instance be used as an aid in vaccine development or to identify antigenic hotspots, which we here demonstrate based on an analysis of the fusion proteins of human respiratory syncytial virus and mumps virus. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Sequencing Needs for Viral Diagnostics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, S N; Lam, M; Mulakken, N J

    2004-01-26

    We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less

  4. Cloning and sequence analysis of a cDNA encoding the alpha-subunit of mouse beta-N-acetylhexosaminidase and comparison with the human enzyme.

    PubMed Central

    Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L

    1992-01-01

    cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046

  5. Genome-wide identification of conserved microRNA and their response to drought stress in Dongxiang wild rice (Oryza rufipogon Griff.).

    PubMed

    Zhang, Fantao; Luo, Xiangdong; Zhou, Yi; Xie, Jiankun

    2016-04-01

    To identify drought stress-responsive conserved microRNA (miRNA) from Dongxiang wild rice (Oryza rufipogon Griff., DXWR) on a genome-wide scale, high-throughput sequencing technology was used to sequence libraries of DXWR samples, treated with and without drought stress. 505 conserved miRNAs corresponding to 215 families were identified. 17 were significantly down-regulated and 16 were up-regulated under drought stress. Stem-loop qRT-PCR revealed the same expression patterns as high-throughput sequencing, suggesting the accuracy of the sequencing result was high. Potential target genes of the drought-responsive miRNA were predicted to be involved in diverse biological processes. Furthermore, 16 miRNA families were first identified to be involved in drought stress response from plants. These results present a comprehensive view of the conserved miRNA and their expression patterns under drought stress for DXWR, which will provide valuable information and sequence resources for future basis studies.

  6. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    NASA Astrophysics Data System (ADS)

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  7. Molecular cloning and functional characterization of a short peptidoglycan recognition protein (HcPGRPS1) from the freshwater mussel, Hyriopsis cumingi.

    PubMed

    Yang, Ziyan; Li, Junhua; Li, Ying; Wu, Hongjuan; Wang, Xiaoyan

    2013-12-01

    Peptidoglycan recognition proteins (PGRPs), which are evolutionarily conserved from invertebrates to vertebrates, function as pattern-recognition and effector molecules in innate immunity. In the present study, a short-form PGRP, designated as HcPGRPS1 was identified from freshwater mussel Hyriopsis cumingi. The deduced amino acid sequence of HcPGRPS1 is composed of 235 residues which contains a conserved PGRP domain at the C-terminus. Sequence analysis showed that HcPGRPS1 shared high identities with other known PGRPs. The mRNA of HcPGRPS1 is constitutively expressed in a wide range of all tested tissues, with highest expression level in hepatopancreas, and its expression in tissues (gonad, nephridium, gill and foot) was up-regulated significantly after LPS or PGN stimulation (P<0.05). The recombinant protein of HcPGRPS1 exhibited binding activity and peptidoglycan-lytic amidase activity toward Lys-PGN from Staphylococcus aureus and DAP-PGN from Bacillus subtilis. Furthermore, recombinant HcPGRPS1 displayed strong antibacterial activity to both Gram-negative bacteria Escherichia coli, Aeromonas hydrophila, Aeromonas sobria and Gram-positive bacteria S. aureus in the presence of Zn(2+). These results suggested that HcPGRPS1 plays a multifunctional role in the defense and protection mechanisms of mussel innate immunity against infections. Copyright © 2013 Elsevier Ltd. All rights reserved.

  8. Molecular cloning and expression profile of an ATP-binding cassette (ABC) transporter gene from the hemipteran insect Nilaparvata lugens.

    PubMed

    Zha, W J; Li, S H; Zhou, L; Chen, Z J; Liu, K; Yang, G C; Hu, G; He, G C; You, A Q

    2015-03-30

    The ATP-binding cassette (ABC) transporters belong to a large superfamily of proteins that have important physiological functions in all living organisms. In insects, ABC transporters have important functions in the transport of molecules, and are also involved in insecticide resistance, metabolism, and development. In this study, the Nilaparvata lugens Stal (Hemiptera: Delphacidae) ABCG (NlABCG) gene was identified and characterized. The complete mRNA sequence of NlABCG was 2608-bp long, with an open reading frame of 2064 bp encoding a protein comprised of 687 amino acids. The conserved regions include three N-glycosylation and 34 phosphorylation sites, as well as seven transmembrane domains. The amino acid identity with the closely related species Acyrthosiphon pisum was 42.8%. Developmental expression analysis using quantitative real-time reverse transcriptase PCR suggested that the NlABCG transcript was expressed at all developmental stages of N. lugens. The lowest expression of NlABCG was in the 1st instar, and levels increased with larval growth. The transcript profiles of NlABCG were analyzed in various tissues from a 5th instar nymph, and the highest expression was observed in the midgut. These results suggest that the sequence, characteristics, and expression of NlABCG are highly conserved, and basic information is provided for its functional analysis.

  9. A conserved role for L1 as a transmembrane link between neuronal adhesion and membrane cytoskeleton assembly.

    PubMed

    Hortsch, M; O'Shea, K S; Zhao, G; Kim, F; Vallejo, Y; Dubreuil, R R

    1998-01-01

    The L1-family of cell adhesion molecules is involved in many important aspects of nervous system development. Mutations in the human L1-CAM gene cause a complicated array of neurological phenotypes; however, the molecular basis of these effects cannot be explained by a simple loss of adhesive function. Human L1-CAM and its Drosophila homolog neuroglian are rather divergent in sequence, with the highest degree of amino acid sequence conservation between segments of their cytoplasmic domains. In an attempt to elucidate the fundamental functions shared between these distantly related members of the L1-family, we demonstrate here that the extracellular domains of mammalian L1-CAMs and Drosophila neuroglian are both able to induce the aggregation of transfected Drosophila S2 cells in vitro. To a limited degree they even interact with each other in cell adhesion and neurite outgrowth assays. The cytoplasmic domains of human L1-CAM and neuroglian are both able to interact with the Drosophila homolog of the cytoskeletal linker protein ankyrin. Moreover the recruitment of ankyrin to cell-cell contacts is completely dependent on L1-mediated cell adhesion. These findings support a model of L1 function in which the phenotypes of human L1-CAM mutations results from a disruption of the link between the extracellular environment and the neuronal cytoskeleton.

  10. Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

    PubMed

    Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

    2010-01-15

    Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. Copyright 2009 Elsevier Inc. All rights reserved.

  11. Molecular cloning and characterization of Aspergillus nidulans cyclophilin B.

    PubMed

    Joseph, J D; Heitman, J; Means, A R

    1999-06-01

    Cyclophilins are an evolutionarily conserved family of proteins which serve as the intracellular receptors for the immunosuppressive drug cyclosporin A. Here we report the characterization of the first cyclophilin cloned from the filamentous fungus Aspergillus nidulans (CYPB). Sequence analysis of the cypB gene predicts an encoded protein with highest homology to the murine cyclophilin B protein. The sequence similarity includes an N-terminal sequence predicted to target the protein to the endoplasmic reticulum (ER) as well as a C-terminal sequence predicted to retain the mature protein in the ER. The bacterially expressed hexa-histidine tagged protein displays peptidyl-prolyl isomerase activity which is inhibited by cyclosporin A. In the presence of cyclosporin A, the expressed protein also inhibits purified calcineurin. When the endogenous cypB gene was disrupted and placed under the control of the regulatable alcohol dehydrogenase promoter, the strain demonstrated no detectable growth phenotype under conditions which induce or repress cypB transcription. Induction or repression of the cypB gene also did not effect sensitivity of A. nidulans to cyclosporin A. cypB mRNA levels were significantly elevated under severe heat shock conditions, indicating a possible role for the A. nidulans cyclophilin B protein during growth in high stress environments. Copyright 1999 Academic Press.

  12. Identification, sequence analysis, and characterization of serine/threonine protein kinase 17A from Clonorchis sinensis.

    PubMed

    Huang, Lisi; Lv, Xiaoli; Huang, Yan; Hu, Yue; Yan, Haiyan; Zheng, Minghui; Zeng, Hua; Li, Xuerong; Liang, Chi; Wu, Zhongdao; Yu, Xinbing

    2014-05-01

    This is the first report of a novel protein from Clonorchis sinensis (C. sinensis), serine/threonine protein kinase 17A (CsSTK17A), which belongs to a member of the death-associated protein kinase (DAPK) family known to regulate diverse biological processes. The full-length sequence encoding CsSTK17A was isolated from C. sinensis adult cDNA plasmid library. Two transcribed isoforms of the gene were identified from the genome of C. sinensis. CsSTK17A contains a kinase domain at the N-terminus that shares a degree of conservation with the DAPK families. Besides, the catalytic domain contains 11 subdomains conserved among STKs and shares the highest identity with STK from Schistosoma mansoni (55.9%). Three-dimensional structure of CsSTK17A displays the canonical STK fold, including the helix C, P-loop, and the activation loop. We obtained recombinant CsSTK17A (rCsSTK17A) and anti-rCsSTK17A IgG. The rCsSTK17A could be probed by anti-rCsSTK17A rat serum, C. sinensis-infected rat serum and the sera from rats immunized with C. sinensis excretory-secretory products, indicating that it is a circulating antigen possessing a strong immunocompetence. Moreover, quantitative RT-PCR and western blotting analyses revealed that CsSTK17A exhibited the highest mRNA and protein expression level in eggs, followed by metacercariae and adult worms. Intriguingly, in the immunolocalization assay, CsSTK17A was intensively localized to the operculum region of eggs in uterus, as well as the vitelline gland of both adult worm and metacercaria, implying that the protein was associated with the reproduction and development of C. sinensis. Overall, these fundamental studies might contribute to further researches on signaling systems of the parasite.

  13. A comparison of complete mitochondrial genomes of silver carp hypophthalmichthys molitrix and bighead carp hypophthalmichthys nobilis: Implications for their taxonomic relationship and phylogeny

    USGS Publications Warehouse

    Li, S.-F.; Xu, J.-W.; Yang, Q.-L.; Wang, C.H.; Chen, Q.; Chapman, D.C.; Lu, G.

    2009-01-01

    Based upon morphological characters, Silver carp Hypophthalmichthys molitrix and bighead carp Hypophthalmichthys nobilis (or Aristichthys nobilis) have been classified into either the same genus or two distinct genera. Consequently, the taxonomic relationship of the two species at the generic level remains equivocal. This issue is addressed by sequencing complete mitochondrial genomes of H. molitrix and H. nobilis, comparing their mitogenome organization, structure and sequence similarity, and conducting a comprehensive phylogenetic analysis of cyprinid species. As with other cyprinid fishes, the mitogenomes of the two species were structurally conserved, containing 37 genes including 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA (tRNAs) genes and a putative control region (D-loop). Sequence similarity between the two mitogenomes varied in different genes or regions, being highest in the tRNA genes (98??8%), lowest in the control region (89??4%) and intermediate in the protein-coding genes (94??2%). Analyses of the sequence comparison and phylogeny using concatenated protein sequences support the view that the two species belong to the genus Hypophthalmichthys. Further studies using nuclear markers and involving more closely related species, and the systematic combination of traditional biology and molecular biology are needed in order to confirm this conclusion. ?? 2009 The Fisheries Society of the British Isles.

  14. RT-PCR and sequence analysis of the full-length fusion protein of Canine Distemper Virus from domestic dogs.

    PubMed

    Romanutti, Carina; Gallo Calderón, Marina; Keller, Leticia; Mattion, Nora; La Torre, José

    2016-02-01

    During 2007-2014, 84 out of 236 (35.6%) samples from domestic dogs submitted to our laboratory for diagnostic purposes were positive for Canine Distemper Virus (CDV), as analyzed by RT-PCR amplification of a fragment of the nucleoprotein gene. Fifty-nine of them (70.2%) were from dogs that had been vaccinated against CDV. The full-length gene encoding the Fusion (F) protein of fifteen isolates was sequenced and compared with that of those of other CDVs, including wild-type and vaccine strains. Phylogenetic analysis using the F gene full-length sequences grouped all the Argentinean CDV strains in the SA2 clade. Sequence identity with the Onderstepoort vaccine strain was 89.0-90.6%, and the highest divergence was found in the 135 amino acids corresponding to the F protein signal-peptide, Fsp (64.4-66.7% identity). In contrast, this region was highly conserved among the local strains (94.1-100% identity). One extra putative N-glycosylation site was identified in the F gene of CDV Argentinean strains with respect to the vaccine strain. The present report is the first to analyze full-length F protein sequences of CDV strains circulating in Argentina, and contributes to the knowledge of molecular epidemiology of CDV, which may help in understanding future disease outbreaks. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. Listeria costaricensis sp. nov.

    PubMed

    Núñez-Montero, Kattia; Leclercq, Alexandre; Moura, Alexandra; Vales, Guillaume; Peraza, Johnny; Pizarro-Cerdá, Javier; Lecuit, Marc

    2018-03-01

    A bacterial strain isolated from a food processing drainage system in Costa Rica fulfilled the criteria as belonging to the genus Listeria, but could not be assigned to any of the known species. Phylogenetic analysis based on the 16S rRNA gene revealed highest sequence similarity with the type strain of Listeria floridensis (98.7 %). Phylogenetic analysis based on Listeria core genomes placed the novel taxon within the Listeria fleishmannii, L. floridensis and Listeria aquatica clade (Listeria sensu lato). Whole-genome sequence analyses based on the average nucleotide blast identity (ANI<80 %) indicated that this isolate belonged to a novel species. Results of pairwise amino acid identity (AAI>70 %) and percentage of conserved proteins (POCP>68 %) with currently known Listeria species, as well as of biochemical characterization, confirmed that the strain constituted a novel species within the genus Listeria. The name Listeria costaricensis sp. nov. is proposed for the novel species, and is represented by the type strain CLIP 2016/00682 T (=CIP 111400 T =DSM 105474 T ).

  16. Coiled-coil length: Size does matter.

    PubMed

    Surkont, Jaroslaw; Diekmann, Yoan; Ryder, Pearl V; Pereira-Leal, Jose B

    2015-12-01

    Protein evolution is governed by processes that alter primary sequence but also the length of proteins. Protein length may change in different ways, but insertions, deletions and duplications are the most common. An optimal protein size is a trade-off between sequence extension, which may change protein stability or lead to acquisition of a new function, and shrinkage that decreases metabolic cost of protein synthesis. Despite the general tendency for length conservation across orthologous proteins, the propensity to accept insertions and deletions is heterogeneous along the sequence. For example, protein regions rich in repetitive peptide motifs are well known to extensively vary their length across species. Here, we analyze length conservation of coiled-coils, domains formed by an ubiquitous, repetitive peptide motif present in all domains of life, that frequently plays a structural role in the cell. We observed that, despite the repetitive nature, the length of coiled-coil domains is generally highly conserved throughout the tree of life, even when the remaining parts of the protein change, including globular domains. Length conservation is independent of primary amino acid sequence variation, and represents a conservation of domain physical size. This suggests that the conservation of domain size is due to functional constraints. © 2015 Wiley Periodicals, Inc.

  17. Protein Sectors: Statistical Coupling Analysis versus Conservation

    PubMed Central

    Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas

    2015-01-01

    Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535

  18. Functional Organization of hsp70 Cluster in Camel (Camelus dromedarius) and Other Mammals

    PubMed Central

    Garbuz, David G.; Astakhova, Lubov N.; Zatsepina, Olga G.; Arkhipova, Irina R.; Nudler, Eugene; Evgen'ev, Michael B.

    2011-01-01

    Heat shock protein 70 (Hsp70) is a molecular chaperone providing tolerance to heat and other challenges at the cellular and organismal levels. We sequenced a genomic cluster containing three hsp70 family genes linked with major histocompatibility complex (MHC) class III region from an extremely heat tolerant animal, camel (Camelus dromedarius). Two hsp70 family genes comprising the cluster contain heat shock elements (HSEs), while the third gene lacks HSEs and should not be induced by heat shock. Comparison of the camel hsp70 cluster with the corresponding regions from several mammalian species revealed similar organization of genes forming the cluster. Specifically, the two heat inducible hsp70 genes are arranged in tandem, while the third constitutively expressed hsp70 family member is present in inverted orientation. Comparison of regulatory regions of hsp70 genes from camel and other mammals demonstrates that transcription factor matches with highest significance are located in the highly conserved 250-bp upstream region and correspond to HSEs followed by NF-Y and Sp1 binding sites. The high degree of sequence conservation leaves little room for putative camel-specific regulatory elements. Surprisingly, RT-PCR and 5′/3′-RACE analysis demonstrated that all three hsp70 genes are expressed in camel's muscle and blood cells not only after heat shock, but under normal physiological conditions as well, and may account for tolerance of camel cells to extreme environmental conditions. A high degree of evolutionary conservation observed for the hsp70 cluster always linked with MHC locus in mammals suggests an important role of such organization for coordinated functioning of these vital genes. PMID:22096537

  19. Mapping the transcription start points of the Staphylococcus aureus eap, emp, and vwb promoters reveals a conserved octanucleotide sequence that is essential for expression of these genes.

    PubMed

    Harraghy, Niamh; Homerova, Dagmar; Herrmann, Mathias; Kormanec, Jan

    2008-01-01

    Mapping the transcription start points of the eap, emp, and vwb promoters revealed a conserved octanucleotide sequence (COS). Deleting this sequence abolished the expression of eap, emp, and vwb. However, electrophoretic mobility shift assays gave no evidence that this sequence was a binding site for SarA or SaeR, known regulators of eap and emp.

  20. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    PubMed

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  1. Evolutionary growth process of highly conserved sequences in vertebrate genomes.

    PubMed

    Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

    2012-08-01

    Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.

  2. Assessing universality of DNA barcoding in geographically isolated selected desert medicinal species of Fabaceae and Poaceae

    PubMed Central

    Hussain, Fatma; Ahmed, Nisar; Ghorbani, Abdolbaset

    2018-01-01

    In pursuit of developing fast and accurate species-level molecular identification methods, we tested six DNA barcodes, namely ITS2, matK, rbcLa, ITS2+matK, ITS2+rbcLa, matK+rbcLa and ITS2+matK+rbcLa, for their capacity to identify frequently consumed but geographically isolated medicinal species of Fabaceae and Poaceae indigenous to the desert of Cholistan. Data were analysed by BLASTn sequence similarity, pairwise sequence divergence in TAXONDNA, and phylogenetic (neighbour-joining and maximum-likelihood trees) methods. Comparison of six barcode regions showed that ITS2 has the highest number of variable sites (209/360) for tested Fabaceae and (106/365) Poaceae species, the highest species-level identification (40%) in BLASTn procedure, distinct DNA barcoding gap, 100% correct species identification in BM and BCM functions of TAXONDNA, and clear cladding pattern with high nodal support in phylogenetic trees in both families. ITS2+matK+rbcLa followed ITS2 in its species-level identification capacity. The study was concluded with advocating the DNA barcoding as an effective tool for species identification and ITS2 as the best barcode region in identifying medicinal species of Fabaceae and Poaceae. Current research has practical implementation potential in the fields of pharmaco-vigilance, trade of medicinal plants and biodiversity conservation. PMID:29576968

  3. Identification of a new genotype H wild-type mumps virus strain and its molecular relatedness to other virulent and attenuated strains.

    PubMed

    Amexis, Georgios; Rubin, Steven; Chatterjee, Nando; Carbone, Kathryn; Chumakov, Kostantin

    2003-06-01

    A single clinical isolate of mumps virus designated 88-1961 was obtained from a patient hospitalized with a clinical history of upper respiratory tract infection, parotitis, severe headache, fever and lymphadenopathy. We have sequenced the full-length genome of 88-1961 and compared it against all available full-length sequences of mumps virus. Based upon its nucleotide sequence of the SH gene 88-1961 was identified as a genotype H mumps strain. The overall extent of nucleotide and amino acid differences between each individual gene and protein of 88-1961 and the full-length mumps samples showed that the missense to silent ratios were unevenly distributed. Upon evaluation of the consensus sequence of 88-1961, four positions were found to be clearly heterogeneous at the nucleotide level (NP 315C/T, NP 318C/T, F 271A/C, and HN 855C/T). Sequence analysis revealed that the amino acid sequences for the NP, M, and the L protein were the most conserved, whereas the SH protein exhibited the highest variability among the compared mumps genotypes A, B, and G. No identifying molecular patterns in the non-coding (intergenic) or coding regions of 88-1961 were found when we compared it against relatively virulent (Urabe AM9 B, Glouc1/UK96, 87-1004 and 87-1005) and non-virulent mumps strains (Jeryl Lynn and all Urabe Am9 A substrains). Copyright 2003 Wiley-Liss, Inc.

  4. cisprimertool: software to implement a comparative genomics strategy for the development of conserved intron scanning (CIS) markers.

    PubMed

    Jayashree, B; Jagadeesh, V T; Hoisington, D

    2008-05-01

    The availability of complete, annotated genomic sequence information in model organisms is a rich resource that can be extended to understudied orphan crops through comparative genomic approaches. We report here a software tool (cisprimertool) for the identification of conserved intron scanning regions using expressed sequence tag alignments to a completely sequenced model crop genome. The method used is based on earlier studies reporting the assessment of conserved intron scanning primers (called CISP) within relatively conserved exons located near exon-intron boundaries from onion, banana, sorghum and pearl millet alignments with rice. The tool is freely available to academic users at http://www.icrisat.org/gt-bt/CISPTool.htm. © 2007 ICRISAT.

  5. Conserved structures formed by heterogeneous RNA sequences drive silencing of an inflammation responsive post-transcriptional operon

    PubMed Central

    Basu, Abhijit; Jain, Niyati; Tolbert, Blanton S.; Komar, Anton A.

    2017-01-01

    Abstract RNA–protein interactions with physiological outcomes usually rely on conserved sequences within the RNA element. By contrast, activity of the diverse gamma-interferon-activated inhibitor of translation (GAIT)-elements relies on the conserved RNA folding motifs rather than the conserved sequence motifs. These elements drive the translational silencing of a group of chemokine (CC/CXC) and chemokine receptor (CCR) mRNAs, thereby helping to resolve physiological inflammation. Despite sequence dissimilarity, these RNA elements adopt common secondary structures (as revealed by 2D-1H NMR spectroscopy), providing a basis for their interaction with the RNA-binding GAIT complex. However, many of these elements (e.g. those derived from CCL22, CXCL13, CCR4 and ceruloplasmin (Cp) mRNAs) have substantially different affinities for GAIT complex binding. Toeprinting analysis shows that different positions within the overall conserved GAIT element structure contribute to differential affinities of the GAIT protein complex towards the elements. Thus, heterogeneity of GAIT elements may provide hierarchical fine-tuning of the resolution of inflammation. PMID:29069516

  6. Early Evolution of Conserved Regulatory Sequences Associated with Development in Vertebrates

    PubMed Central

    McEwen, Gayle K.; Goode, Debbie K.; Parker, Hugo J.; Woolfe, Adam; Callaway, Heather; Elgar, Greg

    2009-01-01

    Comparisons between diverse vertebrate genomes have uncovered thousands of highly conserved non-coding sequences, an increasing number of which have been shown to function as enhancers during early development. Despite their extreme conservation over 500 million years from humans to cartilaginous fish, these elements appear to be largely absent in invertebrates, and, to date, there has been little understanding of their mode of action or the evolutionary processes that have modelled them. We have now exploited emerging genomic sequence data for the sea lamprey, Petromyzon marinus, to explore the depth of conservation of this type of element in the earliest diverging extant vertebrate lineage, the jawless fish (agnathans). We searched for conserved non-coding elements (CNEs) at 13 human gene loci and identified lamprey elements associated with all but two of these gene regions. Although markedly shorter and less well conserved than within jawed vertebrates, identified lamprey CNEs are able to drive specific patterns of expression in zebrafish embryos, which are almost identical to those driven by the equivalent human elements. These CNEs are therefore a unique and defining characteristic of all vertebrates. Furthermore, alignment of lamprey and other vertebrate CNEs should permit the identification of persistent sequence signatures that are responsible for common patterns of expression and contribute to the elucidation of the regulatory language in CNEs. Identifying the core regulatory code for development, common to all vertebrates, provides a foundation upon which regulatory networks can be constructed and might also illuminate how large conserved regulatory sequence blocks evolve and become fixed in genomic DNA. PMID:20011110

  7. Phylogenetic analysis reveals conservation and diversification of micro RNA166 genes among diverse plant species.

    PubMed

    Barik, Suvakanta; SarkarDas, Shabari; Singh, Archita; Gautam, Vibhav; Kumar, Pramod; Majee, Manoj; Sarkar, Ananda K

    2014-01-01

    Similar to the majority of the microRNAs, mature miR166s are derived from multiple members of MIR166 genes (precursors) and regulate various aspects of plant development by negatively regulating their target genes (Class III HD-ZIP). The evolutionary conservation or functional diversification of miRNA166 family members remains elusive. Here, we show the phylogenetic relationships among MIR166 precursor and mature sequences from three diverse model plant species. Despite strong conservation, some mature miR166 sequences, such as ppt-miR166m, have undergone sequence variation. Critical sequence variation in ppt-miR166m has led to functional diversification, as it targets non-HD-ZIPIII gene transcript (s). MIR166 precursor sequences have diverged in a lineage specific manner, and both precursors and mature osa-miR166i/j are highly conserved. Interestingly, polycistronic MIR166s were present in Physcomitrella and Oryza but not in Arabidopsis. The nature of cis-regulatory motifs on the upstream promoter sequences of MIR166 genes indicates their possible contribution to the functional variation observed among miR166 species. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments.

    PubMed

    Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L

    2012-07-01

    Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.

  9. Principles of regulatory information conservation between mouse and human.

    PubMed

    Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; Wu, Weisheng; Cayting, Philip; Boyle, Alan P; Sundaram, Vasavi; Xing, Xiaoyun; Dogan, Nergiz; Li, Jingjing; Euskirchen, Ghia; Lin, Shin; Lin, Yiing; Visel, Axel; Kawli, Trupti; Yang, Xinqiong; Patacsil, Dorrelyn; Keller, Cheryl A; Giardine, Belinda; Kundaje, Anshul; Wang, Ting; Pennacchio, Len A; Weng, Zhiping; Hardison, Ross C; Snyder, Michael P

    2014-11-20

    To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.

  10. Two DNA-binding factors recognize specific sequences at silencers, upstream activating sequences, autonomously replicating sequences, and telomeres in Saccharomyces cerevisiae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buchman, A.R.; Kimmerly, W.J.; Rine, J.

    1988-01-01

    Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less

  11. Functionally essential, invariant glutamate near the C-terminus of strand beta 5 in various (alpha/beta)8-barrel enzymes as a possible indicator of their evolutionary relatedness.

    PubMed

    Janecek, S; Baláz, S

    1995-08-01

    Twelve different (alpha/beta)8-barrel enzymes belonging to three structurally distinct families were found to contain, near the C-terminus of their strand beta 5, a conserved invariant glutamic acid residue that plays an important functional role in each of these enzymes. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif owing to their mutual evolutionary relatedness. For this purpose, the sequence region around the well conserved fifth beta-strand of alpha-amylase containing catalytic glutamate (Glu230, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The isolated sequence stretches of the 12 (alpha/beta)8-barrels are discussed from both the sequence-structural and the evolutionary point of view, the invariant glutamate residue being proposed to be a joining feature of the studied group of enzymes remaining from their ancestral (alpha/beta)8-barrel.

  12. Comparative analysis of the small RNA transcriptomes of Pinus contorta and Oryza sativa

    PubMed Central

    Morin, Ryan D.; Aksay, Gozde; Dolgosheina, Elena; Ebhardt, H. Alexander; Magrini, Vincent; Mardis, Elaine R.; Sahinalp, S. Cenk; Unrau, Peter J.

    2008-01-01

    The diversity of microRNAs and small-interfering RNAs has been extensively explored within angiosperms by focusing on a few key organisms such as Oryza sativa and Arabidopsis thaliana. A deeper division of the plants is defined by the radiation of the angiosperms and gymnosperms, with the latter comprising the commercially important conifers. The conifers are expected to provide important information regarding the evolution of highly conserved small regulatory RNAs. Deep sequencing provides the means to characterize and quantitatively profile small RNAs in understudied organisms such as these. Pyrosequencing of small RNAs from O. sativa revealed, as expected, ∼21- and ∼24-nt RNAs. The former contained known microRNAs, and the latter largely comprised intergenic-derived sequences likely representing heterochromatin siRNAs. In contrast, sequences from Pinus contorta were dominated by 21-nt small RNAs. Using a novel sequence-based clustering algorithm, we identified sequences belonging to 18 highly conserved microRNA families in P. contorta as well as numerous clusters of conserved small RNAs of unknown function. Using multiple methods, including expressed sequence folding and machine learning algorithms, we found a further 53 candidate novel microRNA families, 51 appearing specific to the P. contorta library. In addition, alignment of small RNA sequences to the O. sativa genome revealed six perfectly conserved classes of small RNA that included chloroplast transcripts and specific types of genomic repeats. The conservation of microRNAs and other small RNAs between the conifers and the angiosperms indicates that important RNA silencing processes were highly developed in the earliest spermatophytes. Genomic mapping of all sequences to the O. sativa genome can be viewed at http://microrna.bcgsc.ca/cgi-bin/gbrowse/rice_build_3/. PMID:18323537

  13. Ectomycorrhizal fungal communities in endangered Pinus amamiana forests

    PubMed Central

    Kanetani, Seiichi; Nara, Kazuhide

    2017-01-01

    Interactions between trees and ectomycorrhizal (ECM) fungi are critical for the growth and survival of both partners. However, ECM symbiosis in endangered trees has hardly been explored, complicating conservation efforts. Here, we evaluated resident ECM roots and soil spore banks of ECM fungi from endangered Pinus amamiana forests on Yakushima and Tanegashima Islands, Kagoshima Prefecture, Japan. Soil samples were collected from remaining four forests in the two islands. The resident ECM roots in soil samples were subjected to molecular identification. Soil spore banks of ECM fungi were analyzed via bioassays using a range of host seedlings (P. amamiana, P. parviflora, P. densiflora and Castanopsis sieboldii) for 6–8 months. In all remaining P. amamiana forests, we discovered a new Rhizopogon species (Rhizopogon sp.1), the sequence of which has no match amoung numerous Rhizopogon sequences deposited in the international sequence database. Host identification of the resident ECM roots confirmed that Rhizopogon sp.1 was associated only with P. amamiana. Rhizopogon sp.1 was far more dominant in soil spore banks than in resident ECM roots, and its presence was confirmed in nearly all soil samples examined across the major remaining populations. While Rhizopogon sp.1 did not completely lose compatibility to other pine species, its infection rate in the bioassays was highest in the original host, P. amamiana, the performance of which was improved by the infection. These results indicate that Rhizopogon sp.1 is very likely to have a close ecological relationship with endangered P. amamiana, probably due to a long co-evolutionary period on isolated islands, and to play the key role in seedling establishment after disturbance. We may need to identify and utilize such key ECM fungi to conserve endangered trees practically. PMID:29261780

  14. Multidrug Resistance-Associated Protein 3 (Mrp3/Abcc3/Moat-D) Is Expressed in the SAE Squalus acanthias Shark Embryo–Derived Cell Line

    PubMed Central

    Kobayashi, Hiroshi; Parton, Angela; Czechanski, Anne; Durkin, Christopher; Kong, Chi-Chon; Barnes, David

    2008-01-01

    The multidrug resistance-associated protein 3 (MRP3/Mrp3) is a member of the ATP-binding cassette (ABC) protein family of membrane transporters and related proteins that act on a variety of xenobiotic and anionic molecules to transfer these substrates in an ATP-dependent manner. In recent years, useful comparative information regarding evolutionarily conserved structure and transport functions of these proteins has accrued through the use of primitive marine animals such as cartilaginous fish. Until recently, one missing tool in comparative studies with cartilaginous fish was cell culture. We have derived from the embryo of Squalus acanthias, the spiny dogfish shark, the S. acanthias embryo (SAE) mesenchymal stem cell line. This is the first continuously proliferating cell line from a cartilaginous fish. We identified expression of Mrp3 in this cell line, cloned the molecule, and examined molecular and cellular physiological aspects of the protein. Shark Mrp3 is characterized by three membrane-spanning domains and two nucleotide-binding domains. Multiple alignments with other species showed that the shark Mrp3 amino acid sequence was well conserved. The shark sequence was overall 64% identical to human MRP3, 72% identical to chicken Mrp3, and 71% identical to frog and stickleback Mrp3. Highest identity between shark and human amino acid sequence (82%) was seen in the carboxyl-terminal nucleotide-binding domain of the proteins. Cell culture experiments showed that mRNA for the protein was induced as much as 25-fold by peptide growth factors, fetal bovine serum, and lipid nutritional components, with the largest effect mediated by a combination of lipids including unsaturated and saturated fatty acids, cholesterol, and vitamin E. PMID:18284333

  15. Multidrug resistance-associated protein 3 (Mrp3/Abcc3/Moat-D) is expressed in the SAE Squalus acanthias shark embryo-derived cell line.

    PubMed

    Kobayashi, Hiroshi; Parton, Angela; Czechanski, Anne; Durkin, Christopher; Kong, Chi-Chon; Barnes, David

    2007-01-01

    The multidrug resistance-associated protein 3 (MRP3/Mrp3) is a member of the ATP-binding cassette (ABC) protein family of membrane transporters and related proteins that act on a variety of xenobiotic and anionic molecules to transfer these substrates in an ATP-dependent manner. In recent years, useful comparative information regarding evolutionarily conserved structure and transport functions of these proteins has accrued through the use of primitive marine animals such as cartilaginous fish. Until recently, one missing tool in comparative studies with cartilaginous fish was cell culture. We have derived from the embryo of Squalus acanthias, the spiny dogfish shark, the S. acanthias embryo (SAE) mesenchymal stem cell line. This is the first continuously proliferating cell line from a cartilaginous fish. We identified expression of Mrp3 in this cell line, cloned the molecule, and examined molecular and cellular physiological aspects of the protein. Shark Mrp3 is characterized by three membrane-spanning domains and two nucleotide-binding domains. Multiple alignments with other species showed that the shark Mrp3 amino acid sequence was well conserved. The shark sequence was overall 64% identical to human MRP3, 72% identical to chicken Mrp3, and 71% identical to frog and stickleback Mrp3. Highest identity between shark and human amino acid sequence (82%) was seen in the carboxyl-terminal nucleotide-binding domain of the proteins. Cell culture experiments showed that mRNA for the protein was induced as much as 25-fold by peptide growth factors, fetal bovine serum, and lipid nutritional components, with the largest effect mediated by a combination of lipids including unsaturated and saturated fatty acids, cholesterol, and vitamin E.

  16. Contribution of TyrB26 to the Function and Stability of Insulin

    PubMed Central

    Pandyarajan, Vijay; Phillips, Nelson B.; Rege, Nischay; Lawrence, Michael C.; Whittaker, Jonathan; Weiss, Michael A.

    2016-01-01

    Crystallographic studies of insulin bound to receptor domains have defined the primary hormone-receptor interface. We investigated the role of TyrB26, a conserved aromatic residue at this interface. To probe the evolutionary basis for such conservation, we constructed 18 variants at B26. Surprisingly, non-aromatic polar or charged side chains (such as Glu, Ser, or ornithine (Orn)) conferred high activity, whereas the weakest-binding analogs contained Val, Ile, and Leu substitutions. Modeling of variant complexes suggested that the B26 side chains pack within a shallow depression at the solvent-exposed periphery of the interface. This interface would disfavor large aliphatic side chains. The analogs with highest activity exhibited reduced thermodynamic stability and heightened susceptibility to fibrillation. Perturbed self-assembly was also demonstrated in studies of the charged variants (Orn and Glu); indeed, the GluB26 analog exhibited aberrant aggregation in either the presence or absence of zinc ions. Thus, although TyrB26 is part of insulin's receptor-binding surface, our results suggest that its conservation has been enjoined by the aromatic ring's contributions to native stability and self-assembly. We envisage that such classical structural relationships reflect the implicit threat of toxic misfolding (rather than hormonal function at the receptor level) as a general evolutionary determinant of extant protein sequences. PMID:27129279

  17. CORAL: aligning conserved core regions across domain families.

    PubMed

    Fong, Jessica H; Marchler-Bauer, Aron

    2009-08-01

    Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.

  18. Identification and characterization of the reptilian GnRH-II gene in the leopard gecko, Eublepharis macularius, and its evolutionary considerations.

    PubMed

    Ikemoto, Tadahiro; Park, Min Kyun

    2003-10-16

    To elucidate the molecular phylogeny and evolution of a particular peptide, one must analyze not the limited primary amino acid sequences of the low molecular weight mature polypeptide, but rather the sequences of the corresponding precursors from various species. Of all the structural variants of gonadotropin-releasing hormone (GnRH), GnRH-II (chicken GnRH-II, or cGnRH-II) is remarkably conserved without any sequence substitutions among vertebrates, but its precursor sequences vary considerably. We have identified and characterized the full-length complementary DNA (cDNA) encoding the GnRH-II precursor and determined its genomic structure, consisting of four exons and three introns, in a reptilian species, the leopard gecko Eublepharis macularius. This is the first report about the GnRH-II precursor cDNA/gene from reptiles. The deduced leopard gecko prepro-GnRH-II polypeptide had the highest identities with the corresponding polypeptides of amphibians. The GnRH-II precursor mRNA was detected in more than half of the tissues and organs examined. This widespread expression is consistent with the previous findings in several species, though the roles of GnRH outside the hypothalamus-pituitary-gonadal axis remain largely unknown. Molecular phylogenetic analysis combined with sequence comparison showed that the leopard gecko is more similar to fishes and amphibians than to eutherian mammals with respect to the GnRH-II precursor sequence. These results strongly suggest that the divergence of the GnRH-II precursor sequences seen in eutherian mammals may have occurred along with amniote evolution.

  19. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  20. Cloning and characterization of aquaglyceroporin genes from rainbow smelt (Osmerus mordax) and transcript expression in response to cold temperature.

    PubMed

    Hall, Jennifer R; Clow, Kathy A; Rise, Matthew L; Driedzic, William R

    2015-09-01

    Aquaglyceroporins (GLPs) are integral membrane proteins that facilitate passive movement of water, glycerol and urea across cellular membranes. In this study, GLP-encoding genes were characterized in rainbow smelt (Osmerus mordax mordax), an anadromous teleost that accumulates high glycerol and modest urea levels in plasma and tissues as an adaptive cryoprotectant mechanism in sub-zero temperatures. We report the gene and promoter sequences for two aqp10b paralogs (aqp10ba, aqp10bb) that are 82% identical at the predicted amino acid level, and aqp9b. Aqp10bb and aqp9b have the 6 exon structure common to vertebrate GLPs. Aqp10ba has 8 exons; there are two additional exons at the 5' end, and the promoter sequence is different from aqp10bb. Molecular phylogenetic analysis suggests that the aqp10b paralogs arose from a gene duplication event specific to the smelt lineage. Smelt GLP transcripts are ubiquitously expressed; however, aqp10ba transcripts were highest in kidney, aqp10bb transcripts were highest in kidney, intestine, pyloric caeca and brain, and aqp9b transcripts were highest in spleen, liver, red blood cells and kidney. In cold-temperature challenge experiments, plasma glycerol and urea levels were significantly higher in cold- compared to warm-acclimated smelt; however, GLP transcript levels were generally either significantly lower or remained constant. The exception was significantly higher aqp10ba transcript levels in kidney. High aqp10ba transcripts in smelt kidney that increase significantly in response to cold temperature in congruence with plasma urea suggest that this gene duplicate may have evolved to allow the re-absorption of urea to concomitantly conserve nitrogen and prevent freezing. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. Browning in Annona cherimola fruit: role of polyphenol oxidase and characterization of a coding sequence of the enzyme.

    PubMed

    Prieto, Humberto; Utz, Daniella; Castro, Alvaro; Aguirre, Carlos; González-Agüero, Mauricio; Valdés, Héctor; Cifuentes, Nicolas; Defilippi, Bruno G; Zamora, Pablo; Zúñiga, Gustavo; Campos-Vargas, Reinaldo

    2007-10-31

    Cherimoya (Annona cherimola Mill.) fruit is an attractive candidate for food processing applications as fresh cut. However, along with its desirable delicate taste, cherimoya shows a marked susceptibility to browning. This condition is mainly attributed to polyphenol oxidase activity (PPO). A general lack of knowledge regarding PPO and its role in the oxidative loss of quality in processed cherimoya fruit requires a better understanding of the mechanisms involved. The work carried out included the cloning of a full-length cDNA, an analysis of its properties in the deduced amino sequence, and linkage of its mRNA levels with enzyme activity in mature and ripe fruits after wounding. The results showed one gene different at the nucleotide level when compared with previously reported genes, but a well-conserved protein, either in functional and in structural terms. Cherimoya PPO gene (Ac-ppo, GenBank DQ990911) showed to be present apparently in one copy of the genome, and its transcripts could be significantly detected in leaves and less abundantly in flowers and fruits. Analysis of wounded matured and ripened fruits revealed an inductive behavior for mRNA levels in the flesh of mature cherimoya after 16 h. Although the highest enzymatic activity was observed on rind, a consistent PPO activity was detected on flesh samples. A lack of correlation between PPO mRNA level and PPO activity was observed, especially in flesh tissue. This is probably due to the presence of monophenolic substrates inducing a lag period, enzyme inhibitors and/or diphenolic substrates causing suicide inactivation, and proenzyme or latent isoforms of PPO. To our knowledge this is the first report of a complete PPO sequence in cherimoya. Furthermore, the gene is highly divergent from known nucleotide sequences but shows a well conserved protein in terms of its function, deduced structure, and physiological role.

  2. Molecular characterization and expression profiling of ryanodine receptor gene in the pink stem borer, Sesamia inferens (Walker).

    PubMed

    Wu, Shun-Fan; Zhao, Dan-Dan; Huang, Jing-Mei; Zhao, Si-Qi; Zhou, Li-Qi; Gao, Cong-Fen

    2018-04-01

    The susceptibilities of three field populations of pink stem borer (PSB), Sesamia inferens (walker) to diamide insecticides, chlorantraniliprole and flubendiamide, were evaluated in this study. The results showed that these PSB field populations were still sensitive to the two diamide insecticides after many years of exposure. To further understand PSB and diamide insecticide, the full-length ryanodine receptor (RyR) cDNA (named as SiRyR), the molecular target of diamide insecticides was cloned from PSB and characterized. The SiRyR gene contains an open reading frame of 15,420 nucleotides, encoding 5140 amino acid residues, which shares 77% to 98% sequence identity with RyR homologous of other insects. All hallmarks of RyR proteins are conserved in the SiRyR protein, including the conserved C-terminal domain with the consensus calcium-biding EF-hands (calcium-binding motif), the six transmembrane domains, as well as mannosyltransferase, IP3R and RyR (pfam02815) (MIR) domains. Real-time qPCR analysis revealed that the highest mRNA expression levels of SiRyR were observed in pupa and adults, especially in males. SiRyR was expressed at the highest level in thorax, and the lowest level in wing. The full genetic characterization of SiRyR could provide useful information for future functional expression studies and for discovery of new insecticides with selective insecticidal activity. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. Conserved noncoding sequences (CNSs) in higher plants.

    PubMed

    Freeling, Michael; Subramaniam, Shabarinath

    2009-04-01

    Plant conserved noncoding sequences (CNSs)--a specific category of phylogenetic footprint--have been shown experimentally to function. No plant CNS is conserved to the extent that ultraconserved noncoding sequences are conserved in vertebrates. Plant CNSs are enriched in known transcription factor or other cis-acting binding sites, and are usually clustered around genes. Genes that encode transcription factors and/or those that respond to stimuli are particularly CNS-rich. Only rarely could this function involve small RNA binding. Some transcribed CNSs encode short translation products as a form of negative control. Approximately 4% of Arabidopsis gene content is estimated to be both CNS-rich and occupies a relatively long stretch of chromosome: Bigfoot genes (long phylogenetic footprints). We discuss a 'DNA-templated protein assembly' idea that might help explain Bigfoot gene CNSs.

  4. Genetic differences between blood- and brain-derived viral sequences from human immunodeficiency virus type 1-infected patients: evidence of conserved elements in the V3 region of the envelope protein of brain-derived sequences.

    PubMed Central

    Korber, B T; Kunstman, K J; Patterson, B K; Furtado, M; McEvilly, M M; Levy, R; Wolinsky, S M

    1994-01-01

    Human immunodeficiency virus type 1 (HIV-1) sequences were generated from blood and from brain tissue obtained by stereotactic biopsy from six patients undergoing a diagnostic neurosurgical procedure. Proviral DNA was directly amplified by nested PCR, and 8 to 36 clones from each sample were sequenced. Phylogenetic analysis of intrapatient envelope V3-V5 region HIV-1 DNA sequence sets revealed that brain viral sequences were clustered relative to the blood viral sequences, suggestive of tissue-specific compartmentalization of the virus in four of the six cases. In the other two cases, the blood and brain virus sequences were intermingled in the phylogenetic analyses, suggesting trafficking of virus between the two tissues. Slide-based PCR-driven in situ hybridization of two of the patients' brain biopsy samples confirmed our interpretation of the intrapatient phylogenetic analyses. Interpatient V3 region brain-derived sequence distances were significantly less than blood-derived sequence distances. Relative to the tip of the loop, the set of brain-derived viral sequences had a tendency towards negative or neutral charge compared with the set of blood-derived viral sequences. Entropy calculations were used as a measure of the variability at each position in alignments of blood and brain viral sequences. A relatively conserved set of positions were found, with a significantly lower entropy in the brain-than in the blood-derived viral sequences. These sites constitute a brain "signature pattern," or a noncontiguous set of amino acids in the V3 region conserved in viral sequences derived from brain tissue. This brain-derived signature pattern was also well preserved among isolates previously characterized in vitro as macrophage tropic. Macrophage-monocyte tropism may be the biological constraint that results in the conservation of the viral brain signature pattern. Images PMID:7933130

  5. The kinetoplast DNA of the Australian trypanosome, Trypanosoma copemani, shares features with Trypanosoma cruzi and Trypanosoma lewisi.

    PubMed

    Botero, Adriana; Kapeller, Irit; Cooper, Crystal; Clode, Peta L; Shlomai, Joseph; Thompson, R C Andrew

    2018-05-17

    Kinetoplast DNA (kDNA) is the mitochondrial genome of trypanosomatids. It consists of a few dozen maxicircles and several thousand minicircles, all catenated topologically to form a two-dimensional DNA network. Minicircles are heterogeneous in size and sequence among species. They present one or several conserved regions that contain three highly conserved sequence blocks. CSB-1 (10 bp sequence) and CSB-2 (8 bp sequence) present lower interspecies homology, while CSB-3 (12 bp sequence) or the Universal Minicircle Sequence is conserved within most trypanosomatids. The Universal Minicircle Sequence is located at the replication origin of the minicircles, and is the binding site for the UMS binding protein, a protein involved in trypanosomatid survival and virulence. Here, we describe the structure and organisation of the kDNA of Trypanosoma copemani, a parasite that has been shown to infect mammalian cells and has been associated with the drastic decline of the endangered Australian marsupial, the woylie (Bettongia penicillata). Deep genomic sequencing showed that T. copemani presents two classes of minicircles that share sequence identity and organisation in the conserved sequence blocks with those of Trypanosoma cruzi and Trypanosoma lewisi. A 19,257 bp partial region of the maxicircle of T. copemani that contained the entire coding region was obtained. Comparative analysis of the T. copemani entire maxicircle coding region with the coding regions of T. cruzi and T. lewisi showed they share 71.05% and 71.28% identity, respectively. The shared features in the maxicircle/minicircle organisation and sequence between T. copemani and T. cruzi/T. lewisi suggest similarities in their process of kDNA replication, and are of significance in understanding the evolution of Australian trypanosomes. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  6. Conserved Sequence Preferences Contribute to Substrate Recognition by the Proteasome*

    PubMed Central

    Yu, Houqing; Singh Gautam, Amit K.; Wilmington, Shameika R.; Wylie, Dennis; Martinez-Fonts, Kirby; Kago, Grace; Warburton, Marie; Chavali, Sreenivas; Inobe, Tomonao; Finkelstein, Ilya J.; Babu, M. Madan

    2016-01-01

    The proteasome has pronounced preferences for the amino acid sequence of its substrates at the site where it initiates degradation. Here, we report that modulating these sequences can tune the steady-state abundance of proteins over 2 orders of magnitude in cells. This is the same dynamic range as seen for inducing ubiquitination through a classic N-end rule degron. The stability and abundance of His3 constructs dictated by the initiation site affect survival of yeast cells and show that variation in proteasomal initiation can affect fitness. The proteasome's sequence preferences are linked directly to the affinity of the initiation sites to their receptor on the proteasome and are conserved between Saccharomyces cerevisiae, Schizosaccharomyces pombe, and human cells. These findings establish that the sequence composition of unstructured initiation sites influences protein abundance in vivo in an evolutionarily conserved manner and can affect phenotype and fitness. PMID:27226608

  7. Genome-wide discovery of novel and conserved microRNAs in white shrimp (Litopenaeus vannamei).

    PubMed

    Xi, Qian-Yun; Xiong, Yuan-Yan; Wang, Yuan-Mei; Cheng, Xiao; Qi, Qi-En; Shu, Gang; Wang, Song-Bo; Wang, Li-Na; Gao, Ping; Zhu, Xiao-Tong; Jiang, Qing-Yan; Zhang, Yong-Liang; Liu, Li

    2015-01-01

    Of late years, a large amount of conserved and species-specific microRNAs (miRNAs) have been performed on identification from species which are economically important but lack a full genome sequence. In this study, Solexa deep sequencing and cross-species miRNA microarray were used to detect miRNAs in white shrimp. We identified 239 conserved miRNAs, 14 miRNA* sequences and 20 novel miRNAs by bioinformatics analysis from 7,561,406 high-quality reads representing 325,370 distinct sequences. The all 20 novel miRNAs were species-specific in white shrimp and not homologous in other species. Using the conserved miRNAs from the miRBase database as a query set to search for homologs from shrimp expressed sequence tags (ESTs), 32 conserved computationally predicted miRNAs were discovered in shrimp. In addition, using microarray analysis in the shrimp fed with Panax ginseng polysaccharide complex, 151 conserved miRNAs were identified, 18 of which were significant up-expression, while 49 miRNAs were significant down-expression. In particular, qRT-PCR analysis was also performed for nine miRNAs in three shrimp tissues such as muscle, gill and hepatopancreas. Results showed that these miRNAs expression are tissue specific. Combining results of the three methods, we detected 20 novel and 394 conserved miRNAs. Verification with quantitative reverse transcription (qRT-PCR) and Northern blot showed a high confidentiality of data. The study provides the first comprehensive specific miRNA profile of white shrimp, which includes useful information for future investigations into the function of miRNAs in regulation of shrimp development and immunology.

  8. Identification and characterization of a chitin deacetylase from a metagenomic library of deep-sea sediments of the Arctic Ocean.

    PubMed

    Liu, Jinlin; Jia, Zhijuan; Li, Sha; Li, Yan; You, Qiang; Zhang, Chunyan; Zheng, Xiaotong; Xiong, Guomei; Zhao, Jin; Qi, Chao; Yang, Jihong

    2016-09-15

    The chemical and biological compositions of deep-sea sediments are interesting because of the underexplored diversity when it comes to bioprospecting. The special geographical location and climates make Arctic Ocean a unique ocean area containing an abundance of microbial resources. A metagenomic library was constructed based on the deep-sea sediments of Arctic Ocean. Part of insertion fragments of this library were sequenced. A chitin deacetylase gene, cdaYJ, was identified and characterized. A metagenomic library with 2750 clones was obtained and ten clones were sequenced. Results revealed several interesting genes, including a chitin deacetylase coding sequence, cdaYJ. The CdaYJ is homologous to some known chitin deacetylases and contains conserved chitin deacetylase active sites. CdaYJ protein exhibits a long N-terminal and a relative short C-terminal. Phylogenetic analysis revealed that CdaYJ showed highest homology to CDAs from Alphaproteobacteria. The cdaYJ gene was subcloned into the pET-28a vector and the recombinant CdaYJ (rCdaYJ) was expressed in Escherichia coli BL21 (DE3). rCdaYJ showed a molecular weight of 43kDa, and exhibited deacetylation activity by using p-nitroacetanilide as substrate. The optimal pH and temperature of rCdaYJ were tested as pH7.4 and 28°C, respectively. The construction of metagenomic library of the Arctic deep-sea sediments provides us an opportunity to look into the microbial communities and exploiting valuable gene resources. A chitin deacetylase CdaYJ was identified from the library. It showed highest deacetylation activity under slight alkaline and low temperature conditions. CdaYJ might be a candidate chitin deacetylase that possesses industrial and pharmaceutical potentials. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Comprehensive mutation screening in 55 probands with type 1 primary hyperoxaluria shows feasibility of a gene-based diagnosis.

    PubMed

    Monico, Carla G; Rossetti, Sandro; Schwanz, Heidi A; Olson, Julie B; Lundquist, Patrick A; Dawson, D Brian; Harris, Peter C; Milliner, Dawn S

    2007-06-01

    Mutations in AGXT, a locus mapped to 2q37.3, cause deficiency of liver-specific alanine:glyoxylate aminotransferase (AGT), the metabolic error in type 1 primary hyperoxaluria (PH1). Genetic analysis of 55 unrelated probands with PH1 from the Mayo Clinic Hyperoxaluria Center, to date the largest with availability of complete sequencing across the entire AGXT coding region and documented hepatic AGT deficiency, suggests that a molecular diagnosis (identification of two disease alleles) is feasible in 96% of patients. Unique to this PH1 population was the higher frequency of G170R, the most common AGXT mutation, accounting for 37% of alleles, and detection of a new 3' end deletion (Ex 11_3'UTR del). A described frameshift mutation (c.33_34insC) occurred with the next highest frequency (11%), followed by F152I and G156R (frequencies of 6.3 and 4.5%, respectively), both surpassing the frequency (2.7%) of I244T, the previously reported third most common pathogenic change. These sequencing data indicate that AGXT is even more variable than formerly believed, with 28 new variants (21 mutations and seven polymorphisms) detected, with highest frequencies on exons 1, 4, and 7. When limited to these three exons, molecular analysis sensitivity was 77%, compared with 98% for whole-gene sequencing. These are the first data in support of comprehensive AGXT analysis for the diagnosis of PH1, obviating a liver biopsy in most well-characterized patients. Also reported here is previously unavailable evidence for the pathogenic basis of all AGXT missense variants, including evolutionary conservation data in a multisequence alignment and use of a normal control population.

  10. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

    PubMed Central

    2012-01-01

    Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678

  11. Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lan, Yemin; Rosen, Gail; Hershberg, Ruth

    The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less

  12. Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

    DOE PAGES

    Lan, Yemin; Rosen, Gail; Hershberg, Ruth

    2016-05-03

    The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less

  13. Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide.

    PubMed

    Pérez Sirkin, Daniela I; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M; Vissio, Paula G; Dufour, Sylvie

    2017-01-01

    GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation.

  14. Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide

    PubMed Central

    Pérez Sirkin, Daniela I.; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M.; Vissio, Paula G.; Dufour, Sylvie

    2017-01-01

    GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation. PMID:28878737

  15. Homologs of CD83 from elasmobranch and teleost fish.

    PubMed

    Ohta, Yuko; Landis, Eric; Boulay, Thomas; Phillips, Ruth B; Collet, Bertrand; Secombes, Chris J; Flajnik, Martin F; Hansen, John D

    2004-10-01

    Dendritic cells are one of the most important cell types connecting innate and adaptive immunity, but very little is known about their evolutionary origins. To begin to study dendritic cells from lower vertebrates, we isolated and characterized CD83 from the nurse shark (Ginglymostoma cirratum (Gici)) and rainbow trout (Oncorhynchus mykiss (Onmy)). The open reading frames for Gici-CD83 (194 aa) and Onmy-CD83 (218 aa) display approximately 28-32% identity to mammalian CD83 with the presence of two conserved N-linked glycosylation sites. Identical with mammalian CD83 genes, Gici-CD83 is composed of five exons including conservation of phase for the splice sites. Mammalian CD83 genes contain a split Ig superfamily V domain that represents a unique sequence feature for CD83 genes, a feature conserved in both Gici- and Onmy-CD83. Gici-CD83 and Onmy-CD83 are not linked to the MHC, an attribute shared with mouse but not human CD83. Gici-CD83 is expressed rather ubiquitously with highest levels in the epigonal tissue, a primary site for lymphopoiesis in the nurse shark, whereas Onmy-CD83 mRNA expression largely paralleled that of MHC class II but at lower levels. Finally, Onmy-CD83 gene expression is up-regulated in virus-infected trout, and the promoter is responsive to trout IFN regulatory factor-1. These results suggest that the role of CD83, an adhesion molecule for cell-mediated immunity, has been conserved over 450 million years of vertebrate evolution.

  16. DECIPHER, a Search-Based Approach to Chimera Identification for 16S rRNA Sequences

    PubMed Central

    Wright, Erik S.; Yilmaz, L. Safak

    2012-01-01

    DECIPHER is a new method for finding 16S rRNA chimeric sequences by the use of a search-based approach. The method is based upon detecting short fragments that are uncommon in the phylogenetic group where a query sequence is classified but frequently found in another phylogenetic group. The algorithm was calibrated for full sequences (fs_DECIPHER) and short sequences (ss_DECIPHER) and benchmarked against WigeoN (Pintail), ChimeraSlayer, and Uchime using artificially generated chimeras. Overall, ss_DECIPHER and Uchime provided the highest chimera detection for sequences 100 to 600 nucleotides long (79% and 81%, respectively), but Uchime's performance deteriorated for longer sequences, while ss_DECIPHER maintained a high detection rate (89%). Both methods had low false-positive rates (1.3% and 1.6%). The more conservative fs_DECIPHER, benchmarked only for sequences longer than 600 nucleotides, had an overall detection rate lower than that of ss_DECIPHER (75%) but higher than those of the other programs. In addition, fs_DECIPHER had the lowest false-positive rate among all the benchmarked programs (<0.20%). DECIPHER was outperformed only by ChimeraSlayer and Uchime when chimeras were formed from closely related parents (less than 10% divergence). Given the differences in the programs, it was possible to detect over 89% of all chimeras with just the combination of ss_DECIPHER and Uchime. Using fs_DECIPHER, we detected between 1% and 2% additional chimeras in the RDP, SILVA, and Greengenes databases from which chimeras had already been removed with Pintail or Bellerophon. DECIPHER was implemented in the R programming language and is directly accessible through a webpage or by downloading the program as an R package (http://DECIPHER.cee.wisc.edu). PMID:22101057

  17. Evidence for the Concerted Evolution between Short Linear Protein Motifs and Their Flanking Regions

    PubMed Central

    Chica, Claudia; Diella, Francesca; Gibson, Toby J.

    2009-01-01

    Background Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein–protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids. Results The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co–evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif–mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions. Conclusion The results suggest that flanking regions are relevant for linear motif–mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise. PMID:19584925

  18. CoSMoS: Conserved Sequence Motif Search in the proteome

    PubMed Central

    Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I

    2006-01-01

    Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915

  19. Virulence Gene Sequencing Highlights Similarities and Differences in Sequences in Listeria monocytogenes Serotype 1/2a and 4b Strains of Clinical and Food Origin From 3 Different Geographic Locations.

    PubMed

    Poimenidou, Sofia V; Dalmasso, Marion; Papadimitriou, Konstantinos; Fox, Edward M; Skandamis, Panagiotis N; Jordan, Kieran

    2018-01-01

    The prfA -virulence gene cluster ( p VGC) is the main pathogenicity island in Listeria monocytogenes , comprising the prfA, plcA, hly, mpl, actA , and plcB genes. In this study, the p VGC of 36 L. monocytogenes isolates with respect to different serotypes (1/2a or 4b), geographical origin (Australia, Greece or Ireland) and isolation source (food-associated or clinical) was characterized. The most conserved genes were prfA and hly , with the lowest nucleotide diversity (π) among all genes ( P < 0.05), and the lowest number of alleles, substitutions and non-synonymous substitutions for prfA . Conversely, the most diverse gene was actA , which presented the highest number of alleles ( n = 20) and showed the highest nucleotide diversity. Grouping by serotype had a significantly lower π value ( P < 0.0001) compared to isolation source or geographical origin, suggesting a distinct and well-defined unit compared to other groupings. Among all tested genes, only hly and mpl were those with lower nucleotide diversity in 1/2a serotype than 4b serotype, reflecting a high within-1/2a serotype divergence compared to 4b serotype. Geographical divergence was noted with respect to the hly gene, where serotype 4b Irish strains were distinct from Greek and Australian strains. Australian strains showed less diversity in plcB and mpl relative to Irish or Greek strains. Notable differences regarding sequence mutations were identified between food-associated and clinical isolates in prfA, actA , and plcB sequences. Overall, these results indicate that virulence genes follow different evolutionary pathways, which are affected by a strain's origin and serotype and may influence virulence and/or epidemiological dominance of certain subgroups.

  20. Prunus persica crop management as step toward AMF diversity conservation for the sustainable soil management

    NASA Astrophysics Data System (ADS)

    Alguacil, M. M.; Torrecillas, E.; Lozano, Z.; Garcia-Orenes, F.; Roldan, A.

    2012-04-01

    We investigated the diversity of arbuscular mycorrhizal fungi (AMF) in roots of Prunus persica under two fertilization treatments (CF: consisted of application of chicken manure (1400 kg.ha-1), urea (140 kg.ha-1), complex fertilizer 12-12-17/2 (280 kg.ha-1), and potassium sulfate (40 kg.ha-1) and IF: consisted of application of urea (140 kg.ha-1), complex fertilizer 12-12-17/2 (400 kg.ha-1) and potassium sulfate (70 kg.ha-1)) combined with integrated pest management (IM) or chemical pest management (CM), in a tropical agroecosystem in the north of Venezuela. Our goal was to ascertain how different fertilizers/pest management can modify the AMF diversity colonizing P. persica roots as an important step towards sustainable soil use and therefore protection of biodiversity. The AM fungal small-subunit (SSU) rRNA genes were subjected to PCR, cloning, sequencing and phylogenetic analyses. Twenty-one different phylotypes were identified, which were grouped in five families: Glomeraceae, Paraglomeraceae, Acaulosporaceae, Gigasporaceae and Archaeosporaceae. Sixteen of these sequence groups belonged to the genus Glomus, two to Paraglomus, one to Acaulospora, one to Scutellospora and one to Archaeospora. A different distribution of the AMF phylotypes as consequence of the difference between treatments was observed. Thus, the AMF communities of tree roots in the (IF+CM) treatment had the lowest diversity (H'=1.78) with the lowest total number of AMF sequence types (9). The trees from both (CF+IM) and (IF+IM) treatments had similar AMF diversity (H'?2.00); while the treatment (CF+CM) yielded the highest number of different AMF sequence types (17) and showed the highest diversity index (H'=2.69). In conclusion, the crop management including combination of organic and inorganic fertilization and chemical pest control appears to be the most suitable strategy with respect to reactivate the AMF diversity in the roots of this crop and thus, the agricultural and environmental sustainability in the agroecosystem.

  1. Identification of a Conserved Non-Protein-Coding Genomic Element that Plays an Essential Role in Alphabaculovirus Pathogenesis

    PubMed Central

    Kikhno, Irina

    2014-01-01

    Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153

  2. Comparative Sequence and X-Inactivation Analyses of a Domain of Escape in Human Xp11.2 and the Conserved Segment in Mouse

    PubMed Central

    Tsuchiya, Karen D.; Greally, John M.; Yi, Yajun; Noel, Kevin P.; Truong, Jean-Pierre; Disteche, Christine M.

    2004-01-01

    We have performed X-inactivation and sequence analyses on 350 kb of sequence from human Xp11.2, a region shown previously to contain a cluster of genes that escape X inactivation, and we compared this region with the region of conserved synteny in mouse. We identified several new transcripts from this region in human and in mouse, which defined the full extent of the domain escaping X inactivation in both species. In human, escape from X inactivation involves an uninterrupted 235-kb domain of multiple genes. Despite highly conserved gene content and order between the two species, Smcx is the only mouse gene from the conserved segment that escapes inactivation. As repetitive sequences are believed to facilitate spreading of X inactivation along the chromosome, we compared the repetitive sequence composition of this region between the two species. We found that long terminal repeats (LTRs) were decreased in the human domain of escape, but not in the majority of the conserved mouse region adjacent to Smcx in which genes were subject to X inactivation, suggesting that these repeats might be excluded from escape domains to prevent spreading of silencing. Our findings indicate that genomic context, as well as gene-specific regulatory elements, interact to determine expression of a gene from the inactive X-chromosome. PMID:15197169

  3. Effects of a Non-Conservative Sequence on the Properties of β-glucuronidase from Aspergillus terreus Li-20

    PubMed Central

    Liu, Yanli; Huangfu, Jie; Qi, Feng; Kaleem, Imdad; E, Wenwen; Li, Chun

    2012-01-01

    We cloned the β-glucuronidase gene (AtGUS) from Aspergillus terreus Li-20 encoding 657 amino acids (aa), which can transform glycyrrhizin into glycyrrhetinic acid monoglucuronide (GAMG) and glycyrrhetinic acid (GA). Based on sequence alignment, the C-terminal non-conservative sequence showed low identity with those of other species; thus, the partial sequence AtGUS(-3t) (1–592 aa) was amplified to determine the effects of the non-conservative sequence on the enzymatic properties. AtGUS and AtGUS(-3t) were expressed in E. coli BL21, producing AtGUS-E and AtGUS(-3t)-E, respectively. At the similar optimum temperature (55°C) and pH (AtGUS-E, 6.6; AtGUS(-3t)-E, 7.0) conditions, the thermal stability of AtGUS(-3t)-E was enhanced at 65°C, and the metal ions Co2+, Ca2+ and Ni2+ showed opposite effects on AtGUS-E and AtGUS(-3t)-E, respectively. Furthermore, Km of AtGUS(-3t)-E (1.95 mM) was just nearly one-seventh that of AtGUS-E (12.9 mM), whereas the catalytic efficiency of AtGUS(-3t)-E was 3.2 fold higher than that of AtGUS-E (7.16 vs. 2.24 mM s−1), revealing that the truncation of non-conservative sequence can significantly improve the catalytic efficiency of AtGUS. Conformational analysis illustrated significant difference in the secondary structure between AtGUS-E and AtGUS(-3t)-E by circular dichroism (CD). The results showed that the truncation of the non-conservative sequence could preferably alter and influence the stability and catalytic efficiency of enzyme. PMID:22347419

  4. Computational identification of developmental enhancers:conservation and function of transcription factor binding-site clustersin drosophila melanogaster and drosophila psedoobscura

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.

    2004-08-06

    Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less

  5. Phylogeography and origin of Chinese domestic chicken.

    PubMed

    Wu, Y P; Huo, J H; Xie, J F; Liu, L X; Wei, Q P; Xie, M G; Kang, Z F; Ji, H Y; Ma, Y H

    2014-04-01

    The loss of local chicken breeds as result of replacement with cosmopolitan breeds indicates the need for conservation measures to protect the future of local genetic stocks. The aim of this study is to describe the patterns of polymorphism of the hypervariable control region of mitochondrial DNA (HVR1) in domestic chicken in China's Jiangxi province to investigate genetic diversity, genetic structure and phylo-dynamics. To this end, we sequenced the mtDNA HVR1 in 231 chickens including 22 individuals which belonged to previously published sequences. A neighbor-joining tree revealed that these samples clustered into five lineages (Lineages A, B, C, E and G). The highest haplotype diversity and nucleotide diversity were both found in Anyi tile-liked gray breed. We estimated that the most recent common ancestor of the local chicken existed approximately 16 million years ago. The mismatch distribution analysis showed two major peaks at positions 4 and 9, while the neutrality test (Tajima's D = -2.19, p < 0.05) and Fu's F-statistics (-8.59, p < 0.05) revealed a significant departure from the neutrality assumption. These results support the idea that domestication of chickens facilitated population increases. Results of a global AMOVA indicated that there was no obvious geographic structure among the local chicken breeds analyzed in this study. The data obtained in this study will assist future conservation management of local breeds and also reveals intriguing implications for the history of human population movements and commerce.

  6. Mitochondrial Divergence between Western and Eastern Great Bustards: Implications for Conservation and Species Status.

    PubMed

    Kessler, Aimee Elizabeth; Santos, Malia A; Flatz, Ramona; Batbayar, Nyambayar; Natsagdorj, Tseveenmyadag; Batsuur, Dashnyam; Bidashko, Fyodor G; Galbadrakh, Natsag; Goroshko, Oleg; Khrokov, Valery V; Unenbat, Tuvshin; Vagner, Ivan I; Wang, Muyang; Smith, Christopher Irwin

    2018-06-02

    The Great Bustard is the heaviest bird capable of flight and an iconic species of the Eurasian steppe. Populations of both currently recognized subspecies are highly fragmented and critically small in Asia. We used DNA sequence data from the mitochondrial cytochrome b gene and the mitochondrial control region to estimate the degree of mitochondrial differentiation and rates of female gene flow between the subspecies. We obtained genetic samples from 51 individuals of Otis tarda dybowskii representing multiple populations, including the first samples from Kazakhstan and Mongolia and samples from near the Altai Mountains, the proposed geographic divide between the subspecies, allowing for better characterization of the boundary between the two subspecies. We compared these with existing sequence data (n=66) from O. t. tarda. Our results suggest, though do not conclusively prove, that O. t. dybowskii and O. t. tarda may be distinct species. The geographic distribution of haplotypes, phylogenetic analysis, analyses of molecular variance, and coalescent estimation of divergence time and female migration rates indicate that O. t. tarda and O. t. dybowskii are highly differentiated in the mitochondrial genome, have been isolated for approximately 1.4 million years, and exchange much less than one female migrant per generation. Our findings indicate that the two forms should at least be recognized and managed as separate evolutionary units. Populations in Xinjiang, China and Khövsgöl and Bulgan, Mongolia exhibited the highest levels of genetic diversity and should be prioritized in conservation planning.

  7. Delineating slowly and rapidly evolving fractions of the Drosophila genome.

    PubMed

    Keith, Jonathan M; Adams, Peter; Stephen, Stuart; Mattick, John S

    2008-05-01

    Evolutionary conservation is an important indicator of function and a major component of bioinformatic methods to identify non-protein-coding genes. We present a new Bayesian method for segmenting pairwise alignments of eukaryotic genomes while simultaneously classifying segments into slowly and rapidly evolving fractions. We also describe an information criterion similar to the Akaike Information Criterion (AIC) for determining the number of classes. Working with pairwise alignments enables detection of differences in conservation patterns among closely related species. We analyzed three whole-genome and three partial-genome pairwise alignments among eight Drosophila species. Three distinct classes of conservation level were detected. Sequences comprising the most slowly evolving component were consistent across a range of species pairs, and constituted approximately 62-66% of the D. melanogaster genome. Almost all (>90%) of the aligned protein-coding sequence is in this fraction, suggesting much of it (comprising the majority of the Drosophila genome, including approximately 56% of non-protein-coding sequences) is functional. The size and content of the most rapidly evolving component was species dependent, and varied from 1.6% to 4.8%. This fraction is also enriched for protein-coding sequence (while containing significant amounts of non-protein-coding sequence), suggesting it is under positive selection. We also classified segments according to conservation and GC content simultaneously. This analysis identified numerous sub-classes of those identified on the basis of conservation alone, but was nevertheless consistent with that classification. Software, data, and results available at www.maths.qut.edu.au/-keithj/. Genomic segments comprising the conservation classes available in BED format.

  8. COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

    DOE PAGES

    Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...

    2016-09-20

    There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less

  9. Conservation of hot regions in protein-protein interaction in evolution.

    PubMed

    Hu, Jing; Li, Jiarui; Chen, Nansheng; Zhang, Xiaolong

    2016-11-01

    The hot regions of protein-protein interactions refer to the active area which formed by those most important residues to protein combination process. With the research development on protein interactions, lots of predicted hot regions can be discovered efficiently by intelligent computing methods, while performing biology experiments to verify each every prediction is hardly to be done due to the time-cost and the complexity of the experiment. This study based on the research of hot spot residue conservations, the proposed method is used to verify authenticity of predicted hot regions that using machine learning algorithm combined with protein's biological features and sequence conservation, though multiple sequence alignment, module substitute matrix and sequence similarity to create conservation scoring algorithm, and then using threshold module to verify the conservation tendency of hot regions in evolution. This research work gives an effective method to verify predicted hot regions in protein-protein interactions, which also provides a useful way to deeply investigate the functional activities of protein hot regions. Copyright © 2016. Published by Elsevier Inc.

  10. COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.

    There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less

  11. Principles of regulatory information conservation between mouse and human

    DOE PAGES

    Cheng, Yong; Ma, Zhihai; Kim, Bong-Hyun; ...

    2014-11-19

    To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human–mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and withmore » genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Lastly, single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.« less

  12. THE GRK4 SUBFAMILY OF G PROTEIN-COUPLED RECEPTOR KINASES: ALTERNATIVE SPLICING, GENE ORGANIZATION, AND SEQUENCE CONSERVATION

    EPA Science Inventory

    The GRK4 subfamily of G protein-coupled receptor kinases. Alternative splicing, gene organization, and sequence conservation.

    Premont RT, Macrae AD, Aparicio SA, Kendall HE, Welch JE, Lefkowitz RJ.

    Department of Medicine, Howard Hughes Medical Institute, Duke Univer...

  13. 18 CFR 401.37 - Sequence of approval.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 18 Conservation of Power and Water Resources 2 2011-04-01 2011-04-01 false Sequence of approval. 401.37 Section 401.37 Conservation of Power and Water Resources DELAWARE RIVER BASIN COMMISSION ADMINISTRATIVE MANUAL RULES OF PRACTICE AND PROCEDURE Project Review Under Section 3.8 of the Compact § 401.37...

  14. Genomic organization, expression, and chromosome localization of a third aurora-related kinase gene, Aie1.

    PubMed

    Hu, H M; Chuang, C K; Lee, M J; Tseng, T C; Tang, T K

    2000-11-01

    We previously reported two novel testis-specific serine/threonine kinases, Aie1 (mouse) and AIE2 (human), that share high amino acid identities with the kinase domains of fly aurora and yeast Ipl1. Here, we report the entire intron-exon organization of the Aie1 gene and analyze the expression patterns of Aie1 mRNA during testis development. The mouse Aie1 gene spans approximately 14 kb and contains seven exons. The sequences of the exon-intron boundaries of the Aie1 gene conform to the consensus sequences (GT/AG) of the splicing donor and acceptor sites of most eukaryotic genes. Comparative genomic sequencing revealed that the gene structure is highly conserved between mouse Aie1 and human AIE2. However, much less homology was found in the sequence outside the kinase-coding domains. The Aie1 locus was mapped to mouse chromosome 7A2-A3 by fluorescent in situ hybridization. Northern blot analysis indicates that Aie1 mRNA likely is expressed at a low level on day 14 and reaches its plateau on day 21 in the developing postnatal testis. RNA in situ hybridization indicated that the expression of the Aie1 transcript was restricted to meiotically active germ cells, with the highest levels detected in spermatocytes at the late pachytene stage. These findings suggest that Aie1 plays a role in spermatogenesis.

  15. Molecular Cloning, Characterization, and Differential Expression of a Glucoamylase Gene from the Basidiomycetous Fungus Lentinula edodes

    PubMed Central

    Zhao, J.; Chen, Y. H.; Kwan, H. S.

    2000-01-01

    The complete nucleotide sequence of putative glucoamylase gene gla1 from the basidiomycetous fungus Lentinula edodes strain L54 is reported. The coding region of the genomic glucoamylase sequence, which is preceded by eukaryotic promoter elements CAAT and TATA, spans 2,076 bp. The gla1 gene sequence codes for a putative polypeptide of 571 amino acids and is interrupted by seven introns. The open reading frame sequence of the gla1 gene shows strong homology with those of other fungal glucoamylase genes and encodes a protein with an N-terminal catalytic domain and a C-terminal starch-binding domain. The similarity between the Gla1 protein and other fungal glucoamylases is from 45 to 61%, with the region of highest conservation found in catalytic domains and starch-binding domains. We compared the kinetics of glucoamylase activity and levels of gene expression in L. edodes strain L54 grown on different carbon sources (glucose, starch, cellulose, and potato extract) and in various developmental stages (mycelium growth, primordium appearance, and fruiting body formation). Quantitative reverse transcription PCR utilizing pairs of primers specific for gla1 gene expression shows that expression of gla1 was induced by starch and increased during the process of fruiting body formation, which indicates that glucoamylases may play an important role in the morphogenesis of the basidiomycetous fungus. PMID:10831434

  16. Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants.

    PubMed

    Rodovalho, Cynara M; Ferro, Milene; Fonseca, Fernando Pp; Antonio, Erik A; Guilherme, Ivan R; Henrique-Silva, Flávio; Bacci, Maurício

    2011-06-17

    Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters.

  17. Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants

    PubMed Central

    2011-01-01

    Background Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. Results The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. Conclusion The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters. PMID:21682882

  18. Functions of the 3′ and 5′ genome RNA regions of members of the genus Flavivirus

    PubMed Central

    Brinton, Margo A.; Basu, Mausumi

    2015-01-01

    The positive sense genomes of members of the genus Flavivirus in the family Flaviviridae are ~11 kb nts in length and have a 5′ type I cap but no 3′ poly A. The 5′ and 3′ terminal regions contain short conserved sequences that are proposed to be repeated remnants of an ancient sequence. However, the functions of most of these conserved sequences have not yet been determined. The terminal regions of the genome also contain multiple conserved RNA structures. Functional data for many of these structures has been obtained. Three sets of complementary 3′ and 5′ terminal region sequences, some of which are located in conserved RNA structures, interact to form a panhandle structure that is required for initiation of minus strand RNA synthesis with the 5′ terminal structure functioning as the promoter. How the switch from the terminal RNA structure base pairing to the long distance RNA-RNA interaction is triggered and regulated is not well understood but evidence suggests involvement of a cell protein binding to three sites on the 3′ terminal RNA structures and a cis-acting metastable 3′ RNA element in the 3′ terminal structure. Cell proteins may also be involved in facilitating exponential replication of nascent genomic RNA within replication vesicles at later times of infection cycle. Other conserved RNA structures and/or sequences in the 5′ and 3′ terminal regions have been proposed to regulate genome translation. Additional functions of the 5′ and 3′ terminal sequences have also been reported. PMID:25683510

  19. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  20. Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

    PubMed Central

    Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

    2013-01-01

    Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343

  1. The wheat cytochrome oxidase subunit II gene has an intron insert and three radical amino acid changes relative to maize

    PubMed Central

    Bonen, Linda; Boer, Poppo H.; Gray, Michael W.

    1984-01-01

    We have determined the sequence of the wheat mitochondrial gene for cytochrome oxidase subunit II (COII) and find that its derived protein sequence differs from that of maize at only three amino acid positions. Unexpectedly, all three replacements are non-conservative ones. The wheat COII gene has a highly-conserved intron at the same position as in maize, but the wheat intron is 1.5 times longer because of an insert relative to its maize counterpart. Hybridization analysis of mitochondrial DNA from rye, pea, broad bean and cucumber indicates strong sequence conservation of COII coding sequences among all these higher plants. However, only rye and maize mitochondrial DNA show homology with wheat COII intron sequences and rye alone with intron-insert sequences. We find that a sequence identical to the region of the 5' exon corresponding to the transmembrane domain of the COII protein is present at a second genomic location in wheat mitochondria. These variations in COII gene structure and size, as well as the presence of repeated COII sequences, illustrate at the DNA sequence level, factors which contribute to higher plant mitochondrial DNA diversity and complexity. ImagesFig. 3.Fig. 4.Fig. 5. PMID:16453565

  2. Conservation of Transcription Start Sites within Genes across a Bacterial Genus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shao, Wenjun; Price, Morgan N.; Deutschbauer, Adam M.

    Transcription start sites (TSSs) lying inside annotated genes, on the same or opposite strand, have been observed in diverse bacteria, but the function of these unexpected transcripts is unclear. Here, we use the metal-reducing bacterium Shewanella oneidensis MR-1 and its relatives to study the evolutionary conservation of unexpected TSSs. Using high-resolution tiling microarrays and 5'-end RNA sequencing, we identified 2,531 TSSs in S. oneidensis MR-1, of which 18% were located inside coding sequences (CDSs). Comparative transcriptome analysis with seven additional Shewanella species revealed that the majority (76%) of the TSSs within the upstream regions of annotated genes (gTSSs) were conserved.more » Thirty percent of the TSSs that were inside genes and on the sense strand (iTSSs) were also conserved. Sequence analysis around these iTSSs showed conserved promoter motifs, suggesting that many iTSS are under purifying selection. Furthermore, conserved iTSSs are enriched for regulatory motifs, suggesting that they are regulated, and they tend to eliminate polar effects, which confirms that they are functional. In contrast, the transcription of antisense TSSs located inside CDSs (aTSSs) was significantly less likely to be conserved (22%). However, aTSSs whose transcription was conserved often have conserved promoter motifs and drive the expression of nearby genes. Overall, our findings demonstrate that some internal TSSs are conserved and drive protein expression despite their unusual locations, but the majority are not conserved and may reflect noisy initiation of transcription rather than a biological function.« less

  3. Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies

    PubMed Central

    Dong, Chengliang; Wei, Peng; Jian, Xueqiu; Gibbs, Richard; Boerwinkle, Eric; Wang, Kai; Liu, Xiaoming

    2015-01-01

    Accurate deleteriousness prediction for nonsynonymous variants is crucial for distinguishing pathogenic mutations from background polymorphisms in whole exome sequencing (WES) studies. Although many deleteriousness prediction methods have been developed, their prediction results are sometimes inconsistent with each other and their relative merits are still unclear in practical applications. To address these issues, we comprehensively evaluated the predictive performance of 18 current deleteriousness-scoring methods, including 11 function prediction scores (PolyPhen-2, SIFT, MutationTaster, Mutation Assessor, FATHMM, LRT, PANTHER, PhD-SNP, SNAP, SNPs&GO and MutPred), 3 conservation scores (GERP++, SiPhy and PhyloP) and 4 ensemble scores (CADD, PON-P, KGGSeq and CONDEL). We found that FATHMM and KGGSeq had the highest discriminative power among independent scores and ensemble scores, respectively. Moreover, to ensure unbiased performance evaluation of these prediction scores, we manually collected three distinct testing datasets, on which no current prediction scores were tuned. In addition, we developed two new ensemble scores that integrate nine independent scores and allele frequency. Our scores achieved the highest discriminative power compared with all the deleteriousness prediction scores tested and showed low false-positive prediction rate for benign yet rare nonsynonymous variants, which demonstrated the value of combining information from multiple orthologous approaches. Finally, to facilitate variant prioritization in WES studies, we have pre-computed our ensemble scores for 87 347 044 possible variants in the whole-exome and made them publicly available through the ANNOVAR software and the dbNSFP database. PMID:25552646

  4. Domain architecture conservation in orthologs

    PubMed Central

    2011-01-01

    Background As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence. To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs. Results The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation. The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent. Conclusions On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance. PMID:21819573

  5. The first genetic map of the American cranberry: exploration of synteny conservation and quantitative trait loci.

    PubMed

    Georgi, Laura; Johnson-Cicalese, Jennifer; Honig, Josh; Das, Sushma Parankush; Rajah, Veeran D; Bhattacharya, Debashish; Bassil, Nahla; Rowland, Lisa J; Polashock, James; Vorsa, Nicholi

    2013-03-01

    The first genetic map of cranberry (Vaccinium macrocarpon) has been constructed, comprising 14 linkage groups totaling 879.9 cM with an estimated coverage of 82.2 %. This map, based on four mapping populations segregating for field fruit-rot resistance, contains 136 distinct loci. Mapped markers include blueberry-derived simple sequence repeat (SSR) and cranberry-derived sequence-characterized amplified region markers previously used for fingerprinting cranberry cultivars. In addition, SSR markers were developed near cranberry sequences resembling genes involved in flavonoid biosynthesis or defense against necrotrophic pathogens, or conserved orthologous set (COS) sequences. The cranberry SSRs were developed from next-generation cranberry genomic sequence assemblies; thus, the positions of these SSRs on the genomic map provide information about the genomic location of the sequence scaffold from which they were derived. The use of SSR markers near COS and other functional sequences, plus 33 SSR markers from blueberry, facilitates comparisons of this map with maps of other plant species. Regions of the cranberry map were identified that showed conservation of synteny with Vitis vinifera and Arabidopsis thaliana. Positioned on this map are quantitative trait loci (QTL) for field fruit-rot resistance (FFRR), fruit weight, titratable acidity, and sound fruit yield (SFY). The SFY QTL is adjacent to one of the fruit weight QTL and may reflect pleiotropy. Two of the FFRR QTL are in regions of conserved synteny with grape and span defense gene markers, and the third FFRR QTL spans a flavonoid biosynthetic gene.

  6. An atypical topoisomerase II sequence from the slime mold Physarum polycephalum.

    PubMed

    Hugodot, Yannick; Dutertre, Murielle; Duguet, Michel

    2004-01-21

    We have determined the complete nucleotide sequence of the cDNA encoding DNA topoisomerase II from Physarum polycephalum. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic enzymes, a 250-bp fragment was polymerase chain reaction (PCR) amplified. This fragment was used as a probe to screen a Physarum cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. Rapid amplification of cDNA ends (RACE)-PCR was employed to isolate the remaining portion of the gene. The complete sequence of 4613 bp contains an open reading frame of 4494 bp that codes for 1498 amino acid residues with a theoretical molecular weight of 167 kDa. The predicted amino acid sequence shares similarity with those of other eukaryotes and shows the highest degree of identity with the enzyme of Dictyostelium discoideum. However, the enzyme of P. polycephalum contains an atypical amino-terminal domain very rich in serine and proline, whose function is unknown. Remarkably, both a mitochondrial targeting sequence and a nuclear localization signal were predicted respectively in the amino and carboxy-terminus of the protein, as in the case of human topoisomerase III alpha. At the Physarum genomic level, the topoisomerase II gene encompasses a region of about 16 kbp suggesting a large proportion of intronic sequences, an unusual situation for a gene of a lower eukaryote, often free of introns. Finally, expression of topoisomerase II mRNA does not appear significantly dependent on the plasmodium cycle stage, possibly due to the lack of G1 phase or (and) to a mitochondrial localization of the enzyme.

  7. Genome Fragmentation Is Not Confined to the Peridinin Plastid in Dinoflagellates

    PubMed Central

    Espelund, Mari; Minge, Marianne A.; Gabrielsen, Tove M.; Nederbragt, Alexander J.; Shalchian-Tabrizi, Kamran; Otis, Christian; Turmel, Monique; Lemieux, Claude; Jakobsen, Kjetill S.

    2012-01-01

    When plastids are transferred between eukaryote lineages through series of endosymbiosis, their environment changes dramatically. Comparison of dinoflagellate plastids that originated from different algal groups has revealed convergent evolution, suggesting that the host environment mainly influences the evolution of the newly acquired organelle. Recently the genome from the anomalously pigmented dinoflagellate Karlodinium veneficum plastid was uncovered as a conventional chromosome. To determine if this haptophyte-derived plastid contains additional chromosomal fragments that resemble the mini-circles of the peridin-containing plastids, we have investigated its genome by in-depth sequencing using 454 pyrosequencing technology, PCR and clone library analysis. Sequence analyses show several genes with significantly higher copy numbers than present in the chromosome. These genes are most likely extrachromosomal fragments, and the ones with highest copy numbers include genes encoding the chaperone DnaK(Hsp70), the rubisco large subunit (rbcL), and two tRNAs (trnE and trnM). In addition, some photosystem genes such as psaB, psaA, psbB and psbD are overrepresented. Most of the dnaK and rbcL sequences are found as shortened or fragmented gene sequences, typically missing the 3′-terminal portion. Both dnaK and rbcL are associated with a common sequence element consisting of about 120 bp of highly conserved AT-rich sequence followed by a trnE gene, possibly serving as a control region. Decatenation assays and Southern blot analysis indicate that the extrachromosomal plastid sequences do not have the same organization or lengths as the minicircles of the peridinin dinoflagellates. The fragmentation of the haptophyte-derived plastid genome K. veneficum suggests that it is likely a sign of a host-driven process shaping the plastid genomes of dinoflagellates. PMID:22719952

  8. [Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

    PubMed

    Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

    2009-11-01

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.

  9. A comprehensive analysis of three Asiatic black bear mitochondrial genomes (subspecies ussuricus, formosanus and mupinensis), with emphasis on the complete mtDNA sequence of Ursus thibetanus ussuricus (Ursidae).

    PubMed

    Hwang, Dae-Sik; Ki, Jang-Seu; Jeong, Dong-Hyuk; Kim, Bo-Hyun; Lee, Bae-Keun; Han, Sang-Hoon; Lee, Jae-Seong

    2008-08-01

    In the present paper, we describe the mitochondrial genome sequence of the Asiatic black bear (Ursus thibetanus ussuricus) with particular emphasis on the control region (CR), and compared with mitochondrial genomes on molecular relationships among the bears. The mitochondrial genome sequence of U. thibetanus ussuricus was 16,700 bp in size with mostly conserved structures (e.g. 13 protein-coding, two rRNA genes, 22 tRNA genes). The CR consisted of several typical conserved domains such as F, E, D, and C boxes, and a conserved sequence block. Nucleotide sequences and the repeated motifs in the CR were different among the bear species, and their copy numbers were also variable according to populations, even within F1 generations of U. thibetanus ussuricus. Comparative analyses showed that the CR D1 region was highly informative for the discrimination of the bear family. These findings suggest that nucleotide sequences of both repeated motifs and CR D1 in the bear family are good markers for species discriminations.

  10. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.

    2014-12-23

    The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5more » lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.« less

  11. Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

    PubMed

    da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

    2017-10-28

    Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Creation of a data base for sequences of ribosomal nucleic acids and detection of conserved restriction endonucleases sites through computerized processing.

    PubMed Central

    Patarca, R; Dorta, B; Ramirez, J L

    1982-01-01

    As part of a project pertaining the organization of ribosomal genes in Kinetoplastidae, we have created a data base for published sequences of ribosomal nucleic acids, with information in Spanish. As a first step in their processing, we have written a computer program which introduces the new feature of determining the length of the fragments produced after single or multiple digestion with any of the known restriction enzymes. With this information we have detected conserved SAU 3A sites: (i) at the 5' end of the 5.8S rRNA and at the 3' end of the small subunit rRNA, both included in similar larger sequences; (ii) in the 5.8S rRNA of vertebrates (a second one), which is not present in lower eukaryotes, showing a clear evolutive divergence; and, (iii) at the 5' terminal of the small subunit rRNA, included in a larger conserved sequence. The possible biological importance of these sequences is discussed. PMID:6278402

  13. Full-length genome sequence analysis of an avian leukosis virus subgroup J (ALV-J) as contaminant in live poultry vaccine: The commercial live vaccines might be a potential route for ALV-J transmission.

    PubMed

    Wang, P; Lin, L; Li, H; Shi, M; Gu, Z; Wei, P

    2018-02-25

    One avian leukosis virus subgroup J (ALV-J) strain was isolated from 67 commercial live poultry vaccines produced by various manufacturers during 2013-2016 in China. The complete genomes of the isolate were sequenced and it was found that the genes gag and pol of the strain were relatively conservative, while the gp85 gene of the strain GX14YYA1 had the highest similarities with a field strain GX14ZS14, which was isolated from the chickens of a farm that had once used the same vaccine as the one found to be contaminated with the GX14YYA1. This is the first report of ALV-J contaminant in live poultry vaccine in China. Our finding demonstrates that vaccination of the commercial live vaccines might be a potential new route for ALV-J transmission in chickens and highlights the need for more extensive monitoring of the commercial live vaccines in China. © 2018 Blackwell Verlag GmbH.

  14. WNL Stars - the Most Massive Stars in the Universe?

    NASA Astrophysics Data System (ADS)

    Schnurr, Olivier; Moffat, Anthony F. J.; St-Louis, Nicole; Skalkowski, Gwenael; Niemela, Virpi; Shara, Michael M.

    2001-08-01

    We propose to carry out an intensive and complete time-dependent spectroscopic study of all 47 known WNL stars in the LMC, an ideal laboratory to study the effect of lower ambient metallicity, Z, on stellar evolution. WNL stars are luminous, cooler WR stars of the nitrogen sequence. This will allow us to: 1) determine the binary frequency. The Roche-lobe overflow (RLOF) mechanism in close binaries is predicted to be responsible for the formation of a significant fraction of WR stars in low Z environments such as the LMC. 2) determine the masses. Since some of these stars (denoted WNL(h) or WNLh) are supposed to be hydrogen-burning and thus main-sequence stellar objects of the highest luminosity, they may be the most massive stars known. 3) study wind-wind collision (WWC) effects in WR+O binaries involving very luminous WNL stars with strong winds. Interesting in itself as a high-energy phenomenon, WWC is in competition with conservative RLOF (i.e. mass transfer to the secondary star), and therefore has to be taken into account in this context.

  15. WNLh Stars - The Most Massive Stars in the Universe?

    NASA Astrophysics Data System (ADS)

    Schnurr, Olivier; St-Louis, Nicole; Moffat, Anthony F. J.; Foellmi, Cedric

    2002-08-01

    We propose to conclude our intensive and complete time-dependent spectroscopic study of all 47 known WNL stars in the LMC, an ideal laboratory to study the effect of lower ambient metallicity, Z, on stellar evolution. WNL stars are luminous, cooler WR stars of the nitrogen sequence. This will allow us to: 1) determine the binary frequency. The Roche-lobe overflow (RLOF) mechanism in close binaries is predicted to be responsible for the formation of a significant fraction of WR stars in low Z environments such as the LMC. 2) determine the masses. Since some of these stars (denoted WNL(h) or WNLh) are supposed to be hydrogen-burning and thus main-sequence stellar objects of the highest luminosity, they may be the most massive stars known. 3) study wind-wind collision (WWC) effects in WR+O binaries involving very luminous WNL stars with strong winds. Interesting in itself as a high-energy phenomenon, WWC is in competition with conservative RLOF (i.e. mass transfer to the secondary star), and therefore has to be taken into account in this context.

  16. Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis.

    PubMed

    Sharmin, Refat; Islam, Abul B M M K

    2016-01-01

    MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV. Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein's antigenic sites are found to be conserved with those in HKU4 and HKU5. This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.

  17. T box transcription antitermination riboswitch: Influence of nucleotide sequence and orientation on tRNA binding by the antiterminator element

    PubMed Central

    Fauzi, Hamid; Agyeman, Akwasi; Hines, Jennifer V.

    2008-01-01

    Many bacteria utilize riboswitch transcription regulation to monitor and appropriately respond to cellular levels of important metabolites or effector molecules. The T box transcription antitermination riboswitch responds to cognate uncharged tRNA by specifically stabilizing an antiterminator element in the 5′-untranslated mRNA leader region and precluding formation of a thermodynamically more stable terminator element. Stabilization occurs when the tRNA acceptor end base pairs with the first four nucleotides in the seven nucleotide bulge of the highly conserved antiterminator element. The significance of the conservation of the antiterminator bulge nucleotides that do not base pair with the tRNA is unknown, but they are required for optimal function. In vitro selection was used to determine if the isolated antiterminator bulge context alone dictates the mode in which the tRNA acceptor end binds the bulge nucleotides. No sequence conservation beyond complementarity was observed and the location was not constrained to the first four bases of the bulge. The results indicate that formation of a structure that recognizes the tRNA acceptor end in isolation is not the determinant driving force for the high phylogenetic sequence conservation observed within the antiterminator bulge. Additional factors or T box leader features more likely influenced the phylogenetic sequence conservation. PMID:19152843

  18. Sequence conservation, HLA-E-Restricted peptide, and best-defined CTL/CD8+ epitopes in gag P24 (capsid) of HIV-1 subtype B

    NASA Astrophysics Data System (ADS)

    Prasetyo, Afiono Agung; Dharmawan, Ruben; Sari, Yulia; Sariyatun, Ratna

    2017-02-01

    Human immunodeficiency virus type 1 (HIV-1) remains a cause of global health problem. Continuous studies of HIV-1 genetic and immunological profiles are important to find strategies against the virus. This study aimed to conduct analysis of sequence conservation, HLA-E-restricted peptide, and best-defined CTL/CD8+ epitopes in p24 (capsid) of HIV-1 subtype B worldwide. The p24-coding sequences from 3,557 HIV subtype B isolates were aligned using MUSCLE and analysed. Some highly conserved regions (sequence conservation ≥95%) were observed. Two considerably long series of sequences with conservation of 100% was observed at base 349-356 and 550-557 of p24 (HXB2 numbering). The consensus from all aligned isolates was precisely the same as consensus B in the Los Alamos HIV Database. The HLA-E-restricted peptide in amino acid (aa) 14-22 of HIV-1 p24 (AISPRTLNA) was found in 55.9% (1,987/3,557) of HIV-1 subtype B worldwide. Forty-four best-defined CTL/CD8+ epitopes were observed, in which VKNWMTETL epitope (aa 181-189 of p24) restricted by B*4801 was the most frequent, as found in 94.9% of isolates. The results of this study would contribute information about HIV-1 subtype B and benefits for further works willing to develop diagnostic and therapeutic strategies against the virus.

  19. Huntingtin-interacting protein 1 (Hip1) and Hip1-related protein (Hip1R) bind the conserved sequence of clathrin light chains and thereby influence clathrin assembly in vitro and actin distribution in vivo.

    PubMed

    Chen, Chih-Ying; Brodsky, Frances M

    2005-02-18

    Clathrin heavy and light chains form triskelia, which assemble into polyhedral coats of membrane vesicles that mediate transport for endocytosis and organelle biogenesis. Light chain subunits regulate clathrin assembly in vitro by suppressing spontaneous self-assembly of the heavy chains. The residues that play this regulatory role are at the N terminus of a conserved 22-amino acid sequence that is shared by all vertebrate light chains. Here we show that these regulatory residues and others in the conserved sequence mediate light chain interaction with Hip1 and Hip1R. These related proteins were previously found to be enriched in clathrin-coated vesicles and to promote clathrin assembly in vitro. We demonstrate Hip1R binding preference for light chains associated with clathrin heavy chain and show that Hip1R stimulation of clathrin assembly in vitro is blocked by mutations in the conserved sequence of light chains that abolish interaction with Hip1 and Hip1R. In vivo overexpression of a fragment of clathrin light chain comprising the Hip1R-binding region affected cellular actin distribution. Together these results suggest that the roles of Hip1 and Hip1R in affecting clathrin assembly and actin distribution are mediated by their interaction with the conserved sequence of clathrin light chains.

  20. A conserved predicted pseudoknot in the NS2A-encoding sequence of West Nile and Japanese encephalitis flaviviruses suggests NS1' may derive from ribosomal frameshifting

    PubMed Central

    Firth, Andrew E; Atkins, John F

    2009-01-01

    Japanese encephalitis, West Nile, Usutu and Murray Valley encephalitis viruses form a tight subgroup within the larger Flavivirus genus. These viruses utilize a single-polyprotein expression strategy, resulting in ~10 mature proteins. Plotting the conservation at synonymous sites along the polyprotein coding sequence reveals strong conservation peaks at the very 5' end of the coding sequence, and also at the 5' end of the sequence encoding the NS2A protein. Such peaks are generally indicative of functionally important non-coding sequence elements. The second peak corresponds to a predicted stable pseudoknot structure whose biological importance is supported by compensatory mutations that preserve the structure. The pseudoknot is preceded by a conserved slippery heptanucleotide (Y CCU UUU), thus forming a classical stimulatory motif for -1 ribosomal frameshifting. We hypothesize, therefore, that the functional importance of the pseudoknot is to stimulate a portion of ribosomes to shift -1 nt into a short (45 codon), conserved, overlapping open reading frame, termed foo. Since cleavage at the NS1-NS2A boundary is known to require synthesis of NS2A in cis, the resulting transframe fusion protein is predicted to be NS1-NS2AN-term-FOO. We hypothesize that this may explain the origin of the previously identified NS1 'extension' protein in JEV-group flaviviruses, known as NS1'. PMID:19196463

  1. Application of cytochrome b DNA sequences for the authentication of endangered snake species.

    PubMed

    Wong, Ka-Lok; Wang, Jun; But, Paul Pui-Hay; Shaw, Pang-Chui

    2004-01-06

    In order to enforce the conservation program and curbing the illegal trading and consumption of endangered snake species, the value of cytochrome b sequence in the authentication of snake species was evaluated. As an illustration, DNA was extracted, selected cytochrome b DNA sequences amplified and sequenced from six snakes commonly consumed in Hong Kong. Cataloging with sequences available in public, a cytochrome b database containing 90 species of snakes was constructed. In this database, sequence homology between snakes ranged from 70.68 to 95.11%. On the other hand, intraspecific variation of three tested snakes was 0-0.98%. Using the database, we were able to determine the identity of six meat samples confiscated by the Agriculture, Fisheries and Conservation Department, HKSAR.

  2. Insights into the fold organization of TIM barrel from interaction energy based structure networks.

    PubMed

    Vijayabaskar, M S; Vishveshwara, Saraswathi

    2012-01-01

    There are many well-known examples of proteins with low sequence similarity, adopting the same structural fold. This aspect of sequence-structure relationship has been extensively studied both experimentally and theoretically, however with limited success. Most of the studies consider remote homology or "sequence conservation" as the basis for their understanding. Recently "interaction energy" based network formalism (Protein Energy Networks (PENs)) was developed to understand the determinants of protein structures. In this paper we have used these PENs to investigate the common non-covalent interactions and their collective features which stabilize the TIM barrel fold. We have also developed a method of aligning PENs in order to understand the spatial conservation of interactions in the fold. We have identified key common interactions responsible for the conservation of the TIM fold, despite high sequence dissimilarity. For instance, the central beta barrel of the TIM fold is stabilized by long-range high energy electrostatic interactions and low-energy contiguous vdW interactions in certain families. The other interfaces like the helix-sheet or the helix-helix seem to be devoid of any high energy conserved interactions. Conserved interactions in the loop regions around the catalytic site of the TIM fold have also been identified, pointing out their significance in both structural and functional evolution. Based on these investigations, we have developed a novel network based phylogenetic analysis for remote homologues, which can perform better than sequence based phylogeny. Such an analysis is more meaningful from both structural and functional evolutionary perspective. We believe that the information obtained through the "interaction conservation" viewpoint and the subsequently developed method of structure network alignment, can shed new light in the fields of fold organization and de novo computational protein design.

  3. Conservation of tubulin-binding sequences in TRPV1 throughout evolution.

    PubMed

    Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

    2012-01-01

    Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Our analysis identifies the regions of TRPV1, which are important for structure-function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol®-induced neuropathy.

  4. Genetic and structural analyses of cytochrome P450 hydroxylases in sex hormone biosynthesis: Sequential origin and subsequent coevolution.

    PubMed

    Goldstone, Jared V; Sundaramoorthy, Munirathinam; Zhao, Bin; Waterman, Michael R; Stegeman, John J; Lamb, David C

    2016-01-01

    Biosynthesis of steroid hormones in vertebrates involves three cytochrome P450 hydroxylases, CYP11A1, CYP17A1 and CYP19A1, which catalyze sequential steps in steroidogenesis. These enzymes are conserved in the vertebrates, but their origin and existence in other chordate subphyla (Tunicata and Cephalochordata) have not been clearly established. In this study, selected protein sequences of CYP11A1, CYP17A1 and CYP19A1 were compiled and analyzed using multiple sequence alignment and phylogenetic analysis. Our analyses show that cephalochordates have sequences orthologous to vertebrate CYP11A1, CYP17A1 or CYP19A1, and that echinoderms and hemichordates possess CYP11-like but not CYP19 genes. While the cephalochordate sequences have low identity with the vertebrate sequences, reflecting evolutionary distance, the data show apparent origin of CYP11 prior to the evolution of CYP19 and possibly CYP17, thus indicating a sequential origin of these functionally related steroidogenic CYPs. Co-occurrence of the three CYPs in early chordates suggests that the three genes may have coevolved thereafter, and that functional conservation should be reflected in functionally important residues in the proteins. CYP19A1 has the largest number of conserved residues while CYP11A1 sequences are less conserved. Structural analyses of human CYP11A1, CYP17A1 and CYP19A1 show that critical substrate binding site residues are highly conserved in each enzyme family. The results emphasize that the steroidogenic pathways producing glucocorticoids and reproductive steroids are several hundred million years old and that the catalytic structural elements of the enzymes have been conserved over the same period of time. Analysis of these elements may help to identify when precursor functions linked to these enzymes first arose. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Effects of private-land use, livestock management, and human tolerance on diversity, distribution, and abundance of large african mammals.

    PubMed

    Kinnaird, Margaret F; O'brien, Timothy G

    2012-12-01

    Successful conservation of large terrestrial mammals (wildlife) on private lands requires that landowners be empowered to manage wildlife so that benefits outweigh the costs. Laikipia County, Kenya, is predominantly unfenced, and the land uses in the area allow wide-ranging wildlife to move freely between different management systems on private land. We used camera traps to sample large mammals associated with 4 different management systems (rhinoceros sanctuaries, no livestock; conservancies, intermediate stocking level; fenced ranches, high stocking level; and group ranches, high stocking level, no fencing, pastoralist clan ownership) to examine whether management and stocking levels affect wildlife. We deployed cameras at 522 locations across 8 properties from January 2008 through October 2010 and used the photographs taken during this period to estimate richness, occupancy, and relative abundance of species. Species richness was highest in conservancies and sanctuaries and lowest on fenced and group ranches. Occupancy estimates were, on average, 2 and 5 times higher in sanctuaries and conservancies as on fenced and group ranches, respectively. Nineteen species on fenced ranches and 25 species on group ranches were considered uncommon (occupancy < 0.1). The relative abundance of most species was highest or second highest in sanctuaries and conservancies. Lack of rights to manage and utilize wildlife and uncertain land tenure dampen many owners' incentives to tolerate wildlife. We suggest national conservation strategies consider landscape-level approaches to land-use planning that aim to increase conserved areas by providing landowners with incentives to tolerate wildlife. Possible incentives include improving access to ecotourism benefits, forging agreements to maintain wildlife habitat and corridors, resolving land-ownership conflicts, restoring degraded rangelands, expanding opportunities for grazing leases, and allowing direct benefits to landowners through wildlife harvesting. . © 2012 Society for Conservation Biology.

  6. Modeling soil conservation, water conservation and their tradeoffs: a case study in Beijing.

    PubMed

    Bai, Yang; Ouyang, Zhiyun; Zheng, Hua; Li, Xiaoma; Zhuang, Changwei; Jiang, Bo

    2012-01-01

    Natural ecosystems provide society with important goods and services. With the rapid increase in human populations and excessive utilization of natural resources, humans frequently enhance the production of some services at the expense of the others. Although the need for tradeoffs between conservation and development is urgent, the lack of efficient methods to assess such tradeoffs has impeded progress. Three land use strategy scenarios (development scenario, plan trend scenario and conservation scenario) were created to forecast potential changes in ecosystem services from 2007 to 2050 in Beijing, China. GIS-based techniques were used to map spatial and temporal distribution and changes in ecosystem services for each scenario. The provision of ecosystem services differed spatially, with significant changes being associated with different scenarios. Scenario analysis of water yield (as average annual yield) and soil retention (as retention rate per unit area) for the period 2007 to 2050 indicated that the highest values for these parameters were predicted for the forest habitat under all three scenarios. Annual yield/retention of forest, shrub, and grassland ranked the highest in the conservation scenario. Total water yield and soil retention increased in the conservation scenario and declined dramatically in the other two scenarios, especially the development scenario. The conservation scenario was the optimal land use strategy, resulting in the highest soil retention and water yield. Our study suggests that the evaluation and visualization of ecosystem services can effectively assist in understanding the tradeoffs between conservation and development. Results of this study have implications for planning and monitoring future management of natural capital and ecosystem services, which can be integrated into land use decision-making.

  7. Local ecological knowledge and its relationship with biodiversity conservation among two Quilombola groups living in the Atlantic Rainforest, Brazil

    PubMed Central

    Ticktin, Tamara; Fonseca, Amanda Surerus; Macedo, Arthur Ladeira; Orsi, Timothy Ongaro; Chedier, Luciana Moreira; Rodrigues, Eliana; Pimenta, Daniel Sales

    2017-01-01

    Information on the knowledge, uses, and abundance of natural resources in local communities can provide insight on conservation status and conservation strategies in these locations. The aim of this research was to evaluate the uses, knowledge and conservation status of plants in two Quilombolas (descendants of slaves of African origin) communities in the Atlantic rainforest of Brazil, São Sebastião da Boa Vista (SSBV) and São Bento (SB). We used a combination of ethnobotanical and ecological survey methods to ask: 1) What ethnobotanical knowledge do the communities hold? 2) What native species are most valuable to them? 3) What is the conservation status of the native species used? Thirteen local experts described the names and uses of 212 species in SSBV (105 native species) and 221 in SB (96 native species). Shannon Wiener diversity and Pielou’s Equitability indices of ethnobotanical knowledge of species were very high (5.27/0.96 and 5.28/0.96, respectively). Species with the highest cultural significance and use-value indexes in SSBV were Dalbergia hortensis (26/2.14), Eremanthus erythropappus (6.88/1), and Tibouchina granulosa (6.02/1); while Piptadenia gonoacantha (3.32/1), Sparattosperma leucanthum (3.32/1) and Cecropia glaziovii (3.32/0.67) were the highest in SB. Thirty-three native species ranked in the highest conservation priority category at SSBV and 31 at SB. D. hortensis was noteworthy because of its extremely high cultural importance at SSBV, and its categorization as a conservation priority in both communities. This information can be used towards generating sustainable use and conservation plans that are appropriate for the local communities. PMID:29182637

  8. Local ecological knowledge and its relationship with biodiversity conservation among two Quilombola groups living in the Atlantic Rainforest, Brazil.

    PubMed

    Conde, Bruno Esteves; Ticktin, Tamara; Fonseca, Amanda Surerus; Macedo, Arthur Ladeira; Orsi, Timothy Ongaro; Chedier, Luciana Moreira; Rodrigues, Eliana; Pimenta, Daniel Sales

    2017-01-01

    Information on the knowledge, uses, and abundance of natural resources in local communities can provide insight on conservation status and conservation strategies in these locations. The aim of this research was to evaluate the uses, knowledge and conservation status of plants in two Quilombolas (descendants of slaves of African origin) communities in the Atlantic rainforest of Brazil, São Sebastião da Boa Vista (SSBV) and São Bento (SB). We used a combination of ethnobotanical and ecological survey methods to ask: 1) What ethnobotanical knowledge do the communities hold? 2) What native species are most valuable to them? 3) What is the conservation status of the native species used? Thirteen local experts described the names and uses of 212 species in SSBV (105 native species) and 221 in SB (96 native species). Shannon Wiener diversity and Pielou's Equitability indices of ethnobotanical knowledge of species were very high (5.27/0.96 and 5.28/0.96, respectively). Species with the highest cultural significance and use-value indexes in SSBV were Dalbergia hortensis (26/2.14), Eremanthus erythropappus (6.88/1), and Tibouchina granulosa (6.02/1); while Piptadenia gonoacantha (3.32/1), Sparattosperma leucanthum (3.32/1) and Cecropia glaziovii (3.32/0.67) were the highest in SB. Thirty-three native species ranked in the highest conservation priority category at SSBV and 31 at SB. D. hortensis was noteworthy because of its extremely high cultural importance at SSBV, and its categorization as a conservation priority in both communities. This information can be used towards generating sustainable use and conservation plans that are appropriate for the local communities.

  9. A new earthworm cellulase and its possible role in the innate immunity.

    PubMed

    Park, In Yong; Cha, Ju Roung; Ok, Suk-Mi; Shin, Chuog; Kim, Jin-Se; Kwak, Hee-Jin; Yu, Yun-Sang; Kim, Yu-Kyung; Medina, Brenda; Cho, Sung-Jin; Park, Soon Cheol

    2017-02-01

    A new endogenous cellulase (Ean-EG) from the earthworm, Eisenia andrei and its expression pattern are demonstrated. Based on a deduced amino acid sequence, the open reading frame (ORF) of Ean-EG consisted of 1368 bps corresponding to a polypeptide of 456 amino acid residues in which is contained the conserved region specific to GHF9 that has the essential amino acid residues for enzyme activity. In multiple alignments and phylogenetic analysis, the deduced amino acid sequence of Ean- EG showed the highest sequence similarity (about 79%) to that of an annelid (Pheretima hilgendorfi) and could be clustered together with other GHF9 cellulases, indicating that Ean-EG could be categorized as a member of the GHF9 to which most animal cellulases belong. The histological expression pattern of Ean-EG mRNA using in situ hybridization revealed that the most distinct expression was observed in epithelial cells with positive hybridization signal in epidermis, chloragogen tissue cells, coelomic cell-aggregate, and even blood vessel, which could strongly support the fact that at least in the earthworm, Eisenia andrei, cellulase function must not be limited to digestive process but be possibly extended to the innate immunity. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Isolation and Characterization of the PKAr Gene From a Plant Pathogen, Curvularia lunata.

    PubMed

    Liu, T; Ma, B C; Hou, J M; Zuo, Y H

    2014-09-01

    By using EST database from a full-length cDNA library of Curvularia lunata, we have isolated a 2.9 kb cDNA, termed PKAr. An ORF of 1,383 bp encoding a polypeptide of 460 amino acids with molecular weight 50.1 kDa, (GeneBank Acc. No. KF675744) was cloned. The deduced amino acid sequence of the PKAr shows 90 and 88 % identity with cAMP-dependent protein kinase A regulatory subunit from Alternaria alternate and Pyrenophora tritici-repentis Pt-1C-BFP, respectively. Database analysis revealed that the deduced amino acid sequence of PKAr shares considerable similarity with that of PKA regulatory subunits in other organisms, particularly in the conserved regions. No introns were identified within the 1,383 bp of ORF compared with PKAr genomic DNA sequence. Southern blot indicated that PKAr existed as a single copy per genome. The mRNA expression level of PKAr in different development stages were demonstrated using real-time quantitative PCR. The results showed that the level of PKAr expression was highest in vegetative growth mycelium, which indicated it might play an important role in the vegetative growth of C. lunata. These results provided a fundamental supporting research on the function of PKAr in plant pathogen, C. lunata.

  11. DNA barcoding reveal patterns of species diversity among northwestern Pacific molluscs

    PubMed Central

    Sun, Shao’e; Li, Qi; Kong, Lingfeng; Yu, Hong; Zheng, Xiaodong; Yu, Ruihai; Dai, Lina; Sun, Yan; Chen, Jun; Liu, Jun; Ni, Lehai; Feng, Yanwei; Yu, Zhenzhen; Zou, Shanmei; Lin, Jiping

    2016-01-01

    This study represents the first comprehensive molecular assessment of northwestern Pacific molluscs. In total, 2801 DNA barcodes belonging to 569 species from China, Japan and Korea were analyzed. An overlap between intra- and interspecific genetic distances was present in 71 species. We tested the efficacy of this library by simulating a sequence-based specimen identification scenario using Best Match (BM), Best Close Match (BCM) and All Species Barcode (ASB) criteria with three threshold values. BM approach returned 89.15% true identifications (95.27% when excluding singletons). The highest success rate of congruent identifications was obtained with BCM at 0.053 threshold. The analysis of our barcode library together with public data resulted in 582 Barcode Index Numbers (BINs), 72.2% of which was found to be concordantly with morphology-based identifications. The discrepancies were divided in two groups: sequences from different species clustered in a single BIN and conspecific sequences divided in one more BINs. In Neighbour-Joining phenogram, 2,320 (83.0%) queries fromed 355 (62.4%) species-specific barcode clusters allowing their successful identification. 33 species showed paraphyletic and haplotype sharing. 62 cases are represented by deeply diverged lineages. This study suggest an increased species diversity in this region, highlighting taxonomic revision and conservation strategy for the cryptic complexes. PMID:27640675

  12. The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

    PubMed

    Khoe, Clairine V; Chung, Long H; Murray, Vincent

    2018-06-01

    The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.

  13. DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants

    PubMed Central

    Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B.; Tóth, Gábor; Ortutay, Csaba P.; Patthy, László

    2005-01-01

    DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21 061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically. PMID:15608291

  14. DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants.

    PubMed

    Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B; Tóth, Gábor; Ortutay, Csaba P; Patthy, László

    2005-01-01

    DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.

  15. A synthetic phylogeny of freshwater crayfish: insights for conservation.

    PubMed

    Owen, Christopher L; Bracken-Grissom, Heather; Stern, David; Crandall, Keith A

    2015-02-19

    Phylogenetic systematics is heading for a renaissance where we shift from considering our phylogenetic estimates as a static image in a published paper and taxonomies as a hardcopy checklist to treating both the phylogenetic estimate and dynamic taxonomies as metadata for further analyses. The Open Tree of Life project (opentreeoflife.org) is developing synthesis tools for harnessing the power of phylogenetic inference and robust taxonomy to develop a synthetic tree of life. We capitalize on this approach to estimate a synthesis tree for the freshwater crayfish. The crayfish make an exceptional group to demonstrate the utility of the synthesis approach, as there recently have been a number of phylogenetic studies on the crayfishes along with a robust underlying taxonomic framework. Importantly, the crayfish have also been extensively assessed by an IUCN Red List team and therefore have accurate and up-to-date area and conservation status data available for analysis within a phylogenetic context. Here, we develop a synthesis phylogeny for the world's freshwater crayfish and examine the phylogenetic distribution of threat. We also estimate a molecular phylogeny based on all available GenBank crayfish sequences and use this tree to estimate divergence times and test for divergence rate variation. Finally, we conduct EDGE and HEDGE analyses and identify a number of species of freshwater crayfish of highest priority in conservation efforts. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  16. A synthetic phylogeny of freshwater crayfish: insights for conservation

    PubMed Central

    Owen, Christopher L.; Bracken-Grissom, Heather; Stern, David; Crandall, Keith A.

    2015-01-01

    Phylogenetic systematics is heading for a renaissance where we shift from considering our phylogenetic estimates as a static image in a published paper and taxonomies as a hardcopy checklist to treating both the phylogenetic estimate and dynamic taxonomies as metadata for further analyses. The Open Tree of Life project (opentreeoflife.org) is developing synthesis tools for harnessing the power of phylogenetic inference and robust taxonomy to develop a synthetic tree of life. We capitalize on this approach to estimate a synthesis tree for the freshwater crayfish. The crayfish make an exceptional group to demonstrate the utility of the synthesis approach, as there recently have been a number of phylogenetic studies on the crayfishes along with a robust underlying taxonomic framework. Importantly, the crayfish have also been extensively assessed by an IUCN Red List team and therefore have accurate and up-to-date area and conservation status data available for analysis within a phylogenetic context. Here, we develop a synthesis phylogeny for the world's freshwater crayfish and examine the phylogenetic distribution of threat. We also estimate a molecular phylogeny based on all available GenBank crayfish sequences and use this tree to estimate divergence times and test for divergence rate variation. Finally, we conduct EDGE and HEDGE analyses and identify a number of species of freshwater crayfish of highest priority in conservation efforts. PMID:25561670

  17. Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence.

    PubMed

    Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab; Khandekar, Sushant; Crawford, Erin; Zirbel, Craig L; Leisner, Scott; Prakash, Ashwin; Fedorova, Larisa; Fedorov, Alexei

    2014-09-10

    Orthologous introns have identical positions relative to the coding sequence in orthologous genes of different species. By analyzing the complete genomes of five plants we generated a database of 40,512 orthologous intron groups of dicotyledonous plants, 28,519 orthologous intron groups of angiosperms, and 15,726 of land plants (moss and angiosperms). Multiple sequence alignments of each orthologous intron group were obtained using the Mafft algorithm. The number of conserved regions in plant introns appeared to be hundreds of times fewer than that in mammals or vertebrates. Approximately three quarters of conserved intronic regions among angiosperms and dicots, in particular, correspond to alternatively-spliced exonic sequences. We registered only a handful of conserved intronic ncRNAs of flowering plants. However, the most evolutionarily conserved intronic region, which is ubiquitous for all plants examined in this study, including moss, possessed multiple structural features of tRNAs, which caused us to classify it as a putative tRNA-like ncRNA. Intronic sequences encoding tRNA-like structures are not unique to plants. Bioinformatics examination of the presence of tRNA inside introns revealed an unusually long-term association of four glycine tRNAs inside the Vac14 gene of fish, amniotes, and mammals. Copyright © 2014 Elsevier B.V. All rights reserved.

  18. High-Throughput Sequencing of Arabidopsis microRNAs: Evidence for Frequent Birth and Death of MIRNA Genes

    PubMed Central

    Fahlgren, Noah; Howell, Miya D.; Kasschau, Kristin D.; Chapman, Elisabeth J.; Sullivan, Christopher M.; Cumbie, Jason S.; Givan, Scott A.; Law, Theresa F.; Grant, Sarah R.; Dangl, Jeffery L.; Carrington, James C.

    2007-01-01

    In plants, microRNAs (miRNAs) comprise one of two classes of small RNAs that function primarily as negative regulators at the posttranscriptional level. Several MIRNA genes in the plant kingdom are ancient, with conservation extending between angiosperms and the mosses, whereas many others are more recently evolved. Here, we use deep sequencing and computational methods to identify, profile and analyze non-conserved MIRNA genes in Arabidopsis thaliana. 48 non-conserved MIRNA families, nearly all of which were represented by single genes, were identified. Sequence similarity analyses of miRNA precursor foldback arms revealed evidence for recent evolutionary origin of 16 MIRNA loci through inverted duplication events from protein-coding gene sequences. Interestingly, these recently evolved MIRNA genes have taken distinct paths. Whereas some non-conserved miRNAs interact with and regulate target transcripts from gene families that donated parental sequences, others have drifted to the point of non-interaction with parental gene family transcripts. Some young MIRNA loci clearly originated from one gene family but form miRNAs that target transcripts in another family. We suggest that MIRNA genes are undergoing relatively frequent birth and death, with only a subset being stabilized by integration into regulatory networks. PMID:17299599

  19. Conserved noncoding sequences conserve biological networks and influence genome evolution.

    PubMed

    Xie, Jianbo; Qian, Kecheng; Si, Jingna; Xiao, Liang; Ci, Dong; Zhang, Deqiang

    2018-05-01

    Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.

  20. Chromosome ends: different sequences may provide conserved functions.

    PubMed

    Louis, Edward J; Vershinin, Alexander V

    2005-07-01

    The structures of specific chromosome regions, centromeres and telomeres, present a number of puzzles. As functions performed by these regions are ubiquitous and essential, their DNA, proteins and chromatin structure are expected to be conserved. Recent studies of centromeric DNA from human, Drosophila and plant species have demonstrated that a hidden universal centromere-specific sequence is highly unlikely. The DNA of telomeres is more conserved consisting of a tandemly repeated 6-8 bp Arabidopsis-like sequence in a majority of organisms as diverse as protozoan, fungi, mammals and plants. However, there are alternatives to short DNA repeats at the ends of chromosomes and for telomere elongation by telomerase. Here we focus on the similarities and diversity that exist among the structural elements, DNA sequences and proteins, that make up terminal domains (telomeres and subtelomeres), and how organisms use these in different ways to fulfil the functions of end-replication and end-protection. Copyright (c) 2005 Wiley Periodicals, Inc.

  1. Comparative genomic analysis of the false killer whale (Pseudorca crassidens) LMBR1 locus.

    PubMed

    Kim, Dae-Won; Choi, Sang-Haeng; Kim, Ryong Nam; Kim, Sun-Hong; Paik, Sang-Gi; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Aeri; Kang, Aram; Park, Hong-Seog

    2010-09-01

    The sequencing and comparative genomic analysis of LMBR1 loci in mammals or other species, including human, would be very important in understanding evolutionary genetic changes underlying the evolution of limb development. In this regard, comparative genomic annotation of the false killer whale LMBR1 locus could shed new light on the evolution of limb development. We sequenced two false killer whale BAC clones, corresponding to 156 kb and 144 kb, respectively, harboring the tightly linked RNF32, LMBR1, and NOM1 genes. Our annotation of the false killer whale LMBR1 gene showed that it consists of 17 exons (1473 bp), in contrast to 18 exons (1596 bp) in human, and it displays 93.1% and 95.6% nucleotide and amino acid sequence similarity, respectively, compared with the human gene. In particular, we discovered that exon 10, deleted in the false killer whale LMBR1 gene, is present only in primates, and this fact strongly implies that exon 10 might be crucial in determining primate-specific limb development. ZRS and TFBS sequences have been well conserved across 11 species, suggesting that these regions could be involved in an important function of limb development and limb patterning. The neighboring gene RNF32 showed several lineage-conserved exons, such as exons 2 through 9 conserved in eutherian mammals, exons 3 through 9 conserved in mammals, and exons 5 through 9 conserved in vertebrates. The other neighboring gene, NOM1, had undergone a substitution (ATG→GTA) at the start codon, giving rise to a 36 bp shorter N-terminal sequence compared with the human sequence. Our comparative analysis of the false killer whale LMBR1 genomic locus provides important clues regarding the genetic regions that may play crucial roles in limb development and patterning.

  2. Defining and predicting structurally conserved regions in protein superfamilies

    PubMed Central

    Huang, Ivan K.; Grishin, Nick V.

    2013-01-01

    Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment. Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR. Contact: 91huangi@gmail.com or grishin@chop.swmed.edu Supplementary information: Supplementary data are available at Bioinformatics Online PMID:23193223

  3. Insights into the phylogenetic positions of photosynthetic bacteria obtained from 5S rRNA and 16S rRNA sequence data

    NASA Technical Reports Server (NTRS)

    Fox, G. E.

    1985-01-01

    Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.

  4. Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

    PubMed

    Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

    2004-02-01

    To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.

  5. Determination of the promoter region of mouse ribosomal RNA gene by an in vitro transcription system.

    PubMed Central

    Yamamoto, O; Takakusa, N; Mishima, Y; Kominami, R; Muramatsu, M

    1984-01-01

    Sequences required for a faithful and efficient transcription of a cloned mouse ribosomal RNA gene (rDNA) are determined by testing a series of deletion mutants in an in vitro transcription system utilizing two kinds of mouse cellular extract. Deletion of sequences upstream of -40 or downstream of +52 causes only slight reduction in promoter activity as compared with the "wild-type" template. For upstream deletion mutants, the removal of a sequence between -40 and -35 causes a significant decrease in the capacity to direct efficient initiation. This decrease becomes more pronounced when the deletion reaches -32 and the sequence A-T-C-T-T-T, conserved among mouse, rat, and human rDNAs, is lost. Residual template activity is further reduced as more upstream sequence is deleted and finally becomes undetectable when the deletion is extended from -22 down to -17, corresponding to the loss of the conserved sequence T-A-T-T-G. As for downstream deletion mutants, the removal of the sequence downstream of +23 causes some (and further deletions up to +11 cause a more) serious decrease in template activity in vitro. These deletions involve other conserved sequences downstream of the transcription start site. However, the removal of the original transcription start site does not abolish the transcription initiation completely, provided that the whole upstream sequence is intact. Images PMID:6320178

  6. Myxobolus cerebralis internal transcribed spacer 1 (ITS-1) sequences support recent spread of the parasite to North America and within Europe

    USGS Publications Warehouse

    Whipps, Christopher M.; El-Matbouli, M.; Hedrick, R.P.; Blazer, V.; Kent, M.L.

    2004-01-01

    Molecular approaches for resolving relationships among the Myxozoa have relied mainly on small subunit (SSU) ribosomal DNA (rDNA) sequence analysis. This region of the gene is generally used for higher phylogenetic studies, and the conservative nature of this gene may make it inadequate for intraspecific comparisons. Previous intraspecific studies of Myxobolus cerebralis based on molecular analyses reported that the sequence of SSU rDNA and the internal transcribed spacer (ITS) were highly conserved in representatives of the parasite from North America and Europe. Considering that the ITS is usually a more variable region than the SSU, we reanalyzed available sequences on GenBank and obtained sequences from other M. cerebralis representatives from the states of California and West Virginia in the USA and from Germany and Russia. With the exception of 7 base pairs, most of the sequence designated as ITS-1 in GenBank was a highly conserved portion of the rDNA near the 3-prime end of the SSU region. Nonetheless, the additional ITS-1 sequences obtained from the available geographic representatives were well conserved. It is unlikely that we would have observed virtually identical ITS-1 sequences between European and American M. cerebralis samples had it spread naturally over time, particularly when compared to the variation seen between isolates of another myxozoan (Kudoa thyrsites) that has most likely spread naturally. These data further support the hypothesis that the current distribution of M. cerebralis in North America is a result of recent introductions followed by dispersal via anthropogenic means, largely through the stocking of infected trout for sport fishing.

  7. DsaV methyltransferase and its isoschizomers contain a conserved segment that is similar to the segment in Hhai methyltransferase that is in contact with DNA bases.

    PubMed Central

    Gopal, J; Yebra, M J; Bhagwat, A S

    1994-01-01

    The methyltransferase (MTase) in the DsaV restriction--modification system methylates within 5'-CCNGG sequences. We have cloned the gene for this MTase and determined its sequence. The predicted sequence of the MTase protein contains sequence motifs conserved among all cytosine-5 MTases and is most similar to other MTases that methylate CCNGG sequences, namely M.ScrFI and M.SsoII. All three MTases methylate the internal cytosine within their recognition sequence. The 'variable' region within the three enzymes that methylate CCNGG can be aligned with the sequences of two enzymes that methylate CCWGG sequences. Remarkably, two segments within this region contain significant similarity with the region of M.HhaI that is known to contact DNA bases. These alignments suggest that many cytosine-5 MTases are likely to interact with DNA using a similar structural framework. Images PMID:7971279

  8. Spatial heterogeneity in the Mediterranean Biodiversity Hotspot affects barcoding accuracy of its freshwater fishes.

    PubMed

    Geiger, M F; Herder, F; Monaghan, M T; Almada, V; Barbieri, R; Bariche, M; Berrebi, P; Bohlen, J; Casal-Lopez, M; Delmastro, G B; Denys, G P J; Dettai, A; Doadrio, I; Kalogianni, E; Kärst, H; Kottelat, M; Kovačić, M; Laporte, M; Lorenzoni, M; Marčić, Z; Özuluğ, M; Perdices, A; Perea, S; Persat, H; Porcelotti, S; Puzzi, C; Robalo, J; Šanda, R; Schneider, M; Šlechtová, V; Stoumboudi, M; Walter, S; Freyhof, J

    2014-11-01

    Incomplete knowledge of biodiversity remains a stumbling block for conservation planning and even occurs within globally important Biodiversity Hotspots (BH). Although technical advances have boosted the power of molecular biodiversity assessments, the link between DNA sequences and species and the analytics to discriminate entities remain crucial. Here, we present an analysis of the first DNA barcode library for the freshwater fish fauna of the Mediterranean BH (526 spp.), with virtually complete species coverage (498 spp., 98% extant species). In order to build an identification system supporting conservation, we compared species determination by taxonomists to multiple clustering analyses of DNA barcodes for 3165 specimens. The congruence of barcode clusters with morphological determination was strongly dependent on the method of cluster delineation, but was highest with the general mixed Yule-coalescent (GMYC) model-based approach (83% of all species recovered as GMYC entity). Overall, genetic morphological discontinuities suggest the existence of up to 64 previously unrecognized candidate species. We found reduced identification accuracy when using the entire DNA-barcode database, compared with analyses on databases for individual river catchments. This scale effect has important implications for barcoding assessments and suggests that fairly simple identification pipelines provide sufficient resolution in local applications. We calculated Evolutionarily Distinct and Globally Endangered scores in order to identify candidate species for conservation priority and argue that the evolutionary content of barcode data can be used to detect priority species for future IUCN assessments. We show that large-scale barcoding inventories of complex biotas are feasible and contribute directly to the evaluation of conservation priorities. © 2014 John Wiley & Sons Ltd.

  9. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. Results To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Conclusions This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits. PMID:23116282

  10. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni.

    PubMed

    Mandhan, Vibha; Kaur, Jagdeep; Singh, Kashmir

    2012-11-01

    MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits.

  11. Sequence analysis of the L protein of the Ebola 2014 outbreak: Insight into conserved regions and mutations.

    PubMed

    Ayub, Gohar; Waheed, Yasir

    2016-06-01

    The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.

  12. Transposon Mutagenesis of the Zika Virus Genome Highlights Regions Essential for RNA Replication and Restricted for Immune Evasion.

    PubMed

    Fulton, Benjamin O; Sachs, David; Schwarz, Megan C; Palese, Peter; Evans, Matthew J

    2017-08-01

    The molecular constraints affecting Zika virus (ZIKV) evolution are not well understood. To investigate ZIKV genetic flexibility, we used transposon mutagenesis to add 15-nucleotide insertions throughout the ZIKV MR766 genome and subsequently deep sequenced the viable mutants. Few ZIKV insertion mutants replicated, which likely reflects a high degree of functional constraints on the genome. The NS1 gene exhibited distinct mutational tolerances at different stages of the screen. This result may define regions of the NS1 protein that are required for the different stages of the viral life cycle. The ZIKV structural genes showed the highest degree of insertional tolerance. Although the envelope (E) protein exhibited particular flexibility, the highly conserved envelope domain II (EDII) fusion loop of the E protein was intolerant of transposon insertions. The fusion loop is also a target of pan-flavivirus antibodies that are generated against other flaviviruses and neutralize a broad range of dengue virus and ZIKV isolates. The genetic restrictions identified within the epitopes in the EDII fusion loop likely explain the sequence and antigenic conservation of these regions in ZIKV and among multiple flaviviruses. Thus, our results provide insights into the genetic restrictions on ZIKV that may affect the evolution of this virus. IMPORTANCE Zika virus recently emerged as a significant human pathogen. Determining the genetic constraints on Zika virus is important for understanding the factors affecting viral evolution. We used a genome-wide transposon mutagenesis screen to identify where mutations were tolerated in replicating viruses. We found that the genetic regions involved in RNA replication were mostly intolerant of mutations. The genes coding for structural proteins were more permissive to mutations. Despite the flexibility observed in these regions, we found that epitopes bound by broadly reactive antibodies were genetically constrained. This finding may explain the genetic conservation of these epitopes among flaviviruses. Copyright © 2017 American Society for Microbiology.

  13. Evolution of DUF1313 family members across plant species and their association with maize photoperiod sensitivity.

    PubMed

    Li, Jia; Hu, Erliang; Chen, Xueying; Xu, Jie; Lan, Hai; Li, Chuan; Hu, Yaodong; Lu, Yanli

    2016-05-01

    Proteins of the DUF1313 family contain a highly conserved domain and are only found in plants; they play important roles in most plant functions. In this study, 269 DUF1313 genes from 81 photoautotrophic species were identified; they were classified into three major types based on the amino acid substitutions in the conserved region: IARV, I(S/T/F)(K/R)V, and IRRV. Phylogenic tree constructed from 51 DUF1313 genes from graminoids revealed three clades: A, B1, and B2. Clade B1 was found to have undergone episodic positive selection after a gene duplication event and included four amino acid sites under positive selection. The association between DUF1313 family members and traits investigated in maize indicated that three of four genes (GRMZM2G025646, GRMZM5G877647, GRMZM2G359322, and GRMZM2G382774) were associated with the target traits such as days to silking, days to tasselling, and plant height. The nucleotide diversity of the most primitive and highly conserved DUF1313 gene, ELF4-like4, was the highest in Tripsacum and the lowest in maize. Tajima's D and Fu and Li's D tests revealed that significant purifying selection had occurred in the coding sequence region of this DUF1313 gene in teosinte and maize. No significant signal was detected in the 5'-untranslated region of this gene in each of the three species (maize, teosinte, and Tripsacum) or in any gene regions of Tripsacum. Phylogenetic analyses revealed that the 103 accessions of maize, teosinte, and Tripsacum can be grouped into four clades based on the ELF4-like4 gene sequence similarity. Thus, this gene can be used to determine the relationships between maize and its relatives, and the DUF1313 family members and alleles identified in this study might be valuable genetic resources for molecular marker-assisted breeding in maize. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Detection of hyper-conserved regions in hepatitis B virus X gene potentially useful for gene therapy.

    PubMed

    González, Carolina; Tabernero, David; Cortese, Maria Francesca; Gregori, Josep; Casillas, Rosario; Riveiro-Barciela, Mar; Godoy, Cristina; Sopena, Sara; Rando, Ariadna; Yll, Marçal; Lopez-Martinez, Rosa; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco

    2018-05-21

    To detect hyper-conserved regions in the hepatitis B virus (HBV) X gene ( HBX ) 5' region that could be candidates for gene therapy. The study included 27 chronic hepatitis B treatment-naive patients in various clinical stages (from chronic infection to cirrhosis and hepatocellular carcinoma, both HBeAg-negative and HBeAg-positive), and infected with HBV genotypes A-F and H. In a serum sample from each patient with viremia > 3.5 log IU/mL, the HBX 5' end region [nucleotide (nt) 1255-1611] was PCR-amplified and submitted to next-generation sequencing (NGS). We assessed genotype variants by phylogenetic analysis, and evaluated conservation of this region by calculating the information content of each nucleotide position in a multiple alignment of all unique sequences (haplotypes) obtained by NGS. Conservation at the HBx protein amino acid (aa) level was also analyzed. NGS yielded 1333069 sequences from the 27 samples, with a median of 4578 sequences/sample (2487-9279, IQR 2817). In 14/27 patients (51.8%), phylogenetic analysis of viral nucleotide haplotypes showed a complex mixture of genotypic variants. Analysis of the information content in the haplotype multiple alignments detected 2 hyper-conserved nucleotide regions, one in the HBX upstream non-coding region (nt 1255-1286) and the other in the 5' end coding region (nt 1519-1603). This last region coded for a conserved amino acid region (aa 63-76) that partially overlaps a Kunitz-like domain. Two hyper-conserved regions detected in the HBX 5' end may be of value for targeted gene therapy, regardless of the patients' clinical stage or HBV genotype.

  15. Genomic structure of the human D-site binding protein (DBP) gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shutler, G.; Glassco, T.; Kang, Xiaolin

    1996-06-15

    The human gene for the D-Site Binding Protein (DBP) has been sequenced and characterized. This gene is a member of the b/ZIP family of transcription factors and is one of three genes forming the PAR sub-family. DBP has been implicated in the diurnal regulation of a variety of liver-specific genes. Examination of the genomic structure of DBP reveals that the gene is divided into four exons and is contained within a relatively compact region of approximately 6 kb. These exons appear to correspond to functional divisions the DBP protein. Exon 1 contains a long 5{prime} UTR, and conservation between themore » rat and the human genes of the presence of small open reading frames within this region suggests that is may play a role in translational control. Exon 2 contains a limited region of similarity to the other PAR domain genes, which may be part of a potential activation domain. Exon 3 contains the PAR domain and differs by only 1 of 71 amino acids between rat and human. Exon 4, containing both the basic and the leucine zipper domains, is likewise highly conserved. The overall degree of homology between the rat and the human cDNA sequences is 82% for the nucleic acid sequence and 92% for the protein sequence. comparison of the rat and human proximal promoters reveals extensive sequence conservation, with two previously characterized DNA binding sites being conserved at the functional and sequence levels. 31 refs., 4 figs.« less

  16. Natural selection of the major histocompatibility complex (Mhc) in Hawaiian honeycreepers (Drepanidinae)

    USGS Publications Warehouse

    Jarvi, S.I.; Tarr, C.L.; Mcintosh, C.E.; Atkinson, C.T.; Fleischer, R.C.

    2004-01-01

    The native Hawaiian honeycreepers represent a classic example of adaptive radiation and speciation, but currently face one the highest extinction rates in the world. Although multiple factors have likely influenced the fate of Hawaiian birds, the relatively recent introduction of avian malaria is thought to be a major factor limiting honeycreeper distribution and abundance. We have initiated genetic analyses of class II ?? chain Mhc genes in four species of honeycreepers using methods that eliminate the possibility of sequencing mosaic variants formed by cloning heteroduplexed polymerase chain reaction products. Phylogenetic analyses group the honeycreeper Mhc sequences into two distinct clusters. Variation within one cluster is high, with dN > d S and levels of diversity similar to other studies of Mhc (B system) genes in birds. The second cluster is nearly invariant and includes sequences from honeycreepers (Fringillidae), a sparrow (Emberizidae) and a blackbird (Emberizidae). This highly conserved cluster appears reminiscent of the independently segregating Rfp-Y system of genes defined in chickens. The notion that balancing selection operates at the Mhc in the honeycreepers is supported by transpecies polymorphism and strikingly high dN/dS ratios at codons putatively involved in peptide interaction. Mitochondrial DNA control region sequences were invariant in the i'iwi, but were highly variable in the 'amakihi. By contrast, levels of variability of class II ?? chain Mhc sequence codons that are hypothesized to be directly involved in peptide interactions appear comparable between i'iwi and 'amakihi. In the i'iwi, natural selection may have maintained variation within the Mhc, even in the face of what appears to a genetic bottleneck.

  17. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations.

    PubMed

    Fuentes-Pardo, Angela P; Ruzzante, Daniel E

    2017-10-01

    Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology. © 2017 John Wiley & Sons Ltd.

  18. The complete mitochondrial genome of Lota lota (Gadiformes: Gadidae) from the Burqin River in China.

    PubMed

    Lu, Zhichuang; Zhang, Nan; Song, Na; Gao, Tianxiang

    2016-05-01

    In this study, the complete mitochondrial genome (mitogenome) sequence of Lota lota has been determined by long polymerase chain reaction and primer walking methods. The mitogenome is a circular molecule of 16,519 bp in length and contains 37 mitochondrial genes including 13 protein-coding genes, 2 ribosomal RNA (rRNA), 22 transfer RNA (tRNA) and a control region as other bony fishes. Within the control region, we identified the termination-associated sequence domain (TAS), the central conserved sequence block domains (CSB-F and CSB-D), and the conserved sequence block domains (CSB-1, CSB-2 and CSB-3).

  19. Hepatitis C virus genotypes in Singapore and Indonesia.

    PubMed

    Ng, W C; Guan, R; Tan, M F; Seet, B L; Lim, C A; Ngiam, C M; Sjaifoellah Noer, H M; Lesmana, L

    1995-01-01

    5' untranslated and partial core (C) region sequence of hepatitis C virus (HCV) in 21 Singaporean and 15 Indonesian isolates were amplified by reverse-transcription polymerase chain reaction and sequenced with the use of conserved primer sequences deduced from HCV genomes identified in other geographical regions. The HCV genotypes are predominantly that of Simmonds type 1 and less of type 2 and 3 with the latter genotype currently not detected in Indonesia. The 5' untranslated sequences are related to HCV-1. DK-7 (Denmark), US-11 (United States of America), HCV-J4, SA-10 (South Africa), T-3 (Taiwan), HCV-J6, HCV-J8, Eb-1 and Eb-8. When compared with the prototype HCV-1, insertions are found within the 5' untranslated region of Singaporean isolates and not in the Indonesians. There are Singaporean and Indonesian isolates that have sequences within the 5' untranslated region that differ slightly from each other. Microheterogeneity is observed in the core region of two Singaporeans and one Indonesian isolate. Finally, not all HCV isolates can be amplified with the conserved core sequence primers when compared with the ease with which these isolates can be amplified with 5' untranslated region conserved primers.

  20. Tissue-specific DNA methylation is conserved across human, mouse, and rat, and driven by primary sequence conservation.

    PubMed

    Zhou, Jia; Sears, Renee L; Xing, Xiaoyun; Zhang, Bo; Li, Daofeng; Rockweiler, Nicole B; Jang, Hyo Sik; Choudhary, Mayank N K; Lee, Hyung Joo; Lowdon, Rebecca F; Arand, Jason; Tabers, Brianne; Gu, C Charles; Cicero, Theodore J; Wang, Ting

    2017-09-12

    Uncovering mechanisms of epigenome evolution is an essential step towards understanding the evolution of different cellular phenotypes. While studies have confirmed DNA methylation as a conserved epigenetic mechanism in mammalian development, little is known about the conservation of tissue-specific genome-wide DNA methylation patterns. Using a comparative epigenomics approach, we identified and compared the tissue-specific DNA methylation patterns of rat against those of mouse and human across three shared tissue types. We confirmed that tissue-specific differentially methylated regions are strongly associated with tissue-specific regulatory elements. Comparisons between species revealed that at a minimum 11-37% of tissue-specific DNA methylation patterns are conserved, a phenomenon that we define as epigenetic conservation. Conserved DNA methylation is accompanied by conservation of other epigenetic marks including histone modifications. Although a significant amount of locus-specific methylation is epigenetically conserved, the majority of tissue-specific DNA methylation is not conserved across the species and tissue types that we investigated. Examination of the genetic underpinning of epigenetic conservation suggests that primary sequence conservation is a driving force behind epigenetic conservation. In contrast, evolutionary dynamics of tissue-specific DNA methylation are best explained by the maintenance or turnover of binding sites for important transcription factors. Our study extends the limited literature of comparative epigenomics and suggests a new paradigm for epigenetic conservation without genetic conservation through analysis of transcription factor binding sites.

  1. The CD8α gene in duck (Anatidae): cloning, characterization, and expression during viral infection.

    PubMed

    Xu, Qi; Chen, Yang; Zhao, Wen Ming; Huang, Zheng Yang; Duan, Xiu Jun; Tong, Yi Yu; Zhang, Yang; Li, Xiu; Chang, Guo Bin; Chen, Guo Hong

    2015-02-01

    Cluster of differentiation 8 alpha (CD8α) is critical for cell-mediated immune defense and T-cell development. Although CD8α sequences have been reported for several species, very little is known about CD8α in ducks. To elucidate the mechanisms involved in the innate and adaptive immune responses of ducks, we cloned CD8α coding sequences from domestic, Muscovy, Mallard, and Spotbill ducks using reverse transcription polymerase chain reaction (RT-PCR). Each sequence consisted of 714 nucleotides and encoded a signal peptide, an IgV-like domain, a stalk region, a transmembrane region, and a cytoplasmic tail. We identified 58 nucleotide differences and 37 amino acid differences among the four types of duck; of these, 53 nucleotide and 33 amino acid differences were between Muscovy ducks and the other duck species. The CD8α cDNA sequence from domestic duck consisted of a 61-nucleotide 5' untranslated region (UTR), a 714-nucleotide open reading frame, and an 849-nucleotide 3' UTR. Multiple sequence alignments showed that the amino acid sequence of CD8α is conserved in vertebrates. RT-PCR revealed that expression of CD8α mRNA of domestic ducks was highest in the thymus and very low in the kidney, cerebrum, cerebellum, and muscle. Immunohistochemical analyses detected CD8α on the splenic corpuscle and periarterial lymphatic sheath of the spleen. CD8α mRNA in domestic ducklings was initially up-regulated, and then down-regulated, in the thymus, spleen, and liver after treatment with duck hepatitis virus type I (DHV-1) or the immunostimulant polyriboinosinic polyribocytidylic acid (poly I:C).

  2. Cloning and characterization of the SERK1 gene in triploid Pingyi Tiancha [Malus hupehensis (Pamp.) Rehd. var. pingyiensis Jiang] and a tetraploid hybrid strain.

    PubMed

    Zhang, L J; Dong, W X; Guo, S M; Wang, Y X; Wang, A D; Lu, X J

    2015-11-19

    This study aims to explore the roles of somatic embryogenesis receptor-like kinase (SERK) in Malus hupehensis (Pingyi Tiancha). The full-length sequences of SERK1 in triploid Pingyi Tiancha (3n) and a tetraploid hybrid strain 33# (4n) were cloned, sequenced, and designated as MhSERK1 and MhdSERK1, respectively. Multiple alignments of amino acid sequences were conducted to identify similarity between MhSERK1 and MhdSERK1 and SERK sequences in other species, and a neighbor-joining phylogenetic tree was constructed to elucidate their phylogenetic relations. Expression levels of MhSERK1 and MhdSERK1 in different tissues and developmental stages were investigated using quantitative real-time PCR. The coding sequence lengths of MhSERK1 and MhdSERK1 were 1899 bp (encoding 632 amino acids) and 1881 bp (encoding 626 amino acids), respectively. Sequence analysis demonstrated that MhSERK1 and MhdSERK1 display high similarity to SERKs in other species, with a conserved intron/exon structure that is unique to members of the SERK family. Additionally, the phylogenetic tree showed that MhSERK1 and MhdSERK1 clustered with orange CitSERK (93%). Furthermore, MhSERK1 and MhdSERK1 were mainly expressed in the reproductive organs, in particular the ovary. Their expression levels were highest in young flowers and they differed among different tissues and organs. Our results suggest that MhSERK1 and MhdSERK1 are related to plant reproduction, and that MhSERK1 is related to apomixis in triploid Pingyi Tiancha.

  3. Molecular characterization and epidemic history of hepatitis C virus using core sequences of isolates from Central Province, Saudi Arabia.

    PubMed

    Shier, Medhat K; Iles, James C; El-Wetidy, Mohammad S; Ali, Hebatallah H; Al Qattan, Mohammad M

    2017-01-01

    The source of HCV transmission in Saudi Arabia is unknown. This study aimed to determine HCV genotypes in a representative sample of chronically infected patients in Saudi Arabia. All HCV isolates were genotyped and subtyped by sequencing of the HCV core region and 54 new HCV isolates were identified. Three sets of primers targeting the core region were used for both amplification and sequencing of all isolates resulting in a 326 bp fragment. Most HCV isolates were genotype 4 (85%), whereas only a few isolates were recognized as genotype 1 (15%). With the assistance of Genbank database and BLAST, subtyping results showed that most of genotype 4 isolates were 4d whereas most of genotype 1 isolates were 1b. Nucleotide conservation and variation rates of HCV core sequences showed that 4a and 1b have the highest levels of variation. Phylogenetic analysis of sequences by Maximum Likelihood and Bayesian Coalescent methods was used to explore the source of HCV transmission by investigating the relationship between Saudi Arabia and other countries in the Middle East and Africa. Coalescent analysis showed that transmissions of HCV from Egypt to Saudi Arabia are estimated to have occurred in three major clusters: 4d was introduced into the country before 1900, the major 4a clade's MRCA was introduced between 1900 and 1920, and the remaining lineages were introduced between 1940 and 1960 from Egypt and Middle Africa. Results showed that no lineages seem to have crossed from Egypt to Saudi Arabia in the last 15 years. Finally, sequencing and characterization of new HCV isolates from Saudi Arabia will enrich the HCV database and help further studies related to treatment and management of the virus.

  4. Molecular characterization and epidemic history of hepatitis C virus using core sequences of isolates from Central Province, Saudi Arabia

    PubMed Central

    Iles, James C.; El-Wetidy, Mohammad S.; Ali, Hebatallah H.; Al Qattan, Mohammad M.

    2017-01-01

    The source of HCV transmission in Saudi Arabia is unknown. This study aimed to determine HCV genotypes in a representative sample of chronically infected patients in Saudi Arabia. All HCV isolates were genotyped and subtyped by sequencing of the HCV core region and 54 new HCV isolates were identified. Three sets of primers targeting the core region were used for both amplification and sequencing of all isolates resulting in a 326 bp fragment. Most HCV isolates were genotype 4 (85%), whereas only a few isolates were recognized as genotype 1 (15%). With the assistance of Genbank database and BLAST, subtyping results showed that most of genotype 4 isolates were 4d whereas most of genotype 1 isolates were 1b. Nucleotide conservation and variation rates of HCV core sequences showed that 4a and 1b have the highest levels of variation. Phylogenetic analysis of sequences by Maximum Likelihood and Bayesian Coalescent methods was used to explore the source of HCV transmission by investigating the relationship between Saudi Arabia and other countries in the Middle East and Africa. Coalescent analysis showed that transmissions of HCV from Egypt to Saudi Arabia are estimated to have occurred in three major clusters: 4d was introduced into the country before 1900, the major 4a clade’s MRCA was introduced between 1900 and 1920, and the remaining lineages were introduced between 1940 and 1960 from Egypt and Middle Africa. Results showed that no lineages seem to have crossed from Egypt to Saudi Arabia in the last 15 years. Finally, sequencing and characterization of new HCV isolates from Saudi Arabia will enrich the HCV database and help further studies related to treatment and management of the virus. PMID:28863156

  5. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation | Center for Cancer Research

    Cancer.gov

    Dubbed "Tom's T" by Dhruba Chattoraj, the unusually conserved thymine at position +7 in bacteriophage P1 plasmid RepA DNA binding sites rises above repressor and acceptor sequence logos. The T appears to represent base flipping prior to helix opening in this DNA replication initation protein.

  6. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

    PubMed Central

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-01-01

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363

  7. Comparative analysis on the structural features of the 5' flanking region of κ-casein genes from six different species

    PubMed Central

    Gerencsér, Ákos; Barta, Endre; Boa, Simon; Kastanis, Petros; Bösze, Zsuzsanna; Whitelaw, C Bruce A

    2002-01-01

    κ-casein plays an essential role in the formation, stabilisation and aggregation of milk micelles. Control of κ-casein expression reflects this essential role, although an understanding of the mechanisms involved lags behind that of the other milk protein genes. We determined the 5'-flanking sequences for the murine, rabbit and human κ-casein genes and compared them to the published ruminant sequences. The most conserved region was not the proximal promoter region but an approximately 400 bp long region centred 800 bp upstream of the TATA box. This region contained two highly conserved MGF/STAT5 sites with common spacing relative to each other. In this region, six conserved short stretches of similarity were also found which did not correspond to known transcription factor consensus sites. On the contrary to ruminant and human 5' regulatory sequences, the rabbit and murine 5'-flanking regions did not harbour any kind of repetitive elements. We generated a phylogenetic tree of the six species based on multiple alignment of the κ-casein sequences. This study identified conserved candidate transcriptional regulatory elements within the κ-casein gene promoter. PMID:11929628

  8. Vaccination against Neisseria meningitidis using three variants of the lipoprotein GNA1870.

    PubMed

    Masignani, Vega; Comanducci, Maurizio; Giuliani, Marzia Monica; Bambini, Stefania; Adu-Bobie, Jeannette; Arico, Beatrice; Brunelli, Brunella; Pieri, Alessandro; Santini, Laura; Savino, Silvana; Serruto, Davide; Litt, David; Kroll, Simon; Welsch, Jo Anne; Granoff, Dan M; Rappuoli, Rino; Pizza, Mariagrazia

    2003-03-17

    Sepsis and meningitis caused by serogroup B meningococcus are devastating diseases of infants and young adults, which cannot yet be prevented by vaccination. By genome mining, we discovered GNA1870, a new surface-exposed lipoprotein of Neisseria meningitidis that induces high levels of bactericidal antibodies. The antigen is expressed by all strains of N. meningitidis tested. Sequencing of the gene in 71 strains representative of the genetic and geographic diversity of the N. meningitidis population, showed that the protein can be divided into three variants. Conservation within each variant ranges between 91.6 to 100%, while between the variants the conservation can be as low as 62.8%. The level of expression varies between strains, which can be classified as high, intermediate, and low expressors. Antibodies against a recombinant form of the protein elicit complement-mediated killing of the strains that carry the same variant and induce passive protection in the infant rat model. Bactericidal titers are highest against those strains expressing high yields of the protein; however, even the very low expressors are efficiently killed. The novel antigen is a top candidate for the development of a new vaccine against meningococcus.

  9. CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design

    PubMed Central

    Rose, Timothy M.; Henikoff, Jorja G.; Henikoff, Steven

    2003-01-01

    We have developed a new primer design strategy for PCR amplification of distantly related gene sequences based on consensus-degenerate hybrid oligonucleotide primers (CODEHOPs). An interactive program has been written to design CODEHOP PCR primers from conserved blocks of amino acids within multiply-aligned protein sequences. Each CODEHOP consists of a pool of related primers containing all possible nucleotide sequences encoding 3–4 highly conserved amino acids within a 3′ degenerate core. A longer 5′ non-degenerate clamp region contains the most probable nucleotide predicted for each flanking codon. CODEHOPs are used in PCR amplification to isolate distantly related sequences encoding the conserved amino acid sequence. The primer design software and the CODEHOP PCR strategy have been utilized for the identification and characterization of new gene orthologs and paralogs in different plant, animal and bacterial species. In addition, this approach has been successful in identifying new pathogen species. The CODEHOP designer (http://blocks.fhcrc.org/codehop.html) is linked to BlockMaker and the Multiple Alignment Processor within the Blocks Database World Wide Web (http://blocks.fhcrc.org). PMID:12824413

  10. Genome of turbot rhabdovirus exhibits unusual non-coding regions and an additional ORF that could be expressed in fish cell.

    PubMed

    Zhu, Ruo-Lin; Lei, Xiao-Ying; Ke, Fei; Yuan, Xiu-Ping; Zhang, Qi-Ya

    2011-02-01

    Genomic sequence of Scophthalmus maximus rhabdovirus (SMRV) isolated from diseased turbot has been characterized. The complete genome of SMRV comprises 11,492 nucleotides and encodes five typical rhabdovirus genes N, P, M, G and L. In addition, two open reading frames (ORF) are predicted overlapping with P gene, one upstream of P and smaller than P (temporarily called Ps), and another in P gene which may encodes a protein similar to the vesicular stomatitis virus C protein. The C ORF is contained within the P ORF. The five typical proteins share the highest sequence identities (48.9%) with the corresponding proteins of rhabdoviruses in genus Vesiculovirus. Phylogenetic analysis of partial L protein sequence indicates that SMRV is close to genus Vesiculovirus. The first 13 nucleotides at the ends of the SMRV genome are absolutely inverse complementarity. The gene junctions between the five genes show conserved polyadenylation signal (CATGA(7)) and intergenic dinucleotide (CT) followed by putative transcription initiation sequence A(A/G)(C/G)A(A/G/T), which are different from known rhabdoviruses. The entire Ps ORF was cloned and expressed, and used to generate polyclonal antibody in mice. One obvious band could be detected in SMRV-infected carp leucocyte cells (CLCs) by anti-Ps/C serum via Western blot, and the subcellular localization of Ps-GFP fusion protein exhibited cytoplasm distribution as multiple punctuate or doughnut shaped foci of uneven size. Copyright © 2010 Elsevier B.V. All rights reserved.

  11. Characterization of a Novel Polerovirus Infecting Maize in China

    PubMed Central

    Chen, Sha; Jiang, Guangzhuang; Wu, Jianxiang; Liu, Yong; Qian, Yajuan; Zhou, Xueping

    2016-01-01

    A novel virus, tentatively named Maize Yellow Mosaic Virus (MaYMV), was identified from the field-grown maize plants showing yellow mosaic symptoms on the leaves collected from the Yunnan Province of China by the deep sequencing of small RNAs. The complete 5642 nucleotide (nt)-long genome of the MaYMV shared the highest nucleotide sequence identity (73%) to Maize Yellow Dwarf Virus-RMV. Sequence comparisons and phylogenetic analyses suggested that MaYMV represents a new member of the genus Polerovirus in the family Luteoviridae. Furthermore, the P0 protein encoded by MaYMV was demonstrated to inhibit both local and systemic RNA silencing by co-infiltration assays using transgenic Nicotiana benthamiana line 16c carrying the GFP reporter gene, which further supported the identification of a new polerovirus. The biologically-active cDNA clone of MaYMV was generated by inserting the full-length cDNA of MaYMV into the binary vector pCB301. RT-PCR and Northern blot analyses showed that this clone was systemically infectious upon agro-inoculation into N. benthamiana. Subsequently, 13 different isolates of MaYMV from field-grown maize plants in different geographical locations of Yunnan and Guizhou provinces of China were sequenced. Analyses of their molecular variation indicate that the 3′ half of P3–P5 read-through protein coding region was the most variable, whereas the coat protein- (CP-) and movement protein- (MP-)coding regions were the most conserved. PMID:27136578

  12. Characterization of a Novel Polerovirus Infecting Maize in China.

    PubMed

    Chen, Sha; Jiang, Guangzhuang; Wu, Jianxiang; Liu, Yong; Qian, Yajuan; Zhou, Xueping

    2016-04-28

    A novel virus, tentatively named Maize Yellow Mosaic Virus (MaYMV), was identified from the field-grown maize plants showing yellow mosaic symptoms on the leaves collected from the Yunnan Province of China by the deep sequencing of small RNAs. The complete 5642 nucleotide (nt)-long genome of the MaYMV shared the highest nucleotide sequence identity (73%) to Maize Yellow Dwarf Virus-RMV. Sequence comparisons and phylogenetic analyses suggested that MaYMV represents a new member of the genus Polerovirus in the family Luteoviridae. Furthermore, the P0 protein encoded by MaYMV was demonstrated to inhibit both local and systemic RNA silencing by co-infiltration assays using transgenic Nicotiana benthamiana line 16c carrying the GFP reporter gene, which further supported the identification of a new polerovirus. The biologically-active cDNA clone of MaYMV was generated by inserting the full-length cDNA of MaYMV into the binary vector pCB301. RT-PCR and Northern blot analyses showed that this clone was systemically infectious upon agro-inoculation into N. benthamiana. Subsequently, 13 different isolates of MaYMV from field-grown maize plants in different geographical locations of Yunnan and Guizhou provinces of China were sequenced. Analyses of their molecular variation indicate that the 3' half of P3-P5 read-through protein coding region was the most variable, whereas the coat protein- (CP-) and movement protein- (MP-)coding regions were the most conserved.

  13. Sequence conservation from human to prokaryotes of Surf1, a protein involved in cytochrome c oxidase assembly, deficient in Leigh syndrome.

    PubMed

    Poyau, A; Buchet, K; Godinot, C

    1999-12-03

    The human SURF1 gene encoding a protein involved in cytochrome c oxidase (COX) assembly, is mutated in most patients presenting Leigh syndrome associated with COX deficiency. Proteins homologous to the human Surf1 have been identified in nine eukaryotes and six prokaryotes using database alignment tools, structure prediction and/or cDNA sequencing. Their sequence comparison revealed a remarkable Surf1 conservation during evolution and put forward at least four highly conserved domains that should be essential for Surf1 function. In Paracoccus denitrificans, the Surf1 homologue is found in the quinol oxidase operon, suggesting that Surf1 is associated with a primitive quinol oxidase which belongs to the same superfamily as cytochrome oxidase.

  14. Phylogenetic distribution of plant snoRNA families.

    PubMed

    Patra Bhattacharya, Deblina; Canzler, Sebastian; Kehr, Stephanie; Hertel, Jana; Grosse, Ivo; Stadler, Peter F

    2016-11-24

    Small nucleolar RNAs (snoRNAs) are one of the most ancient families amongst non-protein-coding RNAs. They are ubiquitous in Archaea and Eukarya but absent in bacteria. Their main function is to target chemical modifications of ribosomal RNAs. They fall into two classes, box C/D snoRNAs and box H/ACA snoRNAs, which are clearly distinguished by conserved sequence motifs and the type of chemical modification that they govern. Similarly to microRNAs, snoRNAs appear in distinct families of homologs that affect homologous targets. In animals, snoRNAs and their evolution have been studied in much detail. In plants, however, their evolution has attracted comparably little attention. In order to chart the phylogenetic distribution of individual snoRNA families in plants, we applied a sophisticated approach for identifying homologs of known plant snoRNAs across the plant kingdom. In response to the relatively fast evolution of snoRNAs, information on conserved sequence boxes, target sequences, and secondary structure is combined to identify additional snoRNAs. We identified 296 families of snoRNAs in 24 species and traced their evolution throughout the plant kingdom. Many of the plant snoRNA families comprise paralogs. We also found that targets are well-conserved for most snoRNA families. The sequence conservation of snoRNAs is sufficient to establish homologies between phyla. The degree of this conservation tapers off, however, between land plants and algae. Plant snoRNAs are frequently organized in highly conserved spatial clusters. As a resource for further investigations we provide carefully curated and annotated alignments for each snoRNA family under investigation.

  15. Complete genomic sequence of Powassan virus: evaluation of genetic elements in tick-borne versus mosquito-borne flaviviruses.

    PubMed

    Mandl, C W; Holzmann, H; Kunz, C; Heinz, F X

    1993-05-01

    The complete nucleotide sequence of the positive-stranded RNA genome of the tick-borne flavivirus Powassan (10,839 nucleotides) was elucidated and the amino acid sequence of all viral proteins was derived. Based on this sequence as well as serological data, Powassan virus represents the most divergent member of the tick-borne serocomplex within the genus flaviviruses, family Flaviviridae. The primary nucleotide sequence and potential RNA secondary structures of the Powassan virus genome as well as the protein sequences and the reactivities of the virion with a panel of monoclonal antibodies were compared to other tick-borne and mosquito-borne flaviviruses. These analyses corroborated significant differences between tick-borne and mosquito-borne flaviviruses, but also emphasized structural elements that are conserved among both vector groups. The comparisons among tick-borne flaviviruses revealed conserved sequence elements that might represent important determinants of the tick-borne flavivirus phenotype.

  16. Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, D.A.; Zilinskas, B.A.

    1991-08-01

    The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity)more » with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).« less

  17. Remarkable sequence conservation of the last intron in the PKD1 gene.

    PubMed

    Rodova, Marianna; Islam, M Rafiq; Peterson, Kenneth R; Calvet, James P

    2003-10-01

    The last intron of the PKD1 gene (intron 45) was found to have exceptionally high sequence conservation across four mammalian species: human, mouse, rat, and dog. This conservation did not extend to the comparable intron in pufferfish. Pairwise comparisons for intron 45 showed 91% identity (human vs. dog) to 100% identity (mouse vs. rat) for an average for all four species of 94% identity. In contrast, introns 43 and 44 of the PKD1 gene had average pairwise identities of 57% and 54%, and exons 43, 44, and 45 and the coding region of exon 46 had average pairwise identities of 80%, 84%, 82%, and 80%. Intron 45 is 90 to 95 bp in length, with the major region of sequence divergence being in a central 4-bp to 9-bp variable region. RNA secondary structure analysis of intron 45 predicts a branching stem-loop structure in which the central variable region lies in one loop and the putative branch point sequence lies in another loop, suggesting that the intron adopts a specific stem-loop structure that may be important for its removal. Although intron 45 appears to conform to the class of small, G-triplet-containing introns that are spliced by a mechanism utilizing intron definition, its high sequence conservation may be a reflection of constraints imposed by a unique mechanism that coordinates splicing of this last PKD1 intron with polyadenylation.

  18. Close evolutionary relatedness among functionally distantly related members of the (alpha/beta)8-barrel glycosyl hydrolases suggested by the similarity of their fifth conserved sequence region.

    PubMed

    Janecek, S

    1995-12-11

    A short conserved sequence equivalent to the fifth conserved sequence region of alpha-amylases (173_LPDLD, Aspergillus oryzae alpha-amylase) comprising the calcium-ligand aspartate, Asp-175, was identified in the amino acid sequences of several members of the family of (alpha/beta)8-barrel glycosyl hydrolases. Despite the fact that the aspartate is not invariantly conserved, the stretch can be easily recognised in all sequences to be positioned 26-28 amino acid residues in front of the well-known catalytic aspartate (Asp-206, A. oryzae alpha-amylase) located in the beta 4-strand of the barrel. The identification of this region revealed remarkable similarities between some alpha-amylases (those from Bacillus megaterium, Bacillus subtilis and Dictyoglomus thermophilum) on the one hand and several different enzyme specificities (such as oligo-1,6-glucosidase, amylomaltase and neopullulanase, respectively) on the other hand. The most interesting example was offered by B. subtilis alpha-amylase and potato amylomaltase with the regions LYDWN and LYDWK, respectively. These observations support the idea that all members of the family of glycosyl hydrolases adopting the structure of the alpha-amylase-type (alpha/beta)8-barrel are mutually closely related and the strict evolutionary borders separating the individual enzyme specificities can be hardly defined.

  19. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.

    PubMed

    Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M

    2010-12-15

    Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  20. Unusual Intron Conservation near Tissue-Regulated Exons Found by Splicing Microarrays

    PubMed Central

    Sugnet, Charles W; Srinivasan, Karpagam; Clark, Tyson A; O'Brien, Georgeann; Cline, Melissa S; Wang, Hui; Williams, Alan; Kulp, David; Blume, John E; Haussler, David; Ares, Manuel

    2006-01-01

    Alternative splicing contributes to both gene regulation and protein diversity. To discover broad relationships between regulation of alternative splicing and sequence conservation, we applied a systems approach, using oligonucleotide microarrays designed to capture splicing information across the mouse genome. In a set of 22 adult tissues, we observe differential expression of RNA containing at least two alternative splice junctions for about 40% of the 6,216 alternative events we could detect. Statistical comparisons identify 171 cassette exons whose inclusion or skipping is different in brain relative to other tissues and another 28 exons whose splicing is different in muscle. A subset of these exons is associated with unusual blocks of intron sequence whose conservation in vertebrates rivals that of protein-coding exons. By focusing on sets of exons with similar regulatory patterns, we have identified new sequence motifs implicated in brain and muscle splicing regulation. Of note is a motif that is strikingly similar to the branchpoint consensus but is located downstream of the 5′ splice site of exons included in muscle. Analysis of three paralogous membrane-associated guanylate kinase genes reveals that each contains a paralogous tissue-regulated exon with a similar tissue inclusion pattern. While the intron sequences flanking these exons remain highly conserved among mammalian orthologs, the paralogous flanking intron sequences have diverged considerably, suggesting unusually complex evolution of the regulation of alternative splicing in multigene families. PMID:16424921

  1. Structural characterization of a family of cytochromes c{sub 7} involved in Fe(III) respiration by Geobacter sulfurreducens.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pokkuluri, P. R.; Londer, Y. Y.; Yang, X.

    2010-02-01

    Periplasmic cytochromes c{sub 7} are important in electron transfer pathway(s) in Fe(III) respiration by Geobacter sulfurreducens. The genome of G. sulfurreducens encodes a family of five 10-kDa, three-heme cytochromes c{sub 7}. The sequence identity between the five proteins (designated PpcA, PpcB, PpcC, PpcD, and PpcE) varies between 45% and 77%. Here, we report the high-resolution structures of PpcC, PpcD, and PpcE determined by X-ray diffraction. This new information made it possible to compare the sequences and structures of the entire family. The triheme cores are largely conserved but are not identical. We observed changes, due to different crystal packing, inmore » the relative positions of the hemes between two molecules in the crystal. The overall protein fold of the cytochromes is similar. The structure of PpcD differs most from that of the other homologs, which is not obvious from the sequence comparisons of the family. Interestingly, PpcD is the only cytochrome c{sub 7} within the family that has higher abundance when G. sulfurreducens is grown on insoluble Fe(III) oxide compared to ferric citrate. The structures have the highest degree of conservation around 'heme IV'; the protein surface around this heme is positively charged in all of the proteins, and therefore all cytochromes c{sub 7} could interact with similar molecules involving this region. The structures and surface characteristics of the proteins near the other two hemes, 'heme I' and 'heme III', differ within the family. The above observations suggest that each of the five cytochromes c{sub 7} could interact with its own redox partner via an interface involving the regions of heme I and/or heme III; this provides a possible rationalization for the existence of five similar proteins in G. sulfurreducens.« less

  2. Improving effectiveness of systematic conservation planning with density data.

    PubMed

    Veloz, Samuel; Salas, Leonardo; Altman, Bob; Alexander, John; Jongsomjit, Dennis; Elliott, Nathan; Ballard, Grant

    2015-08-01

    Systematic conservation planning aims to design networks of protected areas that meet conservation goals across large landscapes. The optimal design of these conservation networks is most frequently based on the modeled habitat suitability or probability of occurrence of species, despite evidence that model predictions may not be highly correlated with species density. We hypothesized that conservation networks designed using species density distributions more efficiently conserve populations of all species considered than networks designed using probability of occurrence models. To test this hypothesis, we used the Zonation conservation prioritization algorithm to evaluate conservation network designs based on probability of occurrence versus density models for 26 land bird species in the U.S. Pacific Northwest. We assessed the efficacy of each conservation network based on predicted species densities and predicted species diversity. High-density model Zonation rankings protected more individuals per species when networks protected the highest priority 10-40% of the landscape. Compared with density-based models, the occurrence-based models protected more individuals in the lowest 50% priority areas of the landscape. The 2 approaches conserved species diversity in similar ways: predicted diversity was higher in higher priority locations in both conservation networks. We conclude that both density and probability of occurrence models can be useful for setting conservation priorities but that density-based models are best suited for identifying the highest priority areas. Developing methods to aggregate species count data from unrelated monitoring efforts and making these data widely available through ecoinformatics portals such as the Avian Knowledge Network will enable species count data to be more widely incorporated into systematic conservation planning efforts. © 2015, Society for Conservation Biology.

  3. The LINE-1 DNA sequences in four mammalian orders predict proteins that conserve homologies to retrovirus proteins.

    PubMed Central

    Fanning, T; Singer, M

    1987-01-01

    Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227

  4. PknB remains an essential and a conserved target for drug development in susceptible and MDR strains of M. Tuberculosis.

    PubMed

    Gupta, Anamika; Pal, Sudhir K; Pandey, Divya; Fakir, Najneen A; Rathod, Sunita; Sinha, Dhiraj; SivaKumar, S; Sinha, Pallavi; Periera, Mycal; Balgam, Shilpa; Sekar, Gomathi; UmaDevi, K R; Anupurba, Shampa; Nema, Vijay

    2017-08-18

    The Mycobacterium tuberculosis (M.tb) protein kinase B (PknB) which is now proved to be essential for the growth and survival of M.tb, is a transmembrane protein with a potential to be a good drug target. However it is not known if this target remains conserved in otherwise resistant isolates from clinical origin. The present study describes the conservation analysis of sequences covering the inhibitor binding domain of PknB to assess if it remains conserved in susceptible and resistant clinical strains of mycobacteria picked from three different geographical areas of India. A total of 116 isolates from North, South and West India were used in the study with a variable profile of their susceptibilities towards streptomycin, isoniazid, rifampicin, ethambutol and ofloxacin. Isolates were also spoligotyped in order to find if the conservation pattern of pknB gene remain consistent or differ with different spoligotypes. The impact of variation as found in the study was analyzed using Molecular dynamics simulations. The sequencing results with 115/116 isolates revealed the conserved nature of pknB sequences irrespective of their susceptibility status and spoligotypes. The only variation found was in one strains wherein pnkB sequence had G to A mutation at 664 position translating into a change of amino acid, Valine to Isoleucine. After analyzing the impact of this sequence variation using Molecular dynamics simulations, it was observed that the variation is causing no significant change in protein structure or the inhibitor binding. Hence, the study endorses that PknB is an ideal target for drug development and there is no pre-existing or induced resistance with respect to the sequences involved in inhibitor binding. Also if the mutation that we are reporting for the first time is found again in subsequent work, it should be checked with phenotypic profile before drawing the conclusion that it would affect the activity in any way. Bioinformatics analysis in our study says that it has no significant effect on the binding and hence the activity of the protein.

  5. Analysis of evolutionary conservation patterns and their influence on identifying protein functional sites.

    PubMed

    Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato

    2014-10-01

    Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.

  6. Conservation of an Intact vif Gene of Human Immunodeficiency Virus Type 1 during Maternal-Fetal Transmission

    PubMed Central

    Yedavalli, Venkat R. K.; Chappey, Colombe; Matala, Erik; Ahmad, Nafees

    1998-01-01

    The human immunodeficiency virus type 1 (HIV-1) vif gene is conserved among most lentiviruses, suggesting that vif is important for natural infection. To determine whether an intact vif gene is positively selected during mother-to-infant transmission, we analyzed vif sequences from five infected mother-infant pairs following perinatal transmission. The coding potential of the vif open reading frame directly derived from uncultured peripheral blood mononuclear cell DNA was maintained in most of the 78,912 bp sequenced. We found that 123 of the 137 clones analyzed showed an 89.8% frequency of intact vif open reading frames. There was a low degree of heterogeneity of vif genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vif sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Furthermore, the epidemiologically linked mother-infant pair vif sequences displayed similar patterns that were not seen in vif sequences from epidemiologically unlinked individuals. The functional domains, including the two cysteines at positions 114 and 133, a serine phosphorylation site at position 144, and the C-terminal basic amino acids essential for vif protein function, were highly conserved in most of the sequences. Phylogenetic analyses of 137 mother-infant pair vif sequences and 187 other available vif sequences from HIV-1 databases revealed distinct clusters for vif sequences from each mother-infant pair and for other vif sequences. Taken together, these findings suggest that vif plays an important role in HIV-1 infection and replication in mothers and their perinatally infected infants. PMID:9445004

  7. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities.

    PubMed

    Goris, Johan; Konstantinidis, Konstantinos T; Klappenbach, Joel A; Coenye, Tom; Vandamme, Peter; Tiedje, James M

    2007-01-01

    DNA-DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA-DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of "species". Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.

  8. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  9. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

    PubMed

    Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

    2014-11-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  10. Identification and characterization of microRNAs in Phaseolus vulgaris by high-throughput sequencing

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are endogenously encoded small RNAs that post-transcriptionally regulate gene expression. MiRNAs play essential roles in almost all plant biological processes. Currently, few miRNAs have been identified in the model food legume Phaseolus vulgaris (common bean). Recent advances in next generation sequencing technologies have allowed the identification of conserved and novel miRNAs in many plant species. Here, we used Illumina's sequencing by synthesis (SBS) technology to identify and characterize the miRNA population of Phaseolus vulgaris. Results Small RNA libraries were generated from roots, flowers, leaves, and seedlings of P. vulgaris. Based on similarity to previously reported plant miRNAs,114 miRNAs belonging to 33 conserved miRNA families were identified. Stem-loop precursors and target gene sequences for several conserved common bean miRNAs were determined from publicly available databases. Less conserved miRNA families and species-specific common bean miRNA isoforms were also characterized. Moreover, novel miRNAs based on the small RNAs were found and their potential precursors were predicted. In addition, new target candidates for novel and conserved miRNAs were proposed. Finally, we studied organ-specific miRNA family expression levels through miRNA read frequencies. Conclusions This work represents the first massive-scale RNA sequencing study performed in Phaseolus vulgaris to identify and characterize its miRNA population. It significantly increases the number of miRNAs, precursors, and targets identified in this agronomically important species. The miRNA expression analysis provides a foundation for understanding common bean miRNA organ-specific expression patterns. The present study offers an expanded picture of P. vulgaris miRNAs in relation to those of other legumes. PMID:22394504

  11. Genomic dissection of conserved transcriptional regulation in intestinal epithelial cells

    PubMed Central

    Camp, J. Gray; Weiser, Matthew; Cocchiaro, Jordan L.; Kingsley, David M.; Furey, Terrence S.; Sheikh, Shehzad Z.; Rawls, John F.

    2017-01-01

    The intestinal epithelium serves critical physiologic functions that are shared among all vertebrates. However, it is unknown how the transcriptional regulatory mechanisms underlying these functions have changed over the course of vertebrate evolution. We generated genome-wide mRNA and accessible chromatin data from adult intestinal epithelial cells (IECs) in zebrafish, stickleback, mouse, and human species to determine if conserved IEC functions are achieved through common transcriptional regulation. We found evidence for substantial common regulation and conservation of gene expression regionally along the length of the intestine from fish to mammals and identified a core set of genes comprising a vertebrate IEC signature. We also identified transcriptional start sites and other putative regulatory regions that are differentially accessible in IECs in all 4 species. Although these sites rarely showed sequence conservation from fish to mammals, surprisingly, they drove highly conserved IEC expression in a zebrafish reporter assay. Common putative transcription factor binding sites (TFBS) found at these sites in multiple species indicate that sequence conservation alone is insufficient to identify much of the functionally conserved IEC regulatory information. Among the rare, highly sequence-conserved, IEC-specific regulatory regions, we discovered an ancient enhancer upstream from her6/HES1 that is active in a distinct population of Notch-positive cells in the intestinal epithelium. Together, these results show how combining accessible chromatin and mRNA datasets with TFBS prediction and in vivo reporter assays can reveal tissue-specific regulatory information conserved across 420 million years of vertebrate evolution. We define an IEC transcriptional regulatory network that is shared between fish and mammals and establish an experimental platform for studying how evolutionarily distilled regulatory information commonly controls IEC development and physiology. PMID:28850571

  12. DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

    PubMed Central

    Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

    2009-01-01

    Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes. PMID:19534755

  13. Synteny conservation between the Prunus genome and both the present and ancestral Arabidopsis genomes

    PubMed Central

    Jung, Sook; Main, Dorrie; Staton, Margaret; Cho, Ilhyung; Zhebentyayeva, Tatyana; Arús, Pere; Abbott, Albert

    2006-01-01

    Background Due to the lack of availability of large genomic sequences for peach or other Prunus species, the degree of synteny conservation between the Prunus species and Arabidopsis has not been systematically assessed. Using the recently available peach EST sequences that are anchored to Prunus genetic maps and to peach physical map, we analyzed the extent of conserved synteny between the Prunus and the Arabidopsis genomes. The reconstructed pseudo-ancestral Arabidopsis genome, existed prior to the proposed recent polyploidy event, was also utilized in our analysis to further elucidate the evolutionary relationship. Results We analyzed the synteny conservation between the Prunus and the Arabidopsis genomes by comparing 475 peach ESTs that are anchored to Prunus genetic maps and their Arabidopsis homologs detected by sequence similarity. Microsyntenic regions were detected between all five Arabidopsis chromosomes and seven of the eight linkage groups of the Prunus reference map. An additional 1097 peach ESTs that are anchored to 431 BAC contigs of the peach physical map and their Arabidopsis homologs were also analyzed. Microsyntenic regions were detected in 77 BAC contigs. The syntenic regions from both data sets were short and contained only a couple of conserved gene pairs. The synteny between peach and Arabidopsis was fragmentary; all the Prunus linkage groups containing syntenic regions matched to more than two different Arabidopsis chromosomes, and most BAC contigs with multiple conserved syntenic regions corresponded to multiple Arabidopsis chromosomes. Using the same peach EST datasets and their Arabidopsis homologs, we also detected conserved syntenic regions in the pseudo-ancestral Arabidopsis genome. In many cases, the gene order and content of peach regions was more conserved in the ancestral genome than in the present Arabidopsis region. Statistical significance of each syntenic group was calculated using simulated Arabidopsis genome. Conclusion We report here the result of the first extensive analysis of the conserved microsynteny using DNA sequences across the Prunus genome and their Arabidopsis homologs. Our study also illustrates that both the ancestral and present Arabidopsis genomes can provide a useful resource for marker saturation and candidate gene search, as well as elucidating evolutionary relationships between species. PMID:16615871

  14. A Multireader Exploratory Evaluation of Individual Pulse Sequence Cancer Detection on Prostate Multiparametric Magnetic Resonance Imaging (MRI).

    PubMed

    Gaur, Sonia; Harmon, Stephanie; Gupta, Rajan T; Margolis, Daniel J; Lay, Nathan; Mehralivand, Sherif; Merino, Maria J; Wood, Bradford J; Pinto, Peter A; Shih, Joanna H; Choyke, Peter L; Turkbey, Baris

    2018-04-25

    To determine independent contribution of each prostate multiparametric magnetic resonance imaging (mpMRI) sequence to cancer detection when read in isolation. Prostate mpMRI at 3-Tesla with endorectal coil from 45 patients (n = 30 prostatectomy cases, n = 15 controls with negative magnetic resonance imaging [MRI] or biopsy) were retrospectively interpreted. Sequences (T2-weighted [T2W] MRI, diffusion-weighted imaging [DWI], and dynamic contrast-enhanced [DCE] MRI; N = 135) were separately distributed to three radiologists at different institutions. Readers evaluated each sequence blinded to other mpMRI sequences. Findings were correlated to whole-mount pathology. Cancer detection sensitivity, positive predictive value for whole prostate (WP), transition zone, and peripheral zone were evaluated per sequence by reader, with reader concordance measured by index of specific agreement. Cancer detection rates (CDRs) were calculated for combinations of independently read sequences. 44 patients were evaluable (cases median prostate-specific antigen 6.83 [ range 1.95-51.13] ng/mL, age 62 [45-71] years; controls prostate-specific antigen 6.85 [2.4-10.87] ng/mL, age 65.5 [47-71] years). Readers had highest sensitivity on DWI (59%) vs T2W MRI (48%) and DCE (23%) in WP. DWI-only positivity (DWI+/T2W-/DCE-) achieved highest CDR in WP (38%), compared to T2W-only (CDR 24%) and DCE-only (CDR 8%). DWI+/T2W+/DCE- achieved CDR 80%, an added benefit of 56.4% from T2W-only and of 42% from DWI-only (P < .0001). All three sequences interpreted independently positive gave highest CDR of 90%. Reader agreement was moderate (index of specific agreement: T2W = 54%, DWI = 58%, DCE = 33%). When prostate mpMRI sequences are interpreted independently by multiple observers, DWI achieves highest sensitivity and CDR in transition zone and peripheral zone. T2W and DCE MRI both add value to detection; mpMRI achieves highest detection sensitivity when all three mpMRI sequences are positive. Published by Elsevier Inc.

  15. Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures

    PubMed Central

    2012-01-01

    Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767

  16. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

    PubMed

    Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2012-02-15

    We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.

  17. Rapid functional diversification in the structurally conserved ELAV family of neuronal RNA binding proteins

    PubMed Central

    Samson, Marie-Laure

    2008-01-01

    Background The Drosophila gene embryonic lethal abnormal visual system (elav) is the prototype of a gene family present in all metazoans. Its members encode structurally conserved neuronal proteins with three RNA Recognition Motifs (RRM) but they paradoxically act at diverse levels of post-transcriptional regulation. In an attempt to understand the history of this family, we searched for orthologs in eleven completely sequenced genomes, including those of humans, D. melanogaster and C. elegans, for which cDNAs are available. Results We analyzed 23 orthologs/paralogs of elav, and found evidence of gain/loss of gene copy number. For one set of genes, including elav itself, the coding sequences are free of introns and their products most resemble ELAV. The remaining genes show remarkable conservation of their exon organization, and their products most resemble FNE and RBP9, proteins encoded by the two elav paralogs of Drosophila. Remarkably, three of the conserved exon junctions are both close to structural elements, involved respectively in protein-RNA interactions and in the regulation of sub-cellular localization, and in the vicinity of diverse sequence variations. Conclusion The data indicate that the essential elav gene of Drosophila is newly emerged, restricted to dipterans and of retrotransposed origin. We propose that the conserved exon junctions constitute potential sites for sequence/function modifications, and that RRM binding proteins, whose function relies upon plastic RNA-protein interactions, may have played an important role in brain evolution. PMID:18715504

  18. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    PubMed

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  19. A field ornithologist’s guide to genomics: Practical considerations for ecology and conservation

    USGS Publications Warehouse

    Oyler-McCance, Sara J.; Oh, Kevin; Langin, Kathryn; Aldridge, Cameron L.

    2016-01-01

    Vast improvements in sequencing technology have made it practical to simultaneously sequence millions of nucleotides distributed across the genome, opening the door for genomic studies in virtually any species. Ornithological research stands to benefit in three substantial ways. First, genomic methods enhance our ability to parse and simultaneously analyze both neutral and non-neutral genomic regions, thus providing insight into adaptive evolution and divergence. Second, the sheer quantity of sequence data generated by current sequencing platforms allows increased precision and resolution in analyses. Third, high-throughput sequencing can benefit applications that focus on a small number of loci that are otherwise prohibitively expensive, time-consuming, and technically difficult using traditional sequencing methods. These advances have improved our ability to understand evolutionary processes like speciation and local adaptation, but they also offer many practical applications in the fields of population ecology, migration tracking, conservation planning, diet analyses, and disease ecology. This review provides a guide for field ornithologists interested in incorporating genomic approaches into their research program, with an emphasis on techniques related to ecology and conservation. We present a general overview of contemporary genomic approaches and methods, as well as important considerations when selecting a genomic technique. We also discuss research questions that are likely to benefit from utilizing high-throughput sequencing instruments, highlighting select examples from recent avian studies.

  20. DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

    PubMed

    de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

    2015-11-16

    Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Molecular characterisation of Atlantic salmon paramyxovirus (ASPV): A novel paramyxovirus associated with proliferative gill inflammation

    USGS Publications Warehouse

    Falk, K.; Batts, W.N.; Kvellestad, A.; Kurath, G.; Wiik-Nielsen, J.; Winton, J.R.

    2008-01-01

    Atlantic salmon paramyxovirus (ASPV) was isolated in 1995 from gills of farmed Atlantic salmon suffering from proliferative gill inflammation. The complete genome sequence of ASPV was determined, revealing a genome 16,968 nucleotides in length consisting of six non-overlapping genes coding for the nucleo- (N), phospho- (P), matrix- (M), fusion- (F), haemagglutinin-neuraminidase- (HN) and large polymerase (L) proteins in the order 3???-N-P-M-F-HN-L-5???. The various conserved features related to virus replication found in most paramyxoviruses were also found in ASPV. These include: conserved and complementary leader and trailer sequences, tri-nucleotide intergenic regions and highly conserved transcription start and stop signal sequences. The P gene expression strategy of ASPV was like that of the respiro-, morbilli- and henipaviruses, which express the P and C proteins from the primary transcript and edit a portion of the mRNA to encode V and W proteins. Sequence similarities among various features related to virus replication, pairwise comparisons of all deduced ASPV protein sequences with homologous regions from other members of the family Paramyxoviridae, and phylogenetic analyses of these amino acid sequences suggested that ASPV was a novel member of the sub-family Paramyxovirinae, most closely related to the respiroviruses. ?? 2008 Elsevier B.V. All rights reserved.

  2. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology.

    PubMed

    Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai

    2017-11-23

    The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  3. Characterization and Evolution of Conserved MicroRNA through Duplication Events in Date Palm (Phoenix dactylifera)

    PubMed Central

    Yang, Yaodong; Mason, Annaliese S.; Lei, Xintao; Ma, Zilong

    2013-01-01

    MicroRNAs (miRNAs) are important regulators of gene expression at the post-transcriptional level in a wide range of species. Highly conserved miRNAs regulate ancestral transcription factors common to all plants, and control important basic processes such as cell division and meristem function. We selected 21 conserved miRNA families to analyze the distribution and maintenance of miRNAs. Recently, the first genome sequence in Palmaceae was released: date palm (Phoenix dactylifera). We conducted a systematic miRNA analysis in date palm, computationally identifying and characterizing the distribution and duplication of conserved miRNAs in this species compared to other published plant genomes. A total of 81 miRNAs belonging to 18 miRNA families were identified in date palm. The majority of miRNAs in date palm and seven other well-studied plant species were located in intergenic regions and located 4 to 5 kb away from the nearest protein-coding genes. Sequence comparison showed that 67% of date palm miRNA members were present in duplicated segments, and that 135 pairs of miRNA-containing segments were duplicated in Arabidopsis, tomato, orange, rice, apple, poplar and soybean with a high similarity of non coding sequences between duplicated segments, indicating genomic duplication was a major force for expansion of conserved miRNAs. Duplicated miRNA pairs in date palm showed divergence in pre-miRNA sequence and in number of promoters, implying that these duplicated pairs may have undergone divergent evolution. Comparisons between date palm and the seven other plant species for the gain/loss of miR167 loci in an ancient segment shared between monocots and dicots suggested that these conserved miRNAs were highly influenced by and diverged as a result of genomic duplication events. PMID:23951162

  4. Characterization and evolution of conserved MicroRNA through duplication events in date palm (Phoenix dactylifera).

    PubMed

    Xiao, Yong; Xia, Wei; Yang, Yaodong; Mason, Annaliese S; Lei, Xintao; Ma, Zilong

    2013-01-01

    MicroRNAs (miRNAs) are important regulators of gene expression at the post-transcriptional level in a wide range of species. Highly conserved miRNAs regulate ancestral transcription factors common to all plants, and control important basic processes such as cell division and meristem function. We selected 21 conserved miRNA families to analyze the distribution and maintenance of miRNAs. Recently, the first genome sequence in Palmaceae was released: date palm (Phoenix dactylifera). We conducted a systematic miRNA analysis in date palm, computationally identifying and characterizing the distribution and duplication of conserved miRNAs in this species compared to other published plant genomes. A total of 81 miRNAs belonging to 18 miRNA families were identified in date palm. The majority of miRNAs in date palm and seven other well-studied plant species were located in intergenic regions and located 4 to 5 kb away from the nearest protein-coding genes. Sequence comparison showed that 67% of date palm miRNA members were present in duplicated segments, and that 135 pairs of miRNA-containing segments were duplicated in Arabidopsis, tomato, orange, rice, apple, poplar and soybean with a high similarity of non coding sequences between duplicated segments, indicating genomic duplication was a major force for expansion of conserved miRNAs. Duplicated miRNA pairs in date palm showed divergence in pre-miRNA sequence and in number of promoters, implying that these duplicated pairs may have undergone divergent evolution. Comparisons between date palm and the seven other plant species for the gain/loss of miR167 loci in an ancient segment shared between monocots and dicots suggested that these conserved miRNAs were highly influenced by and diverged as a result of genomic duplication events.

  5. The crystal structure of Erwinia amylovora AmyR, a member of the YbjN protein family, shows similarity to type III secretion chaperones but suggests different cellular functions

    PubMed Central

    Bartho, Joseph D.; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O.; Zhao, Youfu; Walsh, Martin A.

    2017-01-01

    AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family. PMID:28426806

  6. The crystal structure of Erwinia amylovora AmyR, a member of the YbjN protein family, shows similarity to type III secretion chaperones but suggests different cellular functions.

    PubMed

    Bartho, Joseph D; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O; Zhao, Youfu; Walsh, Martin A; Benini, Stefano

    2017-01-01

    AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family.

  7. Molecular cloning, sequence analysis and homology modeling of the first caudata amphibian antifreeze-like protein in axolotl (Ambystoma mexicanum).

    PubMed

    Zhang, Songyan; Gao, Jiuxiang; Lu, Yiling; Cai, Shasha; Qiao, Xue; Wang, Yipeng; Yu, Haining

    2013-08-01

    Antifreeze proteins (AFPs) refer to a class of polypeptides that are produced by certain vertebrates, plants, fungi, and bacteria and which permit their survival in subzero environments. In this study, we report the molecular cloning, sequence analysis and three-dimensional structure of the axolotl antifreeze-like protein (AFLP) by homology modeling of the first caudate amphibian AFLP. We constructed a full-length spleen cDNA library of axolotl (Ambystoma mexicanum). An EST having highest similarity (∼42%) with freeze-responsive liver protein Li16 from Rana sylvatica was identified, and the full-length cDNA was subsequently obtained by RACE-PCR. The axolotl antifreeze-like protein sequence represents an open reading frame for a putative signal peptide and the mature protein composed of 93 amino acids. The calculated molecular mass and the theoretical isoelectric point (pl) of this mature protein were 10128.6 Da and 8.97, respectively. The molecular characterization of this gene and its deduced protein were further performed by detailed bioinformatics analysis. The three-dimensional structure of current AFLP was predicted by homology modeling, and the conserved residues required for functionality were identified. The homology model constructed could be of use for effective drug design. This is the first report of an antifreeze-like protein identified from a caudate amphibian.

  8. Identification and analysis of proton-translocating pyrophosphatases in the methanogenic archaeon Methansarcina mazei.

    PubMed

    Bäumer, Sebastian; Lentes, Sabine; Gottschalk, Gerhard; Deppenmeier, Uwe

    2002-03-01

    Analysis of genome sequence data from the methanogenic archaeon Methanosarcina mazei Gö1 revealed the existence of two open reading frames encoding proton-translocating pyrophosphatases (PPases). These open reading frames are linked by a 750-bp intergenic region containing TC-rich stretches and are transcribed in opposite directions. The corresponding polypeptides are referred to as Mvp1 and Mvp2 and consist of 671 and 676 amino acids, respectively. Both enzymes represent extremely hydrophobic, integral membrane proteins with 15 predicted transmembrane segments and an overall amino acid sequence similarity of 50.1%. Multiple sequence alignments revealed that Mvp1 is closely related to eukaryotic PPases, whereas Mvp2 shows highest homologies to bacterial PPases. Northern blot experiments with RNA from methanol-grown cells harvested in the mid-log growth phase indicated that only Mvp2 was produced under these conditions. Analysis of washed membranes showed that Mvp2 had a specific activity of 0.34 U mg (protein)(-1). Proton translocation experiments with inverted membrane vesicles prepared from methanol-grown cells showed that hydrolysis of 1 mol of pyrophosphate was coupled to the translocation of about 1 mol of protons across the cytoplasmic membrane. Appropriate conditions for mvp1 expression could not be determined yet. The pyrophosphatases of M. mazei Gö1 represent the first examples of this enzyme class in methanogenic archaea and may be part of their energy-conserving system.

  9. The Small-RNA Profiles of Almond (Prunus dulcis Mill.) Reproductive Tissues in Response to Cold Stress.

    PubMed

    Karimi, Marzieh; Ghazanfari, Farahnaz; Fadaei, Adeleh; Ahmadi, Laleh; Shiran, Behrouz; Rabei, Mohammad; Fallahi, Hossein

    2016-01-01

    Spring frost is an important environmental stress that threatens the production of Prunus trees. However, little information is available regarding molecular response of these plants to the frost stress. Using high throughput sequencing, this study was conducted to identify differentially expressed miRNAs, both the conserved and the non-conserved ones, in the reproductive tissues of almond tolerant H genotype under cold stress. Analysis of 50 to 58 million raw reads led to identification of 174 unique conserved and 59 novel microRNAs (miRNAs). Differential expression pattern analysis showed that 50 miRNA families were expressed differentially in one or both of almond reproductive tissues (anther and ovary). Out of these 50 miRNA families, 12 and 15 displayed up-regulation and down-regulation, respectively. The distribution of conserved miRNA families indicated that miR482f harbor the highest number of members. Confirmation of miRNAs expression patterns by quantitative real- time PCR (qPCR) was performed in cold tolerant (H genotype) alongside a sensitive variety (Sh12 genotype). Our analysis revealed differential expression for 9 miRNAs in anther and 3 miRNAs in ovary between these two varieties. Target prediction of miRNAs followed by differential expression analysis resulted in identification of 83 target genes, mostly transcription factors. This study comprehensively catalogued expressed miRNAs under different temperatures in two reproductive tissues (anther and ovary). Results of current study and the previous RNA-seq study, which was conducted in the same tissues by our group, provide a unique opportunity to understand the molecular basis of responses of almond to cold stress. The results can also enhance the possibility for gene manipulation to develop cold tolerant plants.

  10. Endemicity and evolutionary value: a study of Chilean endemic vascular plant genera

    PubMed Central

    Scherson, Rosa A; Albornoz, Abraham A; Moreira-Muñoz, Andrés S; Urbina-Casanova, Rafael

    2014-01-01

    This study uses phylogeny-based measures of evolutionary potential (phylogenetic diversity and community structure) to evaluate the evolutionary value of vascular plant genera endemic to Chile. Endemicity is regarded as a very important consideration for conservation purposes. Taxa that are endemic to a single country are valuable conservation targets, as their protection depends upon a single government policy. This is especially relevant in developing countries in which conservation is not always a high resource allocation priority. Phylogeny-based measures of evolutionary potential such as phylogenetic diversity (PD) have been regarded as meaningful measures of the “value” of taxa and ecosystems, as they are able to account for the attributes that could allow taxa to recover from environmental changes. Chile is an area of remarkable endemism, harboring a flora that shows the highest number of endemic genera in South America. We studied PD and community structure of this flora using a previously available supertree at the genus level, to which we added DNA sequences of 53 genera endemic to Chile. Using discrepancy values and a null model approach, we decoupled PD from taxon richness, in order to compare their geographic distribution over a one-degree grid. An interesting pattern was observed in which areas to the southwest appear to harbor more PD than expected by their generic richness than those areas to the north of the country. In addition, some southern areas showed more PD than expected by chance, as calculated with the null model approach. Geological history as documented by the study of ancient floras as well as glacial refuges in the coastal range of southern Chile during the quaternary seem to be consistent with the observed pattern, highlighting the importance of this area for conservation purposes. PMID:24683462

  11. The Small-RNA Profiles of Almond (Prunus dulcis Mill.) Reproductive Tissues in Response to Cold Stress

    PubMed Central

    Shiran, Behrouz; Rabei, Mohammad; Fallahi, Hossein

    2016-01-01

    Spring frost is an important environmental stress that threatens the production of Prunus trees. However, little information is available regarding molecular response of these plants to the frost stress. Using high throughput sequencing, this study was conducted to identify differentially expressed miRNAs, both the conserved and the non-conserved ones, in the reproductive tissues of almond tolerant H genotype under cold stress. Analysis of 50 to 58 million raw reads led to identification of 174 unique conserved and 59 novel microRNAs (miRNAs). Differential expression pattern analysis showed that 50 miRNA families were expressed differentially in one or both of almond reproductive tissues (anther and ovary). Out of these 50 miRNA families, 12 and 15 displayed up-regulation and down-regulation, respectively. The distribution of conserved miRNA families indicated that miR482f harbor the highest number of members. Confirmation of miRNAs expression patterns by quantitative real- time PCR (qPCR) was performed in cold tolerant (H genotype) alongside a sensitive variety (Sh12 genotype). Our analysis revealed differential expression for 9 miRNAs in anther and 3 miRNAs in ovary between these two varieties. Target prediction of miRNAs followed by differential expression analysis resulted in identification of 83 target genes, mostly transcription factors. This study comprehensively catalogued expressed miRNAs under different temperatures in two reproductive tissues (anther and ovary). Results of current study and the previous RNA-seq study, which was conducted in the same tissues by our group, provide a unique opportunity to understand the molecular basis of responses of almond to cold stress. The results can also enhance the possibility for gene manipulation to develop cold tolerant plants. PMID:27253370

  12. The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence.

    PubMed

    Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J

    2015-09-18

    La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Analysis of the intergenic region of tomato spotted wilt Tospovirus medium RNA segment.

    PubMed

    Bhat, A I; Pappu, S S; Pappu, H R; Deom, C M; Culbreath, A K

    1999-06-01

    The intergenic region (IGR) of the medium (M) RNA of tomato spotted wilt Tospovirus (TSWV) isolates naturally infecting peanut (groundnut), pepper, potato, stokesia, tobacco and watermelon in Georgia (GA) and a peanut isolate from Florida (FL) was cloned and sequenced. The IGR sequences were compared with one another and with respective M RNA IGRs of TSWV isolates from Brazil and Japan and other tospoviruses. The length of M IGR of GA and FL isolates varied from 271 to 277 nucleotides. The M IGRs of TSWV from potato and stokesia, and tobacco and watermelon were identical with each other in their length and sequence. IGR sequences were more conserved (95-100%) among the populations of TSWV from GA and FL, than when compared with those of TSWV isolates from other countries (83-94%). The conserved motif (CAAACTTTGG) present in the IGRs of both M and small (S) RNAs of a Brazilian isolate of TSWV was also conserved in the isolates studied. Cluster analysis of the IGR sequences showed that all GA and FL isolates are closely clustered and are distinct from the TSWV isolates from other countries as well as from other tospoviruses.

  14. The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence

    DOE PAGES

    Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...

    2015-07-22

    La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less

  15. Insilico profiling of microRNAs in Korean ginseng (Panax ginseng Meyer)

    PubMed Central

    Mathiyalagan, Ramya; Subramaniyam, Sathiyamoorthy; Natarajan, Sathishkumar; Kim, Yeon Ju; Sun, Myung Suk; Kim, Se Young; Kim, Yu-Jin; Yang, Deok Chun

    2013-01-01

    MicroRNAs (miRNAs) are a class of recently discovered non-coding small RNA molecules, on average approximately 21 nucleotides in length, which underlie numerous important biological roles in gene regulation in various organisms. The miRNA database (release 18) has 18,226 miRNAs, which have been deposited from different species. Although miRNAs have been identified and validated in many plant species, no studies have been reported on discovering miRNAs in Panax ginseng Meyer, which is a traditionally known medicinal plant in oriental medicine, also known as Korean ginseng. It has triterpene ginseng saponins called ginsenosides, which are responsible for its various pharmacological activities. Predicting conserved miRNAs by homology-based analysis with available expressed sequence tag (EST) sequences can be powerful, if the species lacks whole genome sequence information. In this study by using the EST based computational approach, 69 conserved miRNAs belonging to 44 miRNA families were identified in Korean ginseng. The digital gene expression patterns of predicted conserved miRNAs were analyzed by deep sequencing using small RNA sequences of flower buds, leaves, and lateral roots. We have found that many of the identified miRNAs showed tissue specific expressions. Using the insilico method, 346 potential targets were identified for the predicted 69 conserved miRNAs by searching the ginseng EST database, and the predicted targets were mainly involved in secondary metabolic processes, responses to biotic and abiotic stress, and transcription regulator activities, as well as a variety of other metabolic processes. PMID:23717176

  16. [Analysis of genotype and phenotype correlation of MYH7-V878A mutation among ethnic Han Chinese pedigrees affected with hypertrophic cardiomyopathy].

    PubMed

    Wang, Bo; Guo, Ruiqi; Zuo, Lei; Shao, Hong; Liu, Ying; Wang, Yu; Ju, Yan; Sun, Chao; Wang, Lifeng; Zhang, Yanmin; Liu, Liwen

    2017-08-10

    To analyze the phenotype-genotype correlation of MYH7-V878A mutation. Exonic amplification and high-throughput sequencing of 96-cardiovascular disease-related genes were carried out on probands from 210 pedigrees affected with hypertrophic cardiomyopathy (HCM). For the probands, their family members, and 300 healthy volunteers, the identified MYH7-V878A mutation was verified by Sanger sequencing. Information of the HCM patients and their family members, including clinical data, physical examination, echocardiography (UCG), electrocardiography (ECG), and conserved sequence of the mutation among various species were analyzed. A MYH7-V878A mutation was detected in five HCM pedigrees containing 31 family members. Fourteen members have carried the mutation, among whom 11 were diagnosed with HCM, while 3 did not meet the diagnostic criteria. Some of the fourteen members also carried other mutations. Family members not carrying the mutation had normal UCG and ECG. No MYH7-V878A mutation was found among the 300 healthy volunteers. Analysis of sequence conservation showed that the amino acid is located in highly conserved regions among various species. MYH7-V878A is a hot spot among ethnic Han Chinese with a high penetrance. Functional analysis of the conserved sequences suggested that the mutation may cause significant alteration of the function. MYH7-V878A has a significant value for the early diagnosis of HCM.

  17. [Cloning and functional characterization of phytoene desaturase in Andrographis paniculata].

    PubMed

    Shen, Qin-qin; Li, Li-xia; Zhan, Peng-lin; Wang, Qiang

    2015-10-01

    A full-length cDNA of phytoene desaturase (PDS) gene from Andrographis paniculata was obtained through RACE-PCR. The cDNA sequence consists of 2 224 bp with an intact ORF of 1 752 bp (GeneBank: KP982892), encoding a ploypeptide of 584 amino acids. Homology analysis showed that the deduced protein has extensive sequence similarities to PDS from other plants, and contains a conserved NAD ( H) -binding domain of plant dehydrase cofactor binding-domain in N-terminal. Phylogenetic analysis demonstrated that ApPDS was more related to PDS of Sesamum indicum and Pogostemon cablin. The semi-quantitative RT-PCR analysis revealed that ApPDS expressed in whole aboveground tissues with the highest expression in leaves. Virus induced gene silencing (VIGS) was performed to characterize the functional of ApPDS in planta. Significant photobleaching was not observed in infiltrated leaves, while the PDS gene has been down-regulated significantly at the yellowish area. To the best of our knowledge, this represents the first report of PDS gene cloning and functional characterization from A. paniculata, which lays the foundation for further investigation of new genes, especially that correlative to andrographolide biosynthetic pathway.

  18. Missense polymorphisms in the MC1R gene of the dog, red fox, arctic fox and Chinese raccoon dog.

    PubMed

    Nowacka-Woszuk, J; Salamon, S; Gorna, A; Switonski, M

    2013-04-01

    Coat colour variation is determined by many genes, one of which is the melanocortin receptor type 1 (MC1R) gene. In this study, we examined the whole coding sequence of this gene in four species belonging to the Canidae family (dog, red fox, arctic fox and Chinese raccoon dog). Although the comparative analysis of the obtained nucleotide sequences revealed a high conservation, which varied between 97.9 and 99.1%, we altogether identified 22 SNPs (10 in dogs, six in farmed red foxes, two in wild red foxes, three in arctic foxes and one in Chinese raccoon dog). Among them, seven appeared to be novel: one silent in the dog, three missense and one silent in the red fox, one in the 3'-flanking region in the arctic fox and one silent in the Chinese raccoon dog. In dogs and red foxes, the SNPs segregated as 10 and four haplotypes, respectively. Taking into consideration the published reports and results of this study, the highest number of missense polymorphisms was until now found in the dog (9) and red fox (7). © 2012 Blackwell Verlag GmbH.

  19. Use of armored RNA as a standard to construct a calibration curve for real-time RT-PCR.

    PubMed

    Donia, D; Divizia, M; Pana', A

    2005-06-01

    Armored Enterovirus RNA was used to standardize a real-time reverse transcription (RT)-PCR for environmental testing. Armored technology is a system to produce a robust and stable RNA standard, trapped into phage proteins, to be used as internal control. The Armored Enterovirus RNA protected sequence includes 263 bp of highly conserved sequences in 5' UTR region. During these tests, Armored RNA has been used to produce a calibration curve, comparing three different fluorogenic chemistry: TaqMan system, Syber Green I and Lux-primers. The effective evaluation of three amplifying commercial reagent kits, in use to carry out real-time RT-PCR, and several extraction procedures of protected viral RNA have been carried out. The highest Armored RNA recovery was obtained by heat treatment while chemical extraction may decrease the quantity of RNA. The best sensitivity and specificity was obtained using the Syber Green I technique since it is a reproducible test, easy to use and the cheapest one. TaqMan and Lux-primer assays provide good RT-PCR efficiency in relationship to the several extraction methods used, since labelled probe or primer request in these chemistry strategies, increases the cost of testing.

  20. Arsenic bioremediation potential of a new arsenite-oxidizing bacterium Stenotrophomonas sp. MM-7 isolated from soil.

    PubMed

    Bahar, Md Mezbaul; Megharaj, Mallavarapu; Naidu, Ravi

    2012-11-01

    A new arsenite-oxidizing bacterium was isolated from a low arsenic-containing (8.8 mg kg(-1)) soil. Phylogenetic analysis based on 16S rRNA gene sequencing indicated that the strain was closely related to Stenotrophomonas panacihumi. Batch experiment results showed that the strain completely oxidized 500 μM of arsenite to arsenate within 12 h of incubation in a minimal salts medium. The optimum initial pH range for arsenite oxidation was 5-7. The strain was found to tolerate as high as 60 mM arsenite in culture media. The arsenite oxidase gene was amplified by PCR with degenerate primers. The deduced amino acid sequence showed the highest identity (69.1 %) with the molybdenum containing large subunit of arsenite oxidase derived from Bosea sp. Furthermore the amino acids involved in binding the substrate arsenite, were conserved with the arsenite oxidases of other arsenite oxidizing bacteria such as Alcaligenes feacalis and Herminnimonas arsenicoxydans. To our knowledge, this study constitutes the first report on arsenite oxidation using Stenotrophomonas sp. and the strain has great potential for application in arsenic remediation of contaminated water.

  1. Dominant Sequences of Human Major Histocompatibility Complex Conserved Extended Haplotypes from HLA-DQA2 to DAXX

    PubMed Central

    Larsen, Charles E.; Alford, Dennis R.; Trautwein, Michael R.; Jalloh, Yanoh K.; Tarnacki, Jennifer L.; Kunnenkeri, Sushruta K.; Fici, Dolores A.; Yunis, Edmond J.; Awdeh, Zuheir L.; Alper, Chester A.

    2014-01-01

    We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight “common” European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots. PMID:25299700

  2. Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.

    PubMed

    Sakai, Ryo; Aerts, Jan

    2014-01-01

    The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.

  3. T cell receptor repertoires of mice and humans are clustered in similarity networks around conserved public CDR3 sequences

    PubMed Central

    Madi, Asaf; Poran, Asaf; Shifrut, Eric; Reich-Zeliger, Shlomit; Greenstein, Erez; Zaretsky, Irena; Arnon, Tomer; Laethem, Francois Van; Singer, Alfred; Lu, Jinghua; Sun, Peter D; Cohen, Irun R; Friedman, Nir

    2017-01-01

    Diversity of T cell receptor (TCR) repertoires, generated by somatic DNA rearrangements, is central to immune system function. However, the level of sequence similarity of TCR repertoires within and between species has not been characterized. Using network analysis of high-throughput TCR sequencing data, we found that abundant CDR3-TCRβ sequences were clustered within networks generated by sequence similarity. We discovered a substantial number of public CDR3-TCRβ segments that were identical in mice and humans. These conserved public sequences were central within TCR sequence-similarity networks. Annotated TCR sequences, previously associated with self-specificities such as autoimmunity and cancer, were linked to network clusters. Mechanistically, CDR3 networks were promoted by MHC-mediated selection, and were reduced following immunization, immune checkpoint blockade or aging. Our findings provide a new view of T cell repertoire organization and physiology, and suggest that the immune system distributes its TCR sequences unevenly, attending to specific foci of reactivity. DOI: http://dx.doi.org/10.7554/eLife.22057.001 PMID:28731407

  4. Beta-globin locus activation regions: conservation of organization, structure, and function.

    PubMed Central

    Li, Q L; Zhou, B; Powers, P; Enver, T; Stamatoyannopoulos, G

    1990-01-01

    The human beta-globin locus activation region (LAR) comprises four erythroid-specific DNase I hypersensitive sites (I-IV) thought to be largely responsible for activating the beta-globin domain and facilitating high-level erythroid-specific globin gene expression. We identified the goat beta-globin LAR, determined 10.2 kilobases of its sequence, and demonstrated its function in transgenic mice. The human and goat LARs share 6.5 kilobases of homologous sequences that are as highly conserved as the epsilon-globin gene promoters. Furthermore, the overall spatial organization of the two LARs has been conserved. These results suggest that the functionally relevant regions of the LAR are large and that in addition to their primary structure, the spatial relationship of the conserved elements is important for LAR function. Images PMID:2236034

  5. Physical mapping of repetitive DNA suggests 2n reduction in Amazon turtles Podocnemis (Testudines: Podocnemididae)

    PubMed Central

    Cavalcante, Manoella Gemaque; Bastos, Carlos Eduardo Matos Carvalho; Nagamachi, Cleusa Yoshiko; Pieczarka, Julio Cesar; Vicari, Marcelo Ricardo; Noronha, Renata Coelho Rodrigues

    2018-01-01

    Cytogenetic studies show that there is great karyotypic diversity in order Testudines (2n = 26–68), and that this may be mainly attributed to the presence/absence of microchromosomes. Members of the Podocnemididae family have the smallest diploid numbers of this order (2n = 26–28), which may be a derived condition of the group. Diverse studies suggest that repetitive-DNA-rich sites generally act as hotspots for double-strand breaks and chromosomal reorganization. In this context, we used fluorescent in situ hybridization (FISH) to map telomeric sequences (TTAGGG)n, 45S rDNA, and the genes encoding histones H1 and H3 in two species of genus Podocnemis. We also observed conservation of the 45S rDNA and H1 histone sequences (probable case of conserved synteny), but multiple conserved and non-conserved clusters of H3 genes, which colocalized with the interstitial telomeric sequences in the Podocnemis genome. Our results suggest that fusions have occurred between macro and microchromosomes or between microchromosomes, leading to the observed reduction in diploid number in the family Podocnemididae. PMID:29813087

  6. Physical mapping of repetitive DNA suggests 2n reduction in Amazon turtles Podocnemis (Testudines: Podocnemididae).

    PubMed

    Cavalcante, Manoella Gemaque; Bastos, Carlos Eduardo Matos Carvalho; Nagamachi, Cleusa Yoshiko; Pieczarka, Julio Cesar; Vicari, Marcelo Ricardo; Noronha, Renata Coelho Rodrigues

    2018-01-01

    Cytogenetic studies show that there is great karyotypic diversity in order Testudines (2n = 26-68), and that this may be mainly attributed to the presence/absence of microchromosomes. Members of the Podocnemididae family have the smallest diploid numbers of this order (2n = 26-28), which may be a derived condition of the group. Diverse studies suggest that repetitive-DNA-rich sites generally act as hotspots for double-strand breaks and chromosomal reorganization. In this context, we used fluorescent in situ hybridization (FISH) to map telomeric sequences (TTAGGG)n, 45S rDNA, and the genes encoding histones H1 and H3 in two species of genus Podocnemis. We also observed conservation of the 45S rDNA and H1 histone sequences (probable case of conserved synteny), but multiple conserved and non-conserved clusters of H3 genes, which colocalized with the interstitial telomeric sequences in the Podocnemis genome. Our results suggest that fusions have occurred between macro and microchromosomes or between microchromosomes, leading to the observed reduction in diploid number in the family Podocnemididae.

  7. Conservation of the glycoprotein B homologs of the Kaposi’s sarcoma-associated herpesvirus (KSHV/HHV8) and Old World primate rhadinoviruses of chimpanzees and macaques

    PubMed Central

    Bruce, A. Gregory; Horst, Jeremy A.; Rose, Timothy M.

    2016-01-01

    The envelope-associated glycoprotein B (gB) is highly conserved within the Herpesviridae and plays a critical role in viral entry. We analyzed the evolutionary conservation of sequence and structural motifs within the Kaposi’s sarcoma-associated herpesvirus (KSHV) gB and homologs of Old World primate rhadinoviruses belonging to the distinct RV1 and RV2 rhadinovirus lineages. In addition to gB homologs of rhadinoviruses infecting the pig-tailed and rhesus macaques, we cloned and sequenced gB homologs of RV1 and RV2 rhadinoviruses infecting chimpanzees. A structural model of the KSHV gB was determined, and functional motifs and sequence variants were mapped to the model structure. Conserved domains and motifs were identified, including an “RGD” motif that plays a critical role in KSHV binding and entry through the cellular integrin αVβ3. The RGD motif was only detected in RV1 rhadinoviruses suggesting an important difference in cell tropism between the two rhadinovirus lineages. PMID:27070755

  8. Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

    PubMed

    Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

    2014-04-23

    Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.

  9. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE PAGES

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.; ...

    2017-07-18

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  10. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  11. Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

    PubMed Central

    Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

    2011-01-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875

  12. Comparative RNA sequencing reveals substantial genetic variation in endangered primates

    PubMed Central

    Perry, George H.; Melsted, Páll; Marioni, John C.; Wang, Ying; Bainer, Russell; Pickrell, Joseph K.; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D.; Stephens, Matthew; Pritchard, Jonathan K.; Gilad, Yoav

    2012-01-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success. PMID:22207615

  13. Hop stunt viroid: molecular cloning and nucleotide sequence of the complete cDNA copy.

    PubMed Central

    Ohno, T; Takamatsu, N; Meshi, T; Okada, Y

    1983-01-01

    The complete cDNA of hop stunt viroid (HSV) has been cloned by the method of Okayama and Berg (Mol.Cell.Biol.2,161-170. (1982] and the complete nucleotide sequence has been established. The covalently closed circular single-stranded HSV RNA consists of 297 nucleotides. The secondary structure predicted for HSV contains 67% of its residues base-paired. The native HSV can possess an extended rod-like structure characteristic of viroids previously established. The central region of the native HSV has a similar structure to the conserved region found in all viroids sequenced so far except for avocado sunblotch viroid. The sequence homologous to the 5'-end of U1a RNA is also found in the sequence of HSV but not in the central conserved region. Images PMID:6312412

  14. Genomewide Function Conservation and Phylogeny in the Herpesviridae

    PubMed Central

    Albà, M. Mar; Das, Rhiju; Orengo, Christine A.; Kellam, Paul

    2001-01-01

    The Herpesviridae are a large group of well-characterized double-stranded DNA viruses for which many complete genome sequences have been determined. We have extracted protein sequences from all predicted open reading frames of 19 herpesvirus genomes. Sequence comparison and protein sequence clustering methods have been used to construct herpesvirus protein homologous families. This resulted in 1692 proteins being clustered into 243 multiprotein families and 196 singleton proteins. Predicted functions were assigned to each homologous family based on genome annotation and published data and each family classified into seven broad functional groups. Phylogenetic profiles were constructed for each herpesvirus from the homologous protein families and used to determine conserved functions and genomewide phylogenetic trees. These trees agreed with molecular-sequence-derived trees and allowed greater insight into the phylogeny of ungulate and murine gammaherpesviruses. PMID:11156614

  15. A highly conserved N-terminal sequence for teleost vitellogenin with potential value to the biochemistry, molecular biology and pathology of vitellogenesis

    USGS Publications Warehouse

    Folmar, L.D.; Denslow, N.D.; Wallace, R.A.; LaFleur, G.; Gross, T.S.; Bonomelli, S.; Sullivan, C.V.

    1995-01-01

    N-terminal amino acid sequences for vitellogenin (Vtg) from six species of teleost fish (striped bass, mummichog, pinfish, brown bullhead, medaka, yellow perch and the sturgeon) are compared with published N-terminal Vtg sequences for the lamprey, clawed frog and domestic chicken. Striped bass and mummichog had 100% identical amino acids between positions 7 and 21, while pinfish, brown bullhead, sturgeon, lamprey, Xenopus and chicken had 87%, 93%, 60%, 47%, 47-60%) for four transcripts and had 40% identical, respectively, with striped bass for the same positions. Partial sequences obtained for medaka and yellow perch were 100% identical between positions 5 to 10. The potential utility of this conserved sequence for studies on the biochemistry, molecular biology and pathology of vitellogenesis is discussed.

  16. Analysis of Variability in HIV-1 Subtype A Strains in Russia Suggests a Combination of Deep Sequencing and Multitarget RNA Interference for Silencing of the Virus.

    PubMed

    Kretova, Olga V; Chechetkin, Vladimir R; Fedoseeva, Daria M; Kravatsky, Yuri V; Sosin, Dmitri V; Alembekov, Ildar R; Gorbacheva, Maria A; Gashnikova, Natalya M; Tchurikov, Nickolai A

    2017-02-01

    Any method for silencing the activity of the HIV-1 retrovirus should tackle the extremely high variability of HIV-1 sequences and mutational escape. We studied sequence variability in the vicinity of selected RNA interference (RNAi) targets from isolates of HIV-1 subtype A in Russia, and we propose that using artificial RNAi is a potential alternative to traditional antiretroviral therapy. We prove that using multiple RNAi targets overcomes the variability in HIV-1 isolates. The optimal number of targets critically depends on the conservation of the target sequences. The total number of targets that are conserved with a probability of 0.7-0.8 should exceed at least 2. Combining deep sequencing and multitarget RNAi may provide an efficient approach to cure HIV/AIDS.

  17. Sequence analysis of dolphin ferritin H and L subunits and possible iron-dependent translational control of dolphin ferritin gene

    PubMed Central

    Takaesu, Azusa; Watanabe, Kiyotaka; Takai, Shinji; Sasaki, Yukako; Orino, Koichi

    2008-01-01

    Background Iron-storage protein, ferritin plays a central role in iron metabolism. Ferritin has dual function to store iron and segregate iron for protection of iron-catalyzed reactive oxygen species. Tissue ferritin is composed of two kinds of subunits (H: heavy chain or heart-type subunit; L: light chain or liver-type subunit). Ferritin gene expression is controlled at translational level in iron-dependent manner or at transcriptional level in iron-independent manner. However, sequencing analysis of marine mammalian ferritin subunits has not yet been performed fully. The purpose of this study is to reveal cDNA-derived amino acid sequences of cetacean ferritin H and L subunits, and demonstrate the possibility of expression of these subunits, especially H subunit, by iron. Methods Sequence analyses of cetacean ferritin H and L subunits were performed by direct sequencing of polymerase chain reaction (PCR) fragments from cDNAs generated via reverse transcription-PCR of leukocyte total RNA prepared from blood samples of six different dolphin species (Pseudorca crassidens, Lagenorhynchus obliquidens, Grampus griseus, Globicephala macrorhynchus, Tursiops truncatus, and Delphinapterus leucas). The putative iron-responsive element sequence in the 5'-untranslated region of the six different dolphin species was revealed by direct sequencing of PCR fragments obtained using leukocyte genomic DNA. Results Dolphin H and L subunits consist of 182 and 174 amino acids, respectively, and amino acid sequence identities of ferritin subunits among these dolphins are highly conserved (H: 99–100%, (99→98) ; L: 98–100%). The conserved 28 bp IRE sequence was located -144 bp upstream from the initiation codon in the six different dolphin species. Conclusion These results indicate that six different dolphin species have conserved ferritin sequences, and suggest that these genes are iron-dependently expressed. PMID:18954429

  18. Biosystematics and Conservation: A Case Study with Two Enigmatic and Uncommon Species of Crassula from New Zealand

    PubMed Central

    De Lange, P. J.; Heenan, P. B.; Keeling, D. J.; Murray, B. G.; Smissen, R.; Sykes, W. R.

    2008-01-01

    Background and Aims Crassula hunua and C. ruamahanga have been taxonomically controversial. Here their distinctiveness is assessed so that their taxonomic and conservation status can be clarified. Methods Populations of these two species were analysed using morphological, chromosomal and DNA sequence data. Key Results It proved impossible to differentiate between these two species using 12 key morphological characters. Populations were found to be chromosomally variable with 11 different chromosome numbers ranging from 2n = 42 to 2n = 100. Meiotic behaviour and levels of pollen stainability were both variable. Phylogenetic analyses showed that differences exist in both nuclear and plastid DNA sequences between individual plants, sometimes from the same population. Conclusions The results suggest that these plants are a species complex that has evolved through interspecific hybridization and polyploidy. Their high levels of chromosomal and DNA sequence variation present a problem for their conservation. PMID:18055560

  19. Streptomyces griseus streptomycin phosphotransferase: expression of its gene in Escherichia coli and sequence homology with other antibiotic phosphotransferases and with eukaryotic protein kinases.

    PubMed

    Lim, C K; Smith, M C; Petty, J; Baumberg, S; Wootton, J C

    1989-12-01

    The aphD gene of Streptomyces griseus, encoding a streptomycin 6-phosphotransferase (SPH), was sub-cloned in the pBR322-based expression vector pRK9 (which contains the Serratia marcescens trp promoter) with selection for expression of streptomycin resistance in Escherichia coli. Two hybrid plasmids, pCKL631 and pCKL711, were isolated which conferred resistance. Both contained a approximately 2 kbp fragment already suspected to include aphD. The properties of in vitro deletion derivatives of these plasmids were consistent with the presumed location of aphD. In vitro deletion of a sequence including most of the trp promoter largely, but not quite completely, abolished the ability of the plasmid to confer streptomycin resistance, confirming that expression was indeed principally from the trp promoter. A polypeptide of approximately 34.5 kDa was present in minicells containing plasmids that conferred streptomycin resistance, but was absent when the plasmids contained in vitro deletions removing streptomycin resistance. Part of the fragment was sequenced and an open reading frame corresponding to aphD identified. A computer-assisted comparison of the deduced SPH sequence with those of other antibiotic phosphotransferases suggested a common structure A-B-C-D-E, where B and D were conserved between all sequences compared while A, C and E divided between the streptomycin and hygromycin B phosphotransferases on one hand and kanamycin/neomycin ones on the other. A composite sequence data base was searched for homologues to consensus matrices constructed from five approximately 12-residue subsequences within blocks B and D. For one subsequence, corresponding to the N-terminal portion of block D, those sequences from the database that yielded the highest homology scores comprised almost entirely either antibiotic phosphotransferases or eukaryotic protein kinases. Possible evolutionary implications of this homology, previously described by other groups, are discussed.

  20. Phylogenetic relationships of chrysanthemums in Korea based on novel SSR markers.

    PubMed

    Khaing, A A; Moe, K T; Hong, W J; Park, C S; Yeon, K H; Park, H S; Kim, D C; Choi, B J; Jung, J Y; Chae, S C; Lee, K M; Park, Y J

    2013-11-07

    Chrysanthemums are well known for their esthetic and medicinal values. Characterization of chrysanthemums is vital for their conservation and management as well as for understanding their genetic relationships. We found 12 simple sequence repeat markers (SSRs) of 100 designed primers to be polymorphic. These novel SSR markers were used to evaluate 95 accessions of chrysanthemums (3 indigenous and 92 cultivated accessions). Two hundred alleles were identified, with an average of 16.7 alleles per locus. KNUCRY-77 gave the highest polymorphic information content value (0.879), while KNUCRY-10 gave the lowest (0.218). Similar patterns of grouping were observed with a distance-based dendrogram developed using PowerMarker and model-based clustering with Structure. Three clusters with some admixtures were identified by model-based clustering. These newly developed SSR markers will be useful for further studies of chrysanthemums, such as taxonomy and marker-assisted selection breeding.

  1. Microsatellites for Lindera species

    Treesearch

    Craig S. Echt; D. Deemer; T.L. Kubisiak; C.D. Nelson

    2006-01-01

    Microsatellite markers were developed for conservation genetic studies of Lindera melissifolia (pondberry), a federally endangered shrub of southern bottomland ecosystems. Microsatellite sequences were obtained from DNA libraries that were enriched for the (AC)n simple sequence repeat motif. From 35 clone sequences, 20 primer...

  2. Analysis of the primary structure of the long terminal repeat and the gag and pol genes of the human spumaretrovirus.

    PubMed Central

    Maurer, B; Bannert, H; Darai, G; Flügel, R M

    1988-01-01

    The nucleotide sequence of the human spumaretrovirus (HSRV) genome was determined. The 5' long terminal repeat region was analyzed by strong stop cDNA synthesis and S1 nuclease mapping. The length of the RU5 region was determined and found to be 346 nucleotides long. The 5' long terminal repeat is 1,123 base pairs long and is bound by an 18-base-pair primer-binding site complementary to the 3' end of mammalian lysine-1,2-specific tRNA. Open reading frames for gag and pol genes were identified. Surprisingly, the HSRV gag protein does not contain the cysteine motif of the nucleic acid-binding proteins found in and typical of all other retroviral gag proteins; instead the HSRV gag gene encodes a strongly basic protein reminiscent of those of hepatitis B virus and retrotransposons. The carboxy-terminal part of the HSRV gag gene products encodes a protease domain. The pol gene overlaps the gag gene and is postulated to be synthesized as a gag/pol precursor via translational frameshifting analogous to that of Rous sarcoma virus, with 7 nucleotides immediately upstream of the termination codons of gag conserved between the two viral genomes. The HSRV pol gene is 2,730 nucleotides long, and its deduced protein sequence is readily subdivided into three well-conserved domains, the reverse transcriptase, the RNase H, and the integrase. Although the degree of homology of the HSRV reverse transcriptase domain is highest to that of murine leukemia virus, the HSRV genomic organization is more similar to that of human and simian immunodeficiency viruses. The data justify classifying the spumaretroviruses as a third subfamily of Retroviridae. Images PMID:2451755

  3. Ras-like family small GTPases genes in Nilaparvata lugens: Identification, phylogenetic analysis, gene expression and function in nymphal development

    PubMed Central

    Wang, Weixia; Li, Kailong; Wan, Pinjun; Lai, Fengxiang; Fu, Qiang; Zhu, Tingheng

    2017-01-01

    Twenty-nine cDNAs encoding Ras-like family small GTPases (RSGs) were cloned and sequenced from Nilaparvata lugens. Twenty-eight proteins are described here: 3 from Rho, 2 from Ras, 9 from Arf and 14 from Rabs. These RSGs from N.lugens have five conserved G-loop motifs and displayed a higher degree of sequence conservation with orthologues from insects. RT-qPCR analysis revealed NlRSGs expressed at all life stages and the highest expression was observed in hemolymph, gut or wing for most of NlRSGs. RNAi demonstrated that eighteen NlRSGs play a crucial role in nymphal development. Nymphs with silenced NlRSGs failed to molt, eclosion or development arrest. The qRT-PCR analysis verified the correlation between mortality and the down-regulation of the target genes. The expression level of nuclear receptors, Kr-h1, Hr3, FTZ-F1 and E93 involved in 20E and JH signal pathway was impacted in nymphs with silenced twelve NlRSGs individually. The expression of two halloween genes, Cyp314a1 and Cyp315a1 involved in ecdysone synthesis, decreased in nymphs with silenced NlSar1 or NlArf1. Cyp307a1 increased in nymphs with silenced NlArf6. In N.lugens with silenced NlSRβ, NlSar1 and NlRab2 at 9th day individually, 0.0% eclosion rate and almost 100.0% mortality was demonstrated. Further analysis showed NlSRβ could be served as a candidate target for dsRNA-based pesticides for N.lugens control. PMID:28241066

  4. Location of a major antigenic site involved in Ross River virus neutralization.

    PubMed

    Vrati, S; Fernon, C A; Dalgarno, L; Weir, R C

    1988-02-01

    The location of a major antigenic domain involved in the neutralization of an alphavirus, Ross River virus, has been defined in terms of its position in the amino acid sequence of the E2 glycoprotein. The domain encompasses three topographically close epitopes which were identified using three E2-specific neutralizing monoclonal antibodies in competitive binding assays. Nucleotide sequencing of the structural protein genes of monoclonal antibody-selected antigenic variants showed that for each variant there was a single nucleotide change in the E2 gene leading to a nonconservative amino acid substitution in E2. Changes were at positions 216, 234, and 246-251 in the amino acid sequence. The epitopes are in a region of E2 which, though not strongly conserved as to sequence among Ross River virus, Semliki Forest virus, and Sindbis virus, is conserved in its hydropathy profile among the three alphaviruses. The epitopes lie between two asparagine-linked glycosylation sites (residues 200 and 262) in E2. They are conserved as to position between the mouse virulent T48 strain and the mouse avirulent NB5092 strain.

  5. Conformational Preference of ‘CαNN’ Short Peptide Motif towards Recognition of Anions

    PubMed Central

    Banerjee, Raja

    2013-01-01

    Among several ‘anion binding motifs’, the recently described ‘CαNN’ motif occurring in the loop regions preceding a helix, is conserved through evolution both in sequence and its conformation. To establish the significance of the conserved sequence and their intrinsic affinity for anions, a series of peptides containing the naturally occurring ‘CαNN’ motif at the N-terminus of a designed helix, have been modeled and studied in a context free system using computational techniques. Appearance of a single interacting site with negative binding free-energy for both the sulfate and phosphate ions, as evidenced in docking experiments, establishes that the ‘CαNN’ segment has an intrinsic affinity for anions. Molecular Dynamics (MD) simulation studies reveal that interaction with anion triggers a conformational switch from non-helical to helical state at the ‘CαNN’ segment, which extends the length of the anchoring-helix by one turn at the N-terminus. Computational experiments substantiate the significance of sequence/structural context and justify the conserved nature of the ‘CαNN’ sequence for anion recognition through “local” interaction. PMID:23516403

  6. A new begomovirus associated with alpha- and betasatellite molecules isolated from Vernonia cinerea in China.

    PubMed

    Zulfiqar, Awais; Zhang, Jie; Cui, Xiaofeng; Qian, Yajuan; Zhou, Xueping; Xie, Yan

    2012-01-01

    A begomovirus disease complex associated with Vernonia cinerea showing yellow vein symptoms was studied. The full-length genomic DNA was comprised of 2739 nucleotides (nt) and contained the typical genome structure of begomoviruses. Comparison analysis showed that it shared the highest (78.9%) nucleotide sequence identity with recently characterized Vernonia yellow vein virus (VeYVV) from India. For associated satellites, betasatellite showed the highest nucleotide sequence identity (52.1%) with Vernonia yellow vein virus betasatellite (VeYVVB) and alphasatellite shared the highest sequence identity (70.7%) with Gossypium mustelinium symptomless alphasatellite (GMusSLA). It is a member of a distinct species with cognate alpha- and betasatellites for which the name Vernonia yellow vein Fujian virus (VeYVFjV) is proposed.

  7. Geographic Structuring of the Plasmodium falciparum Sarco(endo)plasmic Reticulum Ca2+ ATPase (PfSERCA) Gene Diversity

    PubMed Central

    Pinto, João; Gribaldo, Simonetta; Legrand, Eric; Niang, Makhtar; Kim, Nimol; Pharath, Lim; Volnay, Béatrice; Ekala, Marie Therese; Bouchier, Christiane; Fandeur, Thierry; Berzosa, Pedro; Benito, Agustin; Ferreira, Isabel Dinis; Ferreira, Cynthia; Vieira, Pedro Paulo; Alecrim, Maria das Graças; Mercereau-Puijalon, Odile; Cravo, Pedro

    2010-01-01

    Artemisinin, a thapsigargin-like sesquiterpene has been shown to inhibit the Plasmodium falciparum sarco/endoplasmic reticulum calcium-ATPase PfSERCA. To collect baseline pfserca sequence information before field deployment of Artemisinin-based Combination therapies that may select mutant parasites, we conducted a sequence analysis of 100 isolates from multiple sites in Africa, Asia and South America. Coding sequence diversity was large, with 29 mutated codons, including 32 SNPs (average of one SNP/115 bp), of which 19 were novel mutations. Most SNP detected in this study were clustered within a region in the cytosolic head of the protein. The PfSERCA functional domains were very well conserved, with non synonymous mutations located outside the functional domains, except for the S769N mutation associated in French Guiana with elevated IC50 for artemether. The S769N mutation is located close to the hinge of the headpiece, which in other species modulates calcium affinity and in consequence efficacy of inhibitors, possibly linking calcium homeostasis to drug resistance. Genetic diversity was highest in Senegal, Brazil and French Guiana, and few mutations were identified in Asia. Population genetic analysis was conducted for a partial fragment of the gene encompassing nucleotide coordinates 87-2862 (unambiguous sequence available for 96 isolates). This supported a geographic clustering, with a separation between Old and New World samples and one dominant ancestral haplotype. Genetic drift alone cannot explain the observed polymorphism, suggesting that other evolutionary mechanisms are operating. One possible contributor could be the frequency of haemoglobinopathies that are associated with calcium dysregulation in the erythrocyte. PMID:20195531

  8. Complete plastid genome sequences suggest strong selection for retention of photosynthetic genes in the parasitic plant genus Cuscuta.

    PubMed

    McNeal, Joel R; Kuehl, Jennifer V; Boore, Jeffrey L; de Pamphilis, Claude W

    2007-10-24

    Plastid genome content and protein sequence are highly conserved across land plants and their closest algal relatives. Parasitic plants, which obtain some or all of their nutrition through an attachment to a host plant, are often a striking exception. Heterotrophy can lead to relaxed constraint on some plastid genes or even total gene loss. We sequenced plastid genomes of two species in the parasitic genus Cuscuta along with a non-parasitic relative, Ipomoea purpurea, to investigate changes in the plastid genome that may result from transition to the parasitic lifestyle. Aside from loss of all ndh genes, Cuscuta exaltata retains photosynthetic and photorespiratory genes that evolve under strong selective constraint. Cuscuta obtusiflora has incurred substantially more change to its plastid genome, including loss of all genes for the plastid-encoded RNA polymerase. Despite extensive change in gene content and greatly increased rate of overall nucleotide substitution, C. obtusiflora also retains all photosynthetic and photorespiratory genes with only one minor exception. Although Epifagus virginiana, the only other parasitic plant with its plastid genome sequenced to date, has lost a largely overlapping set of transfer-RNA and ribosomal genes as Cuscuta, it has lost all genes related to photosynthesis and maintains a set of genes which are among the most divergent in Cuscuta. Analyses demonstrate photosynthetic genes are under the highest constraint of any genes within the plastid genomes of Cuscuta, indicating a function involving RuBisCo and electron transport through photosystems is still the primary reason for retention of the plastid genome in these species.

  9. Complete plastid genome sequences suggest strong selection for retention of photosynthetic genes in the parasitic plant genus Cuscuta

    PubMed Central

    McNeal, Joel R; Kuehl, Jennifer V; Boore, Jeffrey L; de Pamphilis, Claude W

    2007-01-01

    Background Plastid genome content and protein sequence are highly conserved across land plants and their closest algal relatives. Parasitic plants, which obtain some or all of their nutrition through an attachment to a host plant, are often a striking exception. Heterotrophy can lead to relaxed constraint on some plastid genes or even total gene loss. We sequenced plastid genomes of two species in the parasitic genus Cuscuta along with a non-parasitic relative, Ipomoea purpurea, to investigate changes in the plastid genome that may result from transition to the parasitic lifestyle. Results Aside from loss of all ndh genes, Cuscuta exaltata retains photosynthetic and photorespiratory genes that evolve under strong selective constraint. Cuscuta obtusiflora has incurred substantially more change to its plastid genome, including loss of all genes for the plastid-encoded RNA polymerase. Despite extensive change in gene content and greatly increased rate of overall nucleotide substitution, C. obtusiflora also retains all photosynthetic and photorespiratory genes with only one minor exception. Conclusion Although Epifagus virginiana, the only other parasitic plant with its plastid genome sequenced to date, has lost a largely overlapping set of transfer-RNA and ribosomal genes as Cuscuta, it has lost all genes related to photosynthesis and maintains a set of genes which are among the most divergent in Cuscuta. Analyses demonstrate photosynthetic genes are under the highest constraint of any genes within the plastid genomes of Cuscuta, indicating a function involving RuBisCo and electron transport through photosystems is still the primary reason for retention of the plastid genome in these species. PMID:17956636

  10. The most conserved genome segments for life detection on Earth and other planets.

    PubMed

    Isenbarger, Thomas A; Carr, Christopher E; Johnson, Sarah Stewart; Finney, Michael; Church, George M; Gilbert, Walter; Zuber, Maria T; Ruvkun, Gary

    2008-12-01

    On Earth, very simple but powerful methods to detect and classify broad taxa of life by the polymerase chain reaction (PCR) are now standard practice. Using DNA primers corresponding to the 16S ribosomal RNA gene, one can survey a sample from any environment for its microbial inhabitants. Due to massive meteoritic exchange between Earth and Mars (as well as other planets), a reasonable case can be made for life on Mars or other planets to be related to life on Earth. In this case, the supremely sensitive technologies used to study life on Earth, including in extreme environments, can be applied to the search for life on other planets. Though the 16S gene has become the standard for life detection on Earth, no genome comparisons have established that the ribosomal genes are, in fact, the most conserved DNA segments across the kingdoms of life. We present here a computational comparison of full genomes from 13 diverse organisms from the Archaea, Bacteria, and Eucarya to identify genetic sequences conserved across the widest divisions of life. Our results identify the 16S and 23S ribosomal RNA genes as well as other universally conserved nucleotide sequences in genes encoding particular classes of transfer RNAs and within the nucleotide binding domains of ABC transporters as the most conserved DNA sequence segments across phylogeny. This set of sequences defines a core set of DNA regions that have changed the least over billions of years of evolution and provides a means to identify and classify divergent life, including ancestrally related life on other planets.

  11. Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology

    Treesearch

    Richard Cronn; Aaron Liston; Matthew Parks; David S. Gernandt; Rongkun Shen; Todd Mockler

    2008-01-01

    Organellar DNA sequences are widely used in evolutionary and population genetic studies; however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to...

  12. In-silico and in-vivo analyses of EST databases unveil conserved miRNAs from Carthamus tinctorius and Cynara cardunculus

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are small RNAs (21-24 bp) providing an RNA-based system of gene regulation highly conserved in plants and animals. In plants, miRNAs control mRNA degradation or restrain translation, affecting development and responses to stresses. Plant miRNAs show imperfect but extensive complementarity to mRNA targets, making their computational prediction possible, useful when data mining is applied on different species. In this study we used a comparative approach to identify both miRNAs and their targets, in artichoke and safflower. Results Two complete expressed sequence tags (ESTs) datasets from artichoke (3.6·104 entries) and safflower (4.2·104), were analysed with a bioinformatic pipeline and in vitro experiments, identifying 17 potential miRNAs. For each EST, using RNAhybrid program and 953 non redundant miRNA mature sequences, available in mirBase as reference, we searched matching putative targets. 8730 out of 42011 ESTs from safflower and 7145 of 36323 ESTs from artichoke showed at least one predicted miRNA target. BLAST analysis showed that 75% of all ESTs shared at least a common homologous region (E-value < 10-4) and about 50% of these displayed 400 bp or longer aligned sequences as conserved homologous/orthologous (COS) regions. 960 and 890 ESTs of safflower and artichoke organized in COS shared 79 different miRNA targets, considered functionally conserved, and statistically significant when compared with random sequences (signal to noise ratio > 2 and specificity ≥ 0.85). Four highly significant miRNAs selected from in silico data were experimentally validated in globe artichoke leaves. Conclusions Mature miRNAs and targets were predicted within EST sequences of safflower and artichoke. Most of the miRNA targets appeared highly/moderately conserved, highlighting an important and conserved function. In this study we introduce a stringent parameter for the comparative sequence analysis, represented by the identification of the same target in the COS region. After statistical analysis 79 targets, found on the COS regions and belonging to 60 miRNA families, have a signal to noise ratio > 2, with ≥ 0.85 specificity. The putative miRNAs identified belong to 55 dicotyledon plants and to 24 families only in monocotyledon. PMID:22536958

  13. Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms

    PubMed Central

    Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A

    2005-01-01

    Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE prediction tool. This is the first report on the prediction of the frequency and distribution of ESEs in the BRCA1 gene, and it is the first reported attempt to predict which ESEs are most likely to be functional and therefore which sequence variants in ESEs are most likely to be pathogenic. PMID:16280041

  14. Analysis of drug binding pockets and repurposing opportunities for twelve essential enzymes of ESKAPE pathogens

    PubMed Central

    Naz, Sadia; Ngo, Tony; Farooq, Umar

    2017-01-01

    Background The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis. The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Methods Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli, two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. Results High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis. Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Discussion Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner. PMID:28948099

  15. Analysis of drug binding pockets and repurposing opportunities for twelve essential enzymes of ESKAPE pathogens.

    PubMed

    Naz, Sadia; Ngo, Tony; Farooq, Umar; Abagyan, Ruben

    2017-01-01

    The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis . The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli , two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis . Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner.

  16. Identification of Cis-Acting Promoter Elements in Cold- and Dehydration-Induced Transcriptional Pathways in Arabidopsis, Rice, and Soybean

    PubMed Central

    Maruyama, Kyonoshin; Todaka, Daisuke; Mizoi, Junya; Yoshida, Takuya; Kidokoro, Satoshi; Matsukura, Satoko; Takasaki, Hironori; Sakurai, Tetsuya; Yamamoto, Yoshiharu Y.; Yoshiwara, Kyouko; Kojima, Mikiko; Sakakibara, Hitoshi; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

    2012-01-01

    The genomes of three plants, Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and soybean (Glycine max), have been sequenced, and their many genes and promoters have been predicted. In Arabidopsis, cis-acting promoter elements involved in cold- and dehydration-responsive gene expression have been extensively analysed; however, the characteristics of such cis-acting promoter sequences in cold- and dehydration-inducible genes of rice and soybean remain to be clarified. In this study, we performed microarray analyses using the three species, and compared characteristics of identified cold- and dehydration-inducible genes. Transcription profiles of the cold- and dehydration-responsive genes were similar among these three species, showing representative upregulated (dehydrin/LEA) and downregulated (photosynthesis-related) genes. All (46 = 4096) hexamer sequences in the promoters of the three species were investigated, revealing the frequency of conserved sequences in cold- and dehydration-inducible promoters. A core sequence of the abscisic acid-responsive element (ABRE) was the most conserved in dehydration-inducible promoters of all three species, suggesting that transcriptional regulation for dehydration-inducible genes is similar among these three species, with the ABRE-dependent transcriptional pathway. In contrast, for cold-inducible promoters, the conserved hexamer sequences were diversified among these three species, suggesting the existence of diverse transcriptional regulatory pathways for cold-inducible genes among the species. PMID:22184637

  17. Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models

    PubMed Central

    2014-01-01

    Background Logos are commonly used in molecular biology to provide a compact graphical representation of the conservation pattern of a set of sequences. They render the information contained in sequence alignments or profile hidden Markov models by drawing a stack of letters for each position, where the height of the stack corresponds to the conservation at that position, and the height of each letter within a stack depends on the frequency of that letter at that position. Results We present a new tool and web server, called Skylign, which provides a unified framework for creating logos for both sequence alignments and profile hidden Markov models. In addition to static image files, Skylign creates a novel interactive logo plot for inclusion in web pages. These interactive logos enable scrolling, zooming, and inspection of underlying values. Skylign can avoid sampling bias in sequence alignments by down-weighting redundant sequences and by combining observed counts with informed priors. It also simplifies the representation of gap parameters, and can optionally scale letter heights based on alternate calculations of the conservation of a position. Conclusion Skylign is available as a website, a scriptable web service with a RESTful interface, and as a software package for download. Skylign’s interactive logos are easily incorporated into a web page with just a few lines of HTML markup. Skylign may be found at http://skylign.org. PMID:24410852

  18. The Energy Conservation Program for Schools and Hospitals Can Be More Effective. Report to the Congress of the United States by the Comptroller General.

    ERIC Educational Resources Information Center

    Comptroller General of the U.S., Washington, DC.

    The Schools and Hospital Program, funded through the National Energy Conservation Policy Act, is not an effective use of federal monies when compared to other Department of Energy (DOE) conservation programs. It is among the highest in cost, yet among the lowest in yielding energy savings. This report identifies changes which could increase…

  19. The lytic origin of herpesvirus papio is highly homologous to Epstein-Barr virus ori-Lyt: evolutionary conservation of transcriptional activation and replication signals.

    PubMed Central

    Ryon, J J; Fixman, E D; Houchens, C; Zong, J; Lieberman, P M; Chang, Y N; Hayward, G S; Hayward, S D

    1993-01-01

    Herpesvirus papio (HVP) is a B-lymphotropic baboon virus with an estimated 40% homology to Epstein-Barr virus (EBV). We have cloned and sequenced ori-Lyt of herpesvirus papio and found a striking degree of nucleotide homology (89%) with ori-Lyt of EBV. Transcriptional elements form an integral part of EBV ori-Lyt. The promoter and enhancer domains of EBV ori-Lyt are conserved in herpesvirus papio. The EBV ori-Lyt promoter contains four binding sites for the EBV lytic cycle transactivator Zta, and the enhancer includes one Zta and two Rta response elements. All five of the Zta response elements and one of the Rta motifs are conserved in HVP ori-Lyt, and the HVP DS-L leftward promoter and the enhancer were activated in transient transfection assays by the EBV Zta and Rta transactivators. The EBV ori-Lyt enhancer contains a palindromic sequence, GGTCAGCTGACC, centered on a PvuII restriction site. This sequence, with a single base change, is also present in the HVP ori-Lyt enhancer. DNase I footprinting demonstrated that the PvuII sequence was bound by a protein present in a Raji nuclear extract. Mobility shift and competition assays using oligonucleotide probes identified this sequence as a binding site for the cellular transcription factor MLTF. Mutagenesis of the binding site indicated that MLTF contributes significantly to the constitutive activity of the ori-Lyt enhancer. The high degree of conservation of cis-acting signal sequences in HVP ori-Lyt was further emphasized by the finding that an HVP ori-Lyt-containing plasmid was replicated in Vero cells by a set of cotransfected EBV replication genes. The central domain of EBV ori-Lyt contains two related AT-rich palindromes, one of which is partially duplicated in the HVP sequence. The AT-rich palindromes are functionally important cis-acting motifs. Deletion of these palindromes severely diminished replication of an ori-Lyt target plasmid. Images PMID:8389916

  20. BayesMotif: de novo protein sorting motif discovery from impure datasets.

    PubMed

    Hu, Jianjun; Zhang, Fan

    2010-01-18

    Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of PWM (position weight matrix) motif model.

  1. An RNAi in silico approach to find an optimal shRNA cocktail against HIV-1

    PubMed Central

    2010-01-01

    Background HIV-1 can be inhibited by RNA interference in vitro through the expression of short hairpin RNAs (shRNAs) that target conserved genome sequences. In silico shRNA design for HIV has lacked a detailed study of virus variability constituting a possible breaking point in a clinical setting. We designed shRNAs against HIV-1 considering the variability observed in naïve and drug-resistant isolates available at public databases. Methods A Bioperl-based algorithm was developed to automatically scan multiple sequence alignments of HIV, while evaluating the possibility of identifying dominant and subdominant viral variants that could be used as efficient silencing molecules. Student t-test and Bonferroni Dunn correction test were used to assess statistical significance of our findings. Results Our in silico approach identified the most common viral variants within highly conserved genome regions, with a calculated free energy of ≥ -6.6 kcal/mol. This is crucial for strand loading to RISC complex and for a predicted silencing efficiency score, which could be used in combination for achieving over 90% silencing. Resistant and naïve isolate variability revealed that the most frequent shRNA per region targets a maximum of 85% of viral sequences. Adding more divergent sequences maintained this percentage. Specific sequence features that have been found to be related with higher silencing efficiency were hardly accomplished in conserved regions, even when lower entropy values correlated with better scores. We identified a conserved region among most HIV-1 genomes, which meets as many sequence features for efficient silencing. Conclusions HIV-1 variability is an obstacle to achieving absolute silencing using shRNAs designed against a consensus sequence, mainly because there are many functional viral variants. Our shRNA cocktail could be truly effective at silencing dominant and subdominant naïve viral variants. Additionally, resistant isolates might be targeted under specific antiretroviral selective pressure, but in both cases these should be tested exhaustively prior to clinical use. PMID:21172023

  2. Conservation of streptococcal CRISPRs on human skin and saliva.

    PubMed

    Robles-Sikisaka, Refugio; Naidu, Mayuri; Ly, Melissa; Salzman, Julia; Abeles, Shira R; Boehm, Tobias K; Pride, David T

    2014-06-06

    Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) are utilized by bacteria to resist encounters with their viruses. Human body surfaces have numerous bacteria that harbor CRISPRs, and their content can provide clues as to the types and features of viruses they may have encountered. We investigated the conservation of CRISPR content from streptococci on skin and saliva of human subjects over 8-weeks to determine whether similarities existed in the CRISPR spacer profiles and whether CRISPR spacers were a stable component of each biogeographic site. Most of the CRISPR sequences identified were unique, but a small proportion of spacers from the skin and saliva of each subject matched spacers derived from previously sequenced loci of S. thermophilus and other streptococci. There were significant proportions of CRISPR spacers conserved over the entire 8-week study period for all subjects, and salivary CRISPR spacers sampled in the mornings showed significantly higher levels of conservation than any other time of day. We also found substantial similarities in the spacer repertoires of the skin and saliva of each subject. Many skin-derived spacers matched salivary viruses, supporting that bacteria of the skin may encounter viruses with similar sequences to those found in the mouth. Despite the similarities between skin and salivary spacer repertoires, the variation present was distinct based on each subject and body site. The conservation of CRISPR spacers in the saliva and the skin of human subjects over the time period studied suggests a relative conservation of the bacteria harboring them.

  3. Conservation of streptococcal CRISPRs on human skin and saliva

    PubMed Central

    2014-01-01

    Background Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) are utilized by bacteria to resist encounters with their viruses. Human body surfaces have numerous bacteria that harbor CRISPRs, and their content can provide clues as to the types and features of viruses they may have encountered. Results We investigated the conservation of CRISPR content from streptococci on skin and saliva of human subjects over 8-weeks to determine whether similarities existed in the CRISPR spacer profiles and whether CRISPR spacers were a stable component of each biogeographic site. Most of the CRISPR sequences identified were unique, but a small proportion of spacers from the skin and saliva of each subject matched spacers derived from previously sequenced loci of S. thermophilus and other streptococci. There were significant proportions of CRISPR spacers conserved over the entire 8-week study period for all subjects, and salivary CRISPR spacers sampled in the mornings showed significantly higher levels of conservation than any other time of day. We also found substantial similarities in the spacer repertoires of the skin and saliva of each subject. Many skin-derived spacers matched salivary viruses, supporting that bacteria of the skin may encounter viruses with similar sequences to those found in the mouth. Despite the similarities between skin and salivary spacer repertoires, the variation present was distinct based on each subject and body site. Conclusions The conservation of CRISPR spacers in the saliva and the skin of human subjects over the time period studied suggests a relative conservation of the bacteria harboring them. PMID:24903519

  4. Conservation of a pH-sensitive structure in the C-terminal region of spider silk extends across the entire silk gene family.

    PubMed

    Strickland, Michelle; Tudorica, Victor; Řezáč, Milan; Thomas, Neil R; Goodacre, Sara L

    2018-06-01

    Spiders produce multiple silks with different physical properties that allow them to occupy a diverse range of ecological niches, including the underwater environment. Despite this functional diversity, past molecular analyses show a high degree of amino acid sequence similarity between C-terminal regions of silk genes that appear to be independent of the physical properties of the resulting silks; instead, this domain is crucial to the formation of silk fibers. Here, we present an analysis of the C-terminal domain of all known types of spider silk and include silk sequences from the spider Argyroneta aquatica, which spins the majority of its silk underwater. Our work indicates that spiders have retained a highly conserved mechanism of silk assembly, despite the extraordinary diversification of species, silk types and applications of silk over 350 million years. Sequence analysis of the silk C-terminal domain across the entire gene family shows the conservation of two uncommon amino acids that are implicated in the formation of a salt bridge, a functional bond essential to protein assembly. This conservation extends to the novel sequences isolated from A. aquatica. This finding is relevant to research regarding the artificial synthesis of spider silk, suggesting that synthesis of all silk types will be possible using a single process.

  5. The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

    PubMed Central

    Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

    2013-01-01

    Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520

  6. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    PubMed

    Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

    2013-01-01

    Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

  7. Nucleotide sequence of the gene for the Mr 32,000 thylakoid membrane protein from Spinacia oleracea and Nicotiana debneyi predicts a totally conserved primary translation product of Mr 38,950

    PubMed Central

    Zurawski, Gerard; Bohnert, Hans J.; Whitfeld, Paul R.; Bottomley, Warwick

    1982-01-01

    The gene for the so-called Mr 32,000 rapidly labeled photosystem II thylakoid membrane protein (here designated psbA) of spinach (Spinacia oleracea) chloroplasts is located on the chloroplast DNA in the large single-copy region immediately adjacent to one of the inverted repeat sequences. In this paper we show that the size of the mRNA for this protein is ≈ 1.25 kilobases and that the direction of transcription is towards the inverted repeat unit. The nucleotide sequence of the gene and its flanking regions is presented. The only large open reading frame in the sequence codes for a protein of Mr 38,950. The nucleotide sequence of psbA from Nicotiana debneyi also has been determined, and comparison of the sequences from the two species shows them to be highly conserved (>95% homology) throughout the entire reading frame. Conservation of the amino acid sequence is absolute, there being no changes in a total of 353 residues. This leads us to conclude that the primary translation product of psbA must be a protein of Mr 38,950. The protein is characterized by the complete absence of lysine residues and is relatively rich in hydrophobic amino acids, which tend to be clustered. Transcription of spinach psbA starts about 86 base pairs before the first ATG codon. Immediately upstream from this point there is a sequence typical of that found in E. coli promoters. An almost identical sequence occurs in the equivalent region of N. debneyi DNA. Images PMID:16593262

  8. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

    PubMed

    Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

    2013-09-01

    Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.

  9. Sequence diversity within the reovirus S2 gene: reovirus genes reassort in nature, and their termini are predicted to form a panhandle motif.

    PubMed Central

    Chapell, J D; Goral, M I; Rodgers, S E; dePamphilis, C W; Dermody, T S

    1994-01-01

    To better understand genetic diversity within mammalian reoviruses, we determined S2 nucleotide and deduced sigma 2 amino acid sequences of nine reovirus strains and compared these sequences with those of prototype strains of the three reovirus serotypes. The S2 gene and sigma 2 protein are highly conserved among the four type 1, one type 2, and seven type 3 strains studied. Phylogenetic analyses based on S2 nucleotide sequences of the 12 reovirus strains indicate that diversity within the S2 gene is independent of viral serotype. Additionally, we found marked topological differences between phylogenetic trees generated from S1 and S2 gene nucleotide sequences of the seven type 3 strains. These results demonstrate that reovirus S1 and S2 genes have distinct evolutionary histories, thus providing phylogenetic evidence for lateral transfer of reovirus genes in nature. When variability among the 12 sigma 2-encoding S2 nucleotide sequences was analyzed at synonymous positions, we found that approximately 60 nucleotides at the 5' terminus and 30 nucleotides at the 3' terminus were markedly conserved in comparison with other sigma 2-encoding regions of S2. Predictions of RNA secondary structures indicate that the more conserved S2 sequences participate in the formation of an extended region of duplex RNA interrupted by a pair of stem-loops. Among the 12 deduced sigma 2 amino acid sequences examined, substitutions were observed at only 11% of amino acid positions. This finding suggests that constraints on the structure or function of sigma 2, perhaps in part because of its location in the virion core, have limited sequence diversity within this protein. PMID:8289378

  10. On the Role of Aggregation Prone Regions in Protein Evolution, Stability, and Enzymatic Catalysis: Insights from Diverse Analyses

    PubMed Central

    Buck, Patrick M.; Kumar, Sandeep; Singh, Satish K.

    2013-01-01

    The various roles that aggregation prone regions (APRs) are capable of playing in proteins are investigated here via comprehensive analyses of multiple non-redundant datasets containing randomly generated amino acid sequences, monomeric proteins, intrinsically disordered proteins (IDPs) and catalytic residues. Results from this study indicate that the aggregation propensities of monomeric protein sequences have been minimized compared to random sequences with uniform and natural amino acid compositions, as observed by a lower average aggregation propensity and fewer APRs that are shorter in length and more often punctuated by gate-keeper residues. However, evidence for evolutionary selective pressure to disrupt these sequence regions among homologous proteins is inconsistent. APRs are less conserved than average sequence identity among closely related homologues (≥80% sequence identity with a parent) but APRs are more conserved than average sequence identity among homologues that have at least 50% sequence identity with a parent. Structural analyses of APRs indicate that APRs are three times more likely to contain ordered versus disordered residues and that APRs frequently contribute more towards stabilizing proteins than equal length segments from the same protein. Catalytic residues and APRs were also found to be in structural contact significantly more often than expected by random chance. Our findings suggest that proteins have evolved by optimizing their risk of aggregation for cellular environments by both minimizing aggregation prone regions and by conserving those that are important for folding and function. In many cases, these sequence optimizations are insufficient to develop recombinant proteins into commercial products. Rational design strategies aimed at improving protein solubility for biotechnological purposes should carefully evaluate the contributions made by candidate APRs, targeted for disruption, towards protein structure and activity. PMID:24146608

  11. Mitochondrial genome sequences illuminate maternal lineages of conservation concern in a rare carnivore

    Treesearch

    Brian J. Knaus; Richard Cronn; Aaron Liston; Kristine Pilgrim; Michael K. Schwartz

    2011-01-01

    Science-based wildlife management relies on genetic information to infer population connectivity and identify conservation units. The most commonly used genetic marker for characterizing animal biodiversity and identifying maternal lineages is the mitochondrial genome. Mitochondrial genotyping figures prominently in conservation and management plans, with much of the...

  12. A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities.

    PubMed

    Bastien, Olivier; Ortet, Philippe; Roy, Sylvaine; Maréchal, Eric

    2005-03-10

    Popular methods to reconstruct molecular phylogenies are based on multiple sequence alignments, in which addition or removal of data may change the resulting tree topology. We have sought a representation of homologous proteins that would conserve the information of pair-wise sequence alignments, respect probabilistic properties of Z-scores (Monte Carlo methods applied to pair-wise comparisons) and be the basis for a novel method of consistent and stable phylogenetic reconstruction. We have built up a spatial representation of protein sequences using concepts from particle physics (configuration space) and respecting a frame of constraints deduced from pair-wise alignment score properties in information theory. The obtained configuration space of homologous proteins (CSHP) allows the representation of real and shuffled sequences, and thereupon an expression of the TULIP theorem for Z-score probabilities. Based on the CSHP, we propose a phylogeny reconstruction using Z-scores. Deduced trees, called TULIP trees, are consistent with multiple-alignment based trees. Furthermore, the TULIP tree reconstruction method provides a solution for some previously reported incongruent results, such as the apicomplexan enolase phylogeny. The CSHP is a unified model that conserves mutual information between proteins in the way physical models conserve energy. Applications include the reconstruction of evolutionary consistent and robust trees, the topology of which is based on a spatial representation that is not reordered after addition or removal of sequences. The CSHP and its assigned phylogenetic topology, provide a powerful and easily updated representation for massive pair-wise genome comparisons based on Z-score computations.

  13. Characterization of microRNAs Expressed during Secondary Wall Biosynthesis in Acacia mangium

    PubMed Central

    Ong, Seong Siang; Wickneswari, Ratnam

    2012-01-01

    MicroRNAs (miRNAs) play critical regulatory roles by acting as sequence specific guide during secondary wall formation in woody and non-woody species. Although thousands of plant miRNAs have been sequenced, there is no comprehensive view of miRNA mediated gene regulatory network to provide profound biological insights into the regulation of xylem development. Herein, we report the involvement of six highly conserved amg-miRNA families (amg-miR166, amg-miR172, amg-miR168, amg-miR159, amg-miR394, and amg-miR156) as the potential regulatory sequences of secondary cell wall biosynthesis. Within this highly conserved amg-miRNA family, only amg-miR166 exhibited strong differences in expression between phloem and xylem tissue. The functional characterization of amg-miR166 targets in various tissues revealed three groups of HD-ZIP III: ATHB8, ATHB15, and REVOLUTA which play pivotal roles in xylem development. Although these three groups vary in their functions, -psRNA target analysis indicated that miRNA target sequences of the nine different members of HD-ZIP III are always conserved. We found that precursor structures of amg-miR166 undergo exhaustive sequence variation even within members of the same family. Gene expression analysis showed three key lignin pathway genes: C4H, CAD, and CCoAOMT were upregulated in compression wood where a cascade of miRNAs was downregulated. This study offers a comprehensive analysis on the involvement of highly conserved miRNAs implicated in the secondary wall formation of woody plants. PMID:23251324

  14. Synchronous detection of ebolavirus conserved RNA sequences and ebolavirus-encoded miRNA-like fragment based on a zwitterionic copper (II) metal-organic framework.

    PubMed

    Qiu, Gui-Hua; Weng, Zi-Hua; Hu, Pei-Pei; Duan, Wen-Jun; Xie, Bao-Ping; Sun, Bin; Tang, Xiao-Yan; Chen, Jin-Xiang

    2018-04-01

    From a three-dimensional (3D) metal-organic framework (MOF) of {[Cu(Cmdcp)(phen)(H 2 O)] 2 ·9H 2 O} n (1, H 3 CmdcpBr = N-carboxymethyl-(3,5-dicarboxyl)pyridinium bromide, phen = phenanthroline), a sensitive and selective fluorescence sensor has been developed for the simultaneous detection of ebolavirus conserved RNA sequences and ebolavirus-encoded microRNA-like (miRNA-like) fragment. The results from molecular dynamics simulation confirmed that MOF 1 absorbs carboxyfluorescein (FAM)-tagged and 5(6)-carboxyrhodamine, triethylammonium salt (ROX)-tagged probe ss-DNA (probe DNA, P-DNA) by π … π stacking and hydrogen bonding, as well as additional electrostatic interactions to form a sensing platform of P-DNAs@1 with quenched FAM and ROX fluorescence. In the presence of targeted ebolavirus conserved RNA sequences or ebolavirus-encoded miRNA-like fragment, the fluorophore-labeled P-DNA hybridizes with the analyte to give a P-DNA@RNA duplex and released from MOF 1, triggering a fluorescence recovery. Simultaneous detection of two target RNAs has also been realized by single and synchronous fluorescence analysis. The formed sensing platform shows high sensitivity for ebolavirus conserved RNA sequences and ebolavirus-encoded miRNA-like fragment with detection limits at the picomolar level and high selectivity without cross-reaction between the two probes. MOF 1 thus shows the potential as an effective fluorescent sensing platform for the synchronous detection of two ebolavirus-related sequences, and offer improved diagnostic accuracy of Ebola virus disease. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Reptiles and mammals have differentially retained long conserved noncoding sequences from the amniote ancestor.

    PubMed

    Janes, D E; Chapus, C; Gondo, Y; Clayton, D F; Sinha, S; Blatti, C A; Organ, C L; Fujita, M K; Balakrishnan, C N; Edwards, S V

    2011-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation.

  16. Reptiles and Mammals Have Differentially Retained Long Conserved Noncoding Sequences from the Amniote Ancestor

    PubMed Central

    Janes, D.E.; Chapus, C.; Gondo, Y.; Clayton, D.F.; Sinha, S.; Blatti, C.A.; Organ, C.L.; Fujita, M.K.; Balakrishnan, C.N.; Edwards, S.V.

    2010-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation. PMID:21183607

  17. Distantly related lipocalins share two conserved clusters of hydrophobic residues: use in homology modeling

    PubMed Central

    Adam, Benoit; Charloteaux, Benoit; Beaufays, Jerome; Vanhamme, Luc; Godfroid, Edmond; Brasseur, Robert; Lins, Laurence

    2008-01-01

    Background Lipocalins are widely distributed in nature and are found in bacteria, plants, arthropoda and vertebra. In hematophagous arthropods, they are implicated in the successful accomplishment of the blood meal, interfering with platelet aggregation, blood coagulation and inflammation and in the transmission of disease parasites such as Trypanosoma cruzi and Borrelia burgdorferi. The pairwise sequence identity is low among this family, often below 30%, despite a well conserved tertiary structure. Under the 30% identity threshold, alignment methods do not correctly assign and align proteins. The only safe way to assign a sequence to that family is by experimental determination. However, these procedures are long and costly and cannot always be applied. A way to circumvent the experimental approach is sequence and structure analyze. To further help in that task, the residues implicated in the stabilisation of the lipocalin fold were determined. This was done by analyzing the conserved interactions for ten lipocalins having a maximum pairwise identity of 28% and various functions. Results It was determined that two hydrophobic clusters of residues are conserved by analysing the ten lipocalin structures and sequences. One cluster is internal to the barrel, involving all strands and the 310 helix. The other is external, involving four strands and the helix lying parallel to the barrel surface. These clusters are also present in RaHBP2, a unusual "outlier" lipocalin from tick Rhipicephalus appendiculatus. This information was used to assess assignment of LIR2 a protein from Ixodes ricinus and to build a 3D model that helps to predict function. FTIR data support the lipocalin fold for this protein. Conclusion By sequence and structural analyzes, two conserved clusters of hydrophobic residues in interactions have been identified in lipocalins. Since the residues implicated are not conserved for function, they should provide the minimal subset necessary to confer the lipocalin fold. This information has been used to assign LIR2 to lipocalins and to investigate its structure/function relationship. This study could be applied to other protein families with low pairwise similarity, such as the structurally related fatty acid binding proteins or avidins. PMID:18190694

  18. Identification of a new Apscaviroid from Japanese persimmon.

    PubMed

    Nakaune, Ryoji; Nakano, Masaaki

    2008-01-01

    Three viroid-like sequences were detected from Japanese persimmon (Diospyrus kaki Thunb.) by RT-PCR using primers specific for members of the genus Apscaviroid. Based on the sequences, we determined the complete genomic sequences. Two had 92.1-94.3% sequence identity with citrus viroid OS (CVd-OS) and 91.4-96.3% identity with apple fruit crinkle viroid (AFCVd), respectively. Another one, tentatively named persimmon viroid (PVd), had 396 nucleotides and less than 70% sequence identity with known viroids. The secondary structure of PVd is proposed to be rod-like with extensive base pairing and contains the terminal conserved region and the central conserved region characteristic of the genus Apscaviroid. Moreover, we confirmed that the viroids, including PVd, are graft transmissible from persimmon to persimmon and that persimmon is a natural host of these viroids. According to its molecular and biological properties, PVd should be considered a member of a new species in the genus Apscaviroid.

  19. Characterization, genetic diversity, and evolutionary link of Cucumber mosaic virus strain New Delhi from India.

    PubMed

    Koundal, Vikas; Haq, Qazi Mohd Rizwanul; Praveen, Shelly

    2011-02-01

    The genome of Cucumber mosaic virus New Delhi strain (CMV-ND) from India, obtained from tomato, was completely sequenced and compared with full genome sequences of 14 known CMV strains from subgroups I and II, for their genetic diversity. Sequence analysis suggests CMV-ND shares maximum sequence identity at the nucleotide level with a CMV strain from Taiwan. Among all 15 strains of CMV, the encoded protein 2b is least conserved, whereas the coat protein (CP) is most conserved. Sequence identity values and phylogram results indicate that CMV-ND belongs to subgroup I. Based on the recombination detection program result, it appears that CMV is prone to recombination, and different RNA components of CMV-ND have evolved differently. Recombinational analysis of all 15 CMV strains detected maximum recombination breakpoints in RNA2; CP showed the least recombination sites.

  20. Cloning and biochemical characterization of a novel lipolytic gene from activated sludge metagenome, and its gene product

    PubMed Central

    2010-01-01

    In this study, a putative esterase, designated EstMY, was isolated from an activated sludge metagenomic library. The lipolytic gene was subcloned and expressed in Escherichia coli BL21 using the pET expression system. The gene estMY contained a 1,083 bp open reading frame (ORF) encoding a polypeptide of 360 amino acids with a molecular mass of 38 kDa. Sequence analysis indicated that it showed 71% and 52% amino acid identity to esterase/lipase from marine metagenome (ACL67845) and Burkholderia ubonensis Bu (ZP_02382719), respectively; and several conserved regions were identified, including the putative active site, GDSAG, a catalytic triad (Ser203, Asp301, and His327) and a HGGG conserved motif (starting from His133). The EstMY was determined to hydrolyse p-nitrophenyl (NP) esters of fatty acids with short chain lengths (≤C8). This EstMY exhibited the highest activity at 35°C and pH 8.5 respectively, by hydrolysis of p-NP caprylate. It also exhibited the same level of activity over wide temperature and pH spectra and in the presence of metal ions or detergents. The high level of stability of esterase EstMY with unique substrate specificities makes it highly valuable for downstream biotechnological applications. PMID:21054894

  1. Phylogeography of the endangered Cathaya argyrophylla (Pinaceae) inferred from sequence variation of mitochondrial and nuclear DNA.

    PubMed

    Wang, Hong-Wei; Ge, Song

    2006-11-01

    Cathaya argyrophylla is an endangered conifer restricted to subtropical mountains of China. To study phylogeographical pattern and demographic history of C. argyrophylla, species-wide genetic variation was investigated using sequences of maternally inherited mtDNA and biparentally inherited nuclear DNA. Of 15 populations sampled from all four distinct regions, only three mitotypes were detected at two loci, without single region having a mixed composition (G(ST) = 1). Average nucleotide diversity (theta(ws) = 0.0024; pi(s) = 0.0029) across eight nuclear loci is significantly lower than those found for other conifers (theta(ws) = 0.003 approximately 0.015; pi(s) = 0.002 approximately 0.012) based on estimates of multiple loci. Because of its highest diversity among the eight nuclear loci and evolving neutrally, one locus (2009) was further used for phylogeographical studies and eight haplotypes resulting from 12 polymorphic sites were obtained from 98 individuals. All the four distinct regions had at least four haplotypes, with the Dalou region (DL) having the highest diversity and the Bamian region (BM) the lowest, paralleling the result of the eight nuclear loci. An AMOVA revealed significant proportion of diversity attributable to differences among regions (13.4%) and among populations within regions (8.9%). F(ST) analysis also indicated significantly high differentiation among populations (F(ST) = 0.22) and between regions (F(ST) = 0.12-0.38). Non-overlapping distribution of mitotypes and high genetic differentiation among the distinct geographical groups suggest the existence of at least four separate glacial refugia. Based on network and mismatch distribution analyses, we do not find evidence of long distance dispersal and population expansion in C. argyrophylla. Ex situ conservation and artificial crossing are recommended for the management of this endangered species.

  2. Molecular cloning, ontogeny and tissue distribution of zebrafish (Danio rerio) prohormone convertases: pcsk1 and pcsk2.

    PubMed

    Morash, Michael G; MacDonald, Angela B; Croll, Roger P; Anini, Younes

    2009-06-01

    Prohormone convertase subtilisin/kexin (PCSK) enzymes are a family of nine related serine proteases, found in a multitude of tissues, and responsible for the maturation of a variety of protein and peptide precursors. Pcsk1 and Pcsk2 are found within dense core secretory granules in endocrine and neuroendocrine cells and are responsible for cleaving several hormones and neuropeptide precursors. In this work, we cloned and sequenced the cDNA of pcsk1 and pcsk2 from zebrafish (Danio rerio). pcsk1 is a 2268bp ORF, whose 755 amino acid protein product is identical to that predicted from the genome sequence. pcsk2 is a 1941bp ORF, encoding a 646 amino acid peptide. Both Pcsk1 and Pcsk2 display high degrees of similarity to their counterparts in other species, including the conservation of the catalytic triad and other essential residues. The brain contained the highest expression levels of both pcsk1 (1.49+/-0.21) (displayed as ratio to EF-1a), and pcsk2 (0.23+/-0.04). Both transcripts were also detectable in the fore, mid and distal gut. pcsk1 and 2 were detectable at 4.5h post-fertilization, and while pcsk1 expression increased throughout development (0.12+/-0.01 maximum at 3 days post-fertilization), pcsk2 expression was highest at day 5 post-fertilization (0.03+/-0.01), and decreased prior. For the first time, we have identified and characterized a pcsk1 transcript in fish. We have also identified and characterized the pcsk2 transcript in zebrafish, and have assessed the tissue distribution and ontogeny of both.

  3. Abundant RNA editing sites of chloroplast protein-coding genes in Ginkgo biloba and an evolutionary pattern analysis.

    PubMed

    He, Peng; Huang, Sheng; Xiao, Guanghui; Zhang, Yuzhou; Yu, Jianing

    2016-12-01

    RNA editing is a posttranscriptional modification process that alters the RNA sequence so that it deviates from the genomic DNA sequence. RNA editing mainly occurs in chloroplasts and mitochondrial genomes, and the number of editing sites varies in terrestrial plants. Why and how RNA editing systems evolved remains a mystery. Ginkgo biloba is one of the oldest seed plants and has an important evolutionary position. Determining the patterns and distribution of RNA editing in the ancient plant provides insights into the evolutionary trend of RNA editing, and helping us to further understand their biological significance. In this paper, we investigated 82 protein-coding genes in the chloroplast genome of G. biloba and identified 255 editing sites, which is the highest number of RNA editing events reported in a gymnosperm. All of the editing sites were C-to-U conversions, which mainly occurred in the second codon position, biased towards to the U_A context, and caused an increase in hydrophobic amino acids. RNA editing could change the secondary structures of 82 proteins, and create or eliminate a transmembrane region in five proteins as determined in silico. Finally, the evolutionary tendencies of RNA editing in different gene groups were estimated using the nonsynonymous-synonymous substitution rate selection mode. The G. biloba chloroplast genome possesses the highest number of RNA editing events reported so far in a seed plant. Most of the RNA editing sites can restore amino acid conservation, increase hydrophobicity, and even influence protein structures. Similar purifying selections constitute the dominant evolutionary force at the editing sites of essential genes, such as the psa, some psb and pet groups, and a positive selection occurred in the editing sites of nonessential genes, such as most ndh and a few psb genes.

  4. Genome-wide identification and characterization of aquaporin gene family in common bean (Phaseolus vulgaris L.).

    PubMed

    Ariani, Andrea; Gepts, Paul

    2015-10-01

    Plant aquaporins are a large and diverse family of water channel proteins that are essential for several physiological processes in living organisms. Numerous studies have linked plant aquaporins with a plethora of processes, such as nutrient acquisition, CO2 transport, plant growth and development, and response to abiotic stresses. However, little is known about this protein family in common bean. Here, we present a genome-wide identification of the aquaporin gene family in common bean (Phaseolus vulgaris L.), a legume crop essential for human nutrition. We identified 41 full-length coding aquaporin sequences in the common bean genome, divided by phylogenetic analysis into five sub-families (PIPs, TIPs, NIPs, SIPs and XIPs). Residues determining substrate specificity of aquaporins (i.e., NPA motifs and ar/R selectivity filter) seem conserved between common bean and other plant species, allowing inference of substrate specificity for these proteins. Thanks to the availability of RNA-sequencing datasets, expression levels in different organs and in leaves of wild and domesticated bean accessions were evaluated. Three aquaporins (PvTIP1;1, PvPIP2;4 and PvPIP1;2) have the overall highest mean expressions, with PvTIP1;1 having the highest expression among all aquaporins. We performed an EST database mining to identify drought-responsive aquaporins in common bean. This analysis showed a significant increase in expression for PvTIP1;1 in drought stress conditions compared to well-watered environments. The pivotal role suggested for PvTIP1;1 in regulating water homeostasis and drought stress response in the common bean should be verified by further field experimentation under drought stress.

  5. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.

    PubMed

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/

  6. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures

    PubMed Central

    Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir

    2009-01-01

    ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256

  7. Comparative sequence analysis of a region on human chromosome 13q14, frequently deleted in B-cell chronic lymphocytic leukemia, and its homologous region on mouse chromosome 14.

    PubMed

    Kapanadze, B; Makeeva, N; Corcoran, M; Jareborg, N; Hammarsund, M; Baranova, A; Zabarovsky, E; Vorontsova, O; Merup, M; Gahrton, G; Jansson, M; Yankovsky, N; Einhorn, S; Oscier, D; Grandér, D; Sangfelt, O

    2000-12-15

    Previous studies have indicated the presence of a putative tumor suppressor gene on human chromosome 13q14, commonly deleted in patients with B-cell chronic lymphocytic leukemia (B-CLL). We have recently identified a minimally deleted region encompassing parts of two adjacent genes, termed LEU1 and LEU2 (leukemia-associated genes 1 and 2), and several additional transcripts. In addition, 50 kb centromeric to this region we have identified another gene, LEU5/RFP2. To elucidate further the complex genomic organization of this region, we have identified, mapped, and sequenced the homologous region in the mouse. Fluorescence in situ hybridization analysis demonstrated that the region maps to mouse chromosome 14. The overall organization and gene order in this region were found to be highly conserved in the mouse. Sequence comparison between the human deletion hotspot region and its homologous mouse region revealed a high degree of sequence conservation with an overall score of 74%. However, our data also show that in terms of transcribed sequences, only two of those, human LEU2 and LEU5/RFP2, are clearly conserved, strengthening the case for these genes as putative candidate B-CLL tumor suppressor genes.

  8. Energy Conservation Curriculum for Secondary and Post-Secondary Students. Module 9: Human Comfort and Energy Conservation.

    ERIC Educational Resources Information Center

    Navarro Coll., Corsicana, TX.

    This module is the ninth in a series of eleven modules in an energy conservation curriculum for secondary and postsecondary vocational students. It is designed for use by itself or as part of a sequence of four modules on energy conservation in building construction and operation (see also modules 8, 10, and 11). The objective of this module is to…

  9. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.

    PubMed

    2004-12-09

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

  10. Conservation: evolutionary values for all 10,000 birds.

    PubMed

    Lovette, Irby J

    2014-05-19

    Many biologists and conservation practitioners believe that preserving evolutionary diversity should be a priority. An innovative new study measures the evolutionary distinctness of all the world's birds and identifies the species and locations that capture the highest fraction of avian evolutionary history. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. The Silkworm (Bombyx mori) microRNAs and Their Expressions in Multiple Developmental Stages

    PubMed Central

    Luo, Qibin; Cai, Yimei; Lin, Wen-chang; Chen, Huan; Yang, Yue; Hu, Songnian; Yu, Jun

    2008-01-01

    Background MicroRNAs (miRNAs) play crucial roles in various physiological processes through post-transcriptional regulation of gene expressions and are involved in development, metabolism, and many other important molecular mechanisms and cellular processes. The Bombyx mori genome sequence provides opportunities for a thorough survey for miRNAs as well as comparative analyses with other sequenced insect species. Methodology/Principal Findings We identified 114 non-redundant conserved miRNAs and 148 novel putative miRNAs from the B. mori genome with an elaborate computational protocol. We also sequenced 6,720 clones from 14 developmental stage-specific small RNA libraries in which we identified 35 unique miRNAs containing 21 conserved miRNAs (including 17 predicted miRNAs) and 14 novel miRNAs (including 11 predicted novel miRNAs). Among the 114 conserved miRNAs, we found six pairs of clusters evolutionarily conserved cross insect lineages. Our observations on length heterogeneity at 5′ and/or 3′ ends of nine miRNAs between cloned and predicted sequences, and three mature forms deriving from the same arm of putative pre-miRNAs suggest a mechanism by which miRNAs gain new functions. Analyzing development-related miRNAs expression at 14 developmental stages based on clone-sampling and stem-loop RT PCR, we discovered an unusual abundance of 33 sequences representing 12 different miRNAs and sharply fluctuated expression of miRNAs at larva-molting stage. The potential functions of several stage-biased miRNAs were also analyzed in combination with predicted target genes and silkworm's phenotypic traits; our results indicated that miRNAs may play key regulatory roles in specific developmental stages in the silkworm, such as ecdysis. Conclusions/Significance Taking a combined approach, we identified 118 conserved miRNAs and 151 novel miRNA candidates from the B. mori genome sequence. Our expression analyses by sampling miRNAs and real-time PCR over multiple developmental stages allowed us to pinpoint molting stages as hotspots of miRNA expression both in sorts and quantities. Based on the analysis of target genes, we hypothesized that miRNAs regulate development through a particular emphasis on complex stages rather than general regulatory mechanisms. PMID:18714353

  12. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

    PubMed

    Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

    2013-07-30

    Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.

  13. Evolution of the cytoskeleton

    PubMed Central

    Erickson, Harold P.

    2009-01-01

    Summary The eukaryotic cytoskeleton appears to have evolved from ancestral precursors related to prokaryotic FtsZ and MreB. FtsZ and MreB show 40−50% sequence identity across different bacterial and archaeal species. Here I suggest that this represents the limit of divergence that is consistent with maintaining their functions for cytokinesis and cell shape. Previous analyses have noted that tubulin and actin are highly conserved across eukaryotic species, but so divergent from their prokaryotic relatives as to be hardly recognizable from sequence comparisons. One suggestion for this extreme divergence of tubulin and actin is that it occurred as they evolved very different functions from FtsZ and MreB. I will present new arguments favoring this suggestion, and speculate on pathways. Moreover, the extreme conservation of tubulin and actin across eukaryotic species is not due to an intrinsic lack of variability, but is attributed to their acquisition of elaborate mechanisms for assembly dynamics and their interactions with multiple motor and binding proteins. A new structure-based sequence alignment identifies amino acids that are conserved from FtsZ to tubulins. The highly conserved amino acids are not those forming the subunit core or protofilament interface, but those involved in binding and hydrolysis of GTP. PMID:17563102

  14. P53 Gene Mutagenesis in Breast Cancer

    DTIC Science & Technology

    2005-03-01

    the wild type T peak. 12 Table 1. Sonic ntations dected by SINtA Individual Cell Sequence Amino Acid Species Conservation 3 ID’ ID Change2 Change... differences in the content of toxic substances in the diet (Biggs et al., 1993; Blaszyk et al., 1996). The development of this p53 mutation load...Changes in the P53 Gene in Single Cells Individual Sequence Amino acid Species conservation ’ ID’ Cell ID change’ change Monkey Mouse Rat Chicken

  15. Analysis of SSR information in EST resources of sugarcane

    USDA-ARS?s Scientific Manuscript database

    Expressed sequence tags ( ESTs) offer the opportunity to exploit single, low -copy, conserved sequence motifs for the development of simple sequence repeats ( SSRs). The total of 262 113 ESTs of sugarcane (Saccharum officinarum) in the database of NCBI were downloaded and analyzed, which resulted in...

  16. FA-SAT Is an Old Satellite DNA Frozen in Several Bilateria Genomes

    PubMed Central

    Chaves, Raquel; Ferreira, Daniela; Mendes-da-Silva, Ana; Meles, Susana; Adega, Filomena

    2017-01-01

    Abstract In recent years, a growing body of evidence has recognized the tandem repeat sequences, and specifically satellite DNA, as a functional class of sequences in the genomic “dark matter.” Using an original, complementary, and thus an eclectic experimental design, we show that the cat archetypal satellite DNA sequence, FA-SAT, is “frozen” conservatively in several Bilateria genomes. We found different genomic FA-SAT architectures, and the interspersion pattern was conserved. In Carnivora genomes, the FA-SAT-related sequences are also amplified, with the predominance of a specific FA-SAT variant, at the heterochromatic regions. We inspected the cat genome project to locate FA-SAT array flanking regions and revealed an intensive intermingling with transposable elements. Our results also show that FA-SAT-related sequences are transcribed and that the most abundant FA-SAT variant is not always the most transcribed. We thus conclude that the DNA sequences of FA-SAT and their transcripts are “frozen” in these genomes. Future work is needed to disclose any putative function that these sequences may play in these genomes. PMID:29608678

  17. Mutations in the newly identified RAX regulatory sequence are not a frequent cause of micro/anophthalmia.

    PubMed

    Chassaing, Nicolas; Vigouroux, Adeline; Calvas, Patrick

    2009-06-01

    Microphthalmia and anophthalmia are at the severe end of the spectrum of abnormalities in ocular development. A few genes (SOX2, OTX2, RAX, and CHX10) have been implicated in isolated micro/anophthalmia, but causative mutations of these genes explain less than a quarter of these developmental defects. A specifically conserved SOX2/OTX2-mediated RAX expression regulatory sequence has recently been identified. We postulated that mutations in this sequence could lead to micro/anophthalmia, and thus we performed molecular screening of this regulatory element in patients suffering from micro/anophthalmia. Fifty-one patients suffering from nonsyndromic microphthalmia (n = 40) or anophthalmia (n = 11) were included in this study after negative molecular screening for SOX2, OTX2, RAX, and CHX10 mutations. Mutation screening of the RAX regulatory sequence was performed by direct sequencing for these patients. No mutations were identified in the highly conserved RAX regulatory sequence in any of the 51 patients. Mutations in the newly identified RAX regulatory sequence do not represent a frequent cause of nonsyndromic micro/anophthalmia.

  18. Highly conserved D-loop-like nuclear mitochondrial sequences (Numts) in tiger (Panthera tigris).

    PubMed

    Zhang, Wenping; Zhang, Zhihe; Shen, Fujun; Hou, Rong; Lv, Xiaoping; Yue, Bisong

    2006-08-01

    Using oligonucleotide primers designed to match hypervariable segments I (HVS-1) of Panthera tigris mitochondrial DNA (mtDNA), we amplified two different PCR products (500 bp and 287 bp) in the tiger (Panthera tigris), but got only one PCR product (287 bp) in the leopard (Panthera pardus). Sequence analyses indicated that the sequence of 287 bp was a D-loop-like nuclear mitochondrial sequence (Numts), indicating a nuclear transfer that occurred approximately 4.8-17 million years ago in the tiger and 4.6-16 million years ago in the leopard. Although the mtDNA D-loop sequence has a rapid rate of evolution, the 287-bp Numts are highly conserved; they are nearly identical in tiger subspecies and only 1.742% different between tiger and leopard. Thus, such sequences represent molecular 'fossils' that can shed light on evolution of the mitochondrial genome and may be the most appropriate outgroup for phylogenetic analysis. This is also proved by comparing the phylogenetic trees reconstructed using the D-loop sequence of snow leopard and the 287-bp Numts as outgroup.

  19. A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.

    PubMed

    Mehmood, Tahir; Bohlin, Jon; Snipen, Lars

    2015-01-01

    The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.

  20. Parallel tagged next-generation sequencing on pooled samples - a new approach for population genetics in ecology and conservation.

    PubMed

    Zavodna, Monika; Grueber, Catherine E; Gemmell, Neil J

    2013-01-01

    Next-generation sequencing (NGS) on pooled samples has already been broadly applied in human medical diagnostics and plant and animal breeding. However, thus far it has been only sparingly employed in ecology and conservation, where it may serve as a useful diagnostic tool for rapid assessment of species genetic diversity and structure at the population level. Here we undertake a comprehensive evaluation of the accuracy, practicality and limitations of parallel tagged amplicon NGS on pooled population samples for estimating species population diversity and structure. We obtained 16S and Cyt b data from 20 populations of Leiopelma hochstetteri, a frog species of conservation concern in New Zealand, using two approaches - parallel tagged NGS on pooled population samples and individual Sanger sequenced samples. Data from each approach were then used to estimate two standard population genetic parameters, nucleotide diversity (π) and population differentiation (FST), that enable population genetic inference in a species conservation context. We found a positive correlation between our two approaches for population genetic estimates, showing that the pooled population NGS approach is a reliable, rapid and appropriate method for population genetic inference in an ecological and conservation context. Our experimental design also allowed us to identify both the strengths and weaknesses of the pooled population NGS approach and outline some guidelines and suggestions that might be considered when planning future projects.

  1. Conservation for Children [Levels 1-6 and an All-Levels Supplement].

    ERIC Educational Resources Information Center

    Cupertino Union School District, CA.

    Developed to promote conservation awareness in elementary students, each of the six grade-level-sequenced activity guides provides: (1) a list of conservation concepts; (2) a criterion-referenced test; (3) a class record sheet; (4) a content guide; and (5) 90 student worksheets (40 for language arts, 20 for mathematics, 20 for social studies and…

  2. Cloning and expression of hepatic synaptotagmin 1 in mouse.

    PubMed

    Sancho-Knapik, Sara; Guillén, Natalia; Osada, Jesús

    2015-05-15

    Mouse hepatic synaptotagmin 1 (SYT1) cDNA was cloned, characterized and compared to the brain one. The hepatic transcript was 1807 bp in length, smaller than the brain, and only encoded by 9 of 11 gene exons. In this regard, 5'-and 3'-untranslated regions were 66 and 476 bp, respectively; the open reading frame of 1266 bp codified for a protein of 421 amino acids, identical to the brain, with a predicted molecular mass of 47.4 kDa and highly conserved across different species. Immunoblotting of protein showed two isoforms of higher molecular masses than the theoretical prediction based on amino acid sequence suggesting posttranslational modifications. Subcellular distribution of protein isoforms corresponded to plasma membrane, lysosomes and microsomes and was identical between the brain and liver. Nonetheless, the highest molecular weight isoform was smaller in the liver, irrespective of subcellular location. Quantitative mRNA tissue distribution showed that it was widely expressed and that the highest values corresponded to the brain, followed by the liver, spleen, abdominal fat, intestine and skeletal muscle. These findings indicate tissue-specific splicing of the gene and posttranslational modification and the variation in expression in the different tissues might suggest a different requirement of SYT1 for the specific function in each organ. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Structure-Based Sequence Alignment of the Transmembrane Domains of All Human GPCRs: Phylogenetic, Structural and Functional Implications

    PubMed Central

    Cvicek, Vaclav; Goddard, William A.; Abrol, Ravinder

    2016-01-01

    The understanding of G-protein coupled receptors (GPCRs) is undergoing a revolution due to increased information about their signaling and the experimental determination of structures for more than 25 receptors. The availability of at least one receptor structure for each of the GPCR classes, well separated in sequence space, enables an integrated superfamily-wide analysis to identify signatures involving the role of conserved residues, conserved contacts, and downstream signaling in the context of receptor structures. In this study, we align the transmembrane (TM) domains of all experimental GPCR structures to maximize the conserved inter-helical contacts. The resulting superfamily-wide GpcR Sequence-Structure (GRoSS) alignment of the TM domains for all human GPCR sequences is sufficient to generate a phylogenetic tree that correctly distinguishes all different GPCR classes, suggesting that the class-level differences in the GPCR superfamily are encoded at least partly in the TM domains. The inter-helical contacts conserved across all GPCR classes describe the evolutionarily conserved GPCR structural fold. The corresponding structural alignment of the inactive and active conformations, available for a few GPCRs, identifies activation hot-spot residues in the TM domains that get rewired upon activation. Many GPCR mutations, known to alter receptor signaling and cause disease, are located at these conserved contact and activation hot-spot residue positions. The GRoSS alignment places the chemosensory receptor subfamilies for bitter taste (TAS2R) and pheromones (Vomeronasal, VN1R) in the rhodopsin family, known to contain the chemosensory olfactory receptor subfamily. The GRoSS alignment also enables the quantification of the structural variability in the TM regions of experimental structures, useful for homology modeling and structure prediction of receptors. Furthermore, this alignment identifies structurally and functionally important residues in all human GPCRs. These residues can be used to make testable hypotheses about the structural basis of receptor function and about the molecular basis of disease-associated single nucleotide polymorphisms. PMID:27028541

  4. Comparative transgenic analysis of enhancers from the human SHOX and mouse Shox2 genomic regions.

    PubMed

    Rosin, Jessica M; Abassah-Oppong, Samuel; Cobb, John

    2013-08-01

    Disruption of presumptive enhancers downstream of the human SHOX gene (hSHOX) is a frequent cause of the zeugopodal limb defects characteristic of Léri-Weill dyschondrosteosis (LWD). The closely related mouse Shox2 gene (mShox2) is also required for limb development, but in the more proximal stylopodium. In this study, we used transgenic mice in a comparative approach to characterize enhancer sequences in the hSHOX and mShox2 genomic regions. Among conserved noncoding elements (CNEs) that function as enhancers in vertebrate genomes, those that are maintained near paralogous genes are of particular interest given their ancient origins. Therefore, we first analyzed the regulatory potential of a genomic region containing one such duplicated CNE (dCNE) downstream of mShox2 and hSHOX. We identified a strong limb enhancer directly adjacent to the mShox2 dCNE that recapitulates the expression pattern of the endogenous gene. Interestingly, this enhancer requires sequences only conserved in the mammalian lineage in order to drive strong limb expression, whereas the more deeply conserved sequences of the dCNE function as a neural enhancer. Similarly, we found that a conserved element downstream of hSHOX (CNE9) also functions as a neural enhancer in transgenic mice. However, when the CNE9 transgenic construct was enlarged to include adjacent, non-conserved sequences frequently deleted in LWD patients, the transgene drove expression in the zeugopodium of the limbs. Therefore, both hSHOX and mShox2 limb enhancers are coupled to distinct neural enhancers. This is the first report demonstrating the activity of cis-regulatory elements from the hSHOX and mShox2 genomic regions in mammalian embryos.

  5. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

    PubMed

    Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

    2016-11-01

    Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.

  6. Kinetoplast DNA minicircles of phloem-restricted Phytomonas associated with wilt diseases of coconut and oil palms have a two-domain structure.

    PubMed

    Dollet, M; Sturm, N R; Ahomadegbe, J C; Campbell, D A

    2001-11-27

    We report the cloning and sequencing of the first minicircle from a phloem-restricted, pathogenic Phytomonas sp. (Hart 1) isolated from a coconut palm with hartrot disease. The minicircle possessed a two-domain structure of two conserved regions, each containing three conserved sequence blocks (CSB). Based on the sequence around CSB 3 from Hart 1, PCR primers were designed to allow specific amplification of Phytomonas minicircles. This primer pair demonstrated specificity for at least six groups of plant trypanosomatids and did not amplify from insect trypanosomatids. The PCR results were consistent with a two-domain structure for other plant trypanosomatids.

  7. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.

    PubMed

    Powell, Bradford C; Hutchison, Clyde A

    2006-01-19

    Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.

  8. Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs

    PubMed Central

    Powell, Bradford C; Hutchison, Clyde A

    2006-01-01

    Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes. PMID:16423288

  9. Structure and Biochemestry of Laccases from the Lignin-Degrading Basidiomycete, Ganoderma lucidum

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    C.A.Reddy, PI

    2005-06-30

    G. lucidum is one of the most important and widely distributed ligninolytic white rot fungi from habitats such as forest soils, agricultural soils, and tropical mangrove ecosystems and produce laccases as an important family of lignin modifying enzymes. Biochemically, laccases are blue multi copper oxidases that couple four electron reduction of molecular oxygen to water. There is a growing interest in the use of laccases for a variety of industrial applications such as bio-pulping and biobleaching as well as in their ability to detoxify a wide variety of toxic environmental pollutants. These key oxidative enzymes are found in all themore » three domains of life: Eukaryota. Prokarya, and Archaea. Ganoderma lucidum (strain no.103561) produces laccase with some of the highest activity (17,000 micro katals per mg of protein) reported for any laccases to date. Our results showed that this organism produces at least 11 different isoforms of laccase based on variation in mol. weight and/or PI. Our Studies showed that the presence of copper in the medium yields 15- to 20-fold greater levels of enzyme by G. lucidum. Dialysation of extra cellular fluid of G. lucidum against 10mM sodium tartrate (pH5.5) gave an additional 15 to 17 fold stimulation of activity with an observed specific activity of 17,000 {micro}katals/mg protein. Dialysis against acetate buffer gave five fold increase in activity while dialysis against glycine showed inhibition of activity. Purification by FPLC and preparative gel electrophoresis gave purified fractions that resolved into eleven isoforms as separated by isoelectric focusing, and the PI,s were 4.7, 4.6, 4.5, 4.3, 4.2, 4.1, 3.8, 3.7, 3.5, 3.4 and 3.3. Genomic clones of laccase were isolated using G. lucidum DNA as a template and using inverse PCR and forward/reverse primers corresponding to the sequences of the conserved copper binding region in the N-terminal domain of one of the laccases of this organism. Inverse PCR amplication of HindIII digested and ligated G.lucidum DNA was done using ABI Geneamp XL PCR kit in Ribocycler. The 5 conserved copper binding region of laccase was used for designing forward primer (5TCGACAATTCTTTCCTGTACG3) and reverse primer (5 TGGAGATGGG ACACT GGCTTATC 3). The PCR profile was 95 C for 3min, 94 C for 1min, 57 C for 30 sec and 68 C for 5min. for 30 cycles, and the final extension was at 72 C for 10min. The resulting {approx}2.7 Kb inverse PCR fragment was cloned into ZERO TOPOII blunt ligation vector (INVITROGEN) and screened on Kanamycin plates. Selected putative clones containing inserts were digested with a battery of restriction enzymes and analyzed on 1% agarose gels. Restriction digestion of these clones with BamHI, PstI, SalI, PvuII, EcoRI, and XhoI revealed 8 distinct patterns suggesting gene diversity. Two clones were sequenced using overlapping primers on ABI system. The sequences were aligned using Bioedit program. The aa sequences of the clones were deduced by Genewise2 program using Aspergillus as the reference organism. Eukaryotic gene regulatory sequences were identified using GeneWise2 Program. Laccase sequence alignments and similarity indexes were calculated using ClustalW and BioEdit programs. Blast analysis of two distinct BamHI clones, lac1 and lac4, showed that the proteins encoded by these clones are fungal laccase sequences. The coding sequence of lac1gene is interrupted by 6 introns ranging in size from 37-55 nt and encodes a mature protein consisting of 456 aa (Mr: 50,160), preceded by a putative 37-aa signal sequence. This predicted Mr is in agreement with the range of Mrs previously reported by us for the laccases of G. lucidum. The deduced aa sequence of LAC1 showed relatively high degree of homology with laccases of other basidiomycetes. It showed 96% homology to full-length LAC4 protein and 47-53% similarity to unpublished partial laccase sequences of other G. lucidum strains. Among the other basidiomycete laccases, LAC1 showed the highest similarity of 53-55% to Trametes versicolorLAC3 and LAC4. The consensus copper-binding domains found in other basidiomycete laccases are conserved in the LAC1 protein of G.lucidum. Eight putative N-glycosylation sites as well as consensus eukaryotic promoter sequence and polyadenylation signal sequences are also found. Coding sequence of lac4 is interrupted by 7 introns, encodes a mature protein of 525aa (Mr: 57,750), and has 98% nt homology to lac1, but was otherwise identical. Molecular masses of GLAC1 and GLAC4 were 49.8 kDa (462aa) and 52.5 kDa (524aa) in comparison to T. versicolr laccase which was 56.3 kDa (524aa). Predicted PI values of GLAC1, GLAC4 and T. versicolor laccase are, respectively 4.5, 4.7, and 4.2. Eight other laccase clones, distinct from lac1 and lac4 have recently been isolated from G. lucidum Our results show the existence of a laccase multi-gene family in G. lucidum in agreement with our earlier results showing multiple isoforms of laccase in this organism.« less

  10. Structure, inheritance, and expression of hybrid poplar (Populus trichocarpa x Populus deltoides) phenylalanine ammonia-lyase genes.

    PubMed Central

    Subramaniam, R; Reinold, S; Molitor, E K; Douglas, C J

    1993-01-01

    A heterologous probe encoding phenylalanine ammonia-lyase (PAL) was used to identify PAL clones in cDNA libraries made with RNA from young leaf tissue of two Populus deltoides x P. trichocarpa F1 hybrid clones. Sequence analysis of a 2.4-kb cDNA confirmed its identity as a full-length PAl clone. The predicted amino acid sequence is conserved in comparison with that of PAL genes from several other plants. Southern blot analysis of popular genomic DNA from parental and hybrid individuals, restriction site polymorphism in PAL cDNA clones, and sequence heterogeneity in the 3' ends of several cDNA clones suggested that PAL is encoded by at least two genes that can be distinguished by HindIII restriction site polymorphisms. Clones containing each type of PAL gene were isolated from a poplar genomic library. Analysis of the segregation of PAL-specific HindIII restriction fragment-length polymorphisms demonstrated the existence of two independently segregating PAL loci, one of which was mapped to a linkage group of the poplar genetic map. Developmentally regulated PAL expression in poplar was analyzed using RNA blots. Highest expression was observed in young stems, apical buds, and young leaves. Expression was lower in older stems and undetectable in mature leaves. Cellular localization of PAL expression by in situ hybridization showed very high levels of expression in subepidermal cells of leaves early during leaf development. In stems and petioles, expression was associated with subepidermal cells and vascular tissues. PMID:8108506

  11. Sequence analysis, expression profiles and function of thioredoxin 2 and thioredoxin reductase 1 in resistance to nucleopolyhedrovirus in Helicoverpa armigera

    PubMed Central

    Zhang, Songdou; Li, Zhen; Nian, Xiaoge; Wu, Fengming; Shen, Zhongjian; Zhang, Boyu; Zhang, Qingwen; Liu, Xiaoxia

    2015-01-01

    The thioredoxin system, including NADPH, thioredoxin (Trx), and thioredoxin reductase (TrxR), plays significant roles in maintaining intracellular redox homeostasis and protecting organisms against oxidative damage. In this study, the characteristics and functions of H. armigera HaTrx2 and HaTrxR1 were identified. Sequence analysis showed that HaTrx2 and HaTrxR1 were both highly conserved and shared high sequence identity with other insect counterparts. The mRNA of HaTrx2 was expressed the highest in 5th instar 96 h and was mainly detected in heads and epidermis. The expression of HaTrxR1 was highly concentrated in 5th instar 72 h and 96 h, and higher in malpighian tube, midgut and hemocyte than other examined tissues. HaTrx2 and HaTrxR1 were markedly induced by various types of stress. HaTrx2- or HaTrxR1-knockdown increased ROS production in hemocytes and also increased the lipid damage in NPV infected H. armigera larvae. Furthermore, interference with expression of HaTrx2 or HaTrxR1 transcripts in H. armigera larvae resulted in increased sensitivity to NPV infection and shortened LT50 values. Our findings indicated that HaTrx2 and HaTrxR1 contribute to the susceptibility of H. armigera to NPV and also provided the theoretical basis for the in-depth study of insect thioredoxin system. PMID:26502992

  12. Heterochrony and patterns of cranial suture closure in hystricognath rodents

    PubMed Central

    Wilson, Laura A B; Sánchez-Villagra, Marcelo R

    2009-01-01

    Sutures, joints that allow one bone to articulate with another through intervening fibrous connective tissue, serve as major sites of bone expansion during postnatal craniofacial growth in the vertebrate skull and represent an aspect of cranial ontogeny which may exhibit functional and phylogenetic correlates. Suture evolution among hystricognath rodents, an ecologically diverse group represented here by 26 species, is examined using sequence heterochrony methods, i.e. event pairing and parsimov. Although minor nuances in suture closure sequence exist between species, the overall sequence was found to be conserved both across the hystricognath group and, to an increasing degree, within selected clades. At species level, suture closure pattern exhibited a significant positive correlation with patterns previously reported for hominoids. Patterns for most clades revealed the first sutures to close are those contacting the exoccipital, interparietal, and palatine bones. Heterochronic shifts were found along 19 of 35 branches within the hystricognath phylogeny. The number of shifts per node ranged from one to seven events and, overall, involved 21 of 34 suture sites. The topology generated by parsimony analyses of the event pair matrix yielded only one grouping that was congruent with the evolutionary relationships, compiled from morphological and molecular studies, taken as framework. Sutures contacting the exoccipital displayed the highest levels of most complete closure across all species. Level of suture closure is negatively correlated with cranial length (P < 0.05). Differing life history and locomotory strategies are coupled in part with differing suture closure patterns among several species. PMID:19245501

  13. Genomewide analysis of Drosophila circular RNAs reveals their structural and sequence properties and age-dependent neural accumulation

    PubMed Central

    Westholm, Jakub O.; Miura, Pedro; Olson, Sara; Shenker, Sol; Joseph, Brian; Sanfilippo, Piero; Celniker, Susan E.; Graveley, Brenton R.; Lai, Eric C.

    2014-01-01

    Circularization was recently recognized to broadly expand transcriptome complexity. Here, we exploit massive Drosophila total RNA-sequencing data, >5 billion paired-end reads from >100 libraries covering diverse developmental stages, tissues and cultured cells, to rigorously annotate >2500 fruitfly circular RNAs. These mostly derive from back-splicing of protein-coding genes and lack poly(A) tails, and circularization of hundreds of genes is conserved across multiple Drosophila species. We elucidate structural and sequence properties of Drosophila circular RNAs, which exhibit commonalities and distinctions from mammalian circles. Notably, Drosophila circular RNAs harbor >1000 well-conserved canonical miRNA seed matches, especially within coding regions, and coding conserved miRNA sites reside preferentially within circularized exons. Finally, we analyze the developmental and tissue specificity of circular RNAs, and note their preferred derivation from neural genes and enhanced accumulation in neural tissues. Interestingly, circular isoforms increase dramatically relative to linear isoforms during CNS aging, and constitute a novel aging biomarker. PMID:25544350

  14. The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.

    PubMed

    Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H

    2016-04-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.

  15. Genome-wide Analysis of Drosophila Circular RNAs Reveals Their Structural and Sequence Properties and Age-Dependent Neural Accumulation

    DOE PAGES

    Westholm, Jakub  O.; Miura, Pedro; Olson, Sara; ...

    2014-11-26

    Circularization was recently recognized to broadly expand transcriptome complexity. Here, we exploit massive Drosophila total RNA-sequencing data, >5 billion paired-end reads from >100 libraries covering diverse developmental stages, tissues, and cultured cells, to rigorously annotate >2,500 fruit fly circular RNAs. These mostly derive from back-splicing of protein-coding genes and lack poly(A) tails, and the circularization of hundreds of genes is conserved across multiple Drosophila species. We elucidate structural and sequence properties of Drosophila circular RNAs, which exhibit commonalities and distinctions from mammalian circles. Notably, Drosophila circular RNAs harbor >1,000 well-conserved canonical miRNA seed matches, especially within coding regions, and codingmore » conserved miRNA sites reside preferentially within circularized exons. Finally, we analyze the developmental and tissue specificity of circular RNAs and note their preferred derivation from neural genes and enhanced accumulation in neural tissues. Interestingly, circular isoforms increase substantially relative to linear isoforms during CNS aging and constitute an aging biomarker.« less

  16. Genome-wide Analysis of Drosophila Circular RNAs Reveals Their Structural and Sequence Properties and Age-Dependent Neural Accumulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Westholm, Jakub  O.; Miura, Pedro; Olson, Sara

    Circularization was recently recognized to broadly expand transcriptome complexity. Here, we exploit massive Drosophila total RNA-sequencing data, >5 billion paired-end reads from >100 libraries covering diverse developmental stages, tissues, and cultured cells, to rigorously annotate >2,500 fruit fly circular RNAs. These mostly derive from back-splicing of protein-coding genes and lack poly(A) tails, and the circularization of hundreds of genes is conserved across multiple Drosophila species. We elucidate structural and sequence properties of Drosophila circular RNAs, which exhibit commonalities and distinctions from mammalian circles. Notably, Drosophila circular RNAs harbor >1,000 well-conserved canonical miRNA seed matches, especially within coding regions, and codingmore » conserved miRNA sites reside preferentially within circularized exons. Finally, we analyze the developmental and tissue specificity of circular RNAs and note their preferred derivation from neural genes and enhanced accumulation in neural tissues. Interestingly, circular isoforms increase substantially relative to linear isoforms during CNS aging and constitute an aging biomarker.« less

  17. Hairpin structures with conserved sequence motifs determine the 3' ends of non-polyadenylated invertebrate iridovirus transcripts.

    PubMed

    İnce, İkbal Agah; Pijlman, Gorben P; Vlak, Just M; van Oers, Monique M

    2017-11-01

    Previously, we observed that the transcripts of Invertebrate iridescent virus 6 (IIV6) are not polyadenylated, in line with the absence of canonical poly(A) motifs (AATAAA) downstream of the open reading frames (ORFs) in the genome. Here, we determined the 3' ends of the transcripts of fifty-four IIV6 virion protein genes in infected Drosophila Schneider 2 (S2) cells. By using ligation-based amplification of cDNA ends (LACE) it was shown that the IIV6 mRNAs often ended with a CAUUA motif. In silico analysis showed that the 3'-untranslated regions of IIV6 genes have the ability to form hairpin structures (22-56 nt in length) and that for about half of all IIV6 genes these 3' sequences contained complementary TAATG and CATTA motifs. We also show that a hairpin in the 3' flanking region with conserved sequence motifs is a conserved feature in invertebrate-infecting iridoviruses (genus Iridovirus and Chloriridovirus). Copyright © 2017 Elsevier Inc. All rights reserved.

  18. The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons

    PubMed Central

    Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.

    2016-01-01

    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095

  19. Determinism and randomness in the evolution of introns and sine inserts in mouse and human mitochondrial solute carrier and cytokine receptor genes.

    PubMed

    Cianciulli, Antonia; Calvello, Rosa; Panaro, Maria A

    2015-04-01

    In the homologous genes studied, the exons and introns alternated in the same order in mouse and human. We studied, in both species: corresponding short segments of introns, whole corresponding introns and complete homologous genes. We considered the total number of nucleotides and the number and orientation of the SINE inserts. Comparisons of mouse and human data series showed that at the level of individual relatively short segments of intronic sequences the stochastic variability prevails in the local structuring, but at higher levels of organization a deterministic component emerges, conserved in mouse and human during the divergent evolution, despite the ample re-editing of the intronic sequences and the fact that processes such as SINE spread had taken place in an independent way in the two species. Intron conservation is negatively correlated with the SINE occupancy, suggesting that virus inserts interfere with the conservation of the sequences inherited from the common ancestor. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. A +1 ribosomal frameshifting motif prevalent among plant amalgaviruses.

    PubMed

    Nibert, Max L; Pyle, Jesse D; Firth, Andrew E

    2016-11-01

    Sequence accessions attributable to novel plant amalgaviruses have been found in the Transcriptome Shotgun Assembly database. Sixteen accessions, derived from 12 different plant species, appear to encompass the complete protein-coding regions of the proposed amalgaviruses, which would substantially expand the size of genus Amalgavirus from 4 current species. Other findings include evidence for UUU_CGN as a +1 ribosomal frameshifting motif prevalent among plant amalgaviruses; for a variant version of this motif found thus far in only two amalgaviruses from solanaceous plants; for a region of α-helical coiled coil propensity conserved in a central region of the ORF1 translation product of plant amalgaviruses; and for conserved sequences in a C-terminal region of the ORF2 translation product (RNA-dependent RNA polymerase) of plant amalgaviruses, seemingly beyond the region of conserved polymerase motifs. These results additionally illustrate the value of mining the TSA database and others for novel viral sequences for comparative analyses. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  1. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  2. Conservation of CD44 exon v3 functional elements in mammals

    PubMed Central

    Vela, Elena; Hilari, Josep M; Delclaux, María; Fernández-Bellon, Hugo; Isamat, Marcos

    2008-01-01

    Background The human CD44 gene contains 10 variable exons (v1 to v10) that can be alternatively spliced to generate hundreds of different CD44 protein isoforms. Human CD44 variable exon v3 inclusion in the final mRNA depends on a multisite bipartite splicing enhancer located within the exon itself, which we have recently described, and provides the protein domain responsible for growth factor binding to CD44. Findings We have analyzed the sequence of CD44v3 in 95 mammalian species to report high conservation levels for both its splicing regulatory elements (the 3' splice site and the exonic splicing enhancer), and the functional glycosaminglycan binding site coded by v3. We also report the functional expression of CD44v3 isoforms in peripheral blood cells of different mammalian taxa with both consensus and variant v3 sequences. Conclusion CD44v3 mammalian sequences maintain all functional splicing regulatory elements as well as the GAG binding site with the same relative positions and sequence identity previously described during alternative splicing of human CD44. The sequence within the GAG attachment site, which in turn contains the Y motif of the exonic splicing enhancer, is more conserved relative to the rest of exon. Amplification of CD44v3 sequence from mammalian species but not from birds, fish or reptiles, may lead to classify CD44v3 as an exclusive mammalian gene trait. PMID:18710510

  3. [Identification of new conserved and variable regions in the 16S rRNA gene of acetic acid bacteria and acetobacteraceae family].

    PubMed

    Chakravorty, S; Sarkar, S; Gachhui, R

    2015-01-01

    The Acetobacteraceae family of the class Alpha Proteobacteria is comprised of high sugar and acid tolerant bacteria. The Acetic Acid Bacteria are the economically most significant group of this family because of its association with food products like vinegar, wine etc. Acetobacteraceae are often hard to culture in laboratory conditions and they also maintain very low abundances in their natural habitats. Thus identification of the organisms in such environments is greatly dependent on modern tools of molecular biology which require a thorough knowledge of specific conserved gene sequences that may act as primers and or probes. Moreover unconserved domains in genes also become markers for differentiating closely related genera. In bacteria, the 16S rRNA gene is an ideal candidate for such conserved and variable domains. In order to study the conserved and variable domains of the 16S rRNA gene of Acetic Acid Bacteria and the Acetobacteraceae family, sequences from publicly available databases were aligned and compared. Near complete sequences of the gene were also obtained from Kombucha tea biofilm, a known Acetobacteraceae family habitat, in order to corroborate the domains obtained from the alignment studies. The study indicated that the degree of conservation in the gene is significantly higher among the Acetic Acid Bacteria than the whole Acetobacteraceae family. Moreover it was also observed that the previously described hypervariable regions V1, V3, V5, V6 and V7 were more or less conserved in the family and the spans of the variable regions are quite distinct as well.

  4. Identification of novel microRNAs in Hevea brasiliensis and computational prediction of their targets

    PubMed Central

    2012-01-01

    Background Plants respond to external stimuli through fine regulation of gene expression partially ensured by small RNAs. Of these, microRNAs (miRNAs) play a crucial role. They negatively regulate gene expression by targeting the cleavage or translational inhibition of target messenger RNAs (mRNAs). In Hevea brasiliensis, environmental and harvesting stresses are known to affect natural rubber production. This study set out to identify abiotic stress-related miRNAs in Hevea using next-generation sequencing and bioinformatic analysis. Results Deep sequencing of small RNAs was carried out on plantlets subjected to severe abiotic stress using the Solexa technique. By combining the LeARN pipeline, data from the Plant microRNA database (PMRD) and Hevea EST sequences, we identified 48 conserved miRNA families already characterized in other plant species, and 10 putatively novel miRNA families. The results showed the most abundant size for miRNAs to be 24 nucleotides, except for seven families. Several MIR genes produced both 20-22 nucleotides and 23-27 nucleotides. The two miRNA class sizes were detected for both conserved and putative novel miRNA families, suggesting their functional duality. The EST databases were scanned with conserved and novel miRNA sequences. MiRNA targets were computationally predicted and analysed. The predicted targets involved in "responses to stimuli" and to "antioxidant" and "transcription activities" are presented. Conclusions Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs when the complete genome is not yet available. Our study provided additional information for evolutionary studies and revealed potentially specific regulation of the control of redox status in Hevea. PMID:22330773

  5. Analysis of hepatitis B virus preS1 variability and prevalence of the rs2296651 polymorphism in a Spanish population

    PubMed Central

    Casillas, Rosario; Tabernero, David; Gregori, Josep; Belmonte, Irene; Cortese, Maria Francesca; González, Carolina; Riveiro-Barciela, Mar; López, Rosa Maria; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco

    2018-01-01

    AIM To determine the variability/conservation of the domain of hepatitis B virus (HBV) preS1 region that interacts with sodium-taurocholate cotransporting polypeptide (hereafter, NTCP-interacting domain) and the prevalence of the rs2296651 polymorphism (S267F, NTCP variant) in a Spanish population. METHODS Serum samples from 246 individuals were included and divided into 3 groups: patients with chronic HBV infection (CHB) (n = 41, 73% Caucasians), patients with resolved HBV infection (n = 100, 100% Caucasians) and an HBV-uninfected control group (n = 105, 100% Caucasians). Variability/conservation of the amino acid (aa) sequences of the NTCP-interacting domain, (aa 2-48 in viral genotype D) and a highly conserved preS1 domain associated with virion morphogenesis (aa 92-103 in viral genotype D) were analyzed by next-generation sequencing and compared in 18 CHB patients with viremia > 4 log IU/mL. The rs2296651 polymorphism was determined in all individuals in all 3 groups using an in-house real-time PCR melting curve analysis. RESULTS The HBV preS1 NTCP-interacting domain showed a high degree of conservation among the examined viral genomes especially between aa 9 and 21 (in the genotype D consensus sequence). As compared with the virion morphogenesis domain, the NTCP-interacting domain had a smaller proportion of HBV genotype-unrelated changes comprising > 1% of the quasispecies (25.5% vs 31.8%), but a larger proportion of genotype-associated viral polymorphisms (34% vs 27.3%), according to consensus sequences from GenBank patterns of HBV genotypes A to H. Variation/conservation in both domains depended on viral genotype, with genotype C being the most highly conserved and genotype E the most variable (limited finding, only 2 genotype E included). Of note, proline residues were highly conserved in both domains, and serine residues showed changes only to threonine or tyrosine in the virion morphogenesis domain. The rs2296651 polymorphism was not detected in any participant. CONCLUSION In our CHB population, the NTCP-interacting domain was highly conserved, particularly the proline residues and essential amino acids related with the NTCP interaction, and the prevalence of rs2296651 was low/null. PMID:29456407

  6. Tourism and the conservation of critically endangered frogs.

    PubMed

    Morrison, Clare; Simpkins, Clay; Castley, J Guy; Buckley, Ralf C

    2012-01-01

    Protected areas are critical for the conservation of many threatened species. Despite this, many protected areas are acutely underfunded, which reduces their effectiveness significantly. Tourism is one mechanism to promote and fund conservation in protected areas, but there are few studies analyzing its tangible conservation outcomes for threatened species. This study uses the 415 IUCN critically endangered frog species to evaluate the contribution of protected area tourism revenue to conservation. Contributions were calculated for each species as the proportion of geographic range inside protected areas multiplied by the proportion of protected area revenues derived from tourism. Geographic ranges were determined from IUCN Extent of Occurrence maps. Almost 60% (239) of critically endangered frog species occur in protected areas. Higher proportions of total range are protected in Nearctic, Australasian and Afrotopical regions. Tourism contributions to protected area budgets ranged from 5-100%. These financial contributions are highest for developing countries in the Afrotropical, Indomalayan and Neotropical regions. Data for both geographic range and budget are available for 201 critically endangered frog species with proportional contributions from tourism to species protection ranging from 0.8-99%. Tourism's financial contributions to critically endangered frog species protection are highest in the Afrotropical region. This study uses a coarse measure but at the global scale it demonstrates that tourism has significant potential to contribute to global frog conservation efforts.

  7. Tourism and the Conservation of Critically Endangered Frogs

    PubMed Central

    Morrison, Clare; Simpkins, Clay; Castley, J. Guy; Buckley, Ralf C.

    2012-01-01

    Protected areas are critical for the conservation of many threatened species. Despite this, many protected areas are acutely underfunded, which reduces their effectiveness significantly. Tourism is one mechanism to promote and fund conservation in protected areas, but there are few studies analyzing its tangible conservation outcomes for threatened species. This study uses the 415 IUCN critically endangered frog species to evaluate the contribution of protected area tourism revenue to conservation. Contributions were calculated for each species as the proportion of geographic range inside protected areas multiplied by the proportion of protected area revenues derived from tourism. Geographic ranges were determined from IUCN Extent of Occurrence maps. Almost 60% (239) of critically endangered frog species occur in protected areas. Higher proportions of total range are protected in Nearctic, Australasian and Afrotopical regions. Tourism contributions to protected area budgets ranged from 5–100%. These financial contributions are highest for developing countries in the Afrotropical, Indomalayan and Neotropical regions. Data for both geographic range and budget are available for 201 critically endangered frog species with proportional contributions from tourism to species protection ranging from 0.8–99%. Tourism's financial contributions to critically endangered frog species protection are highest in the Afrotropical region. This study uses a coarse measure but at the global scale it demonstrates that tourism has significant potential to contribute to global frog conservation efforts. PMID:22984440

  8. Global meta-analysis reveals low consistency of biodiversity congruence relationships.

    PubMed

    Westgate, Martin J; Barton, Philip S; Lane, Peter W; Lindenmayer, David B

    2014-05-21

    Knowledge of the number and distribution of species is fundamental to biodiversity conservation efforts, but this information is lacking for the majority of species on earth. Consequently, subsets of taxa are often used as proxies for biodiversity; but this assumes that different taxa display congruent distribution patterns. Here we use a global meta-analysis to show that studies of cross-taxon congruence rarely give consistent results. Instead, species richness congruence is highest at extreme spatial scales and close to the equator, while congruence in species composition is highest at large extents and grain sizes. Studies display highest variance in cross-taxon congruence when conducted in areas with dissimilar areal extents (for species richness) or latitudes (for species composition). These results undermine the assumption that a subset of taxa can be representative of biodiversity. Therefore, researchers whose goal is to prioritize locations or actions for conservation should use data from a range of taxa.

  9. The sequence, structure and evolutionary features of HOTAIR in mammals

    PubMed Central

    2011-01-01

    Background An increasing number of long noncoding RNAs (lncRNAs) have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer. Results To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals. Conclusions HOTAIR exists in mammals, has poorly conserved sequences and considerably conserved structures, and has evolved faster than nearby HoxC genes. Exons of HOTAIR show distinct evolutionary features, and a 239 bp domain in the 1804 bp exon6 is especially conserved. These features, together with the absence of some exons and sequences in mouse, rat and kangaroo, suggest ab initio generation of HOTAIR in marsupials. Structure prediction identifies two fragments in the 5' end exon1 and the 3' end domain B of exon6, with sequence and structure invariably occurring in various predicted structures of exon1, the domain B of exon6 and the full HOTAIR. PMID:21496275

  10. Conserved and species-specific transcription factor co-binding patterns drive divergent gene regulation in human and mouse

    PubMed Central

    Diehl, Adam G

    2018-01-01

    Abstract The mouse is widely used as system to study human genetic mechanisms. However, extensive rewiring of transcriptional regulatory networks often confounds translation of findings between human and mouse. Site-specific gain and loss of individual transcription factor binding sites (TFBS) has caused functional divergence of orthologous regulatory loci, and so we must look beyond this positional conservation to understand common themes of regulatory control. Fortunately, transcription factor co-binding patterns shared across species often perform conserved regulatory functions. These can be compared to ‘regulatory sentences’ that retain the same meanings regardless of sequence and species context. By analyzing TFBS co-occupancy patterns observed in four human and mouse cell types, we learned a regulatory grammar: the rules by which TFBS are combined into meaningful regulatory sentences. Different parts of this grammar associate with specific sets of functional annotations regardless of sequence conservation and predict functional signatures more accurately than positional conservation. We further show that both species-specific and conserved portions of this grammar are involved in gene expression divergence and human disease risk. These findings expand our understanding of transcriptional regulatory mechanisms, suggesting that phenotypic divergence and disease risk are driven by a complex interplay between deeply conserved and species-specific transcriptional regulatory pathways. PMID:29361190

  11. RNA-seq Transcriptome Analysis of Panax japonicus, and Its Comparison with Other Panax Species to Identify Potential Genes Involved in the Saponins Biosynthesis

    PubMed Central

    Rai, Amit; Yamazaki, Mami; Takahashi, Hiroki; Nakamura, Michimi; Kojoma, Mareshige; Suzuki, Hideyuki; Saito, Kazuki

    2016-01-01

    The Panax genus has been a source of natural medicine, benefitting human health over the ages, among which the Panax japonicus represents an important species. Our understanding of several key pathways and enzymes involved in the biosynthesis of ginsenosides, a pharmacologically active class of metabolites and a major chemical constituents of the rhizome extracts from the Panax species, are limited. Limited genomic information, and lack of studies on comparative transcriptomics across the Panax species have restricted our understanding of the biosynthetic mechanisms of these and many other important classes of phytochemicals. Herein, we describe Illumina based RNA sequencing analysis to characterize the transcriptome and expression profiles of genes expressed in the five tissues of P. japonicus, and its comparison with other Panax species. RNA sequencing and de novo transcriptome assembly for P. japonicus resulted in a total of 135,235 unigenes with 78,794 (58.24%) unigenes being annotated using NCBI-nr database. Transcriptome profiling, and gene ontology enrichment analysis for five tissues of P. japonicus showed that although overall processes were evenly conserved across all tissues. However, each tissue was characterized by several unique unigenes with the leaves showing the most unique unigenes among the tissues studied. A comparative analysis of the P. japonicus transcriptome assembly with publically available transcripts from other Panax species, namely, P. ginseng, P. notoginseng, and P. quinquefolius also displayed high sequence similarity across all Panax species, with P. japonicus showing highest similarity with P. ginseng. Annotation of P. japonicus transcriptome resulted in the identification of putative genes encoding all enzymes from the triterpene backbone biosynthetic pathways, and identified 24 and 48 unigenes annotated as cytochrome P450 (CYP) and glycosyltransferases (GT), respectively. These CYPs and GTs annotated unigenes were conserved across all Panax species and co-expressed with other the transcripts involved in the triterpenoid backbone biosynthesis pathways. Unigenes identified in this study represent strong candidates for being involved in the triterpenoid saponins biosynthesis, and can serve as a basis for future validation studies. PMID:27148308

  12. Typical Window, Interior Wall Paint Sequence, Wall Section, and Foundation ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    Typical Window, Interior Wall Paint Sequence, Wall Section, and Foundation Sections - Civilian Conservation Corps (CCC) Camp NP-5-C, Barracks No. 5, CCC Camp Historic District at Chapin Mesa, Cortez, Montezuma County, CO

  13. Conservation of the sequence of the Alzheimer's disease amyloid peptide in dog, polar bear and five other mammals by cross-species polymerase chain reaction analysis.

    PubMed

    Johnstone, E M; Chaney, M O; Norris, F H; Pascual, R; Little, S P

    1991-07-01

    Neuritic plaque and cerebrovascular amyloid deposits have been detected in the aged monkey, dog, and polar bear and have rarely been found in aged rodents (Biochem. Biophy. Res. Commun., 12 (1984) 885-890; Proc. Natl. Acad. Sci. U.S.A., 82 (1985) 4245-4249). To determine if the primary structure of the 42-43 residue amyloid peptide is conserved in species that accumulate plaques, the region of the amyloid precursor protein (APP) cDNA that encodes the peptide region was amplified by the polymerase chain reaction and sequenced. The deduced amino acid sequence was compared to those species where amyloid accumulation has not been detected. The DNA sequences of dog, polar bear, rabbit, cow, sheep, pig and guinea pig were compared and a phylogenetic tree was generated. We conclude that the amino acid sequence of dog and polar bear and other mammals which may form amyloid plaques is conserved and the species where amyloid has not been detected (mouse, rat) may be evolutionarily a distinct group. In addition, the predicted secondary structure of mouse and rat amyloid that differs from that of amyloid bearing species is its lack of propensity to form a beta sheeted structure. Thus, a cross-species examination of the amyloid peptide may suggest what is essential for amyloid deposition.

  14. A dehydrin cognate protein from pea (Pisum sativum L.) with an atypical pattern of expression.

    PubMed

    Robertson, M; Chandler, P M

    1994-11-01

    Dehydrins are a family of proteins characterised by conserved amino acid motifs, and induced in plants by dehydration or treatment with ABA. An antiserum was raised against a synthetic oligopeptide based on the most highly conserved dehydrin amino acid motif, the lysine-rich (core sequence KIKEK-LPG). This antiserum detected a novel M(r) 40,000 polypeptide and enabled isolation of a corresponding cDNA clone, pPsB61 (B61). The deduced amino acid sequence contained two lysine-rich blocks, however the remainder of the sequenced differed markedly from other pea dehydrins. Surprisingly, the sequence contained a stretch of serine residues, a characteristic common to dehydrins from many plant species but which is missing in pea dehydrin. The expression patterns of B61 mRNA and polypeptide were distinctively different from those of the pea dehydrins during seed development, germination and in young seedlings exposed to dehydration stress or treated with ABA. In particular, dehydration stress led to slightly reduced levels of B61 RNA, and ABA application to young seedlings had no marked effect on its abundance. The M(r) 40,000 polypeptide is thus related to pea dehydrin by the presence of the most highly conserved amino acid sequence motifs, but lacks the characteristic expression pattern of dehydrin. By analogy with heat shock cognate proteins we refer to this protein as a dehydrin cognate.

  15. Synteny of Prunus and other model plant species

    PubMed Central

    Jung, Sook; Jiwan, Derick; Cho, Ilhyung; Lee, Taein; Abbott, Albert; Sosinski, Bryon; Main, Dorrie

    2009-01-01

    Background Fragmentary conservation of synteny has been reported between map-anchored Prunus sequences and Arabidopsis. With the availability of genome sequence for fellow rosid I members Populus and Medicago, we analyzed the synteny between Prunus and the three model genomes. Eight Prunus BAC sequences and map-anchored Prunus sequences were used in the comparison. Results We found a well conserved synteny across the Prunus species – peach, plum, and apricot – and Populus using a set of homologous Prunus BACs. Conversely, we could not detect any synteny with Arabidopsis in this region. Other peach BACs also showed extensive synteny with Populus. The syntenic regions detected were up to 477 kb in Populus. Two syntenic regions between Arabidopsis and these BACs were much shorter, around 10 kb. We also found syntenic regions that are conserved between the Prunus BACs and Medicago. The array of synteny corresponded with the proposed whole genome duplication events in Populus and Medicago. Using map-anchored Prunus sequences, we detected many syntenic blocks with several gene pairs between Prunus and Populus or Arabidopsis. We observed a more complex network of synteny between Prunus-Arabidopsis, indicative of multiple genome duplication and subsequence gene loss in Arabidopsis. Conclusion Our result shows the striking microsynteny between the Prunus BACs and the genome of Populus and Medicago. In macrosynteny analysis, more distinct Prunus regions were syntenic to Populus than to Arabidopsis. PMID:19208249

  16. Local Renyi entropic profiles of DNA sequences.

    PubMed

    Vinga, Susana; Almeida, Jonas S

    2007-10-16

    In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at http://kdbio.inesc-id.pt/~svinga/ep/. The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.

  17. Local Renyi entropic profiles of DNA sequences

    PubMed Central

    Vinga, Susana; Almeida, Jonas S

    2007-01-01

    Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. PMID:17939871

  18. Characterization of the complete genome segments from BmCPV-SZ, a novel Bombyx mori cypovirus 1 isolate.

    PubMed

    Cao, Guangli; Meng, Xiangkun; Xue, Renyu; Zhu, Yuexiong; Zhang, Xiaorong; Pan, Zhonghua; Zheng, Xiaojian; Gong, Chengliang

    2012-07-01

    A novel Bombyx mori cypovirus 1 isolated from infected silkworm larvae and tentatively assigned as Bombyx mori cypovirus 1 isolate Suzhou (BmCPV-SZ). The complete nucleotide sequences of genomic segments S1-S10 from BmCPV-SZ were determined. All segments possessed a single open reading frame; however, bioinformatic evidence suggested a short overlapping coding sequence in S1. Each BmCPV-SZ segment possessed the conserved terminal sequences AGUAA and GUUAGCC at the 5' and 3' ends, respectively. The conserved A/G at the -3 position in relation to the AUG codon could be found in the BmCPV-SZ genome, and it was postulated that this conserved A/G may be the most important nucleotide for efficient translation initiation in cypoviruses (CPVs). Examination of the putative amino acid sequences encoded by BmCPV-SZ revealed some characteristic motifs. Homology searches showed that viral structural proteins VP1, VP3, and VP4 had localized homologies with proteins of Rice ragged stunt virus , a member of the genus Oryzavirus within the family Reoviridae. A phylogenetic tree based on RNA-dependent RNA polymerase sequences demonstrated that CPV is more closely related to Rice ragged stunt virus and Aedes pseudoscutellaris reovirus than to other members of Reoviridae, suggesting that they may have originated from common ancestors.

  19. A bacterial laccase from marine microbial metagenome exhibiting chloride tolerance and dye decolorization ability.

    PubMed

    Fang, Zemin; Li, Tongliang; Wang, Quan; Zhang, Xuecheng; Peng, Hui; Fang, Wei; Hong, Yuzhi; Ge, Honghua; Xiao, Yazhong

    2011-02-01

    Laccases are blue multicopper oxidases with potential applications in environmental and industrial biotechnology. In this study, a new bacterial laccase gene of 1.32 kb was obtained from a marine microbial metagenome of the South China Sea by using a sequence screening strategy. The protein (named as Lac15) of 439 amino acids encoded by the gene contains three conserved Cu(2+)-binding domains, but shares less than 40% of sequence identities with all of the bacterial multicopper oxidases characterized. Lac15, recombinantly expressed in Escherichia coli, showed high activity towards syringaldazine at pH 6.5-9.0 with an optimum pH of 7.5 and with the highest activity occurring at 45 °C. Lac15 was stable at pH ranging from 5.5 to 9.0 and at temperatures from 15 to 45 °C. Distinguished from fungal laccases, the activity of Lac15 was enhanced twofold by chloride at concentrations lower than 700 mM, and kept the original level even at 1,000 mM chloride. Furthermore, Lac15 showed an ability to decolorize several industrial dyes of reactive azo class under alkalescent conditions. The properties of alkalescence-dependent activity, high chloride tolerance, and dye decolorization ability make the new laccase Lac15 an alternative for specific industrial applications.

  20. HCIV-1 and Other Tailless Icosahedral Internal Membrane-Containing Viruses of the Family Sphaerolipoviridae.

    PubMed

    Demina, Tatiana A; Pietilä, Maija K; Svirskaitė, Julija; Ravantti, Janne J; Atanasova, Nina S; Bamford, Dennis H; Oksanen, Hanna M

    2017-02-18

    Members of the virus family Sphaerolipoviridae include both archaeal viruses and bacteriophages that possess a tailless icosahedral capsid with an internal membrane. The genera Alpha- and Betasphaerolipovirus comprise viruses that infect halophilic euryarchaea, whereas viruses of thermophilic Thermus bacteria belong to the genus Gammasphaerolipovirus . Both sequence-based and structural clustering of the major capsid proteins and ATPases of sphaerolipoviruses yield three distinct clades corresponding to these three genera. Conserved virion architectural principles observed in sphaerolipoviruses suggest that these viruses belong to the PRD1-adenovirus structural lineage. Here we focus on archaeal alphasphaerolipoviruses and their related putative proviruses. The highest sequence similarities among alphasphaerolipoviruses are observed in the core structural elements of their virions: the two major capsid proteins, the major membrane protein, and a putative packaging ATPase. A recently described tailless icosahedral haloarchaeal virus, Haloarcula californiae icosahedral virus 1 (HCIV-1), has a double-stranded DNA genome and an internal membrane lining the capsid. HCIV-1 shares significant similarities with the other tailless icosahedral internal membrane-containing haloarchaeal viruses of the family Sphaerolipoviridae . The proposal to include a new virus species, Haloarcula virus HCIV1 , into the genus Alphasphaerolipovirus was submitted to the International Committee on Taxonomy of Viruses (ICTV) in 2016.

  1. Isolation of prawn ( Exopalaemon carinicauda) lipopolysaccharide and β-1, 3-glucan binding protein gene and its expression in responding to bacterial and viral infections

    NASA Astrophysics Data System (ADS)

    Ge, Qianqian; Li, Jian; Duan, Yafei; Li, Jitao; Sun, Ming; Zhao, Fazhen

    2016-04-01

    The pattern recognition proteins (PRPs) play a major role in immune response of crustacean to resist pathogens. In the present study, as one of PRPs, lipopolysaccharide and β-1, 3-glucan binding protein (LGBP) gene in the ridge tail white prawn ( Exopalaemon carinicauda) ( EcLGBP) was isolated. The full-length cDNA of EcLGBP was 1338 bp, encoding a polypeptide of 366 amino acid residules. The deduced amino acid sequence of EcLGBP shared high similarities with LGBP and BGBP from other crustaceans. Some conservative domains were predicted in EcLGBP sequence. EcLGBP constitutively expressed in most tissues at different levels, and the highest expression was observed in hepatopancreas. With infection time, the cumulative mortality increased gradually followed by the proliferation of Vibrio parahaemolyticus and white spot syndrome virus (WSSV). The expression of EcLGBP in response to V. parahaemolyticus infection was up-regulated in hemocytes and hepatopancreas, and the up-regulation in hepatopancreas was earlier than that in hemocytes. EcLGBP expression after WSSV infection increased at 3 h, then significantly decreased in both hemocytes and hepatopancreas. The results indicated that EcLGBP was involved in the immune defense against bacterial and viral infections.

  2. [Cloning and bioinformatic analysis and expression analysis of beta-glucuronidase in Scutellaria baicalensis].

    PubMed

    Guo, Shuang-shuang; Cheng, Lin; Yang, Li-min; Han, Mei

    2015-11-01

    The β-Glucuronidase gene (sbGUS) cDNA firstly from Scutellari abaicalensis leaf was cloned by RT-PCR, with GenBank accession number KR364726. The full length cDNA of sbGUS was 1 584 bp with an open reading frame (ORF), encoding an unstable protein with 527 amino acids. The bioinformatic analysis showed that the sbGUS encoding protein had isoelectric point (pI) of 5.55 and a calculated molecular weight about 58.724 8 kDa, with a transmembrane regions and signal peptide, had conserved domains of glycoside hydrolase super family and unintegrated trans-glycosidase catalytic structure. In the secondary structure, the percentage of alpha helix, extended strand, β-extended and random coil were 25.62%, 28.84%, 13.28% and 32.26%, respectively. The homologous analysis indicated the nucleotide sequence 98.93% similarity and the amino acid sequence 98.29% similarity with S. baicalensis (BAA97804.1), in the nine positions were different. The expression level of sGUS was the highest in root based on a real-time PCR analysis, followed by flower and stem, and the lowest was in stem. The results provide a foundation for exploring the molecular function of sbGUS involved in baicalcin biosynthesis based on synthetic biology approach in S. baicalensis plants.

  3. Metallothionein from Wild Populations of the African Catfish Clarias gariepinus: From Sequence, Protein Expression and Metal Binding Properties to Transcriptional Biomarker of Metal Pollution.

    PubMed

    M'kandawire, Ethel; Mierek-Adamska, Agnieszka; Stürzenbaum, Stephen R; Choongo, Kennedy; Yabe, John; Mwase, Maxwell; Saasa, Ngonda; Blindauer, Claudia A

    2017-07-18

    Anthropogenic pollution with heavy metals is an on-going concern throughout the world, and methods to monitor release and impact of heavy metals are of high importance. With a view to probe its suitability as molecular biomarker of metal pollution, this study has determined a coding sequence for metallothionein of the African sharptooth catfish Clarias gariepinus . The gene product was recombinantly expressed in Escherichia coli in presence of Zn(II), Cd(II), or Cu, and characterised by Electrospray Ionisation Mass Spectrometry and elemental analysis. C. gariepinus MT displays typical features of fish MTs, including 20 conserved cysteines, and seven bound divalent cations (Zn(II) or Cd(II)) when saturated. Livers from wild C. gariepinus fish collected in all three seasons from four different sites on the Kafue River of Zambia were analysed for their metal contents and for MT expression levels by quantitative PCR. Significant correlations were found between Zn and Cu levels and MT expression in livers, with MT expression clearly highest at the most polluted site, Chililabombwe, which is situated in the Copperbelt region. Based on our findings, hepatic expression of MT from C. gariepinus may be further developed as a major molecular biomarker of heavy metal pollution resulting from mining activities in this region.

  4. Cloning, sequencing and phylogenetic analysis of the small GTPase gene cdc-42 from Ancylostoma caninum.

    PubMed

    Yang, Yurong; Zheng, Jing; Chen, Jiaxin

    2012-12-01

    CDC-42 is a member of the Rho GTPase subfamily that is involved in many signaling pathways, including mitosis, cell polarity, cell migration and cytoskeleton remodeling. Here, we present the first characterization of a full-length cDNA encoding the small GTPase cdc-42, designated as Accdc-42, isolated from the parasitic nematode Ancylostoma caninum. The encoded protein contains 191 amino acid residues with a predicted molecular weight of 21 kDa and displays a high level of identity with the Rho-family GTPase protein CDC-42. Phylogenetic analysis revealed that Accdc-42 was most closely related to Caenorhabditis briggsae cdc-42. Comparison with selected sequences from the free-living nematode Caenorhabditis elegans, Drosophila melanogaster, Xenopus laevis, Danio rerio, Mus musculus and human genomes showed that Accdc-42 is highly conserved. AcCDC-42 demonstrates the highest identity to CDC-42 from C. briggsae (94.2%), and it also exhibits 91.6% identity to CDC-42 from C. elegans and 91.1% from Brugia malayi. Additionally, the transcript of Accdc-42 was analyzed during the different developmental stages of the worm. Accdc-42 was expressed in the L1/L2 larvae, L3 larvae and female and male adults of A. caninum. Copyright © 2012 Elsevier Inc. All rights reserved.

  5. Characterization, Molecular Cloning, and Differential Expression Analysis of Laccase Genes from the Edible Mushroom Lentinula edodes

    PubMed Central

    Zhao, J.; Kwan, H. S.

    1999-01-01

    The effect of different substrates and various developmental stages (mycelium growth, primordium appearance, and fruiting-body formation) on laccase production in the edible mushroom Lentinula edodes was studied. The cap of the mature mushroom showed the highest laccase activity, and laccase activity was not stimulated by some well-known laccase inducers or sawdust. For our molecular studies, two genomic DNA sequences, representing allelic variants of the L. edodes lac1 gene, were isolated, and DNA sequence analysis demonstrated that lac1 encodes a putative polypeptide of 526 amino acids which is interrupted by 13 introns. The two allelic genes differ at 95 nucleotides, which results in seven amino acid differences in the encoded protein. The copper-binding domains found in other laccase enzymes are conserved in the L. edodes Lac1 proteins. A fragment of a second laccase gene (lac2) was also isolated, and competitive PCR showed that expression of lac1 and lac2 genes was different under various conditions. Our results suggest that laccases may play a role in the morphogenesis of the mushroom. To our knowledge, this is the first report on the cloning of genes involved in lignocellulose degradation in this economically important edible fungus. PMID:10543802

  6. Sequence Bundles: a novel method for visualising, discovering and exploring sequence motifs

    PubMed Central

    2014-01-01

    Background We introduce Sequence Bundles--a novel data visualisation method for representing multiple sequence alignments (MSAs). We identify and address key limitations of the existing bioinformatics data visualisation methods (i.e. the Sequence Logo) by enabling Sequence Bundles to give salient visual expression to sequence motifs and other data features, which would otherwise remain hidden. Methods For the development of Sequence Bundles we employed research-led information design methodologies. Sequences are encoded as uninterrupted, semi-opaque lines plotted on a 2-dimensional reconfigurable grid. Each line represents a single sequence. The thickness and opacity of the stack at each residue in each position indicates the level of conservation and the lines' curved paths expose patterns in correlation and functionality. Several MSAs can be visualised in a composite image. The Sequence Bundles method is designed to favour a tangible, continuous and intuitive display of information. Results We have developed a software demonstration application for generating a Sequence Bundles visualisation of MSAs provided for the BioVis 2013 redesign contest. A subsequent exploration of the visualised line patterns allowed for the discovery of a number of interesting features in the dataset. Reported features include the extreme conservation of sequences displaying a specific residue and bifurcations of the consensus sequence. Conclusions Sequence Bundles is a novel method for visualisation of MSAs and the discovery of sequence motifs. It can aid in generating new insight and hypothesis making. Sequence Bundles is well disposed for future implementation as an interactive visual analytics software, which can complement existing visualisation tools. PMID:25237395

  7. Selection of Optimal Polypurine Tract Region Sequences during Moloney Murine Leukemia Virus Replication

    PubMed Central

    Robson, Nicole D.; Telesnitsky, Alice

    2000-01-01

    Retrovirus plus-strand synthesis is primed by a cleavage remnant of the polypurine tract (PPT) region of viral RNA. In this study, we tested replication properties for Moloney murine leukemia viruses with targeted mutations in the PPT and in conserved sequences upstream, as well as for pools of mutants with randomized sequences in these regions. The importance of maintaining some purine residues within the PPT was indicated both by examining the evolution of random PPT pools and from the replication properties of targeted mutants. Although many different PPT sequences could support efficient replication and one mutant that contained two differences in the core PPT was found to replicate as well as the wild type, some sequences in the core PPT clearly conferred advantages over others. Contributions of sequences upstream of the core PPT were examined with deletion mutants. A conserved T-stretch within the upstream sequence was examined in detail and found to be unimportant to helper functions. Evolution of virus pools containing randomized T-stretch sequences demonstrated marked preference for the wild-type sequence in six of its eight positions. These findings demonstrate that maintenance of the T-rich element is more important to viral replication than is maintenance of the core PPT. PMID:11044073

  8. Genome variability of foot-and-mouth disease virus during the short period of the 2010 epidemic in Japan.

    PubMed

    Nishi, Tatsuya; Yamada, Manabu; Fukai, Katsuhiko; Shimada, Nobuaki; Morioka, Kazuki; Yoshida, Kazuo; Sakamoto, Kenichi; Kanno, Toru; Yamakawa, Makoto

    2017-02-01

    Foot-and-mouth disease virus (FMDV) is highly contagious and has a high mutation rate, leading to extensive genetic variation. To investigate how FMDV genetically evolves over a short period of an epidemic after initial introduction into an FMD-free area, whole L-fragment sequences of 104 FMDVs isolated from the 2010 epidemic in Japan, which continued for less than three months were determined and phylogenetically and comparatively analyzed. Phylogenetic analysis of whole L-fragment sequences showed that these isolates were classified into a single group, indicating that FMDV was introduced into Japan in the epidemic via a single introduction. Nucleotide sequences of 104 virus isolates showed more than 99.56% pairwise identity rates without any genetic deletion or insertion, although no sequences were completely identical with each other. These results indicate that genetic substitutions of FMDV occurred gradually and constantly during the epidemic and generation of an extensive mutant virus could have been prevented by rapid eradication strategy. From comparative analysis of variability of each FMDV protein coding region, VP4 and 2C regions showed the highest average identity rates and invariant rates, and were confirmed as highly conserved. In contrast, the protein coding regions VP2 and VP1 were confirmed to be highly variable regions with the lowest average identity rates and invariant rates, respectively. Our data demonstrate the importance of rapid eradication strategy in an FMD epidemic and provide valuable information on the genome variability of FMDV during the short period of an epidemic. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  9. The Physics of Toys.

    ERIC Educational Resources Information Center

    Levinstein, Henry

    1982-01-01

    Outlines a course in which toys are used to demonstrate physics concepts, stressing available or easily constructed toys. A recent lecture sequence included toys demonstrating equilibrium, force/torque, linear/angular momentum conservation, energy conservation/storage, flying, vibrations, programed music, and others. Illustrations of selected toys…

  10. Landscape responses of bats to habitat fragmentation in Atlantic forest of paraguay

    USGS Publications Warehouse

    Gorresen, P.M.; Willig, M.R.

    2004-01-01

    Understanding effects of habitat loss and fragmentation on populations or communities is critical to effective conservation and restoration. This is particularly important for bats because they provide vital services to ecosystems via pollination and seed dispersal, especially in tropical and subtropical habitats. Based on more than 1,000 h of survey during a 15-month period, we quantified species abundances and community structure of phyllostomid bats at 14 sites in a 3,000-km2 region of eastern Paraguay. Abundance was highest for Artibeus lituratus in deforested landscapes and for Chrotopterus auritus in forested habitats. In contrast, Artibeus fimbriatus, Carollia perspicillata, Glossophaga soricina, Platyrrhinus lineatus, Pygoderma bilabiatum, and Sturnira lilium attained highest abundance in moderately fragmented forest landscapes. Forest cover, patch size, and patch density frequently were associated with abundance of species. At the community level, species richness was highest in partly deforested landscapes, whereas evenness was greatest in forested habitat. In general, the highest diversity of bats occurred in landscapes comprising moderately fragmented forest habitat. This underscores the importance of remnant habitat patches to conservation strategies.

  11. Evaluating, Comparing, and Interpreting Protein Domain Hierarchies

    PubMed Central

    2014-01-01

    Abstract Arranging protein domain sequences hierarchically into evolutionarily divergent subgroups is important for investigating evolutionary history, for speeding up web-based similarity searches, for identifying sequence determinants of protein function, and for genome annotation. However, whether or not a particular hierarchy is optimal is often unclear, and independently constructed hierarchies for the same domain can often differ significantly. This article describes methods for statistically evaluating specific aspects of a hierarchy, for probing the criteria underlying its construction and for direct comparisons between hierarchies. Information theoretical notions are used to quantify the contributions of specific hierarchical features to the underlying statistical model. Such features include subhierarchies, sequence subgroups, individual sequences, and subgroup-associated signature patterns. Underlying properties are graphically displayed in plots of each specific feature's contributions, in heat maps of pattern residue conservation, in “contrast alignments,” and through cross-mapping of subgroups between hierarchies. Together, these approaches provide a deeper understanding of protein domain functional divergence, reveal uncertainties caused by inconsistent patterns of sequence conservation, and help resolve conflicts between competing hierarchies. PMID:24559108

  12. Cube - an online tool for comparison and contrasting of protein sequences.

    PubMed

    Zhang, Zong Hong; Khoo, Aik Aun; Mihalek, Ivana

    2013-01-01

    When comparing sequences of similar proteins, two kinds of questions can be asked, and the related two kinds of inference made. First, one may ask to what degree they are similar, and then, how they differ. In the first case one may tentatively conclude that the conserved elements common to all sequences are of central and common importance to the protein's function. In the latter case the regions of specialization may be discriminative of the function or binding partners across subfamilies of related proteins. Experimental efforts - mutagenesis or pharmacological intervention - can then be pointed in either direction, depending on the context of the study. Cube simplifies this process for users that already have their favorite sets of sequences, and helps them collate the information by visualization of the conservation and specialization scores on the sequence and on the structure, and by spreadsheet tabulation. All information can be visualized on the spot, or downloaded for reference and later inspection. http://eopsf.org/cube.

  13. Centromere-Like Regions in the Budding Yeast Genome

    PubMed Central

    Lefrançois, Philippe; Auerbach, Raymond K.; Yellman, Christopher M.; Roeder, G. Shirleen; Snyder, Michael

    2013-01-01

    Accurate chromosome segregation requires centromeres (CENs), the DNA sequences where kinetochores form, to attach chromosomes to microtubules. In contrast to most eukaryotes, which have broad centromeres, Saccharomyces cerevisiae possesses sequence-defined point CENs. Chromatin immunoprecipitation followed by sequencing (ChIP–Seq) reveals colocalization of four kinetochore proteins at novel, discrete, non-centromeric regions, especially when levels of the centromeric histone H3 variant, Cse4 (a.k.a. CENP-A or CenH3), are elevated. These regions of overlapping protein binding enhance the segregation of plasmids and chromosomes and have thus been termed Centromere-Like Regions (CLRs). CLRs form in close proximity to S. cerevisiae CENs and share characteristics typical of both point and regional CENs. CLR sequences are conserved among related budding yeasts. Many genomic features characteristic of CLRs are also associated with these conserved homologous sequences from closely related budding yeasts. These studies provide general and important insights into the origin and evolution of centromeres. PMID:23349633

  14. Conserved thioredoxin fold is present in Pisum sativum L. sieve element occlusion-1 protein

    PubMed Central

    Umate, Pavan; Tuteja, Renu

    2010-01-01

    Homology-based three-dimensional model for Pisum sativum sieve element occlusion 1 (Ps.SEO1) (forisomes) protein was constructed. A stretch of amino acids (residues 320 to 456) which is well conserved in all known members of forisomes proteins was used to model the 3D structure of Ps.SEO1. The structural prediction was done using Protein Homology/analogY Recognition Engine (PHYRE) web server. Based on studies of local sequence alignment, the thioredoxin-fold containing protein [Structural Classification of Proteins (SCOP) code d1o73a_], a member of the glutathione peroxidase family was selected as a template for modeling the spatial structure of Ps.SEO1. Selection was based on comparison of primary sequence, higher match quality and alignment accuracy. Motif 1 (EVF) is conserved in Ps.SEO1, Vicia faba (Vf.For1) and Medicago truncatula (MT.SEO3); motif 2 (KKED) is well conserved across all forisomes proteins and motif 3 (IGYIGNP) is conserved in Ps.SEO1 and Vf.For1. PMID:20404566

  15. Development and application of a PCR assay to detect chicken and turkey parvoviruses in commercial poultry flocks in the United States.

    USDA-ARS?s Scientific Manuscript database

    Comparative sequence analysis of six independent chicken and turkey parvovirus nonstructural (NS) genes revealed specific genomic regions with 100% nucleotide sequence identity. A PCR assay with primers targeting these conserved genome sequences proved to be highly specific and sensitive to detect p...

  16. Generating carbon finance through avoided deforestation and its potential to create climatic, conservation and human development benefits.

    PubMed

    Ebeling, Johannes; Yasué, Maï

    2008-05-27

    Recent proposals to compensate developing countries for reducing emissions from deforestation (RED) under forthcoming climate change mitigation regimes are receiving increasing attention. Here we demonstrate that if RED credits were traded on international carbon markets, even moderate decreases in deforestation rates could generate billions of Euros annually for tropical forest conservation. We also discuss the main challenges for a RED mechanism that delivers real climatic benefits. These include providing sufficient incentives while only rewarding deforestation reductions beyond business-as-usual scenarios, addressing risks arising from forest degradation and international leakage, and ensuring permanence of emission reductions. Governance may become a formidable challenge for RED because some countries with the highest RED potentials score poorly on governance indices. In addition to climate mitigation, RED funds could help achieve substantial co-benefits for biodiversity conservation and human development. However, this will probably require targeted additional support because the highest biodiversity threats and human development needs may exist in countries that have limited income potentials from RED. In conclusion, how successfully a market-based RED mechanism can contribute to climate change mitigation, conservation and development will strongly depend on accompanying measures and carefully designed incentive structures involving governments, business, as well as the conservation and development communities.

  17. Embryonic lethality is not sufficient to explain hourglass-like conservation of vertebrate embryos.

    PubMed

    Uchida, Yui; Uesaka, Masahiro; Yamamoto, Takayoshi; Takeda, Hiroyuki; Irie, Naoki

    2018-01-01

    Understanding the general trends in developmental changes during animal evolution, which are often associated with morphological diversification, has long been a central issue in evolutionary developmental biology. Recent comparative transcriptomic studies revealed that gene expression profiles of mid-embryonic period tend to be more evolutionarily conserved than those in earlier or later periods. While the hourglass-like divergence of developmental processes has been demonstrated in a variety of animal groups such as vertebrates, arthropods, and nematodes, the exact mechanism leading to this mid-embryonic conservation remains to be clarified. One possibility is that the mid-embryonic period (pharyngula period in vertebrates) is highly prone to embryonic lethality, and the resulting negative selections lead to evolutionary conservation of this phase. Here, we tested this "mid-embryonic lethality hypothesis" by measuring the rate of lethal phenotypes of three different species of vertebrate embryos subjected to two kinds of perturbations: transient perturbations and genetic mutations. By subjecting zebrafish ( Danio rerio ), African clawed frog ( Xenopus laevis ), and chicken ( Gallus gallus ) embryos to transient perturbations, namely heat shock and inhibitor treatments during three developmental periods [early (represented by blastula and gastrula), pharyngula, and late], we found that the early stages showed the highest rate of lethal phenotypes in all three species. This result was corroborated by perturbation with genetic mutations. By tracking the survival rate of wild-type embryos and embryos with genetic mutations induced by UV irradiation in zebrafish and African clawed frogs, we found that the highest decrease in survival rate was at the early stages particularly around gastrulation in both these species. In opposition to the "mid-embryonic lethality hypothesis," our results consistently showed that the stage with the highest lethality was not around the conserved pharyngula period, but rather around the early period in all the vertebrate species tested. These results suggest that negative selection by embryonic lethality could not explain hourglass-like conservation of animal embryos. This highlights the potential contribution of alternative mechanisms such as the diversifying effect of positive selections against earlier and later stages, and developmental constraints which lead to conservation of mid-embryonic stages.

  18. Plant centromeres: structure and control.

    PubMed

    Richards, E J; Dawe, R K

    1998-04-01

    Recent work has led to a better understanding of the molecular components of plant centromeres. Conservation of at least some centromere protein constituents between plant and non-plant systems has been demonstrated. The identity and organization of plant centromeric DNA sequences are also beginning to yield to analysis. While there is little primary DNA sequence conservation among the characterized plant centromeres and their non-plant counterparts, some parallels in centromere genomic organisation can be seen across species. Finally, the emerging idea that centromere activity is controlled epigenetically finds support in an examination of the plant centromere literature.

  19. Optimal packaging of FIV genomic RNA depends upon a conserved long-range interaction and a palindromic sequence within gag.

    PubMed

    Rizvi, Tahir A; Kenyon, Julia C; Ali, Jahabar; Aktar, Suriya J; Phillip, Pretty S; Ghazawi, Akela; Mustafa, Farah; Lever, Andrew M L

    2010-10-15

    The feline immunodeficiency virus (FIV) is a lentivirus that is related to human immunodeficiency virus (HIV), causing a similar pathology in cats. It is a potential small animal model for AIDS and the FIV-based vectors are also being pursued for human gene therapy. Previous studies have mapped the FIV packaging signal (ψ) to two or more discontinuous regions within the 5' 511 nt of the genomic RNA and structural analyses have determined its secondary structure. The 5' and 3' sequences within ψ region interact through extensive long-range interactions (LRIs), including a conserved heptanucleotide interaction between R/U5 and gag. Other secondary structural elements identified include a conserved 150 nt stem-loop (SL2) and a small palindromic stem-loop within gag open reading frame that might act as a viral dimerization initiation site. We have performed extensive mutational analysis of these sequences and structures and ascertained their importance in FIV packaging using a trans-complementation assay. Disrupting the conserved heptanucleotide LRI to prevent base pairing between R/U5 and gag reduced packaging by 2.8-5.5 fold. Restoration of pairing using an alternative, non-wild type (wt) LRI sequence restored RNA packaging and propagation to wt levels, suggesting that it is the structure of the LRI, rather than its sequence, that is important for FIV packaging. Disrupting the palindrome within gag reduced packaging by 1.5-3-fold, but substitution with a different palindromic sequence did not restore packaging completely, suggesting that the sequence of this region as well as its palindromic nature is important. Mutation of individual regions of SL2 did not have a pronounced effect on FIV packaging, suggesting that either it is the structure of SL2 as a whole that is necessary for optimal packaging, or that there is redundancy within this structure. The mutational analysis presented here has further validated the previously predicted RNA secondary structure of FIV ψ. Copyright © 2010 Elsevier Ltd. All rights reserved.

  20. On a new class of completely integrable nonlinear wave equations. II. Multi-Hamiltonian structure

    NASA Astrophysics Data System (ADS)

    Nutku, Y.

    1987-11-01

    The multi-Hamiltonian structure of a class of nonlinear wave equations governing the propagation of finite amplitude waves is discussed. Infinitely many conservation laws had earlier been obtained for these equations. Starting from a (primary) Hamiltonian formulation of these equations the necessary and sufficient conditions for the existence of bi-Hamiltonian structure are obtained and it is shown that the second Hamiltonian operator can be constructed solely through a knowledge of the first Hamiltonian function. The recursion operator which first appears at the level of bi-Hamiltonian structure gives rise to an infinite sequence of conserved Hamiltonians. It is found that in general there exist two different infinite sequences of conserved quantities for these equations. The recursion relation defining higher Hamiltonian structures enables one to obtain the necessary and sufficient conditions for the existence of the (k+1)st Hamiltonian operator which depends on the kth Hamiltonian function. The infinite sequence of conserved Hamiltonians are common to all the higher Hamiltonian structures. The equations of gas dynamics are discussed as an illustration of this formalism and it is shown that in general they admit tri-Hamiltonian structure with two distinct infinite sets of conserved quantities. The isothermal case of γ=1 is an exceptional one that requires separate treatment. This corresponds to a specialization of the equations governing the expansion of plasma into vacuum which will be shown to be equivalent to Poisson's equation in nonlinear acoustics.

  1. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets.

    PubMed

    Nelson, Christopher S; Fuller, Chris K; Fordyce, Polly M; Greninger, Alexander L; Li, Hao; DeRisi, Joseph L

    2013-07-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein's DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2's-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved.

  2. Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets

    PubMed Central

    Nelson, Christopher S.; Fuller, Chris K.; Fordyce, Polly M.; Greninger, Alexander L.; Li, Hao; DeRisi, Joseph L.

    2013-01-01

    The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein’s DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2’s-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved. PMID:23625967

  3. Conservation and divergence of plant LHP1 protein sequences and expression patterns in angiosperms and gymnosperms.

    PubMed

    Guan, Hexin; Zheng, Zhengui; Grey, Paris H; Li, Yuhua; Oppenheimer, David G

    2011-05-01

    Floral transition is a critical and strictly regulated developmental process in plants. Mutations in Arabidopsis LIKE HETEROCHROMATIN PROTEIN 1 (AtLHP1)/TERMINAL FLOWER 2 (TFL2) result in early and terminal flowers. Little is known about the gene expression, function and evolution of plant LHP1 homologs, except for Arabidopsis LHP1. In this study, the conservation and divergence of plant LHP1 protein sequences was analyzed by sequence alignments and phylogeny. LHP1 expression patterns were compared among taxa that occupy pivotal phylogenetic positions. Several relatively conserved new motifs/regions were identified among LHP1 homologs. Phylogeny of plant LHP1 proteins agreed with established angiosperm relationships. In situ hybridization unveiled conserved expression of plant LHP1 in the axillary bud/tiller, vascular bundles, developing stamens, and carpels. Unlike AtLHP1, cucumber CsLHP1-2, sugarcane SoLHP1 and maize ZmLHP1, rice OsLHP1 is not expressed in the shoot apical meristem (SAM) and the OsLHP1 transcript level is consistently low in shoots. "Unequal crossover" might have contributed to the divergence in the N-terminal and hinge region lengths of LHP1 homologs. We propose an "insertion-deletion" model for soybean (Glycine max L.) GmLHP1s evolution. Plant LHP1 homologs are more conserved than previously expected, and may favor vegetative meristem identity and primordia formation. OsLHP1 may not function in rice SAM during floral induction.

  4. The impact of age, biogenesis, and genomic clustering on Drosophila microRNA evolution

    PubMed Central

    Mohammed, Jaaved; Flynt, Alex S.; Siepel, Adam; Lai, Eric C.

    2013-01-01

    The molecular evolutionary signatures of miRNAs inform our understanding of their emergence, biogenesis, and function. The known signatures of miRNA evolution have derived mostly from the analysis of deeply conserved, canonical loci. In this study, we examine the impact of age, biogenesis pathway, and genomic arrangement on the evolutionary properties of Drosophila miRNAs. Crucial to the accuracy of our results was our curation of high-quality miRNA alignments, which included nearly 150 corrections to ortholog calls and nucleotide sequences of the global 12-way Drosophilid alignments currently available. Using these data, we studied primary sequence conservation, normalized free-energy values, and types of structure-preserving substitutions. We expand upon common miRNA evolutionary patterns that reflect fundamental features of miRNAs that are under functional selection. We observe that melanogaster-subgroup-specific miRNAs, although recently emerged and rapidly evolving, nonetheless exhibit evolutionary signatures that are similar to well-conserved miRNAs and distinct from other structured noncoding RNAs and bulk conserved non-miRNA hairpins. This provides evidence that even young miRNAs may be selected for regulatory activities. More strikingly, we observe that mirtrons and clustered miRNAs both exhibit distinct evolutionary properties relative to solo, well-conserved miRNAs, even after controlling for sequence depth. These studies highlight the previously unappreciated impact of biogenesis strategy and genomic location on the evolutionary dynamics of miRNAs, and affirm that miRNAs do not evolve as a unitary class. PMID:23882112

  5. Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved

    PubMed Central

    Long, Hannah K.; King, Hamish W.; Patient, Roger K.; Odom, Duncan T.; Klose, Robert J.

    2016-01-01

    DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. PMID:27084945

  6. Assessment of phylogenetic relationship of rare plant species collected from Saudi Arabia using internal transcribed spacer sequences of nuclear ribosomal DNA.

    PubMed

    Al-Qurainy, F; Khan, S; Nadeem, M; Tarroum, M; Alaklabi, A

    2013-03-11

    The rare and endangered plants of any country are important genetic resources that often require urgent conservation measures. Assessment of phylogenetic relationships and evaluation of genetic diversity is very important prior to implementation of conservation strategies for saving rare and endangered plant species. We used internal transcribed spacer sequences of nuclear ribosomal DNA for the evaluation of sequence identity from the available taxa in the GenBank database by using the Basic Local Alignment Search Tool (BLAST). Two rare plant species viz, Heliotropium strigosum claded with H. pilosum (98% branch support) and Pancratium tortuosum claded with P. tenuifolium (61% branch support) clearly. However, some species, viz Scadoxus multiflorus, Commiphora myrrha and Senecio hadiensis showed close relationships with more than one species. We conclude that nuclear ribosomal internal transcribed spacer sequences are useful markers for phylogenetic study of these rare plant species in Saudi Arabia.

  7. Biocuration in the structure-function linkage database: the anatomy of a superfamily.

    PubMed

    Holliday, Gemma L; Brown, Shoshana D; Akiva, Eyal; Mischel, David; Hicks, Michael A; Morris, John H; Huang, Conrad C; Meng, Elaine C; Pegg, Scott C-H; Ferrin, Thomas E; Babbitt, Patricia C

    2017-01-01

    With ever-increasing amounts of sequence data available in both the primary literature and sequence repositories, there is a bottleneck in annotating molecular function to a sequence. This article describes the biocuration process and methods used in the structure-function linkage database (SFLD) to help address some of the challenges. We discuss how the hierarchy within the SFLD allows us to infer detailed functional properties for functionally diverse enzyme superfamilies in which all members are homologous, conserve an aspect of their chemical function and have associated conserved structural features that enable the chemistry. Also presented is the Enzyme Structure-Function Ontology (ESFO), which has been designed to capture the relationships between enzyme sequence, structure and function that underlie the SFLD and is used to guide the biocuration processes within the SFLD. http://sfld.rbvi.ucsf.edu/. © The Author 2017. Published by Oxford University Press.

  8. An alternative nested-PCR assay for the detection of Toxoplasma gondii strains based on GRA7 gene sequences.

    PubMed

    Costa, Maria Eduarda S M; Oliveira, Claudio Bruno S; Andrade, Joelma Maria de A; Medeiros, Thatiany A; Neto, Valter F Andrade; Lanza, Daniel C F

    2016-07-01

    Toxoplasma gondii is a widespread parasite able to infect virtually any nucleated cells of warm-blooded hosts. In some cases, T. gondii detection using already developed PCR primers can be inefficient in routine laboratory tests, especially to detect atypical strains. Here we report a new nested-PCR protocol able to detect virtually all T. gondii isolates. Analyzing 685 sequences available in GenBank, we determine that GRA7 is one of the most conserved genes of T. gondii genome. Based on an alignment of 85 GRA7 sequences new primer sets that anneal in the highly conserved regions of this gene were designed. The new GRA7 nested-PCR assay providing sensitivity and specificity equal to or greater than the gold standard PCR assays for T. gondii detection, that amplify the B1 sequence or the repetitive 529bp element. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. A TALE-inspired computational screen for proteins that contain approximate tandem repeats.

    PubMed

    Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias

    2017-01-01

    TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.

  10. A TALE-inspired computational screen for proteins that contain approximate tandem repeats

    PubMed Central

    Krwawicz, Joanna

    2017-01-01

    TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832

  11. Identification of evolutionarily conserved Momordica charantia microRNAs using computational approach and its utility in phylogeny analysis.

    PubMed

    Thirugnanasambantham, Krishnaraj; Saravanan, Subramanian; Karikalan, Kulandaivelu; Bharanidharan, Rajaraman; Lalitha, Perumal; Ilango, S; HairulIslam, Villianur Ibrahim

    2015-10-01

    Momordica charantia (bitter gourd, bitter melon) is a monoecious Cucurbitaceae with anti-oxidant, anti-microbial, anti-viral and anti-diabetic potential. Molecular studies on this economically valuable plant are very essential to understand its phylogeny and evolution. MicroRNAs (miRNAs) are conserved, small, non-coding RNA with ability to regulate gene expression by bind the 3' UTR region of target mRNA and are evolved at different rates in different plant species. In this study we have utilized homology based computational approach and identified 27 mature miRNAs for the first time from this bio-medically important plant. The phylogenetic tree developed from binary data derived from the data on presence/absence of the identified miRNAs were noticed to be uncertain and biased. Most of the identified miRNAs were highly conserved among the plant species and sequence based phylogeny analysis of miRNAs resolved the above difficulties in phylogeny approach using miRNA. Predicted gene targets of the identified miRNAs revealed their importance in regulation of plant developmental process. Reported miRNAs held sequence conservation in mature miRNAs and the detailed phylogeny analysis of pre-miRNA sequences revealed genus specific segregation of clusters. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Conserved Sequences at the Origin of Adenovirus DNA Replication

    PubMed Central

    Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.

    1982-01-01

    The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575

  13. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    PubMed Central

    Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia

    2011-01-01

    Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358

  14. Antimicrobial activity of Brassica nectar lipid transfer protein

    USDA-ARS?s Scientific Manuscript database

    Antimicrobial peptides (AMPs) provide an ancient, innate immunity conserved in all multicellular organisms. In plants, there are several large families of AMPs defined by sequence similarity. The nonspecific lipid transfer protein (LTP) family is defined by a conserved signature of eight cysteines a...

  15. Information analysis of sequences that bind the replication initiator RepA | Center for Cancer Research

    Cancer.gov

    The tall letters represent the highly conserved bases in DNA binding sites of several prokaryotic repressors and activators. Conservation is strongest where major grooves of the double helical DNA (represented by crests of a cosine wave) face the protein. This shows that conservation analysis alone can be used to predict the face of DNA that contacts the proteins.

  16. Whole-Genome Sequencing and Variant Analysis of Human Papillomavirus 16 Infections.

    PubMed

    van der Weele, Pascal; Meijer, Chris J L M; King, Audrey J

    2017-10-01

    Human papillomavirus (HPV) is a strongly conserved DNA virus, high-risk types of which can cause cervical cancer in persistent infections. The most common type found in HPV-attributable cancer is HPV16, which can be subdivided into four lineages (A to D) with different carcinogenic properties. Studies have shown HPV16 sequence diversity in different geographical areas, but only limited information is available regarding HPV16 diversity within a population, especially at the whole-genome level. We analyzed HPV16 major variant diversity and conservation in persistent infections and performed a single nucleotide polymorphism (SNP) comparison between persistent and clearing infections. Materials were obtained in the Netherlands from a cohort study with longitudinal follow-up for up to 3 years. Our analysis shows a remarkably large variant diversity in the population. Whole-genome sequences were obtained for 57 persistent and 59 clearing HPV16 infections, resulting in 109 unique variants. Interestingly, persistent infections were completely conserved through time. One reinfection event was identified where the initial and follow-up samples clustered differently. Non-A1/A2 variants seemed to clear preferentially ( P = 0.02). Our analysis shows that population-wide HPV16 sequence diversity is very large. In persistent infections, the HPV16 sequence was fully conserved. Sequencing can identify HPV16 reinfections, although occurrence is rare. SNP comparison identified no strongly acting effect of the viral genome affecting HPV16 infection clearance or persistence in up to 3 years of follow-up. These findings suggest the progression of an early HPV16 infection could be host related. IMPORTANCE Human papillomavirus 16 (HPV16) is the predominant type found in cervical cancer. Progression of initial infection to cervical cancer has been linked to sequence properties; however, knowledge of variants circulating in European populations, especially with longitudinal follow-up, is limited. By sequencing a number of infections with known follow-up for up to 3 years, we gained initial insights into the genetic diversity of HPV16 and the effects of the viral genome on the persistence of infections. A SNP comparison between sequences obtained from clearing and persistent infections did not identify strongly acting DNA variations responsible for these infection outcomes. In addition, we identified an HPV16 reinfection event where sequencing of initial and follow-up samples showed different HPV16 variants. Based on conventional genotyping, this infection would incorrectly be considered a persistent HPV16 infection. In the context of vaccine efficacy and monitoring studies, such infections could potentially cause reduced reported efficacy or efficiency. Copyright © 2017 van der Weele et al.

  17. Antibacterial activity and immune responses of a molluscan macrophage expressed gene-1 from disk abalone, Haliotis discus discus.

    PubMed

    Bathige, S D N K; Umasuthan, Navaneethaiyer; Whang, Ilson; Lim, Bong-Soo; Won, Seung Hwan; Lee, Jehee

    2014-08-01

    The membrane-attack complex/perforin (MACPF) domain-containing proteins play an important role in the innate immune response against invading microbial pathogens. In the current study, a member of the MACPF domain-containing proteins, macrophage expressed gene-1 (MPEG1) encoding 730 amino acids with the theoretical molecular mass of 79.6 kDa and an isoelectric point (pI) of 6.49 was characterized from disk abalone Haliotis discus discus (AbMPEG1). We found that the characteristic MACPF domain (Val(131)-Tyr(348)) and transmembrane segment (Ala(669)-Ile(691)) of AbMPEG1 are located in the N- and C-terminal ends of the protein, respectively. Ortholog comparison revealed that AbMPEG1 has the highest sequence identity with its pink abalone counterpart, while sequences identities of greater than 90% were observed with MPEG1 members from other abalone species. Likewise, the furin cleavage site KRRRK was highly conserved in all abalone species, but not in other species investigated. We identified an intron-less genomic sequence within disk abalone AbMPEG1, which was similar to other mammalian, avian, and reptilian counterparts. Transcription factor binding sites, which are important for immune responses, were identified in the 5'-flanking region of AbMPEG1. qPCR revealed AbMPEG1 transcripts are present in every tissues examined, with the highest expression level occurring in mantle tissue. Significant up-regulation of AbMPEG1 transcript levels was observed in hemocytes and gill tissues following challenges with pathogens (Vibrio parahemolyticus, Listeria monocytogenes and viral hemorrhagic septicemia virus) as well as pathogen-associated molecular patterns (PAMPs: lipopolysaccharides and poly I:C immunostimulant). Finally, the antibacterial activity of the MACPF domain was characterized against Gram-negative and -positive bacteria using a recombinant peptide. Taken together, these results indicate that the biological significance of the AbMPEG1 gene includes a role in protecting disk abalone through the ability of AbMPEG1 to initiate an innate immune response upon pathogen invasion. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Screening of broad spectrum natural pesticides against conserved target arginine kinase in cotton pests by molecular modeling.

    PubMed

    Sakthivel, Seethalakshmi; Habeeb, S K M; Raman, Chandrasekar

    2018-03-12

    Cotton is an economically important crop and its production is challenged by the diversity of pests and related insecticide resistance. Identification of the conserved target across the cotton pest will help to design broad spectrum insecticide. In this study, we have identified conserved sequences by Expressed Sequence Tag profiling from three cotton pests namely Aphis gossypii, Helicoverpa armigera, and Spodoptera exigua. One target protein arginine kinase having a key role in insect physiology and energy metabolism was studied further using homology modeling, virtual screening, molecular docking, and molecular dynamics simulation to identify potential biopesticide compounds from the Zinc natural database. We have identified four compounds having excellent inhibitor potential against the identified broad spectrum target which are highly specific to invertebrates.

  19. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

    PubMed

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-07-20

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Identification of miRNAs and their targets in wild tomato at moderately and acutely elevated temperatures by high-throughput sequencing and degradome analysis

    PubMed Central

    Zhou, Rong; Wang, Qian; Jiang, Fangling; Cao, Xue; Sun, Mintao; Liu, Min; Wu, Zhen

    2016-01-01

    MicroRNAs (miRNAs) are 19–24 nucleotide (nt) noncoding RNAs that play important roles in abiotic stress responses in plants. High temperatures have been the subject of considerable attention due to their negative effects on plant growth and development. Heat-responsive miRNAs have been identified in some plants. However, there have been no reports on the global identification of miRNAs and their targets in tomato at high temperatures, especially at different elevated temperatures. Here, three small-RNA libraries and three degradome libraries were constructed from the leaves of the heat-tolerant tomato at normal, moderately and acutely elevated temperatures (26/18 °C, 33/33 °C and 40/40 °C, respectively). Following high-throughput sequencing, 662 conserved and 97 novel miRNAs were identified in total with 469 conserved and 91 novel miRNAs shared in the three small-RNA libraries. Of these miRNAs, 96 and 150 miRNAs were responsive to the moderately and acutely elevated temperature, respectively. Following degradome sequencing, 349 sequences were identified as targets of 138 conserved miRNAs, and 13 sequences were identified as targets of eight novel miRNAs. The expression levels of seven miRNAs and six target genes obtained by quantitative real-time PCR (qRT-PCR) were largely consistent with the sequencing results. This study enriches the number of heat-responsive miRNAs and lays a foundation for the elucidation of the miRNA-mediated regulatory mechanism in tomatoes at elevated temperatures. PMID:27653374

  1. Structure-Related Roles for the Conservation of the HIV-1 Fusion Peptide Sequence Revealed by Nuclear Magnetic Resonance.

    PubMed

    Serrano, Soraya; Huarte, Nerea; Rujas, Edurne; Andreu, David; Nieva, José L; Jiménez, María Angeles

    2017-10-17

    Despite extensive characterization of the human immunodeficiency virus type 1 (HIV-1) hydrophobic fusion peptide (FP), the structure-function relationships underlying its extraordinary degree of conservation remain poorly understood. Specifically, the fact that the tandem repeat of the FLGFLG tripeptide is absolutely conserved suggests that high hydrophobicity may not suffice to unleash FP function. Here, we have compared the nuclear magnetic resonance (NMR) structures adopted in nonpolar media by two FP surrogates, wtFP-tag and scrFP-tag, which had equal hydrophobicity but contained wild-type and scrambled core sequences LFLGFLG and FGLLGFL, respectively. In addition, these peptides were tagged at their C-termini with an epitope sequence that folded independently, thereby allowing Western blot detection without interfering with FP structure. We observed similar α-helical FP conformations for both specimens dissolved in the low-polarity medium 25% (v/v) 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP), but important differences in contact with micelles of the membrane mimetic dodecylphosphocholine (DPC). Thus, whereas wtFP-tag preserved a helix displaying a Gly-rich ridge, the scrambled sequence lost in great part the helical structure upon being solubilized in DPC. Western blot analyses further revealed the capacity of wtFP-tag to assemble trimers in membranes, whereas membrane oligomers were not observed in the case of the scrFP-tag sequence. We conclude that, beyond hydrophobicity, preserving sequence order is an important feature for defining the secondary structures and oligomeric states adopted by the HIV FP in membranes.

  2. Unravelling the complexity of microRNA-mediated gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA profiling.

    PubMed

    Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V

    2016-01-01

    Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.

  3. Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

    PubMed

    Rani, R Ranjani; Ramyachitra, D

    2016-12-01

    Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  4. Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis

    PubMed Central

    Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting

    2016-01-01

    ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181

  5. In silico Analysis of 3′-End-Processing Signals in Aspergillus oryzae Using Expressed Sequence Tags and Genomic Sequencing Data

    PubMed Central

    Tanaka, Mizuki; Sakai, Yoshifumi; Yamada, Osamu; Shintani, Takahiro; Gomi, Katsuya

    2011-01-01

    To investigate 3′-end-processing signals in Aspergillus oryzae, we created a nucleotide sequence data set of the 3′-untranslated region (3′ UTR) plus 100 nucleotides (nt) sequence downstream of the poly(A) site using A. oryzae expressed sequence tags and genomic sequencing data. This data set comprised 1065 sequences derived from 1042 unique genes. The average 3′ UTR length in A. oryzae was 241 nt, which is greater than that in yeast but similar to that in plants. The 3′ UTR and 100 nt sequence downstream of the poly(A) site is notably U-rich, while the region located 15–30 nt upstream of the poly(A) site is markedly A-rich. The most frequently found hexanucleotide in this A-rich region is AAUGAA, although this sequence accounts for only 6% of all transcripts. These data suggested that A. oryzae has no highly conserved sequence element equivalent to AAUAAA, a mammalian polyadenylation signal. We identified that putative 3′-end-processing signals in A. oryzae, while less well conserved than those in mammals, comprised four sequence elements: the furthest upstream U-rich element, A-rich sequence, cleavage site, and downstream U-rich element flanking the cleavage site. Although these putative 3′-end-processing signals are similar to those in yeast and plants, some notable differences exist between them. PMID:21586533

  6. Incorporating evolution of transcription factor binding sites into annotated alignments.

    PubMed

    Bais, Abha S; Grossmann, Stefen; Vingron, Martin

    2007-08-01

    Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.

  7. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    PubMed

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  8. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering

    PubMed Central

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor

    2015-01-01

    Abstract To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice. PMID:25560745

  9. Combined sequence and structure analysis of the fungal laccase family.

    PubMed

    Kumar, S V Suresh; Phale, Prashant S; Durani, S; Wangikar, Pramod P

    2003-08-20

    Plant and fungal laccases belong to the family of multi-copper oxidases and show much broader substrate specificity than other members of the family. Laccases have consequently been of interest for potential industrial applications. We have analyzed the essential sequence features of fungal laccases based on multiple sequence alignments of more than 100 laccases. This has resulted in identification of a set of four ungapped sequence regions, L1-L4, as the overall signature sequences that can be used to identify the laccases, distinguishing them within the broader class of multi-copper oxidases. The 12 amino acid residues in the enzymes serving as the copper ligands are housed within these four identified conserved regions, of which L2 and L4 conform to the earlier reported copper signature sequences of multi-copper oxidases while L1 and L3 are distinctive to the laccases. The mapping of regions L1-L4 on to the three-dimensional structure of the Coprinus cinerius laccase indicates that many of the non-copper-ligating residues of the conserved regions could be critical in maintaining a specific, more or less C-2 symmetric, protein conformational motif characterizing the active site apparatus of the enzymes. The observed intraprotein homologies between L1 and L3 and between L2 and L4 at both the structure and the sequence levels suggest that the quasi C-2 symmetric active site conformational motif may have arisen from a structural duplication event that neither the sequence homology analysis nor the structure homology analysis alone would have unraveled. Although the sequence and structure homology is not detectable in the rest of the protein, the relative orientation of region L1 with L2 is similar to that of L3 with L4. The structure duplication of first-shell and second-shell residues has become cryptic because the intraprotein sequence homology noticeable for a given laccase becomes significant only after comparing the conservation pattern in several fungal laccases. The identified motifs, L1-L4, can be useful in searching the newly sequenced genomes for putative laccase enzymes. Copyright 2003 Wiley Periodicals, Inc. Biotechnol Bioeng 83: 386-394, 2003.

  10. Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

    PubMed

    Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark

    2003-07-04

    The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.

  11. Plastome Sequence Determination and Comparative Analysis for Members of the Lolium-Festuca Grass Species Complex

    PubMed Central

    Hand, Melanie L.; Spangenberg, German C.; Forster, John W.; Cogan, Noel O. I.

    2013-01-01

    Chloroplast genome sequences are of broad significance in plant biology, due to frequent use in molecular phylogenetics, comparative genomics, population genetics, and genetic modification studies. The present study used a second-generation sequencing approach to determine and assemble the plastid genomes (plastomes) of four representatives from the agriculturally important Lolium-Festuca species complex of pasture grasses (Lolium multiflorum, Festuca pratensis, Festuca altissima, and Festuca ovina). Total cellular DNA was extracted from either roots or leaves, was sequenced, and the output was filtered for plastome-related reads. A comparison between sources revealed fewer plastome-related reads from root-derived template but an increase in incidental bacterium-derived sequences. Plastome assembly and annotation indicated high levels of sequence identity and a conserved organization and gene content between species. However, frequent deletions within the F. ovina plastome appeared to contribute to a smaller plastid genome size. Comparative analysis with complete plastome sequences from other members of the Poaceae confirmed conservation of most grass-specific features. Detailed analysis of the rbcL–psaI intergenic region, however, revealed a “hot-spot” of variation characterized by independent deletion events. The evolutionary implications of this observation are discussed. The complete plastome sequences are anticipated to provide the basis for potential organelle-specific genetic modification of pasture grasses. PMID:23550121

  12. rpoB Gene Sequence-Based Identification of Aerobic Gram-Positive Cocci of the Genera Streptococcus, Enterococcus, Gemella, Abiotrophia, and Granulicatella

    PubMed Central

    Drancourt, Michel; Roux, Véronique; Fournier, Pierre-Edouard; Raoult, Didier

    2004-01-01

    We developed a new molecular tool based on rpoB gene (encoding the beta subunit of RNA polymerase) sequencing to identify streptococci. We first sequenced the complete rpoB gene for Streptococcus anginosus, S. equinus, and Abiotrophia defectiva. Sequences were aligned with these of S. pyogenes, S. agalactiae, and S. pneumoniae available in GenBank. Using an in-house analysis program (SVARAP), we identified a 740-bp variable region surrounded by conserved, 20-bp zones and, by using these conserved zones as PCR primer targets, we amplified and sequenced this variable region in an additional 30 Streptococcus, Enterococcus, Gemella, Granulicatella, and Abiotrophia species. This region exhibited 71.2 to 99.3% interspecies homology. We therefore applied our identification system by PCR amplification and sequencing to a collection of 102 streptococci and 60 bacterial isolates belonging to other genera. Amplicons were obtained in streptococci and Bacillus cereus, and sequencing allowed us to make a correct identification of streptococci. Molecular signatures were determined for the discrimination of closely related species within the S. pneumoniae-S. oralis-S. mitis group and the S. agalactiae-S. difficile group. These signatures allowed us to design a S. pneumoniae-specific PCR and sequencing primer pair. PMID:14766807

  13. Maintenance of an Intact Human Immunodeficiency Virus Type 1 vpr Gene following Mother-to-Infant Transmission

    PubMed Central

    Yedavalli, Venkat R. K.; Chappey, Colombe; Ahmad, Nafees

    1998-01-01

    The vpr sequences from six human immunodeficiency virus type 1 (HIV-1)-infected mother-infant pairs following perinatal transmission were analyzed. We found that 153 of the 166 clones analyzed from uncultured peripheral blood mononuclear cell DNA samples showed a 92.17% frequency of intact vpr open reading frames. There was a low degree of heterogeneity of vpr genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vpr sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Moreover, the infants’ sequences displayed patterns similar to those seen in their mothers. The functional domains essential for Vpr activity, including virion incorporation, nuclear import, and cell cycle arrest and differentiation were highly conserved in most of the sequences. Phylogenetic analyses of 166 mother-infant pairs and 195 other available vpr sequences from HIV databases formed distinct clusters for each mother-infant pair and for other vpr sequences and grouped the six mother-infant pairs’ sequences with subtype B sequences. A high degree of conservation of intact and functional vpr supports the notion that vpr plays an important role in HIV-1 infection and replication in mother-infant isolates that are involved in perinatal transmission. PMID:9658150

  14. Sequence analysis of Jembrana disease virus strains reveals a genetically stable lentivirus.

    PubMed

    Desport, Moira; Stewart, Meredith E; Mikosza, Andrew S; Sheridan, Carol A; Peterson, Shane E; Chavand, Olivier; Hartaningsih, Nining; Wilcox, Graham E

    2007-06-01

    Jembrana disease virus (JDV) is a lentivirus associated with an acute disease syndrome with a 20% case fatality rate in Bos javanicus (Bali cattle) in Indonesia, occurring after a short incubation period and with no recurrence of the disease after recovery. Partial regions of gag and pol and the entire env were examined for sequence variation in DNA samples from cases of Jembrana disease obtained from Bali, Sumatra and South Kalimantan in Indonesian Borneo. A high level of nucleotide conservation (97-100%) was observed in gag sequences from samples taken in Bali and Sumatra, indicating that the source of JDV in Sumatra was most likely to have originated from Bali. The pol sequences and, unexpectedly, the env sequences from Bali samples were also well conserved with low nucleotide (96-99%) and amino acid substitutions (95-99%). However, the sample from South Kalimantan (JDV(KAL/01)) contained more divergent sequences, particularly in env (88% identity). Phylogenetic analysis revealed that the JDV(KAL/01)env sequences clustered with the sequence from the Pulukan sample (Bali) from 2001. JDV appears to be remarkably stable genetically and has undergone minor genetic changes over a period of nearly 20 years in Bali despite becoming endemic in the cattle population of the island.

  15. Homology to peptide pattern for annotation of carbohydrate-active enzymes and prediction of function.

    PubMed

    Busk, P K; Pilgaard, B; Lezyk, M J; Meyer, A S; Lange, L

    2017-04-12

    Carbohydrate-active enzymes are found in all organisms and participate in key biological processes. These enzymes are classified in 274 families in the CAZy database but the sequence diversity within each family makes it a major task to identify new family members and to provide basis for prediction of enzyme function. A fast and reliable method for de novo annotation of genes encoding carbohydrate-active enzymes is to identify conserved peptides in the curated enzyme families followed by matching of the conserved peptides to the sequence of interest as demonstrated for the glycosyl hydrolase and the lytic polysaccharide monooxygenase families. This approach not only assigns the enzymes to families but also provides functional prediction of the enzymes with high accuracy. We identified conserved peptides for all enzyme families in the CAZy database with Peptide Pattern Recognition. The conserved peptides were matched to protein sequence for de novo annotation and functional prediction of carbohydrate-active enzymes with the Hotpep method. Annotation of protein sequences from 12 bacterial and 16 fungal genomes to families with Hotpep had an accuracy of 0.84 (measured as F1-score) compared to semiautomatic annotation by the CAZy database whereas the dbCAN HMM-based method had an accuracy of 0.77 with optimized parameters. Furthermore, Hotpep provided a functional prediction with 86% accuracy for the annotated genes. Hotpep is available as a stand-alone application for MS Windows. Hotpep is a state-of-the-art method for automatic annotation and functional prediction of carbohydrate-active enzymes.

  16. CRISPR Diversity and Microevolution in Clostridium difficile

    PubMed Central

    Andersen, Joakim M.; Shoup, Madelyn; Robinson, Cathy; Britton, Robert; Olsen, Katharina E.P.; Barrangou, Rodolphe

    2016-01-01

    Abstract Virulent strains of Clostridium difficile have become a global health problem associated with morbidity and mortality. Traditional typing methods do not provide ideal resolution to track outbreak strains, ascertain genetic diversity between isolates, or monitor the phylogeny of this species on a global basis. Here, we investigate the occurrence and diversity of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated genes (cas) in C. difficile to assess the potential of CRISPR-based phylogeny and high-resolution genotyping. A single Type-IB CRISPR-Cas system was identified in 217 analyzed genomes with cas gene clusters present at conserved chromosomal locations, suggesting vertical evolution of the system, assessing a total of 1,865 CRISPR arrays. The CRISPR arrays, markedly enriched (8.5 arrays/genome) compared with other species, occur both at conserved and variable locations across strains, and thus provide a basis for typing based on locus occurrence and spacer polymorphism. Clustering of strains by array composition correlated with sequence type (ST) analysis. Spacer content and polymorphism within conserved CRISPR arrays revealed phylogenetic relationship across clades and within ST. Spacer polymorphisms of conserved arrays were instrumental for differentiating closely related strains, e.g., ST1/RT027/B1 strains and pathogenicity locus encoding ST3/RT001 strains. CRISPR spacers showed sequence similarity to phage sequences, which is consistent with the native role of CRISPR-Cas as adaptive immune systems in bacteria. Overall, CRISPR-Cas sequences constitute a valuable basis for genotyping of C. difficile isolates, provide insights into the micro-evolutionary events that occur between closely related strains, and reflect the evolutionary trajectory of these genomes. PMID:27576538

  17. Computational approaches for identification of conserved/unique binding pockets in the A chain of ricin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ecale Zhou, C L; Zemla, A T; Roe, D

    2005-01-29

    Specific and sensitive ligand-based protein detection assays that employ antibodies or small molecules such as peptides, aptamers, or other small molecules require that the corresponding surface region of the protein be accessible and that there be minimal cross-reactivity with non-target proteins. To reduce the time and cost of laboratory screening efforts for diagnostic reagents, we developed new methods for evaluating and selecting protein surface regions for ligand targeting. We devised combined structure- and sequence-based methods for identifying 3D epitopes and binding pockets on the surface of the A chain of ricin that are conserved with respect to a set ofmore » ricin A chains and unique with respect to other proteins. We (1) used structure alignment software to detect structural deviations and extracted from this analysis the residue-residue correspondence, (2) devised a method to compare corresponding residues across sets of ricin structures and structures of closely related proteins, (3) devised a sequence-based approach to determine residue infrequency in local sequence context, and (4) modified a pocket-finding algorithm to identify surface crevices in close proximity to residues determined to be conserved/unique based on our structure- and sequence-based methods. In applying this combined informatics approach to ricin A we identified a conserved/unique pocket in close proximity (but not overlapping) the active site that is suitable for bi-dentate ligand development. These methods are generally applicable to identification of surface epitopes and binding pockets for development of diagnostic reagents, therapeutics, and vaccines.« less

  18. Taxonomic uncertainty and the loss of biodiversity on Christmas Island, Indian Ocean.

    PubMed

    Eldridge, Mark D B; Meek, Paul D; Johnson, Rebecca N

    2014-04-01

    The taxonomic uniqueness of island populations is often uncertain which hinders effective prioritization for conservation. The Christmas Island shrew (Crocidura attenuata trichura) is the only member of the highly speciose eutherian family Soricidae recorded from Australia. It is currently classified as a subspecies of the Asian gray or long-tailed shrew (C. attenuata), although it was originally described as a subspecies of the southeast Asian white-toothed shrew (C. fuliginosa). The Christmas Island shrew is currently listed as endangered and has not been recorded in the wild since 1984-1985, when 2 specimens were collected after an 80-year absence. We aimed to obtain DNA sequence data for cytochrome b (cytb) from Christmas Island shrew museum specimens to determine their taxonomic affinities and to confirm the identity of the 1980s specimens. The Cytb sequences from 5, 1898 specimens and a 1985 specimen were identical. In addition, the Christmas Island shrew cytb sequence was divergent at the species level from all available Crocidura cytb sequences. Rather than a population of a widespread species, current evidence suggests the Christmas Island shrew is a critically endangered endemic species, C. trichura, and a high priority for conservation. As the decisions typically required to save declining species can be delayed or deferred if the taxonomic status of the population in question is uncertain, it is hoped that the history of the Christmas Island shrew will encourage the clarification of taxonomy to be seen as an important first step in initiating informed and effective conservation action. © 2013 Society for Conservation Biology.

  19. Ermelin, an endoplasmic reticulum transmembrane protein, contains the novel HELP domain conserved in eukaryotes.

    PubMed

    Suzuki, Akiko; Endo, Takeshi

    2002-02-06

    We have cloned a cDNA encoding a novel protein referred to as ermelin from mouse C2 skeletal muscle cells. This protein contained six hydrophobic amino acid stretches corresponding to transmembrane domains, two histidine-rich sequences, and a sequence homologous to the fusion peptides of certain fusion proteins. Ermelin also contained a novel modular sequence, designated as HELP domain, which was highly conserved among eukaryotes, from yeast to higher plants and animals. All these HELP domain-containing proteins, including mouse KE4, Drosophila Catsup, and Arabidopsis IAR1, possessed multipass transmembrane domains and histidine-rich sequences. Ermelin was predominantly expressed in brain and testis, and induced during neuronal differentiation of N1E-115 neuroblastoma cells but downregulated during myogenic differentiation of C2 cells. The mRNA was accumulated in hippocampus and cerebellum of brain and central areas of seminiferous tubules in testis. Epitope-tagging experiments located ermelin and KE4 to a network structure throughout the cytoplasm. Staining with the fluorescent dye DiOC(6)(3) identified this structure as the endoplasmic reticulum. These results suggest that at least some, if not all, of the HELP domain-containing proteins are multipass endoplasmic reticulum membrane proteins with functions conserved among eukaryotes.

  20. Long-range comparison of human and mouse Sprr loci to identify conserved noncoding sequences involved in coordinate regulation

    PubMed Central

    Martin, Natalia; Patel, Satyakam; Segre, Julia A.

    2004-01-01

    Mammalian epidermis provides a permeability barrier between an organism and its environment. Under homeostatic conditions, epidermal cells produce structural proteins, which are cross-linked in an orderly fashion to form a cornified envelope (CE). However, under genetic or environmental stress, specific genes are induced to rapidly build a temporary barrier. Small proline-rich (SPRR) proteins are the primary constituents of the CE. Under stress the entire family of 14 Sprr genes is upregulated. The Sprr genes are clustered within the larger epidermal differentiation complex on mouse chromosome 3, human chromosome 1q21. The clustering of the Sprr genes and their upregulation under stress suggest that these genes may be coordinately regulated. To identify enhancer elements that regulate this stress response activation of the Sprr locus, we utilized bioinformatic tools and classical biochemical dissection. Long-range comparative sequence analysis identified conserved noncoding sequences (CNSs). Clusters of epidermal-specific DNaseI-hypersensitive sites (HSs) mapped to specific CNSs. Increased prevalence of these HSs in barrier-deficient epidermis provides in vivo evidence of the regulation of the Sprr locus by these conserved sequences. Individual components of these HSs were cloned, and one was shown to have strong enhancer activity specific to conditions when the Sprr genes are coordinately upregulated. PMID:15574822

  1. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules

    PubMed Central

    Ashkenazy, Haim; Abadi, Shiran; Martz, Eric; Chay, Ofer; Mayrose, Itay; Pupko, Tal; Ben-Tal, Nir

    2016-01-01

    The degree of evolutionary conservation of an amino acid in a protein or a nucleic acid in DNA/RNA reflects a balance between its natural tendency to mutate and the overall need to retain the structural integrity and function of the macromolecule. The ConSurf web server (http://consurf.tau.ac.il), established over 15 years ago, analyses the evolutionary pattern of the amino/nucleic acids of the macromolecule to reveal regions that are important for structure and/or function. Starting from a query sequence or structure, the server automatically collects homologues, infers their multiple sequence alignment and reconstructs a phylogenetic tree that reflects their evolutionary relations. These data are then used, within a probabilistic framework, to estimate the evolutionary rates of each sequence position. Here we introduce several new features into ConSurf, including automatic selection of the best evolutionary model used to infer the rates, the ability to homology-model query proteins, prediction of the secondary structure of query RNA molecules from sequence, the ability to view the biological assembly of a query (in addition to the single chain), mapping of the conservation grades onto 2D RNA models and an advanced view of the phylogenetic tree that enables interactively rerunning ConSurf with the taxa of a sub-tree. PMID:27166375

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bahl, C.; Morisseau, C; Bomberger, J

    Cystic fibrosis transmembrane conductance regulator (CFTR) inhibitory factor (Cif) is a virulence factor secreted by Pseudomonas aeruginosa that reduces the quantity of CFTR in the apical membrane of human airway epithelial cells. Initial sequence analysis suggested that Cif is an epoxide hydrolase (EH), but its sequence violates two strictly conserved EH motifs and also is compatible with other {alpha}/{beta} hydrolase family members with diverse substrate specificities. To investigate the mechanistic basis of Cif activity, we have determined its structure at 1.8-{angstrom} resolution by X-ray crystallography. The catalytic triad consists of residues Asp129, His297, and Glu153, which are conserved across themore » family of EHs. At other positions, sequence deviations from canonical EH active-site motifs are stereochemically conservative. Furthermore, detailed enzymatic analysis confirms that Cif catalyzes the hydrolysis of epoxide compounds, with specific activity against both epibromohydrin and cis-stilbene oxide, but with a relatively narrow range of substrate selectivity. Although closely related to two other classes of {alpha}/{beta} hydrolase in both sequence and structure, Cif does not exhibit activity as either a haloacetate dehalogenase or a haloalkane dehalogenase. A reassessment of the structural and functional consequences of the H269A mutation suggests that Cif's effect on host-cell CFTR expression requires the hydrolysis of an extended endogenous epoxide substrate.« less

  3. A Bioinformatic Strategy for the Detection, Classification and Analysis of Bacterial Autotransporters

    PubMed Central

    Celik, Nermin; Webb, Chaille T.; Leyton, Denisse L.; Holt, Kathryn E.; Heinz, Eva; Gorrell, Rebecca; Kwok, Terry; Naderer, Thomas; Strugnell, Richard A.; Speed, Terence P.; Teasdale, Rohan D.; Likić, Vladimir A.; Lithgow, Trevor

    2012-01-01

    Autotransporters are secreted proteins that are assembled into the outer membrane of bacterial cells. The passenger domains of autotransporters are crucial for bacterial pathogenesis, with some remaining attached to the bacterial surface while others are released by proteolysis. An enigma remains as to whether autotransporters should be considered a class of secretion system, or simply a class of substrate with peculiar requirements for their secretion. We sought to establish a sensitive search protocol that could identify and characterize diverse autotransporters from bacterial genome sequence data. The new sequence analysis pipeline identified more than 1500 autotransporter sequences from diverse bacteria, including numerous species of Chlamydiales and Fusobacteria as well as all classes of Proteobacteria. Interrogation of the proteins revealed that there are numerous classes of passenger domains beyond the known proteases, adhesins and esterases. In addition the barrel-domain-a characteristic feature of autotransporters-was found to be composed from seven conserved sequence segments that can be arranged in multiple ways in the tertiary structure of the assembled autotransporter. One of these conserved motifs overlays the targeting information required for autotransporters to reach the outer membrane. Another conserved and diagnostic motif maps to the linker region between the passenger domain and barrel-domain, indicating it as an important feature in the assembly of autotransporters. PMID:22905239

  4. Antisense Transcription Is Pervasive but Rarely Conserved in Enteric Bacteria

    PubMed Central

    Raghavan, Rahul; Sloan, Daniel B.; Ochman, Howard

    2012-01-01

    ABSTRACT Noncoding RNAs, including antisense RNAs (asRNAs) that originate from the complementary strand of protein-coding genes, are involved in the regulation of gene expression in all domains of life. Recent application of deep-sequencing technologies has revealed that the transcription of asRNAs occurs genome-wide in bacteria. Although the role of the vast majority of asRNAs remains unknown, it is often assumed that their presence implies important regulatory functions, similar to those of other noncoding RNAs. Alternatively, many antisense transcripts may be produced by chance transcription events from promoter-like sequences that result from the degenerate nature of bacterial transcription factor binding sites. To investigate the biological relevance of antisense transcripts, we compared genome-wide patterns of asRNA expression in closely related enteric bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, by performing strand-specific transcriptome sequencing. Although antisense transcripts are abundant in both species, less than 3% of asRNAs are expressed at high levels in both species, and only about 14% appear to be conserved among species. And unlike the promoters of protein-coding genes, asRNA promoters show no evidence of sequence conservation between, or even within, species. Our findings suggest that many or even most bacterial asRNAs are nonadaptive by-products of the cell’s transcription machinery. PMID:22872780

  5. Finding a (pine) needle in a haystack: chloroplast genome sequence divergence in rare and widespread pines

    Treesearch

    J.B. Whittall; J. Syring; M. Parks; J. Buenrostro; C. Dick; A. Liston; R. Cronn

    2010-01-01

    Critical to conservation efforts and other investigations at low taxonomic levels, DNA sequence data offer important insights into the distinctiveness, biogeographic partitioning, and evolutionary histories of species. The resolving power of DNA sequences is often limited by insufficient variability at the intraspecific level. This is particularly true of studies...

  6. Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936.

    PubMed

    Durrens, Pascal; Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J; Noël, Thierry

    2017-08-03

    Clavispora lusitaniae , an environmental saprophytic yeast belonging to the CTG clade of Candida , can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. Copyright © 2017 Durrens et al.

  7. Genome Sequence of the Yeast Clavispora lusitaniae Type Strain CBS 6936

    PubMed Central

    Klopp, Christophe; Biteau, Nicolas; Fitton-Ouhabi, Valérie; Dementhon, Karine; Accoceberry, Isabelle; Sherman, David J.; Noël, Thierry

    2017-01-01

    ABSTRACT Clavispora lusitaniae, an environmental saprophytic yeast belonging to the CTG clade of Candida, can behave occasionally as an opportunistic pathogen in humans. We report here the genome sequence of the type strain CBS 6936. Comparison with sequences of strain ATCC 42720 indicates conservation of chromosomal structure but significant nucleotide divergence. PMID:28774979

  8. Non-biological synthetic spike-in controls and the AMPtk software pipeline improve mycobiome data

    Treesearch

    Jonathan M. Palmer; Michelle A. Jusino; Mark T. Banik; Daniel L. Lindner

    2018-01-01

    High-throughput amplicon sequencing (HTAS) of conserved DNA regions is a powerful technique to characterize microbial communities. Recently, spike-in mock communities have been used to measure accuracy of sequencing platforms and data analysis pipelines. To assess the ability of sequencing platforms and data processing pipelines using fungal internal transcribed spacer...

  9. Analysis of developmental gene conservation in the Actinomycetales using DNA/DNA microarray comparisons.

    PubMed

    Kirby, Ralph; Herron, Paul; Hoskisson, Paul

    2011-02-01

    Based on available genome sequences, Actinomycetales show significant gene synteny across a wide range of species and genera. In addition, many genera show varying degrees of complex morphological development. Using the presence of gene synteny as a basis, it is clear that an analysis of gene conservation across the Streptomyces and various other Actinomycetales will provide information on both the importance of genes and gene clusters and the evolution of morphogenesis in these bacteria. Genome sequencing, although becoming cheaper, is still relatively expensive for comparing large numbers of strains. Thus, a heterologous DNA/DNA microarray hybridization dataset based on a Streptomyces coelicolor microarray allows a cheaper and greater depth of analysis of gene conservation. This study, using both bioinformatical and microarray approaches, was able to classify genes previously identified as involved in morphogenesis in Streptomyces into various subgroups in terms of conservation across species and genera. This will allow the targeting of genes for further study based on their importance at the species level and at higher evolutionary levels.

  10. Variability among the Most Rapidly Evolving Plastid Genomic Regions is Lineage-Specific: Implications of Pairwise Genome Comparisons in Pyrus (Rosaceae) and Other Angiosperms for Marker Choice

    PubMed Central

    Ter-Voskanyan, Hasmik; Allgaier, Martin; Borsch, Thomas

    2014-01-01

    Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)—a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC–trnV, trnR–atpA, ndhF–rpl32, psbM–trnD, and trnQ–rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF–rpl32 and trnK–rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations. Sequencing whole plastid genomes to find markers for evolutionary analyses is therefore particularly useful when overall genetic distances are low. PMID:25405773

  11. Sequence Complexity of Chromosome 3 in Caenorhabditis elegans

    PubMed Central

    Pierro, Gaetano

    2012-01-01

    The nucleotide sequences complexity in chromosome 3 of Caenorhabditis elegans (C. elegans) is studied. The complexity of these sequences is compared with some random sequences. Moreover, by using some parameters related to complexity such as fractal dimension and frequency, indicator matrix is given a first classification of sequences of C. elegans. In particular, the sequences with highest and lowest fractal value are singled out. It is shown that the intrinsic nature of the low fractal dimension sequences has many common features with the random sequences. PMID:22919380

  12. Sequence of Radiotherapy and Chemotherapy in Breast Cancer After Breast-Conserving Surgery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jobsen, Jan J., E-mail: J.Jobsen@mst.nl; Palen, Job van der; Department of Research Methodology, Measurement and Data Analysis, Faculty of Behavioural Science, University of Twente

    2012-04-01

    Purpose: The optimal sequence of radiotherapy and chemotherapy in breast-conserving therapy is unknown. Methods and Materials: From 1983 through 2007, a total of 641 patients with 653 instances of breast-conserving therapy (BCT), received both chemotherapy and radiotherapy and are the basis of this analysis. Patients were divided into three groups. Groups A and B comprised patients treated before 2005, Group A radiotherapy first and Group B chemotherapy first. Group C consisted of patients treated from 2005 onward, when we had a fixed sequence of radiotherapy first, followed by chemotherapy. Results: Local control did not show any differences among the threemore » groups. For distant metastasis, no difference was shown between Groups A and B. Group C, when compared with Group A, showed, on univariate and multivariate analyses, a significantly better distant metastasis-free survival. The same was noted for disease-free survival. With respect to disease-specific survival, no differences were shown on multivariate analysis among the three groups. Conclusion: Radiotherapy, as an integral part of the primary treatment of BCT, should be administered first, followed by adjuvant chemotherapy.« less

  13. Mutations that alter a conserved element upstream of the potato virus X triple block and coat protein genes affect subgenomic RNA accumulation.

    PubMed

    Kim, K H; Hemenway, C

    1997-05-26

    The putative subgenomic RNA (sgRNA) promoter regions upstream of the potato virus X (PVX) triple block and coat protein (CP) genes contain sequences common to other potexviruses. The importance of these sequences to PVX sgRNA accumulation was determined by inoculation of Nicotiana tabacum NT1 cell suspension protoplasts with transcripts derived from wild-type and modified PVX cDNA clones. Analyses of RNA accumulation by S1 nuclease digestion and primer extension indicated that a conserved octanucleotide sequence element and the spacing between this element and the start-site for sgRNA synthesis are critical for accumulation of the two major sgRNA species. The impact of mutations on CP sgRNA levels was also reflected in the accumulation of CP. In contrast, genomic minus- and plus-strand RNA accumulation were not significantly affected by mutations in these regions. Studies involving inoculation of tobacco plants with the modified transcripts suggested that the conserved octanucleotide element functions in sgRNA accumulation and some other aspect of the infection process.

  14. The Use of DNA Barcoding in Identification and Conservation of Rosewood (Dalbergia spp.)

    PubMed Central

    Hartvig, Ida; Czako, Mihaly; Kjær, Erik Dahl; Nielsen, Lene Rostgaard; Theilade, Ida

    2015-01-01

    The genus Dalbergia contains many valuable timber species threatened by illegal logging and deforestation, but knowledge on distributions and threats is often limited and accurate species identification difficult. The aim of this study was to apply DNA barcoding methods to support conservation efforts of Dalbergia species in Indochina. We used the recommended rbcL, matK and ITS barcoding markers on 95 samples covering 31 species of Dalbergia, and tested their discrimination ability with both traditional distance-based as well as different model-based machine learning methods. We specifically tested whether the markers could be used to solve taxonomic confusion concerning the timber species Dalbergia oliveri, and to identify the CITES-listed Dalbergia cochinchinensis. We also applied the barcoding markers to 14 samples of unknown identity. In general, we found that the barcoding markers discriminated among Dalbergia species with high accuracy. We found that ITS yielded the single highest discrimination rate (100%), but due to difficulties in obtaining high-quality sequences from degraded material, the better overall choice for Dalbergia seems to be the standard rbcL+matK barcode, as this yielded discrimination rates close to 90% and amplified well. The distance-based method TaxonDNA showed the highest identification rates overall, although a more complete specimen sampling is needed to conclude on the best analytic method. We found strong support for a monophyletic Dalbergia oliveri and encourage that this name is used consistently in Indochina. The CITES-listed Dalbergia cochinchinensis was successfully identified, and a species-specific assay can be developed from the data generated in this study for the identification of illegally traded timber. We suggest that the use of DNA barcoding is integrated into the work flow during floristic studies and at national herbaria in the region, as this could significantly increase the number of identified specimens and improve knowledge about species distributions. PMID:26375850

  15. Amino acid sequence analysis of the annexin super-gene family of proteins.

    PubMed

    Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

    1991-06-15

    The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.

  16. Conservation and divergence of ADAM family proteins in the Xenopus genome

    PubMed Central

    2010-01-01

    Background Members of the disintegrin metalloproteinase (ADAM) family play important roles in cellular and developmental processes through their functions as proteases and/or binding partners for other proteins. The amphibian Xenopus has long been used as a model for early vertebrate development, but genome-wide analyses for large gene families were not possible until the recent completion of the X. tropicalis genome sequence and the availability of large scale expression sequence tag (EST) databases. In this study we carried out a systematic analysis of the X. tropicalis genome and uncovered several interesting features of ADAM genes in this species. Results Based on the X. tropicalis genome sequence and EST databases, we identified Xenopus orthologues of mammalian ADAMs and obtained full-length cDNA clones for these genes. The deduced protein sequences, synteny and exon-intron boundaries are conserved between most human and X. tropicalis orthologues. The alternative splicing patterns of certain Xenopus ADAM genes, such as adams 22 and 28, are similar to those of their mammalian orthologues. However, we were unable to identify an orthologue for ADAM7 or 8. The Xenopus orthologue of ADAM15, an active metalloproteinase in mammals, does not contain the conserved zinc-binding motif and is hence considered proteolytically inactive. We also found evidence for gain of ADAM genes in Xenopus as compared to other species. There is a homologue of ADAM10 in Xenopus that is missing in most mammals. Furthermore, a single scaffold of X. tropicalis genome contains four genes encoding ADAM28 homologues, suggesting genome duplication in this region. Conclusions Our genome-wide analysis of ADAM genes in X. tropicalis revealed both conservation and evolutionary divergence of these genes in this amphibian species. On the one hand, all ADAMs implicated in normal development and health in other species are conserved in X. tropicalis. On the other hand, some ADAM genes and ADAM protease activities are absent, while other novel ADAM proteins in this species are predicted by this study. The conservation and unique divergence of ADAM genes in Xenopus probably reflect the particular selective pressures these amphibian species faced during evolution. PMID:20630080

  17. Genomic positional conservation identifies topological anchor point RNAs linked to developmental loci.

    PubMed

    Amaral, Paulo P; Leonardi, Tommaso; Han, Namshik; Viré, Emmanuelle; Gascoigne, Dennis K; Arias-Carrasco, Raúl; Büscher, Magdalena; Pandolfini, Luca; Zhang, Anda; Pluchino, Stefano; Maracaja-Coutinho, Vinicius; Nakaya, Helder I; Hemberg, Martin; Shiekhattar, Ramin; Enright, Anton J; Kouzarides, Tony

    2018-03-15

    The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality. We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers. This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.

  18. Energy Conservation Curriculum for Secondary and Post-Secondary Students. Module 6: Hot Water Heating Conservation Opportunities.

    ERIC Educational Resources Information Center

    Navarro Coll., Corsicana, TX.

    This module is the sixth in a series of eleven modules in an energy conservation curriculum for secondary and postsecondary vocational students. It is designed for use by itself or as part of a sequence of four modules on understanding utilities (see also modules 3, 5, and 7). The objective of this module is to train students in the recognition,…

  19. Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population

    PubMed Central

    Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C. Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B.; Nauck, Markus; Kaminski, Wolfgang E.

    2017-01-01

    The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its “a” determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the “a” determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of “a” determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly underappreciated. PMID:28472040

  20. Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population.

    PubMed

    Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-Suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B; Nauck, Markus; Kaminski, Wolfgang E

    2017-01-01

    The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its "a" determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the "a" determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of "a" determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly underappreciated.

Top