Sample records for sequence analysis suggested

  1. The complete sequence of Cymbidium mosaic virus from Vanilla fragrans in Hainan, China.

    PubMed

    He, Zhen; Jiang, Dongmei; Liu, Aiqin; Sang, Liwei; Li, Wenfeng; Li, Shifang

    2011-06-01

    The complete nucleotide sequence of Cymbidium mosaic virus (CymMV) isolated from vanilla in Hainan province, China was determined for the first time. It comprised 6,224 nucleotides; sequence analysis suggested that the isolate we obtained was a member of the genus Potexvirus, and its sequence shared 86.67-96.61% identities with previously reported sequences. Phylogenetic analysis suggested that CymMV from vanilla fragrans was clustered into subgroup A and the isolates in this subgroup displayed little regional difference.

  2. Novel primer specific false terminations during DNA sequencing reactions: danger of inaccuracy of mutation analysis in molecular diagnostics

    PubMed Central

    Anwar, R; Booth, A; Churchill, A J; Markham, A F

    1996-01-01

    The determination of nucleotide sequence is fundamental to the identification and molecular analysis of genes. Direct sequencing of PCR products is now becoming a commonplace procedure for haplotype analysis, and for defining mutations and polymorphism within genes, particularly for diagnostic purposes. A previously unrecognised phenomenon, primer related variability, observed in sequence data generated using Taq cycle sequencing and T7 Sequenase sequencing, is reported. This suggests that caution is necessary when interpreting DNA sequence data. This is particularly important in situations where treatment may be dependent on the accuracy of the molecular diagnosis. Images PMID:16696096

  3. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

    PubMed

    Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

    2017-03-27

    Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.

  4. Kickoff to Conflict: A Sequence Analysis of Intra-State Conflict-Preceding Event Structures

    PubMed Central

    D'Orazio, Vito; Yonamine, James E.

    2015-01-01

    While many studies have suggested or assumed that the periods preceding the onset of intra-state conflict are similar across time and space, few have empirically tested this proposition. Using the Integrated Crisis Early Warning System's domestic event data in Asia from 1998–2010, we subject this proposition to empirical analysis. We code the similarity of government-rebel interactions in sequences preceding the onset of intra-state conflict to those preceding further periods of peace using three different metrics: Euclidean, Levenshtein, and mutual information. These scores are then used as predictors in a bivariate logistic regression to forecast whether we are likely to observe conflict in neither, one, or both of the states. We find that our model accurately classifies cases where both sequences precede peace, but struggles to distinguish between cases in which one sequence escalates to conflict and where both sequences escalate to conflict. These findings empirically suggest that generalizable patterns exist between event sequences that precede peace. PMID:25951105

  5. Direct repeat sequences in the Streptomyces chitinase-63 promoter direct both glucose repression and chitin induction

    PubMed Central

    Ni, Xiangyang; Westpheling, Janet

    1997-01-01

    The chi63 promoter directs glucose-sensitive, chitin-dependent transcription of a gene involved in the utilization of chitin as carbon source. Analysis of 5′ and 3′ deletions of the promoter region revealed that a 350-bp segment is sufficient for wild-type levels of expression and regulation. The analysis of single base changes throughout the promoter region, introduced by random and site-directed mutagenesis, identified several sequences to be important for activity and regulation. Single base changes at −10, −12, −32, −33, −35, and −37 upstream of the transcription start site resulted in loss of activity from the promoter, suggesting that bases in these positions are important for RNA polymerase interaction. The sequences centered around −10 (TATTCT) and −35 (TTGACC) in this promoter are, in fact, prototypical of eubacterial promoters. Overlapping the RNA polymerase binding site is a perfect 12-bp direct repeat sequence. Some base changes within this direct repeat resulted in constitutive expression, suggesting that this sequence is an operator for negative regulation. Other base changes resulted in loss of glucose repression while retaining the requirement for chitin induction, suggesting that this sequence is also involved in glucose repression. The fact that cis-acting mutations resulted in glucose resistance but not inducer independence rules out the possibility that glucose repression acts exclusively by inducer exclusion. The fact that mutations that affect glucose repression and chitin induction fall within the same direct repeat sequence module suggests that the direct repeat sequence facilitates both chitin induction and glucose repression. PMID:9371809

  6. Therapeutic change in interaction: conversation analysis of a transforming sequence.

    PubMed

    Voutilainen, Liisa; Perakyla, Anssi; Ruusuvuori, Johanna

    2011-05-01

    A process of change within a single case of cognitive-constructivist therapy is analyzed by means of conversation analysis (CA). The focus is on a process of change in the sequences of interaction, which consist of the therapist's conclusion and the patient's response to it. In the conclusions, the therapist investigates and challenges the patient's tendency to transform her feelings of disappointment and anger into self-blame. Over the course of the therapy, the patient's responses to these conclusions are recast: from the patient first rejecting the conclusion, to then being ambivalent, and finally to agreeing with the therapist. On the basis of this case study, we suggest that an analysis that focuses on sequences of talk that are interactionally similar offers a sensitive method to investigate the manifestation of therapeutic change. It is suggested that this line of research can complement assimilation analysis and other methods of analyzing changes in a client's talk.

  7. Characterization of shark complement factor I gene(s): genomic analysis of a novel shark-specific sequence.

    PubMed

    Shin, Dong-Ho; Webb, Barbara M; Nakao, Miki; Smith, Sylvia L

    2009-07-01

    Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and -d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (

  8. Characterization of shark complement factor I gene(s): genomic analysis of a novel shark-specific sequence

    PubMed Central

    Shin, Dong-Ho; Webb, Barbara M.; Nakao, Miki; Smith, Sylvia L.

    2009-01-01

    Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and –d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (≤) amino acid identities with each other, 35.4 ~ 39.6% and 62.8 ~ 65.9% with factor I of mammals and banded houndshark (Triakis scyllium), respectively. The modular structure of the GcIf is similar to that of mammals with one notable exception, the presence of a novel shark-specific sequence between the leader peptide (LP) and the factor I membrane attack complex (FIMAC) domain. The cDNA sequences differ only in the size and composition of the shark-specific region (SSR). Sequence analysis of each SSR has identified within the region two novel short sequences (SS1 and SS2) and three repeat sequences (RS1, 2 and 3). Genomic analysis has revealed the existence of three introns between the leader peptide and the FIMAC domain, tentatively designated intron 1, intron 2, and intron 3 which span 4067, 2293 and 2082 bp, respectively. Southern blot analysis suggests the presence of a single gene copy for each cDNA type. Phylogenetic analysis suggests that complement factor I of cartilaginous fish diverged prior to the emergence of mammals. All four GcIf cDNA species are expressed in four different tissues and the liver is the main tissue in which expression level of all four is high. This suggests that the expression of GcIf isotypes is tissue-dependent. PMID:19423168

  9. Molecular Phylogenetic Analysis of Archaeal Intron-Containing Genes Coding for rRNA Obtained from a Deep-Subsurface Geothermal Water Pool

    PubMed Central

    Takai, Ken; Horikoshi, Koki

    1999-01-01

    Molecular phylogenetic analysis of a naturally occurring microbial community in a deep-subsurface geothermal environment indicated that the phylogenetic diversity of the microbial population in the environment was extremely limited and that only hyperthermophilic archaeal members closely related to Pyrobaculum were present. All archaeal ribosomal DNA sequences contained intron-like sequences, some of which had open reading frames with repeated homing-endonuclease motifs. The sequence similarity analysis and the phylogenetic analysis of these homing endonucleases suggested the possible phylogenetic relationship among archaeal rRNA-encoded homing endonucleases. PMID:10584021

  10. Ribosomal RNA sequence suggest microsporidia are extremely ancient eukaryotes

    NASA Technical Reports Server (NTRS)

    Vossbrinck, C. R.; Maddox, J. V.; Friedman, S.; Debrunner-Vossbrinck, B. A.; Woese, C. R.

    1987-01-01

    A comparative sequence analysis of the 18S small subunit ribosomal RNA (rRNA) of the microsporidium Vairimorpha necatrix is presented. The results show that this rRNA sequence is more unlike those of other eukaryotes than any known eukaryote rRNA sequence. It is concluded that the lineage leading to microsporidia branched very early from that leading to other eukaryotes.

  11. Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

    PubMed

    Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L

    2017-09-27

    Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Phylodynamic and Phylogeographic Patterns of the HIV Type 1 Subtype F1 Parenteral Epidemic in Romania

    PubMed Central

    Hué, Stéphane; Buckton, Andrew J.; Myers, Richard E.; Duiculescu, Dan; Ene, Luminita; Oprea, Cristiana; Tardei, Gratiela; Rugina, Sorin; Mardarescu, Mariana; Floch, Corinne; Notheis, Gundula; Zöhrer, Bettina; Cane, Patricia A.; Pillay, Deenan

    2012-01-01

    Abstract In the late 1980s an HIV-1 epidemic emerged in Romania that was dominated by subtype F1. The main route of infection is believed to be parenteral transmission in children. We sequenced partial pol coding regions of 70 subtype F1 samples from children and adolescents from the PENTA-EPPICC network of which 67 were from Romania. Phylogenetic reconstruction using the sequences and other publically available global subtype F sequences showed that 79% of Romanian F1 sequences formed a statistically robust monophyletic cluster. The monophyletic cluster was epidemiologically linked to parenteral transmission in children. Coalescent-based analysis dated the origins of the parenteral epidemic to 1983 [1981–1987; 95% HPD]. The analysis also shows that the epidemic's effective population size has remained fairly constant since the early 1990s suggesting limited onward spread of the virus within the population. Furthermore, phylogeographic analysis suggests that the root location of the parenteral epidemic was Bucharest. PMID:22251065

  13. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data.

    PubMed

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E; Greenwood, Alex D

    2015-11-24

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.

  14. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data

    PubMed Central

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E.; Greenwood, Alex D.

    2015-01-01

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals. PMID:26610552

  15. Mucosal and Cutaneous Human Papillomaviruses Detected in Raw Sewages

    PubMed Central

    La Rosa, Giuseppina; Fratini, Marta; Accardi, Luisa; D'Oro, Graziana; Della Libera, Simonetta; Muscillo, Michele; Di Bonito, Paola

    2013-01-01

    Epitheliotropic viruses can find their way into sewage. The aim of the present study was to investigate the occurrence, distribution, and genetic diversity of Human Papillomaviruses (HPVs) in urban wastewaters. Sewage samples were collected from treatment plants distributed throughout Italy. The DNA extracted from these samples was analyzed by PCR using five PV-specific sets of primers targeting the L1 (GP5/GP6, MY09/MY11, FAP59/64, SKF/SKR) and E1 regions (PM-A/PM-B), according to the protocols previously validated for the detection of mucosal and cutaneous HPV genotypes. PCR products underwent sequencing analysis and the sequences were aligned to reference genomes from the Papillomavirus Episteme database. Phylogenetic analysis was then performed to assess the genetic relationships among the different sequences and between the sequences of the samples and those of the prototype strains. A broad spectrum of sequences related to mucosal and cutaneous HPV types was detected in 81% of the sewage samples analyzed. Surprisingly, sequences related to the anogenital HPV6 and 11 were detected in 19% of the samples, and sequences related to the “high risk” oncogenic HPV16 were identified in two samples. Sequences related to HPV9, HPV20, HPV25, HPV76, HPV80, HPV104, HPV110, HPV111, HPV120 and HPV145 beta Papillomaviruses were detected in 76% of the samples. In addition, similarity searches and phylogenetic analysis of some sequences suggest that they could belong to putative new genotypes of the beta genus. In this study, for the first time, the presence of HPV viruses strongly related to human cancer is reported in sewage samples. Our data increases the knowledge of HPV genomic diversity and suggests that virological analysis of urban sewage can provide key information useful in supporting epidemiological studies. PMID:23341898

  16. Sequence Analysis and Domain Motifs in the Porcine Skin Decorin Glycosaminoglycan Chain*

    PubMed Central

    Zhao, Xue; Yang, Bo; Solakylidirim, Kemal; Joo, Eun Ji; Toida, Toshihiko; Higashi, Kyohei; Linhardt, Robert J.; Li, Lingyun

    2013-01-01

    Decorin proteoglycan is comprised of a core protein containing a single O-linked dermatan sulfate/chondroitin sulfate glycosaminoglycan (GAG) chain. Although the sequence of the decorin core protein is determined by the gene encoding its structure, the structure of its GAG chain is determined in the Golgi. The recent application of modern MS to bikunin, a far simpler chondroitin sulfate proteoglycans, suggests that it has a single or small number of defined sequences. On this basis, a similar approach to sequence the decorin of porcine skin much larger and more structurally complex dermatan sulfate/chondroitin sulfate GAG chain was undertaken. This approach resulted in information on the consistency/variability of its linkage region at the reducing end of the GAG chain, its iduronic acid-rich domain, glucuronic acid-rich domain, and non-reducing end. A general motif for the porcine skin decorin GAG chain was established. A single small decorin GAG chain was sequenced using MS/MS analysis. The data obtained in the study suggest that the decorin GAG chain has a small or a limited number of sequences. PMID:23423381

  17. Can natural proteins designed with 'inverted' peptide sequences adopt native-like protein folds?

    PubMed

    Sridhar, Settu; Guruprasad, Kunchur

    2014-01-01

    We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to 'swap' certain short peptide sequences in naturally occurring proteins with their corresponding 'inverted' peptides and generate 'artificial' proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5-12 and 18 amino acid residues. Our analysis illustrates with examples that such 'artificial' proteins may be generated by identifying peptides with 'similar structural environment' and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.

  18. Genomic DNA sequence and cytosine methylation changes of adult rice leaves after seeds space flight

    NASA Astrophysics Data System (ADS)

    Shi, Jinming

    In this study, cytosine methylation on CCGG site and genomic DNA sequence changes of adult leaves of rice after seeds space flight were detected by methylation-sensitive amplification polymorphism (MSAP) and Amplified fragment length polymorphism (AFLP) technique respectively. Rice seeds were planted in the trial field after 4 days space flight on the shenzhou-6 Spaceship of China. Adult leaves of space-treated rice including 8 plants chosen randomly and 2 plants with phenotypic mutation were used for AFLP and MSAP analysis. Polymorphism of both DNA sequence and cytosine methylation were detected. For MSAP analysis, the average polymorphic frequency of the on-ground controls, space-treated plants and mutants are 1.3%, 3.1% and 11% respectively. For AFLP analysis, the average polymorphic frequencies are 1.4%, 2.9%and 8%respectively. Total 27 and 22 polymorphic fragments were cloned sequenced from MSAP and AFLP analysis respectively. Nine of the 27 fragments from MSAP analysis show homology to coding sequence. For the 22 polymorphic fragments from AFLP analysis, no one shows homology to mRNA sequence and eight fragments show homology to repeat region or retrotransposon sequence. These results suggest that although both genomic DNA sequence and cytosine methylation status can be effected by space flight, the genomic region homology to the fragments from genome DNA and cytosine methylation analysis were different.

  19. A proteomic analysis of leaf sheaths from rice.

    PubMed

    Shen, Shihua; Matsubae, Masami; Takao, Toshifumi; Tanaka, Naoki; Komatsu, Setsuko

    2002-10-01

    The proteins extracted from the leaf sheaths of rice seedlings were separated by 2-D PAGE, and analyzed by Edman sequencing and mass spectrometry, followed by database searching. Image analysis revealed 352 protein spots on 2-D PAGE after staining with Coomassie Brilliant Blue. The amino acid sequences of 44 of 84 proteins were determined; for 31 of these proteins, a clear function could be assigned, whereas for 12 proteins, no function could be assigned. Forty proteins did not yield amino acid sequence information, because they were N-terminally blocked, or the obtained sequences were too short and/or did not give unambiguous results. Fifty-nine proteins were analyzed by mass spectrometry; all of these proteins were identified by matching to the protein database. The amino acid sequences of 19 of 27 proteins analyzed by mass spectrometry were similar to the results of Edman sequencing. These results suggest that 2-D PAGE combined with Edman sequencing and mass spectrometry analysis can be effectively used to identify plant proteins.

  20. Forensic strategy to ensure the quality of sequencing data of mitochondrial DNA in highly degraded samples.

    PubMed

    Adachi, Noboru; Umetsu, Kazuo; Shojo, Hideki

    2014-01-01

    Mitochondrial DNA (mtDNA) is widely used for DNA analysis of highly degraded samples because of its polymorphic nature and high number of copies in a cell. However, as endogenous mtDNA in deteriorated samples is scarce and highly fragmented, it is not easy to obtain reliable data. In the current study, we report the risks of direct sequencing mtDNA in highly degraded material, and suggest a strategy to ensure the quality of sequencing data. It was observed that direct sequencing data of the hypervariable segment (HVS) 1 by using primer sets that generate an amplicon of 407 bp (long-primer sets) was different from results obtained by using newly designed primer sets that produce an amplicon of 120-139 bp (mini-primer sets). The data aligned with the results of mini-primer sets analysis in an amplicon length-dependent manner; the shorter the amplicon, the more evident the endogenous sequence became. Coding region analysis using multiplex amplified product-length polymorphisms revealed the incongruence of single nucleotide polymorphisms between the coding region and HVS 1 caused by contamination with exogenous mtDNA. Although the sequencing data obtained using long-primer sets turned out to be erroneous, it was unambiguous and reproducible. These findings suggest that PCR primers that produce amplicons shorter than those currently recognized should be used for mtDNA analysis in highly degraded samples. Haplogroup motif analysis of the coding region and HVS should also be performed to improve the reliability of forensic mtDNA data. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  1. Effects of Sequences of Cognitions on Group Performance Over Time

    PubMed Central

    Molenaar, Inge; Chiu, Ming Ming

    2017-01-01

    Extending past research showing that sequences of low cognitions (low-level processing of information) and high cognitions (high-level processing of information through questions and elaborations) influence the likelihoods of subsequent high and low cognitions, this study examines whether sequences of cognitions are related to group performance over time; 54 primary school students (18 triads) discussed and wrote an essay about living in another country (32,375 turns of talk). Content analysis and statistical discourse analysis showed that within each lesson, groups with more low cognitions or more sequences of low cognition followed by high cognition added more essay words. Groups with more high cognitions, sequences of low cognition followed by low cognition, or sequences of high cognition followed by an action followed by low cognition, showed different words and sequences, suggestive of new ideas. The links between cognition sequences and group performance over time can inform facilitation and assessment of student discussions. PMID:28490854

  2. Effects of Sequences of Cognitions on Group Performance Over Time.

    PubMed

    Molenaar, Inge; Chiu, Ming Ming

    2017-04-01

    Extending past research showing that sequences of low cognitions (low-level processing of information) and high cognitions (high-level processing of information through questions and elaborations) influence the likelihoods of subsequent high and low cognitions, this study examines whether sequences of cognitions are related to group performance over time; 54 primary school students (18 triads) discussed and wrote an essay about living in another country (32,375 turns of talk). Content analysis and statistical discourse analysis showed that within each lesson, groups with more low cognitions or more sequences of low cognition followed by high cognition added more essay words. Groups with more high cognitions, sequences of low cognition followed by low cognition, or sequences of high cognition followed by an action followed by low cognition, showed different words and sequences, suggestive of new ideas. The links between cognition sequences and group performance over time can inform facilitation and assessment of student discussions.

  3. Identification of the sequence motif of glycoside hydrolase 13 family members

    PubMed Central

    Kumar, Vikash

    2011-01-01

    A bioinformatics analysis of sequences of enzymes of the glycoside hydrolase (GH) 13 family members such as α-amylase, cyclodextrin glycosyltransferase (CGTase), branching enzyme and cyclomaltodextrinase has been carried out in order to find out the sequence motifs that govern the reactions specificities of these enzymes by using hidden Markov model (HMM) profile. This analysis suggests the existence of such sequence motifs and residues of these motifs constituting the −1 to +3 catalytic subsites of the enzyme. Hence, by introducing mutations in the residues of these four subsites, one can change the reaction specificities of the enzymes. In general it has been observed that α -amylase sequence motif have low sequence conservation than rest of the motifs of the GH13 family members. PMID:21544166

  4. Library preparation and data analysis packages for rapid genome sequencing.

    PubMed

    Pomraning, Kyle R; Smith, Kristina M; Bredeweg, Erin L; Connolly, Lanelle R; Phatale, Pallavi A; Freitag, Michael

    2012-01-01

    High-throughput sequencing (HTS) has quickly become a valuable tool for comparative genetics and genomics and is now regularly carried out in laboratories that are not connected to large sequencing centers. Here we describe an updated version of our protocol for constructing single- and paired-end Illumina sequencing libraries, beginning with purified genomic DNA. The present protocol can also be used for "multiplexing," i.e. the analysis of several samples in a single flowcell lane by generating "barcoded" or "indexed" Illumina sequencing libraries in a way that is independent from Illumina-supported methods. To analyze sequencing results, we suggest several independent approaches but end users should be aware that this is a quickly evolving field and that currently many alignment (or "mapping") and counting algorithms are being developed and tested.

  5. A novel peptide from the ACEI/BPP-CNP precursor in the venom of Crotalus durissus collilineatus.

    PubMed

    Higuchi, Shigesada; Murayama, Nobuhiro; Saguchi, Ken-ichi; Ohi, Hiroaki; Fujita, Yoshiaki; da Silva, Nelson Jorge; de Siqueira, Rodrigo José Bezerra; Lahlou, Saad; Aird, Steven D

    2006-10-01

    In crotaline venoms, angiotensin-converting enzyme inhibitors [ACEIs, also known as bradykinin potentiating peptides (BPPs)], are products of a gene coding for an ACEI/BPP-C-type natriuretic peptide (CNP) precursor. In the genes from Bothrops jararaca and Gloydius blomhoffii, ACEI/BPP sequences are repeated. Sequencing of a cDNA clone from venom glands of Crotalus durissus collilineatus showed that two ACEIs/BPPs are located together at the N-terminus, but without repeats. An additional sequence for CNP was unexpectedly found at the C-terminus. Homologous genes for the ACEI/BPP-CNP precursor suggest that most crotaline venoms contain both ACEIs/BPPs and CNP. The sequence of ACEIs/BPPs is separated from the CNP sequence by a long spacer sequence. Previously, there was no evidence that this spacer actually coded any expressed peptides. Aird and Kaiser (1986, unpublished) previously isolated and sequenced a peptide of 11 residues (TPPAGPDVGPR) from Crotalus viridis viridis venom. In the present study, analysis of the cDNA clone from C. d. collilineatus revealed a nearly identical sequence in the ACEI/BPP-CNP spacer. Fractionation of the crude venom by reverse phase HPLC (C(18)), and analysis of the fractions by mass spectrometry (MS) indicated a component of 1020.5 Da. Amino acid sequencing by MS/MS confirmed that C. d. collilineatus venom contains the peptide TPPAGPDGGPR. Its high proline content and paired proline residues are typical of venom hypotensive peptides, although it lacks the usual N-terminal pyroglutamate. It has no demonstrable hypotensive activity when injected intravenously in rats; however, its occurrence in the venoms of dissimilar species suggests that its presence is not accidental. Evidence suggests that these novel toxins probably activate anaphylatoxin C3a receptors.

  6. Extensive sequence analysis of CFTR, SCNN1A, SCNN1B, SCNN1G and SERPINA1 suggests an oligogenic basis for cystic fibrosis-like phenotypes.

    PubMed

    Ramos, M D; Trujillano, D; Olivar, R; Sotillo, F; Ossowski, S; Manzanares, J; Costa, J; Gartner, S; Oliva, C; Quintana, E; Gonzalez, M I; Vazquez, C; Estivill, X; Casals, T

    2014-07-01

    The term cystic fibrosis (CF)-like disease is used to describe patients with a borderline sweat test and suggestive CF clinical features but without two CFTR(cystic fibrosis transmembrane conductance regulator) mutations. We have performed the extensive molecular analysis of four candidate genes (SCNN1A, SCNN1B, SCNN1G and SERPINA1) in a cohort of 10 uncharacterized patients with CF and CF-like disease. We have used whole-exome sequencing to characterize mutations in the CFTR gene and these four candidate genes. CFTR molecular analysis allowed a complete characterization of three of four CF patients. Candidate variants in SCNN1A, SCNN1B, SCNN1G and SERPINA1 in six patients with CF-like phenotypes were confirmed by Sanger sequencing and were further supported by in silico predictive analysis, pedigree studies, sweat test in other family members, and analysis in CF patients and healthy subjects. Our results suggest that CF-like disease probably results from complex genotypes in several genes in an oligogenic form, with rare variants interacting with environmental factors. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  7. Determination of the sequences of protein-derived peptides and peptide mixtures by mass spectrometry

    PubMed Central

    Morris, Howard R.; Williams, Dudley H.; Ambler, Richard P.

    1971-01-01

    Micro-quantities of protein-derived peptides have been converted into N-acetylated permethyl derivatives, and their sequences determined by low-resolution mass spectrometry without prior knowledge of their amino acid compositions or lengths. A new strategy is suggested for the mass spectrometric sequencing of oligopeptides or proteins, involving gel filtration of protein hydrolysates and subsequent sequence analysis of peptide mixtures. Finally, results are given that demonstrate for the first time the use of mass spectrometry for the analysis of a protein-derived peptide mixture, again without prior knowledge of the protein or components within the mixture. PMID:5158904

  8. JGI Fungal Genomics Program

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grigoriev, Igor V.

    2011-03-14

    Genomes of energy and environment fungi are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 50 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functionalmore » genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such 'parts' suggested by comparative genomics and functional analysis in these areas are presented here« less

  9. Functional identification and regulatory analysis of Δ6-fatty acid desaturase from the oleaginous fungus Mucor sp. EIM-10.

    PubMed

    Jiang, Xianzhang; Liu, Hongjiao; Niu, Yongchao; Qi, Feng; Zhang, Mingliang; Huang, Jianzhong

    2017-03-01

    To enlarge the diversity of the desaturases associated with PUFA biosynthesis and to better understand the transcriptional regulation of desaturases, a Δ 6 -desaturase gene (Md6) from Mucor sp. and its 5'-upstream sequence was functionally identified in Saccharomyces cerevisiae. Expression of the Δ 6 -fatty acid desaturase (Md6) in S. cerevisiae showed that Md6 could convert linolenic acid to γ-linolenic acid. Computational analysis of the promoter of Md6 suggested it contains several eukaryotic fundamental transcription regulatory elements. In vivo functional analysis of the promoter showed the 5'-upstream sequence of Md6 could initiate expression of GFP and Md6 itself in S. cerevisiae. A series deletion analysis of the promoter suggested that sequence between -919 to -784 bp (relative to start site) named as eMd6 is the key factor for high activity of Δ 6 -desaturase. The activity of Δ 6 -desaturase was increased by 2.8-fold and 2.5-fold when the eMd6 sequence was placed upstream of -434 with forward or reverse orientations respectively. To our best knowledge, the native promoter of Md6 from Mucor is the strongest promoter for Δ 6 -desaturase reported so far and the sequence between -919 to -784 bp is an enhancer for Δ 6 -desaturase activity.

  10. Comparative sequence analysis suggests a conserved gating mechanism for TRP channels

    PubMed Central

    Palovcak, Eugene; Delemotte, Lucie; Klein, Michael L.

    2015-01-01

    The transient receptor potential (TRP) channel superfamily plays a central role in transducing diverse sensory stimuli in eukaryotes. Although dissimilar in sequence and domain organization, all known TRP channels act as polymodal cellular sensors and form tetrameric assemblies similar to those of their distant relatives, the voltage-gated potassium (Kv) channels. Here, we investigated the related questions of whether the allosteric mechanism underlying polymodal gating is common to all TRP channels, and how this mechanism differs from that underpinning Kv channel voltage sensitivity. To provide insight into these questions, we performed comparative sequence analysis on large, comprehensive ensembles of TRP and Kv channel sequences, contextualizing the patterns of conservation and correlation observed in the TRP channel sequences in light of the well-studied Kv channels. We report sequence features that are specific to TRP channels and, based on insight from recent TRPV1 structures, we suggest a model of TRP channel gating that differs substantially from the one mediating voltage sensitivity in Kv channels. The common mechanism underlying polymodal gating involves the displacement of a defect in the H-bond network of S6 that changes the orientation of the pore-lining residues at the hydrophobic gate. PMID:26078053

  11. Using information content and base frequencies to distinguish mutations from genetic polymorphisms in splice junction recognition sites.

    PubMed

    Rogan, P K; Schneider, T D

    1995-01-01

    Predicting the effects of nucleotide substitutions in human splice sites has been based on analysis of consensus sequences. We used a graphic representation of sequence conservation and base frequency, the sequence logo, to demonstrate that a change in a splice acceptor of hMSH2 (a gene associated with familial nonpolyposis colon cancer) probably does not reduce splicing efficiency. This confirms a population genetic study that suggested that this substitution is a genetic polymorphism. The information theory-based sequence logo is quantitative and more sensitive than the corresponding splice acceptor consensus sequence for detection of true mutations. Information analysis may potentially be used to distinguish polymorphisms from mutations in other types of transcriptional, translational, or protein-coding motifs.

  12. Complete sequence determination of a novel reptile iridovirus isolated from soft-shelled turtle and evolutionary analysis of Iridoviridae

    PubMed Central

    Huang, Youhua; Huang, Xiaohong; Liu, Hong; Gong, Jie; Ouyang, Zhengliang; Cui, Huachun; Cao, Jianhao; Zhao, Yingtao; Wang, Xiujie; Jiang, Yulin; Qin, Qiwei

    2009-01-01

    Background Soft-shelled turtle iridovirus (STIV) is the causative agent of severe systemic diseases in cultured soft-shelled turtles (Trionyx sinensis). To our knowledge, the only molecular information available on STIV mainly concerns the highly conserved STIV major capsid protein. The complete sequence of the STIV genome is not yet available. Therefore, determining the genome sequence of STIV and providing a detailed bioinformatic analysis of its genome content and evolution status will facilitate further understanding of the taxonomic elements of STIV and the molecular mechanisms of reptile iridovirus pathogenesis. Results We determined the complete nucleotide sequence of the STIV genome using 454 Life Science sequencing technology. The STIV genome is 105 890 bp in length with a base composition of 55.1% G+C. Computer assisted analysis revealed that the STIV genome contains 105 potential open reading frames (ORFs), which encode polypeptides ranging from 40 to 1,294 amino acids and 20 microRNA candidates. Among the putative proteins, 20 share homology with the ancestral proteins of the nuclear and cytoplasmic large DNA viruses (NCLDVs). Comparative genomic analysis showed that STIV has the highest degree of sequence conservation and a colinear arrangement of genes with frog virus 3 (FV3), followed by Tiger frog virus (TFV), Ambystoma tigrinum virus (ATV), Singapore grouper iridovirus (SGIV), Grouper iridovirus (GIV) and other iridovirus isolates. Phylogenetic analysis based on conserved core genes and complete genome sequence of STIV with other virus genomes was performed. Moreover, analysis of the gene gain-and-loss events in the family Iridoviridae suggested that the genes encoded by iridoviruses have evolved for favoring adaptation to different natural host species. Conclusion This study has provided the complete genome sequence of STIV. Phylogenetic analysis suggested that STIV and FV3 are strains of the same viral species belonging to the Ranavirus genus in the Iridoviridae family. Given virus-host co-evolution and the phylogenetic relationship among vertebrates from fish to reptiles, we propose that iridovirus might transmit between reptiles and amphibians and that STIV and FV3 are strains of the same viral species in the Ranavirus genus. PMID:19439104

  13. Identification of human-to-human transmissibility factors in PB2 proteins of influenza A by large-scale mutual information analysis

    PubMed Central

    Miotto, Olivo; Heiny, AT; Tan, Tin Wee; August, J Thomas; Brusic, Vladimir

    2008-01-01

    Background The identification of mutations that confer unique properties to a pathogen, such as host range, is of fundamental importance in the fight against disease. This paper describes a novel method for identifying amino acid sites that distinguish specific sets of protein sequences, by comparative analysis of matched alignments. The use of mutual information to identify distinctive residues responsible for functional variants makes this approach highly suitable for analyzing large sets of sequences. To support mutual information analysis, we developed the AVANA software, which utilizes sequence annotations to select sets for comparison, according to user-specified criteria. The method presented was applied to an analysis of influenza A PB2 protein sequences, with the objective of identifying the components of adaptation to human-to-human transmission, and reconstructing the mutation history of these components. Results We compared over 3,000 PB2 protein sequences of human-transmissible and avian isolates, to produce a catalogue of sites involved in adaptation to human-to-human transmission. This analysis identified 17 characteristic sites, five of which have been present in human-transmissible strains since the 1918 Spanish flu pandemic. Sixteen of these sites are located in functional domains, suggesting they may play functional roles in host-range specificity. The catalogue of characteristic sites was used to derive sequence signatures from historical isolates. These signatures, arranged in chronological order, reveal an evolutionary timeline for the adaptation of the PB2 protein to human hosts. Conclusion By providing the most complete elucidation to date of the functional components participating in PB2 protein adaptation to humans, this study demonstrates that mutual information is a powerful tool for comparative characterization of sequence sets. In addition to confirming previously reported findings, several novel characteristic sites within PB2 are reported. Sequence signatures generated using the characteristic sites catalogue characterize concisely the adaptation characteristics of individual isolates. Evolutionary timelines derived from signatures of early human influenza isolates suggest that characteristic variants emerged rapidly, and remained remarkably stable through subsequent pandemics. In addition, the signatures of human-infecting H5N1 isolates suggest that this avian subtype has low pandemic potential at present, although it presents more human adaptation components than most avian subtypes. PMID:18315849

  14. Molecular and comparative analysis of Salmonella enterica Senftenberg from humans and animals using PFGE, MLST and NARMS.

    PubMed

    Stepan, Ryan M; Sherwood, Julie S; Petermann, Shana R; Logue, Catherine M

    2011-06-27

    Salmonella species are recognized worldwide as a significant cause of human and animal disease. In this study the molecular profiles and characteristics of Salmonella enterica Senftenberg isolated from human cases of illness and those recovered from healthy or diagnostic cases in animals were assessed. Included in the study was a comparison with our own sequenced strain of S. Senfteberg recovered from production turkeys in North Dakota. Isolates examined in this study were subjected to antimicrobial susceptibility profiling using the National Antimicrobial Resistance Monitoring System (NARMS) panel which tested susceptibility to 15 different antimicrobial agents. The molecular profiles of all isolates were determined using Pulsed Field Gel Electrophoresis (PFGE) and the sequence types of the strains were obtained using Multi-Locus Sequence Type (MLST) analysis based on amplification and sequence interrogation of seven housekeeping genes (aroC, dnaN, hemD, hisD, purE, sucA, and thrA). PFGE data was input into BioNumerics analysis software to generate a dendrogram of relatedness among the strains. The study found 93 profiles among 98 S. Senftenberg isolates tested and there were primarily two sequence types associated with humans and animals (ST185 and ST14) with overlap observed in all host types suggesting that the distribution of S. Senftenberg sequence types is not host dependent. Antimicrobial resistance was observed among the animal strains, however no resistance was detected in human isolates suggesting that animal husbandry has a significant influence on the selection and promotion of antimicrobial resistance. The data demonstrates the circulation of at least two strain types in both animal and human health suggesting that S. Senftenberg is relatively homogeneous in its distribution. The data generated in this study could be used towards defining a pathotype for this serovar.

  15. Toward a method for tracking virus evolutionary trajectory applied to the pandemic H1N1 2009 influenza virus.

    PubMed

    Squires, R Burke; Pickett, Brett E; Das, Sajal; Scheuermann, Richard H

    2014-12-01

    In 2009 a novel pandemic H1N1 influenza virus (H1N1pdm09) emerged as the first official influenza pandemic of the 21st century. Early genomic sequence analysis pointed to the swine origin of the virus. Here we report a novel computational approach to determine the evolutionary trajectory of viral sequences that uses data-driven estimations of nucleotide substitution rates to track the gradual accumulation of observed sequence alterations over time. Phylogenetic analysis and multiple sequence alignments show that sequences belonging to the resulting evolutionary trajectory of the H1N1pdm09 lineage exhibit a gradual accumulation of sequence variations and tight temporal correlations in the topological structure of the phylogenetic trees. These results suggest that our evolutionary trajectory analysis (ETA) can more effectively pinpoint the evolutionary history of viruses, including the host and geographical location traversed by each segment, when compared against either BLAST or traditional phylogenetic analysis alone. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Fungal Genomics for Energy and Environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grigoriev, Igor V.

    2013-03-11

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Sequencing Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for usersmore » to nominate new species for sequencing. Over 200 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.« less

  17. Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

    PubMed

    Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

    2006-08-11

    Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.

  18. Cloning and sequence analysis of Hemonchus contortus HC58cDNA.

    PubMed

    Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li

    2007-06-01

    The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.

  19. Genetic analysis of tumorigenesis: XXXII. Localization of constitutionally amplified KRAS sequences to Chinese hamster chromosomes X and Y by in situ hybridization.

    PubMed

    Stenman, G; Anisowicz, A; Sager, R

    1988-11-01

    The KRAS gene is constitutionally amplified in the Chinese hamster. We have mapped the amplified sequences by in situ hybridization to two major sites on the X and Y chromosomes, Xq4 and Yp2. No autosomal site was detected despite a search under relaxed hybridization conditions. KRAS DNA is amplified about 50-fold compared to a human cell line known to have a diploid number of KRAS sequences, whereas mRNA expression is 5- to 10-fold lower than in normal human cells. While mRNA expression levels do not necessarily parallel gene copy number, the low expression level strongly suggests that the amplified sequences are transcriptionally silent. It is suggested that the amplified sequences arose from the original KRAS gene on chromosome 8 and that the KRAS sequences on the Y chromosome arose by X-Y recombination.

  20. Isolation and cloning of a metalloproteinase from king cobra snake venom.

    PubMed

    Guo, Xiao-Xi; Zeng, Lin; Lee, Wen-Hui; Zhang, Yun; Jin, Yang

    2007-06-01

    A 50 kDa fibrinogenolytic protease, ohagin, from the venom of Ophiophagus hannah was isolated by a combination of gel filtration, ion-exchange and heparin affinity chromatography. Ohagin specifically degraded the alpha-chain of human fibrinogen and the proteolytic activity was completely abolished by EDTA, but not by PMSF, suggesting it is a metalloproteinase. It dose-dependently inhibited platelet aggregation induced by ADP, TMVA and stejnulxin. The full sequence of ohagin was deduced by cDNA cloning and confirmed by protein sequencing and peptide mass fingerprinting. The full-length cDNA sequence of ohagin encodes an open reading frame of 611 amino acids that includes signal peptide, proprotein and mature protein comprising metalloproteinase, disintegrin-like and cysteine-rich domains, suggesting it belongs to P-III class metalloproteinase. In addition, P-III class metalloproteinases from the venom glands of Naja atra, Bungarus multicinctus and Bungarus fasciatus were also cloned in this study. Sequence analysis and phylogenetic analysis indicated that metalloproteinases from elapid snake venoms form a new subgroup of P-III SVMPs.

  1. Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

    PubMed

    Bhatia, S; Singh Negi, M; Lakshmikumaran, M

    1996-11-01

    EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.

  2. Single-Cell RNA-Sequencing: Assessment of Differential Expression Analysis Methods.

    PubMed

    Dal Molin, Alessandra; Baruzzo, Giacomo; Di Camillo, Barbara

    2017-01-01

    The sequencing of the transcriptomes of single-cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types and for the study of stochastic gene expression. In recent years, various tools for analyzing single-cell RNA-sequencing data have been proposed, many of them with the purpose of performing differentially expression analysis. In this work, we compare four different tools for single-cell RNA-sequencing differential expression, together with two popular methods originally developed for the analysis of bulk RNA-sequencing data, but largely applied to single-cell data. We discuss results obtained on two real and one synthetic dataset, along with considerations about the perspectives of single-cell differential expression analysis. In particular, we explore the methods performance in four different scenarios, mimicking different unimodal or bimodal distributions of the data, as characteristic of single-cell transcriptomics. We observed marked differences between the selected methods in terms of precision and recall, the number of detected differentially expressed genes and the overall performance. Globally, the results obtained in our study suggest that is difficult to identify a best performing tool and that efforts are needed to improve the methodologies for single-cell RNA-sequencing data analysis and gain better accuracy of results.

  3. Genomic Encyclopedia of Fungi

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grigoriev, Igor

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 150 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supportedmore » by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.« less

  4. Complete Genome Sequence of a Rhodococcus Species Isolated from the Winter Skate Leucoraja ocellata.

    PubMed

    Wiens, Julia; Ho, Ryan; Fernando, Dinesh; Kumar, Ayush; Loewen, Peter C; Brassinga, Ann Karen C; Anderson, W Gary

    2016-09-01

    We report here a genome sequence for Rhodococcus sp. isolate UM008 isolated from the renal/interrenal tissue of the winter skate Leucoraja ocellata Genome sequence analysis suggests that Rhodococcus bacteria may act in a novel mutualistic relationship with their elasmobranch host, serving as biocatalysts in the steroidogenic pathway of 1α-hydroxycorticosterone. Copyright © 2016 Wiens et al.

  5. Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

    PubMed Central

    Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

    2010-01-01

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665

  6. Fast Dissemination of New HIV-1 CRF02/A1 Recombinants in Pakistan

    PubMed Central

    Chen, Yue; Hora, Bhavna; DeMarco, Todd; Shah, Sharaf Ali; Ahmed, Manzoor; Sanchez, Ana M.; Su, Chang; Carter, Meredith; Stone, Mars; Hasan, Rumina; Hasan, Zahra; Busch, Michael P.; Denny, Thomas N.; Gao, Feng

    2016-01-01

    A number of HIV-1 subtypes are identified in Pakistan by characterization of partial viral gene sequences. Little is known whether new recombinants are generated and how they disseminate since whole genome sequences for these viruses have not been characterized. Near full-length genome (NFLG) sequences were obtained by amplifying two overlapping half genomes or next generation sequencing from 34 HIV-1-infected individuals in Pakistan. Phylogenetic tree analysis showed that the newly characterized sequences were 16 subtype As, one subtype C, and 17 A/G recombinants. Further analysis showed that all 16 subtype A1 sequences (47%), together with the vast majority of sequences from Pakistan from other studies, formed a tight subcluster (A1a) within the subtype A1 clade, suggesting that they were derived from a single introduction. More in-depth analysis of 17 A/G NFLG sequences showed that five shared similar recombination breakpoints as in CRF02 (15%) but were phylogenetically distinct from the prototype CRF02 by forming a tight subcluster (CRF02a) while 12 (38%) were new recombinants between CRF02a and A1a or a divergent A1b viruses. Unique recombination patterns among the majority of the newly characterized recombinants indicated ongoing recombination. Interestingly, recombination breakpoints in these CRF02/A1 recombinants were similar to those in prototype CRF02 viruses, indicating that recombination at these sites more likely generate variable recombinant viruses. The dominance and fast dissemination of new CRF02a/A1 recombinants over prototype CRF02 suggest that these recombinant have more adapted and may become major epidemic strains in Pakistan. PMID:27973597

  7. Complete genome sequence analysis identifies a new genotype of brassica yellows virus that infects cabbage and radish in China.

    PubMed

    Zhang, Xiao-Yan; Xiang, Hai-Ying; Zhou, Cui-Ji; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2014-08-01

    For brassica yellows virus (BrYV), proposed to be a member of a new polerovirus species, two clearly distinct genotypes (BrYV-A and BrYV-B) have been described. In this study, the complete nucleotide sequences of two BrYV isolates from radish and Chinese cabbage were determined. Sequence analysis suggested that these isolates represent a new genotype, referred to here as BrYV-C. The full-length sequences of the two BrYV-C isolates shared 93.4-94.8 % identity with BrYV-A and BrYV-B. Further phylogenetic analysis showed that the BrYV-C isolates formed a subgroup that was distinct from the BrYV-A and BrYV-B isolates based on all of the proteins except P5.

  8. Evaluating the protein coding potential of exonized transposable element sequences

    PubMed Central

    Piriyapongsa, Jittima; Rutledge, Mark T; Patel, Sanil; Borodovsky, Mark; Jordan, I King

    2007-01-01

    Background Transposable element (TE) sequences, once thought to be merely selfish or parasitic members of the genomic community, have been shown to contribute a wide variety of functional sequences to their host genomes. Analysis of complete genome sequences have turned up numerous cases where TE sequences have been incorporated as exons into mRNAs, and it is widely assumed that such 'exonized' TEs encode protein sequences. However, the extent to which TE-derived sequences actually encode proteins is unknown and a matter of some controversy. We have tried to address this outstanding issue from two perspectives: i-by evaluating ascertainment biases related to the search methods used to uncover TE-derived protein coding sequences (CDS) and ii-through a probabilistic codon-frequency based analysis of the protein coding potential of TE-derived exons. Results We compared the ability of three classes of sequence similarity search methods to detect TE-derived sequences among data sets of experimentally characterized proteins: 1-a profile-based hidden Markov model (HMM) approach, 2-BLAST methods and 3-RepeatMasker. Profile based methods are more sensitive and more selective than the other methods evaluated. However, the application of profile-based search methods to the detection of TE-derived sequences among well-curated experimentally characterized protein data sets did not turn up many more cases than had been previously detected and nowhere near as many cases as recent genome-wide searches have. We observed that the different search methods used were complementary in the sense that they yielded largely non-overlapping sets of hits and differed in their ability to recover known cases of TE-derived CDS. The probabilistic analysis of TE-derived exon sequences indicates that these sequences have low protein coding potential on average. In particular, non-autonomous TEs that do not encode protein sequences, such as Alu elements, are frequently exonized but unlikely to encode protein sequences. Conclusion The exaptation of the numerous TE sequences found in exons as bona fide protein coding sequences may prove to be far less common than has been suggested by the analysis of complete genomes. We hypothesize that many exonized TE sequences actually function as post-transcriptional regulators of gene expression, rather than coding sequences, which may act through a variety of double stranded RNA related regulatory pathways. Indeed, their relatively high copy numbers and similarity to sequences dispersed throughout the genome suggests that exonized TE sequences could serve as master regulators with a wide scope of regulatory influence. Reviewers: This article was reviewed by Itai Yanai, Kateryna D. Makova, Melissa Wilson (nominated by Kateryna D. Makova) and Cedric Feschotte (nominated by John M. Logsdon Jr.). PMID:18036258

  9. Complete mitochondrial DNA sequence of a tadpole shrimp (Triops cancriformis) and analysis of museum samples.

    PubMed

    Umetsu, Kazuo; Iwabuchi, Naruki; Yuasa, Isao; Saitou, Naruya; Clark, Paul F; Boxshall, Geoff; Osawa, Motoki; Igarashi, Keiji

    2002-12-01

    The complete mitochondrial DNA (mtNDA) of the tadpole shrimp Triops cancriformis was sequenced. The sequence consisted of 15,101 bp with an A+T content of 69%. Its gene arrangement was identical with those sequences of the water flea (Daphnia pulex) and giant tiger prawn (Penaeus monodon), whereas it differed from that of the brine shrimp (Artemia franciscana) in the arrangement of its genes for tRNAs. Phylogenetic analysis revealed T. cancriformis to be more closely related to the water flea than to the brine shrimp and giant tiger prawn. We also compared the 16S rRNA sequences of five formalin-fixed tadpole shrimps that had been collected in five different locations and stored in a museum. The sequence divergence was in the range of 0-1.51%, suggesting that those samples were closely related to each other.

  10. Mansonella ozzardi mitogenome and pseudogene characterisation provides new perspectives on filarial parasite systematics and CO-1 barcoding.

    PubMed

    Crainey, James Lee; Marín, Michel Abanto; Silva, Túllio Romão Ribeiro da; de Medeiros, Jansen Fernandes; Pessoa, Felipe Arley Costa; Santos, Yago Vinícius; Vicente, Ana Carolina Paulo; Luz, Sérgio Luiz Bessa

    2018-04-18

    Despite the broad distribution of M. ozzardi in Latin America and the Caribbean, there is still very little DNA sequence data available to study this neglected parasite's epidemiology. Mitochondrial DNA (mtDNA) sequences, especially the cytochrome oxidase (CO1) gene's barcoding region, have been targeted successfully for filarial diagnostics and for epidemiological, ecological and evolutionary studies. MtDNA-based studies can, however, be compromised by unrecognised mitochondrial pseudogenes, such as Numts. Here, we have used shot-gun Illumina-HiSeq sequencing to recover the first complete Mansonella genus mitogenome and to identify several mitochondrial-origin pseudogenes. Mitogenome phylogenetic analysis placed M. ozzardi in the Onchocercidae "ONC5" clade and suggested that Mansonella parasites are more closely related to Wuchereria and Brugia genera parasites than they are to Loa genus parasites. DNA sequence alignments, BLAST searches and conceptual translations have been used to compliment phylogenetic analysis showing that M. ozzardi from the Amazon and Caribbean regions are near-identical and that previously reported Peruvian M. ozzardi CO1 reference sequences are probably of pseudogene origin. In addition to adding a much-needed resource to the Mansonella genus's molecular tool-kit and providing evidence that some M. ozzardi CO1 sequence deposits are pseudogenes, our results suggest that all Neotropical M. ozzardi parasites are closely related.

  11. Effects of the Laramide Structures on the Regional Distribution of Tight-Gas Sandstone in the Upper Mesaverde Group, Uinta Basin, Utah

    NASA Astrophysics Data System (ADS)

    Sitaula, R. P.; Aschoff, J.

    2013-12-01

    Regional-scale sequence stratigraphic correlation, well log analysis, syntectonic unconformity mapping, isopach maps, and depositional environment maps of the upper Mesaverde Group (UMG) in Uinta basin, Utah suggest higher accommodation in northeastern part (Natural Buttes area) and local development of lacustrine facies due to increased subsidence caused by uplift of San Rafael Swell (SRS) in southern and Uinta Uplift in northern parts. Recently discovered lacustrine facies in Natural Buttes area are completely different than the dominant fluvial facies in outcrops along Book Cliffs and could have implications for significant amount of tight-gas sand production from this area. Data used for sequence stratigraphic correlation, isopach maps and depositional environmental maps include > 100 well logs, 20 stratigraphic profiles, 35 sandstone thin sections and 10 outcrop-based gamma ray profiles. Seven 4th order depositional sequences (~0.5 my duration) are identified and correlated within UMG. Correlation was constructed using a combination of fluvial facies and stacking patterns in outcrops, chert-pebble conglomerates and tidally influenced strata. These surfaces were extrapolated into subsurface by matching GR profiles. GR well logs and core log of Natural Buttes area show intervals of coarsening upward patterns suggesting possible lacustrine intervals that might contain high TOC. Locally, younger sequences are completely truncated across SRS whereas older sequences are truncated and thinned toward SRS. The cycles of truncation and thinning represent phases of SRS uplift. Thinning possibly related with the Uinta Uplift is also observed in northwestern part. Paleocurrents are consistent with interpretation of periodic segmentation and deflection of sedimentation. Regional paleocurrents are generally E-NE-directed in Sequences 1-4, and N-directed in Sequences 5-7. From isopach maps and paleocurrent direction it can be interpreted that uplift of SRS changed route of sediment supply from west to southwest. Locally, paleocurrents are highly variable near SRS further suggesting UMG basin-fill was partitioned by uplift of SRS. Sandstone composition analysis also suggests the uplift of SRS causing the variation of source rocks in upper sequences than the lower sequences. In conclusion, we suggest that Uinta basin was episodically partitioned during the deposition of UMG due to uplift of Laramide structures in the basin and accommodation was localized in northeastern part. Understanding of structural controls on accommodation, sedimentation patterns and depositional environments will aid prediction of the best-producing gas reservoirs.

  12. Within-genome evolution of REPINs: a new family of miniature mobile DNA in bacteria.

    PubMed

    Bertels, Frederic; Rainey, Paul B

    2011-06-01

    Repetitive sequences are a conserved feature of many bacterial genomes. While first reported almost thirty years ago, and frequently exploited for genotyping purposes, little is known about their origin, maintenance, or processes affecting the dynamics of within-genome evolution. Here, beginning with analysis of the diversity and abundance of short oligonucleotide sequences in the genome of Pseudomonas fluorescens SBW25, we show that over-represented short sequences define three distinct groups (GI, GII, and GIII) of repetitive extragenic palindromic (REP) sequences. Patterns of REP distribution suggest that closely linked REP sequences form a functional replicative unit: REP doublets are over-represented, randomly distributed in extragenic space, and more highly conserved than singlets. In addition, doublets are organized as inverted repeats, which together with intervening spacer sequences are predicted to form hairpin structures in ssDNA or mRNA. We refer to these newly defined entities as REPINs (REP doublets forming hairpins) and identify short reads from population sequencing that reveal putative transposition intermediates. The proximal relationship between GI, GII, and GIII REPINs and specific REP-associated tyrosine transposases (RAYTs), combined with features of the putative transposition intermediate, suggests a mechanism for within-genome dissemination. Analysis of the distribution of REPs in a range of RAYT-containing bacterial genomes, including Escherichia coli K-12 and Nostoc punctiforme, show that REPINs are a widely distributed, but hitherto unrecognized, family of miniature non-autonomous mobile DNA.

  13. Rapid Detection & Identification of Bacillus Species using MALDI-TOF/TOF and Biomarker Database

    DTIC Science & Technology

    2006-06-01

    rRNA sequence analysis. Multilocus enzyme electrophoresis ( MEE ) and comparative DNA sequence analysis suggest that they may represent a single species...adaptation of the MEE method [63] but with greater discrimination [64]. All of these new PCR-based subtyping methods are certainly superior and more...Demirev, P.A., Lin, J.S., Pineda , F.J., and Fenselau, C. (2001). Bioinformatics and mass spectrometry for microorganism identification: proteome-wide

  14. Prefrontal neural correlates of memory for sequences.

    PubMed

    Averbeck, Bruno B; Lee, Daeyeol

    2007-02-28

    The sequence of actions appropriate to solve a problem often needs to be discovered by trial and error and recalled in the future when faced with the same problem. Here, we show that when monkeys had to discover and then remember a sequence of decisions across trials, ensembles of prefrontal cortex neurons reflected the sequence of decisions the animal would make throughout the interval between trials. This signal could reflect either an explicit memory process or a sequence-planning process that begins far in advance of the actual sequence execution. This finding extended to error trials such that, when the neural activity during the intertrial interval specified the wrong sequence, the animal also attempted to execute an incorrect sequence. More specifically, we used a decoding analysis to predict the sequence the monkey was planning to execute at the end of the fore-period, just before sequence execution. When this analysis was applied to error trials, we were able to predict where in the sequence the error would occur, up to three movements into the future. This suggests that prefrontal neural activity can retain information about sequences between trials, and that regardless of whether information is remembered correctly or incorrectly, the prefrontal activity veridically reflects the animal's action plan.

  15. Integer sequence discovery from small graphs

    PubMed Central

    Hoppe, Travis; Petrone, Anna

    2015-01-01

    We have exhaustively enumerated all simple, connected graphs of a finite order and have computed a selection of invariants over this set. Integer sequences were constructed from these invariants and checked against the Online Encyclopedia of Integer Sequences (OEIS). 141 new sequences were added and six sequences were extended. From the graph database, we were able to programmatically suggest relationships among the invariants. It will be shown that we can readily visualize any sequence of graphs with a given criteria. The code has been released as an open-source framework for further analysis and the database was constructed to be extensible to invariants not considered in this work. PMID:27034526

  16. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

    PubMed

    Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L

    2013-01-30

    Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.

  17. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

    PubMed Central

    2013-01-01

    Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705

  18. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

    PubMed

    Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

    2014-09-18

    Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.

  19. Sequence-structure mapping errors in the PDB: OB-fold domains

    PubMed Central

    Venclovas, Česlovas; Ginalski, Krzysztof; Kang, Chulhee

    2004-01-01

    The Protein Data Bank (PDB) is the single most important repository of structural data for proteins and other biologically relevant molecules. Therefore, it is critically important to keep the PDB data, as much as possible, error-free. In this study, we have analyzed PDB crystal structures possessing oligonucleotide/oligosaccharide binding (OB)-fold, one of the highly populated folds, for the presence of sequence-structure mapping errors. Using energy-based structure quality assessment coupled with sequence analyses, we have found that there are at least five OB-structures in the PDB that have regions where sequences have been incorrectly mapped onto the structure. We have demonstrated that the combination of these computation techniques is effective not only in detecting sequence-structure mapping errors, but also in providing guidance to correct them. Namely, we have used results of computational analysis to direct a revision of X-ray data for one of the PDB entries containing a fairly inconspicuous sequence-structure mapping error. The revised structure has been deposited with the PDB. We suggest use of computational energy assessment and sequence analysis techniques to facilitate structure determination when homologs having known structure are available to use as a reference. Such computational analysis may be useful in either guiding the sequence-structure assignment process or verifying the sequence mapping within poorly defined regions. PMID:15133161

  20. Genetic Analyses of the Internal Transcribed Spacer Sequences Suggest Introgression and Duplication in the Medicinal Mushroom Agaricus subrufescens

    PubMed Central

    Chen, Jie; Moinard, Magalie; Xu, Jianping; Wang, Shouxian; Foulongne-Oriol, Marie; Zhao, Ruilin; Hyde, Kevin D.; Callac, Philippe

    2016-01-01

    The internal transcribed spacer (ITS) region of the nuclear ribosomal RNA gene cluster is widely used in fungal taxonomy and phylogeographic studies. The medicinal and edible mushroom Agaricus subrufescens has a worldwide distribution with a high level of polymorphism in the ITS region. A previous analysis suggested notable ITS sequence heterogeneity within the wild French isolate CA487. The objective of this study was to investigate the pattern and potential mechanism of ITS sequence heterogeneity within this strain. Using PCR, cloning, and sequencing, we identified three types of ITS sequences, A, B, and C with a balanced distribution, which differed from each other at 13 polymorphic positions. The phylogenetic comparisons with samples from different continents revealed that the type C sequence was similar to those found in Oceanian and Asian specimens of A. subrufescens while types A and B sequences were close to those found in the Americas or in Europe. We further investigated the inheritance of these three ITS sequence types by analyzing their distribution among single-spore isolates from CA487. In this analysis, three co-dominant markers were used firstly to distinguish the homokaryotic offspring from the heterokaryotic offspring. The homokaryotic offspring were then analyzed for their ITS types. Our genetic analyses revealed that types A and B were two alleles segregating at one locus ITSI, while type C was not allelic with types A and B but was located at another unlinked locus ITSII. Furthermore, type C was present in only one of the two constitutive haploid nuclei (n) of the heterokaryotic (n+n) parent CA487. These data suggest that there was a relatively recent introduction of the type C sequence and a duplication of the ITS locus in this strain. Whether other genes were also transferred and duplicated and their impacts on genome structure and stability remain to be investigated. PMID:27228131

  1. A new polymorphic and multicopy MHC gene family related to nonmammalian class I

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leelayuwat, C.; Degli-Esposti, M.A.; Abraham, L.J.

    1994-12-31

    The authors have used genomic analysis to characterize a region of the central major histocompatibility complex (MHC) spanning {approximately} 300 kilobases (kb) between TNF and HLA-B. This region has been suggested to carry genetic factors relevant to the development of autoimmune diseases such as myasthenia gravis (MG) and insulin dependent diabetes mellitus (IDDM). Genomic sequence was analyzed for coding potential, using two neural network programs, GRAIL and GeneParser. A genomic probe, JAB, containing putative coding sequences (PERB11) located 60 kb centromeric of HLA-B, was used for northern analysis of human tissues. Multiple transcripts were detected. Southern analysis of genomic DNAmore » and overlapping YAC clones, covering the region from BAT1 to HLA-F, indicated that there are at least five copies of PERB11, four of which are located within this region of the MHC. The partial cDNA sequence of PERB11 was obtained from poly-A RNA derived from skeletal muscle. The putative amino acid sequence of PERB11 shares {approximately} 30% identity to MHC class I molecules from various species, including reptiles, chickens, and frogs, as well as to other MHC class I-like molecules, such as the IgG FcR of the mouse and rat and the human Zn-{alpha}2-glycoprotein. From direct comparison of amino acid sequences, it is concluded that PERB11 is a distinct molecule more closely related to nonmammalian than known mammalian MHC class I molecules. Genomic sequence analysis of PERB11 from five MHC ancestral haplotypes (AH) indicated that the gene is polymorphic at both DNA and protein level. The results suggest that the authors have identified a novel polymorphic gene family with multiple copies within the MHC. 48 refs., 10 figs., 2 tabs.« less

  2. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    PubMed

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  3. Fueling the Future with Fungal Genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grigoriev, Igor V.

    2014-10-27

    Genomes of fungi relevant to energy and environment are in focus of the JGI Fungal Genomic Program. One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts and pathogens) and biorefinery processes (cellulose degradation and sugar fermentation) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Science Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for users to nominate new species for sequencing. Over 400 fungal genomes have beenmore » sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics will lead to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such ‘parts’ suggested by comparative genomics and functional analysis in these areas are presented here.« less

  4. 5S ribosomal ribonucleic acid sequences in Bacteroides and Fusobacterium: evolutionary relationships within these genera and among eubacteria in general

    NASA Technical Reports Server (NTRS)

    Van den Eynde, H.; De Baere, R.; Shah, H. N.; Gharbia, S. E.; Fox, G. E.; Michalik, J.; Van de Peer, Y.; De Wachter, R.

    1989-01-01

    The 5S ribosomal ribonucleic acid (rRNA) sequences were determined for Bacteroides fragilis, Bacteroides thetaiotaomicron, Bacteroides capillosus, Bacteroides veroralis, Porphyromonas gingivalis, Anaerorhabdus furcosus, Fusobacterium nucleatum, Fusobacterium mortiferum, and Fusobacterium varium. A dendrogram constructed by a clustering algorithm from these sequences, which were aligned with all other hitherto known eubacterial 5S rRNA sequences, showed differences as well as similarities with respect to results derived from 16S rRNA analyses. In the 5S rRNA dendrogram, Bacteroides clustered together with Cytophaga and Fusobacterium, as in 16S rRNA analyses. Intraphylum relationships deduced from 5S rRNAs suggested that Bacteroides is specifically related to Cytophaga rather than to Fusobacterium, as was suggested by 16S rRNA analyses. Previous taxonomic considerations concerning the genus Bacteroides, based on biochemical and physiological data, were confirmed by the 5S rRNA sequence analysis.

  5. Genomic Diversity and Evolution of the Lyssaviruses

    PubMed Central

    Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

    2008-01-01

    Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239

  6. cyclostratigraphy, sequence stratigraphy and organic matter accumulation mechanism

    NASA Astrophysics Data System (ADS)

    Cong, F.; Li, J.

    2016-12-01

    The first member of Maokou Formation of Sichuan basin is composed of well preserved carbonate ramp couplets of limestone and marlstone/shale. It acts as one of the potential shale gas source rock, and is suitable for time-series analysis. We conducted time-series analysis to identify high-frequency sequences, reconstruct high-resolution sedimentation rate, estimate detailed primary productivity for the first time in the study intervals and discuss organic matter accumulation mechanism of source rock under sequence stratigraphic framework.Using the theory of cyclostratigraphy and sequence stratigraphy, the high-frequency sequences of one outcrop profile and one drilling well are identified. Two third-order sequences and eight fourth-order sequences are distinguished on outcrop profile based on the cycle stacking patterns. For drilling well, sequence boundary and four system tracts is distinguished by "integrated prediction error filter analysis" (INPEFA) of Gamma-ray logging data, and eight fourth-order sequences is identified by 405ka long eccentricity curve in depth domain which is quantified and filtered by integrated analysis of MTM spectral analysis, evolutive harmonic analysis (EHA), evolutive average spectral misfit (eASM) and band-pass filtering. It suggests that high-frequency sequences correlate well with Milankovitch orbital signals recorded in sediments, and it is applicable to use cyclostratigraphy theory in dividing high-frequency(4-6 orders) sequence stratigraphy.High-resolution sedimentation rate is reconstructed through the study interval by tracking the highly statistically significant short eccentricity component (123ka) revealed by EHA. Based on sedimentation rate, measured TOC and density data, the burial flux, delivery flux and primary productivity of organic carbon was estimated. By integrating redox proxies, we can discuss the controls on organic matter accumulation by primary production and preservation under the high-resolution sequence stratigraphic framework. Results show that high average organic carbon contents in the study interval are mainly attributed to high primary production. The results also show a good correlation between high organic carbon accumulation and intervals of transgression.

  7. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    PubMed

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  8. A new cryptic virus belonging to the family Partitiviridae was found in watermelon co-infected with Melon necrotic spot virus.

    PubMed

    Sela, Noa; Lachman, Oded; Reingold, Victoria; Dombrovsky, Aviv

    2013-10-01

    A novel virus was detected in watermelon plants (Citrullus lanatus Thunb.) infected with Melon necrotic spot virus (MNSV) using SOLiD next-generation sequence analysis. In addition to the expected MSNV genome, two double-stranded RNA (dsRNA) segments of 1,312 and 1,118 bp were also identified and sequenced from the purified virus preparations. These two dsRNA segments encode two putative partitivirus-related proteins, an RNA-dependent RNA polymerase (RdRP) and a capsid protein, which were sequenced. Genomic-sequence analysis and analysis of phylogenetic relationships indicate that these two dsRNAs together make up the genome of a novel Partitivirus. This virus was found to be closely related to the Pepper cryptic virus 1 and Raphanus sativus cryptic virus. It is suggested that this novel virus putatively named Citrullus lanatus cryptic virus be considered as a new member of the family Partitiviridae.

  9. Protein Sectors: Statistical Coupling Analysis versus Conservation

    PubMed Central

    Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas

    2015-01-01

    Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535

  10. VWF mutations and new sequence variations identified in healthy controls are more frequent in the African-American population.

    PubMed

    Bellissimo, Daniel B; Christopherson, Pamela A; Flood, Veronica H; Gill, Joan Cox; Friedman, Kenneth D; Haberichter, Sandra L; Shapiro, Amy D; Abshire, Thomas C; Leissinger, Cindy; Hoots, W Keith; Lusher, Jeanne M; Ragni, Margaret V; Montgomery, Robert R

    2012-03-01

    Diagnosis and classification of VWD is aided by molecular analysis of the VWF gene. Because VWF polymorphisms have not been fully characterized, we performed VWF laboratory testing and gene sequencing of 184 healthy controls with a negative bleeding history. The controls included 66 (35.9%) African Americans (AAs). We identified 21 new sequence variations, 13 (62%) of which occurred exclusively in AAs and 2 (G967D, T2666M) that were found in 10%-15% of the AA samples, suggesting they are polymorphisms. We identified 14 sequence variations reported previously as VWF mutations, the majority of which were type 1 mutations. These controls had VWF Ag levels within the normal range, suggesting that these sequence variations might not always reduce plasma VWF levels. Eleven mutations were found in AAs, and the frequency of M740I, H817Q, and R2185Q was 15%-18%. Ten AA controls had the 2N mutation H817Q; 1 was homozygous. The average factor VIII level in this group was 99 IU/dL, suggesting that this variation may confer little or no clinical symptoms. This study emphasizes the importance of sequencing healthy controls to understand ethnic-specific sequence variations so that asymptomatic sequence variations are not misidentified as mutations in other ethnic or racial groups.

  11. Polymorphism of CRISPR shows separated natural groupings of Shigella subtypes and evidence of horizontal transfer of CRISPR

    PubMed Central

    Yang, Chaojie; Li, Peng; Su, Wenli; Li, Hao; Liu, Hongbo; Yang, Guang; Xie, Jing; Yi, Shengjie; Wang, Jian; Cui, Xianyan; Wu, Zhihao; Wang, Ligui; Hao, Rongzhang; Jia, Leili; Qiu, Shaofu; Song, Hongbin

    2015-01-01

    Clustered, regularly interspaced, short palindromic repeats (CRISPR) act as an adaptive RNA-mediated immune mechanism in bacteria. They can also be used for identification and evolutionary studies based on polymorphisms within the CRISPR locus. We amplified and analyzed 6 CRISPR loci from 237 Shigella strains belonging to the 4 species groups, as well as 13 Escherichia coli strains. The CRISPR-associated (cas) gene sequence arrays of these strains were screened and compared. The CRISPR sequences from Shigella were conserved among subtypes, suggesting that CRISPR may represent a new identification tool for the detection and discrimination of Shigella species. Secondary structure analysis showed a different stem-loop structure at the terminal repeat, suggesting a distinct recognition mechanism in the formation of crRNA. In addition, the presence of “self-target” spacers and polymorphisms within CRISPR in Shigella indicated a selective pressure for inhibition of this system, which has the potential to damage “self DNA.” Homology analysis of spacers showed that CRISPR might be involved in the regulation of virulence transmission. Phylogenetic analysis based on CRISPR sequences from Shigella and E. coli indicated that although phenotypic properties maintain convergent evolution, the 4 Shigella species do not represent natural groupings. Surprisingly, comparative analysis of Shigella repeats with other species provided new evidence for CRISPR horizontal transfer. Our results suggested that CRISPR analysis is applicable for the detection of Shigella species and for investigation of evolutionary relationships. PMID:26327282

  12. Polymorphism of CRISPR shows separated natural groupings of Shigella subtypes and evidence of horizontal transfer of CRISPR.

    PubMed

    Yang, Chaojie; Li, Peng; Su, Wenli; Li, Hao; Liu, Hongbo; Yang, Guang; Xie, Jing; Yi, Shengjie; Wang, Jian; Cui, Xianyan; Wu, Zhihao; Wang, Ligui; Hao, Rongzhang; Jia, Leili; Qiu, Shaofu; Song, Hongbin

    2015-01-01

    Clustered, regularly interspaced, short palindromic repeats (CRISPR) act as an adaptive RNA-mediated immune mechanism in bacteria. They can also be used for identification and evolutionary studies based on polymorphisms within the CRISPR locus. We amplified and analyzed 6 CRISPR loci from 237 Shigella strains belonging to the 4 species groups, as well as 13 Escherichia coli strains. The CRISPR-associated (cas) gene sequence arrays of these strains were screened and compared. The CRISPR sequences from Shigella were conserved among subtypes, suggesting that CRISPR may represent a new identification tool for the detection and discrimination of Shigella species. Secondary structure analysis showed a different stem-loop structure at the terminal repeat, suggesting a distinct recognition mechanism in the formation of crRNA. In addition, the presence of "self-target" spacers and polymorphisms within CRISPR in Shigella indicated a selective pressure for inhibition of this system, which has the potential to damage "self DNA." Homology analysis of spacers showed that CRISPR might be involved in the regulation of virulence transmission. Phylogenetic analysis based on CRISPR sequences from Shigella and E. coli indicated that although phenotypic properties maintain convergent evolution, the 4 Shigella species do not represent natural groupings. Surprisingly, comparative analysis of Shigella repeats with other species provided new evidence for CRISPR horizontal transfer. Our results suggested that CRISPR analysis is applicable for the detection of Shigella species and for investigation of evolutionary relationships.

  13. Molecular and comparative analysis of Salmonella enterica Senftenberg from humans and animals using PFGE, MLST and NARMS

    PubMed Central

    2011-01-01

    Background Salmonella species are recognized worldwide as a significant cause of human and animal disease. In this study the molecular profiles and characteristics of Salmonella enterica Senftenberg isolated from human cases of illness and those recovered from healthy or diagnostic cases in animals were assessed. Included in the study was a comparison with our own sequenced strain of S. Senfteberg recovered from production turkeys in North Dakota. Isolates examined in this study were subjected to antimicrobial susceptibility profiling using the National Antimicrobial Resistance Monitoring System (NARMS) panel which tested susceptibility to 15 different antimicrobial agents. The molecular profiles of all isolates were determined using Pulsed Field Gel Electrophoresis (PFGE) and the sequence types of the strains were obtained using Multi-Locus Sequence Type (MLST) analysis based on amplification and sequence interrogation of seven housekeeping genes (aroC, dnaN, hemD, hisD, purE, sucA, and thrA). PFGE data was input into BioNumerics analysis software to generate a dendrogram of relatedness among the strains. Results The study found 93 profiles among 98 S. Senftenberg isolates tested and there were primarily two sequence types associated with humans and animals (ST185 and ST14) with overlap observed in all host types suggesting that the distribution of S. Senftenberg sequence types is not host dependent. Antimicrobial resistance was observed among the animal strains, however no resistance was detected in human isolates suggesting that animal husbandry has a significant influence on the selection and promotion of antimicrobial resistance. Conclusion The data demonstrates the circulation of at least two strain types in both animal and human health suggesting that S. Senftenberg is relatively homogeneous in its distribution. The data generated in this study could be used towards defining a pathotype for this serovar. PMID:21708021

  14. Spliced leader RNA of trypanosomes: in vivo mutational analysis reveals extensive and distinct requirements for trans splicing and cap4 formation.

    PubMed Central

    Lücke, S; Xu, G L; Palfi, Z; Cross, M; Bellofatto, V; Bindereif, A

    1996-01-01

    In trypanosomes mRNAs are generated through trans splicing. The spliced leader (SL) RNA, which donates the 5'-terminal mini-exon to each of the protein coding exons, plays a central role in the trans splicing process. We have established in vivo assays to study in detail trans splicing, cap4 modification, and RNP assembly of the SL RNA in the trypanosomatid species Leptomonas seymouri. First, we found that extensive sequences within the mini-exon are required for SL RNA function in vivo, although a conserved length of 39 nt is not essential. In contrast, the intron sequence appears to be surprisingly tolerant to mutation; only the stem-loop II structure is indispensable. The asymmetry of the sequence requirements in the stem I region suggests that this domain may exist in different functional conformations. Second, distinct mini-exon sequences outside the modification site are important for efficient cap4 formation. Third, all SL RNA mutations tested allowed core RNP assembly, suggesting flexible requirements for core protein binding. In sum, the results of our mutational analysis provide evidence for a discrete domain structure of the SL RNA and help to explain the strong phylogenetic conservation of the mini-exon sequence and of the overall SL RNA secondary structure; they also suggest that there may be certain differences between trans splicing in nematodes and trypanosomes. This approach provides a basis for studying RNA-RNA interactions in the trans spliceosome. Images PMID:8861965

  15. Re-examination of population structure and phylogeography of hawksbill turtles in the wider Caribbean using longer mtDNA sequences.

    PubMed

    Leroux, Robin A; Dutton, Peter H; Abreu-Grobois, F Alberto; Lagueux, Cynthia J; Campbell, Cathi L; Delcroix, Eric; Chevalier, Johan; Horrocks, Julia A; Hillis-Starr, Zandy; Troëng, Sebastian; Harrison, Emma; Stapleton, Seth

    2012-01-01

    Management of the critically endangered hawksbill turtle in the Wider Caribbean (WC) has been hampered by knowledge gaps regarding stock structure. We carried out a comprehensive stock structure re-assessment of 11 WC hawksbill rookeries using longer mtDNA sequences, larger sample sizes (N = 647), and additional rookeries compared to previous surveys. Additional variation detected by 740 bp sequences between populations allowed us to differentiate populations such as Barbados-Windward and Guadeloupe (F (st) = 0.683, P < 0.05) that appeared genetically indistinguishable based on shorter 380 bp sequences. POWSIM analysis showed that longer sequences improved power to detect population structure and that when N < 30, increasing the variation detected was as effective in increasing power as increasing sample size. Geographic patterns of genetic variation suggest a model of periodic long-distance colonization coupled with region-wide dispersal and subsequent secondary contact within the WC. Mismatch analysis results for individual clades suggest a general population expansion in the WC following a historic bottleneck about 100 000-300 000 years ago. We estimated an effective female population size (N (ef)) of 6000-9000 for the WC, similar to the current estimated numbers of breeding females, highlighting the importance of these regional rookeries to maintaining genetic diversity in hawksbills. Our results provide a basis for standardizing future work to 740 bp sequence reads and establish a more complete baseline for determining stock boundaries in this migratory marine species. Finally, our findings illustrate the value of maintaining an archive of specimens for re-analysis as new markers become available.

  16. Geoseq: a tool for dissecting deep-sequencing datasets.

    PubMed

    Gurtowski, James; Cancio, Anthony; Shah, Hardik; Levovitz, Chaya; George, Ajish; Homann, Robert; Sachidanandam, Ravi

    2010-10-12

    Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  17. Molecular Characterization of Epiphytic Bacterial Communities on Charophycean Green Algae

    PubMed Central

    Fisher, Madeline M.; Wilcox, Lee W.; Graham, Linda E.

    1998-01-01

    Epiphytic bacterial communities within the sheath material of three filamentous green algae, Desmidium grevillii, Hyalotheca dissiliens, and Spondylosium pulchrum (class Charophyceae, order Zygnematales), collected from a Sphagnum bog were characterized by PCR amplification, cloning, and sequencing of 16S ribosomal DNA. A total of 20 partial sequences and nine different sequence types were obtained, and one sequence type was recovered from the bacterial communities on all three algae. By phylogenetic analysis, the cloned sequences were placed into several major lineages of the Bacteria domain: the Flexibacter/Cytophaga/Bacteroides phylum and the α, β, and γ subdivisions of the phylum Proteobacteria. Analysis at the subphylum level revealed that the majority of our sequences were not closely affiliated with those of known, cultured taxa, although the estimated evolutionary distances between our sequences and their nearest neighbors were always less than 0.1 (i.e., greater than 90% similar). This result suggests that the majority of sequences obtained in this study represent as yet phenotypically undescribed bacterial species and that the range of bacterial-algal interactions that occur in nature has not yet been fully described. PMID:9797295

  18. Cloning and sequence analysis of the meso-diaminopimelate decarboxylase gene from Bacillus methanolicus MGA3 and comparison to other decarboxylase genes.

    PubMed Central

    Mills, D A; Flickinger, M C

    1993-01-01

    The lysA gene of Bacillus methanolicus MGA3 was cloned by complementation of an auxotrophic Escherichia coli lysA22 mutant with a genomic library of B. methanolicus MGA3 chromosomal DNA. Subcloning localized the B. methanolicus MGA3 lysA gene into a 2.3-kb SmaI-SstI fragment. Sequence analysis of the 2.3-kb fragment indicated an open reading frame encoding a protein of 48,223 Da, which was similar to the meso-diaminopimelate (DAP) decarboxylase amino acid sequences of Bacillus subtilis (62%) and Corynebacterium glutamicum (40%). Amino acid sequence analysis indicated several regions of conservation among bacterial DAP decarboxylases, eukaryotic ornithine decarboxylases, and arginine decarboxylases, suggesting a common structural arrangement for positioning of substrate and the cofactor pyridoxal 5'-phosphate. The B. methanolicus MGA3 DAP decarboxylase was shown to be a dimer (M(r) 86,000) with a subunit molecular mass of approximately 50,000 Da. This decarboxylase is inhibited by lysine (Ki = 0.93 mM) with a Km of 0.8 mM for DAP. The inhibition pattern suggests that the activity of this enzyme in lysine-overproducing strains of B. methanolicus MGA3 may limit lysine synthesis. Images PMID:8215365

  19. Cloning and sequence analysis of the meso-diaminopimelate decarboxylase gene from Bacillus methanolicus MGA3 and comparison to other decarboxylase genes.

    PubMed

    Mills, D A; Flickinger, M C

    1993-09-01

    The lysA gene of Bacillus methanolicus MGA3 was cloned by complementation of an auxotrophic Escherichia coli lysA22 mutant with a genomic library of B. methanolicus MGA3 chromosomal DNA. Subcloning localized the B. methanolicus MGA3 lysA gene into a 2.3-kb SmaI-SstI fragment. Sequence analysis of the 2.3-kb fragment indicated an open reading frame encoding a protein of 48,223 Da, which was similar to the meso-diaminopimelate (DAP) decarboxylase amino acid sequences of Bacillus subtilis (62%) and Corynebacterium glutamicum (40%). Amino acid sequence analysis indicated several regions of conservation among bacterial DAP decarboxylases, eukaryotic ornithine decarboxylases, and arginine decarboxylases, suggesting a common structural arrangement for positioning of substrate and the cofactor pyridoxal 5'-phosphate. The B. methanolicus MGA3 DAP decarboxylase was shown to be a dimer (M(r) 86,000) with a subunit molecular mass of approximately 50,000 Da. This decarboxylase is inhibited by lysine (Ki = 0.93 mM) with a Km of 0.8 mM for DAP. The inhibition pattern suggests that the activity of this enzyme in lysine-overproducing strains of B. methanolicus MGA3 may limit lysine synthesis.

  20. Intron loss from the NADH dehydrogenase subunit 4 gene of lettuce mitochondrial DNA: evidence for homologous recombination of a cDNA intermediate.

    PubMed

    Geiss, K T; Abbas, G M; Makaroff, C A

    1994-04-01

    The mitochondrial gene coding for subunit 4 of the NADH dehydrogenase complex I (nad4) has been isolated and characterized from lettuce, Lactuca sativa. Analysis of nad4 genes in a number of plants by Southern hybridization had previously suggested that the intron content varied between species. Characterization of the lettuce gene confirms this observation. Lettuce nad4 contains two exons and one group IIA intron, whereas previously sequenced nad4 genes from turnip and wheat contain three group IIA introns. Northern analysis identified a transcript of 1600 nucleotides, which represents the mature nad4 mRNA and a primary transcript of 3200 nucleotides. Sequence analysis of lettuce and turnip nad4 cDNAs was used to confirm the intron/exon border sequences and to examine RNA editing patterns. Editing is observed at the 5' and 3' ends of the lettuce transcript, but is absent from sequences that correspond to exons two, three and the 5' end of exon four in turnip and wheat. In contrast, turnip transcripts are highly edited in this region, suggesting that homologous recombination of an edited and spliced cDNA intermediate was involved in the loss of introns two and three from an ancestral lettuce nad4 gene.

  1. Molecular analysis of immunoglobulin variable genes supports a germinal center experienced normal counterpart in primary cutaneous diffuse large B-cell lymphoma, leg-type.

    PubMed

    Pham-Ledard, Anne; Prochazkova-Carlotti, Martina; Deveza, Mélanie; Laforet, Marie-Pierre; Beylot-Barry, Marie; Vergier, Béatrice; Parrens, Marie; Feuillard, Jean; Merlio, Jean-Philippe; Gachard, Nathalie

    2017-11-01

    Immunophenotype of primary cutaneous diffuse large B-cell lymphoma, leg-type (PCLBCL-LT) suggests a germinal center-experienced B lymphocyte (BCL2+ MUM1+ BCL6+/-). As maturation history of B-cell is "imprinted" during B-cell development on the immunoglobulin gene sequence, we studied the structure and sequence of the variable part of the genes (IGHV, IGLV, IGKV), immunoglobulin surface expression and features of class switching in order to determine the PCLBCL-LT cell of origin. Clonality analysis with BIOMED2 protocol and VH leader primers was done on DNA extracted from frozen skin biopsies on retrospective samples from 14 patients. The clonal DNA IGHV sequence of the tumor was aligned and compared with the closest germline sequence and homology percentage was calculated. Superantigen binding sites were studied. Features of selection pressure were evaluated with the multinomial Lossos model. A functional monoclonal sequence was observed in 14 cases as determined for IGHV (10), IGLV (2) or IGKV (3). IGV mutation rates were high (>5%) in all cases but one (median:15.5%), with superantigen binding sites conservation. Features of selection pressure were identified in 11/12 interpretable cases, more frequently negative (75%) than positive (25%). Intraclonal variation was detected in 3 of 8 tumor specimens with a low rate of mutations. Surface immunoglobulin was an IgM in 12/12 cases. FISH analysis of IGHM locus, deleted during class switching, showed heterozygous IGHM gene deletion in half of cases. The genomic PCR analysis confirmed the deletions within the switch μ region. IGV sequences were highly mutated but functional, with negative features of selection pressure suggesting one or more germinal center passage(s) with somatic hypermutation, but superantigen (SpA) binding sites conservation. Genetic features of class switch were observed, but on the non functional allele and co-existing with primary isotype IgM expression. These data suggest that cell-of origin is germinal center experienced and superantigen driven selected B-cell, in a stage between germinal center B-cell and plasma cell. Copyright © 2017 Japanese Society for Investigative Dermatology. Published by Elsevier B.V. All rights reserved.

  2. Structural Analysis of Biodiversity

    PubMed Central

    Sirovich, Lawrence; Stoeckle, Mark Y.; Zhang, Yu

    2010-01-01

    Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. PMID:20195371

  3. R/S analysis of reaction time in Neuron Type Test for human activity in civil aviation

    NASA Astrophysics Data System (ADS)

    Zhang, Hong-Yan; Kang, Ming-Cui; Li, Jing-Qiang; Liu, Hai-Tao

    2017-03-01

    Human factors become the most serious problem leading to accidents of civil aviation, which stimulates the design and analysis of Neuron Type Test (NTT) system to explore the intrinsic properties and patterns behind the behaviors of professionals and students in civil aviation. In the experiment, normal practitioners' reaction time sequences, collected from NTT, exhibit log-normal distribution approximately. We apply the χ2 test to compute the goodness-of-fit by transforming the time sequence with Box-Cox transformation to cluster practitioners. The long-term correlation of different individual practitioner's time sequence is represented by the Hurst exponent via Rescaled Range Analysis, also named by Range/Standard deviation (R/S) Analysis. The different Hurst exponent suggests the existence of different collective behavior and different intrinsic patterns of human factors in civil aviation.

  4. The Role of the Y-Chromosome in the Establishment of Murine Hybrid Dysgenesis and in the Analysis of the Nucleotide Sequence Organization, Genetic Transmission and Evolution of Repeated Sequences.

    NASA Astrophysics Data System (ADS)

    Nallaseth, Ferez Soli

    The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1) sequence content of deletion products confirmed the previously unidentified loss of genetic control of mammalian chromosome biology and hybrid dysgenesis.

  5. Within-Genome Evolution of REPINs: a New Family of Miniature Mobile DNA in Bacteria

    PubMed Central

    Bertels, Frederic; Rainey, Paul B.

    2011-01-01

    Repetitive sequences are a conserved feature of many bacterial genomes. While first reported almost thirty years ago, and frequently exploited for genotyping purposes, little is known about their origin, maintenance, or processes affecting the dynamics of within-genome evolution. Here, beginning with analysis of the diversity and abundance of short oligonucleotide sequences in the genome of Pseudomonas fluorescens SBW25, we show that over-represented short sequences define three distinct groups (GI, GII, and GIII) of repetitive extragenic palindromic (REP) sequences. Patterns of REP distribution suggest that closely linked REP sequences form a functional replicative unit: REP doublets are over-represented, randomly distributed in extragenic space, and more highly conserved than singlets. In addition, doublets are organized as inverted repeats, which together with intervening spacer sequences are predicted to form hairpin structures in ssDNA or mRNA. We refer to these newly defined entities as REPINs (REP doublets forming hairpins) and identify short reads from population sequencing that reveal putative transposition intermediates. The proximal relationship between GI, GII, and GIII REPINs and specific REP-associated tyrosine transposases (RAYTs), combined with features of the putative transposition intermediate, suggests a mechanism for within-genome dissemination. Analysis of the distribution of REPs in a range of RAYT–containing bacterial genomes, including Escherichia coli K-12 and Nostoc punctiforme, show that REPINs are a widely distributed, but hitherto unrecognized, family of miniature non-autonomous mobile DNA. PMID:21698139

  6. The complete genome sequence of a new polerovirus in strawberry plants from eastern Canada showing strawberry decline symptoms.

    PubMed

    Xiang, Yu; Bernardy, Mike; Bhagwat, Basdeo; Wiersma, Paul A; DeYoung, Robyn; Bouthillier, Michel

    2015-02-01

    Strawberry decline disease, probably caused by synergistic reactions of mixed virus infections, threatens the North American strawberry industry. Deep sequencing of strawberry plant samples from eastern Canada resulted in the identification of a new virus genome resembling poleroviruses in sequence and genome structure. Phylogenetic analysis suggests that it is a new member of the genus Polerovirus, family Luteoviridae. The virus is tentatively named "strawberry polerovirus 1" (SPV1).

  7. Novel insect-specific flavivirus isolated from northern Europe

    PubMed Central

    Huhtamo, Eili; Moureau, Gregory; Cook, Shelley; Julkunen, Ora; Putkuri, Niina; Kurkela, Satu; Uzcátegui, Nathalie Y.; Harbach, Ralph E.; Gould, Ernest A.; Vapalahti, Olli; de Lamballerie, Xavier

    2012-01-01

    Mosquitoes collected in Finland were screened for flaviviral RNA leading to the discovery and isolation of a novel flavivirus designated Hanko virus (HANKV). Virus characterization, including phylogenetic analysis of the complete coding sequence, confirmed HANKV as a member of the “insect-specific” flavivirus (ISF) group. HANKV is the first member of this group isolated from northern Europe, and therefore the first northern European ISF for which the complete coding sequence has been determined. HANKV was not transcribed as DNA in mosquito cell culture, which appears atypical for an ISF. HANKV shared highest sequence homology with the partial NS5 sequence available for the recently discovered Spanish Ochlerotatus flavivirus (SOcFV). Retrospective analysis of mitochondrial sequences from the virus-positive mosquito pool suggested an Ochlerotatus mosquito species as the most likely host for HANKV. HANKV and SOcFV may therefore represent a novel group of Ochlerotatus-hosted insect-specific flaviviruses in Europe and further afield. PMID:22999256

  8. Comparative analysis of the prion protein gene sequences in African lion.

    PubMed

    Wu, Chang-De; Pang, Wan-Yong; Zhao, De-Ming

    2006-10-01

    The prion protein gene of African lion (Panthera Leo) was first cloned and polymorphisms screened. The results suggest that the prion protein gene of eight African lions is highly homogenous. The amino acid sequences of the prion protein (PrP) of all samples tested were identical. Four single nucleotide polymorphisms (C42T, C81A, C420T, T600C) in the prion protein gene (Prnp) of African lion were found, but no amino acid substitutions. Sequence analysis showed that the higher homology is observed to felis catus AF003087 (96.7%) and to sheep number M31313.1 (96.2%) Genbank accessed. With respect to all the mammalian prion protein sequences compared, the African lion prion protein sequence has three amino acid substitutions. The homology might in turn affect the potential intermolecular interactions critical for cross species transmission of prion disease.

  9. Characterization, genetic diversity, and evolutionary link of Cucumber mosaic virus strain New Delhi from India.

    PubMed

    Koundal, Vikas; Haq, Qazi Mohd Rizwanul; Praveen, Shelly

    2011-02-01

    The genome of Cucumber mosaic virus New Delhi strain (CMV-ND) from India, obtained from tomato, was completely sequenced and compared with full genome sequences of 14 known CMV strains from subgroups I and II, for their genetic diversity. Sequence analysis suggests CMV-ND shares maximum sequence identity at the nucleotide level with a CMV strain from Taiwan. Among all 15 strains of CMV, the encoded protein 2b is least conserved, whereas the coat protein (CP) is most conserved. Sequence identity values and phylogram results indicate that CMV-ND belongs to subgroup I. Based on the recombination detection program result, it appears that CMV is prone to recombination, and different RNA components of CMV-ND have evolved differently. Recombinational analysis of all 15 CMV strains detected maximum recombination breakpoints in RNA2; CP showed the least recombination sites.

  10. A Single Early Introduction of HIV-1 Subtype B into Central America Accounts for Most Current Cases

    PubMed Central

    Murillo, Wendy; Veras, Nazle; Prosperi, Mattia; de Rivera, Ivette Lorenzana; Paz-Bailey, Gabriela; Morales-Miranda, Sonia; Juarez, Sandra I.; Yang, Chunfu; DeVos, Joshua; Marín, José Pablo; Mild, Mattias; Albert, Jan

    2013-01-01

    Human immunodeficiency virus type 1 (HIV-1) variants show considerable geographical separation across the world, but there is limited information from Central America. We provide the first detailed investigation of the genetic diversity and molecular epidemiology of HIV-1 in six Central American countries. Phylogenetic analysis was performed on 625 HIV-1 pol gene sequences collected between 2002 and 2010 in Honduras, El Salvador, Nicaragua, Costa Rica, Panama, and Belize. Published sequences from neighboring countries (n = 57) and the rest of the world (n = 740) were included as controls. Maximum likelihood methods were used to explore phylogenetic relationships. Bayesian coalescence-based methods were used to time HIV-1 introductions. Nearly all (98.9%) Central American sequences were of subtype B. Phylogenetic analysis revealed that 437 (70%) sequences clustered within five significantly supported monophyletic clades formed essentially by Central American sequences. One clade contained 386 (62%) sequences from all six countries; the other four clades were smaller and more country specific, suggesting discrete subepidemics. The existence of one large well-supported Central American clade provides evidence that a single introduction of HIV-1 subtype B in Central America accounts for most current cases. An introduction during the early phase of the HIV-1 pandemic may explain its epidemiological success. Moreover, the smaller clades suggest a subsequent regional spread related to specific transmission networks within each country. PMID:23616665

  11. Identification of Novel Sequence Types among Staphylococcus haemolyticus Isolated from Variety of Infections in India.

    PubMed

    Panda, Sasmita; Jena, Smrutiti; Sharma, Savitri; Dhawan, Benu; Nath, Gopal; Singh, Durg Vijai

    2016-01-01

    The aim of this study was to determine sequence types of 34 S. haemolyticus strains isolated from a variety of infections between 2013 and 2016 in India by MLST. The MEGA5.2 software was used to align and compare the nucleotide sequences. The advanced cluster analysis was performed to define the clonal complexes. MLST analysis showed 24 new sequence types (ST) among S. haemolyticus isolates, irrespective of sources and place of isolation. The finding of this study allowed to set up an MLST database on the PubMLST.org website using BIGSdb software and made available at http://pubmlst.org/shaemolyticus/. The data of this study thus suggest that MLST can be used to study population structure and diversity among S. haemolyticus isolates.

  12. Identification of Novel Sequence Types among Staphylococcus haemolyticus Isolated from Variety of Infections in India

    PubMed Central

    Panda, Sasmita; Jena, Smrutiti; Sharma, Savitri; Dhawan, Benu; Nath, Gopal

    2016-01-01

    The aim of this study was to determine sequence types of 34 S. haemolyticus strains isolated from a variety of infections between 2013 and 2016 in India by MLST. The MEGA5.2 software was used to align and compare the nucleotide sequences. The advanced cluster analysis was performed to define the clonal complexes. MLST analysis showed 24 new sequence types (ST) among S. haemolyticus isolates, irrespective of sources and place of isolation. The finding of this study allowed to set up an MLST database on the PubMLST.org website using BIGSdb software and made available at http://pubmlst.org/shaemolyticus/. The data of this study thus suggest that MLST can be used to study population structure and diversity among S. haemolyticus isolates. PMID:27824930

  13. Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology.

    PubMed

    Otto, Thomas D; Sanders, Mandy; Berriman, Matthew; Newbold, Chris

    2010-07-15

    The accuracy of reference genomes is important for downstream analysis but a low error rate requires expensive manual interrogation of the sequence. Here, we describe a novel algorithm (Iterative Correction of Reference Nucleotides) that iteratively aligns deep coverage of short sequencing reads to correct errors in reference genome sequences and evaluate their accuracy. Using Plasmodium falciparum (81% A + T content) as an extreme example, we show that the algorithm is highly accurate and corrects over 2000 errors in the reference sequence. We give examples of its application to numerous other eukaryotic and prokaryotic genomes and suggest additional applications. The software is available at http://icorn.sourceforge.net

  14. Quasispecies Analyses of the HIV-1 Near-full-length Genome With Illumina MiSeq

    PubMed Central

    Ode, Hirotaka; Matsuda, Masakazu; Matsuoka, Kazuhiro; Hachiya, Atsuko; Hattori, Junko; Kito, Yumiko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

    2015-01-01

    Human immunodeficiency virus type-1 (HIV-1) exhibits high between-host genetic diversity and within-host heterogeneity, recognized as quasispecies. Because HIV-1 quasispecies fluctuate in terms of multiple factors, such as antiretroviral exposure and host immunity, analyzing the HIV-1 genome is critical for selecting effective antiretroviral therapy and understanding within-host viral coevolution mechanisms. Here, to obtain HIV-1 genome sequence information that includes minority variants, we sought to develop a method for evaluating quasispecies throughout the HIV-1 near-full-length genome using the Illumina MiSeq benchtop deep sequencer. To ensure the reliability of minority mutation detection, we applied an analysis method of sequence read mapping onto a consensus sequence derived from de novo assembly followed by iterative mapping and subsequent unique error correction. Deep sequencing analyses of aHIV-1 clone showed that the analysis method reduced erroneous base prevalence below 1% in each sequence position and discarded only < 1% of all collected nucleotides, maximizing the usage of the collected genome sequences. Further, we designed primer sets to amplify the HIV-1 near-full-length genome from clinical plasma samples. Deep sequencing of 92 samples in combination with the primer sets and our analysis method provided sufficient coverage to identify >1%-frequency sequences throughout the genome. When we evaluated sequences of pol genes from 18 treatment-naïve patients' samples, the deep sequencing results were in agreement with Sanger sequencing and identified numerous additional minority mutations. The results suggest that our deep sequencing method would be suitable for identifying within-host viral population dynamics throughout the genome. PMID:26617593

  15. Molecular phylogeny for marine turtles based on sequences of the ND4-leucine tRNA and control regions of mitochondrial DNA.

    PubMed

    Dutton, P H; Davis, S K; Guerra, T; Owens, D

    1996-06-01

    Marine turtles are divided into two families, the Dermochelyidae and the Cheloniidae. The majority of species are currently placed within the two tribes of the Cheloniidae, the Chelonini and the Carettini, but debate continues over generic and tribal affinities as well as species boundaries. We used nucleotide sequences (907 bp) from the ND4-LEU tRNA region and the control region (526 bp) of mitochondrial DNA to resolve areas of uncertainty in marine turtle (Chelonioidae) systematics. The ND4-LEU tRNA fragment was more conserved than the fragment from the control region, with sequence divergences ranging from 0.026 to 0.148 and 0.067 to 0.267, respectively. Parsimony analysis based only on the ND4-LEU tRNA data suggests that the hawksbill, Eretmochelys imbricata, lies within the tribe Carettni and is closely related to the genus Caretta, but could not resolve the position of the flatback, Natator depressus. A similar analysis based only on the control region sequence data suggested that N. depressus is affiliated with the Chelonini, but failed to resolve the position of E. imbricata and the loggerhead, Caretta caretta. In contrast to these results, the combination of both data sets with published cytochrome b data produced a phylogeny based on 1924 bp of sequence data which resolves the position of E. imbricata relative to Caretta and Lepidochelys and joins N. depressus as sister to the Carettini. Based on the molecular data, the Chelonini contains the Chelonia species, while the Carettini contains the remaining species of Cheloniidae. The control region sequence divergence between Pacific and Atlantic populations of the leatherback, Dermochelys coriacea, was relatively low (0.0081) when compared with the green turtle, Chelonia mydas (0.071-0.074). Atlantic and Pacific populations of Ch. mydas were found to be paraphyletic with respect to the black turtle, Ch. agassizi, suggesting that the current taxonomic designations within the Pacific Chelonia are questionable. This analysis shows the utility of combining sequence data for different regions of mtDNA that by themselves are insufficient to obtain robust phylogenies.

  16. High quality de novo sequencing and assembly of the Saccharomyces arboricolus genome

    PubMed Central

    2013-01-01

    Background Comparative genomics is a formidable tool to identify functional elements throughout a genome. In the past ten years, studies in the budding yeast Saccharomyces cerevisiae and a set of closely related species have been instrumental in showing the benefit of analyzing patterns of sequence conservation. Increasing the number of closely related genome sequences makes the comparative genomics approach more powerful and accurate. Results Here, we report the genome sequence and analysis of Saccharomyces arboricolus, a yeast species recently isolated in China, that is closely related to S. cerevisiae. We obtained high quality de novo sequence and assemblies using a combination of next generation sequencing technologies, established the phylogenetic position of this species and considered its phenotypic profile under multiple environmental conditions in the light of its gene content and phylogeny. Conclusions We suggest that the genome of S. arboricolus will be useful in future comparative genomics analysis of the Saccharomyces sensu stricto yeasts. PMID:23368932

  17. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    DOE PAGES

    Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.; ...

    2017-04-09

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less

  18. Genome-wide screening of Oryza sativa ssp. japonica and indica reveals a complex family of proteins with ribosome-inactivating protein domains.

    PubMed

    Wytynck, Pieter; Rougé, Pierre; Van Damme, Els J M

    2017-11-01

    Ribosome-inactivating proteins (RIPs) are cytotoxic enzymes capable of halting protein synthesis by irreversible modification of ribosomes. Although RIPs are widespread they are not ubiquitous in the plant kingdom. The physiological importance of RIPs is not fully elucidated, but evidence suggests a role in the protection of the plant against biotic and abiotic stresses. Searches in the rice genome revealed a large and highly complex family of proteins with a RIP domain. A comparative analysis retrieved 38 RIP sequences from the genome sequence of Oryza sativa subspecies japonica and 34 sequences from the subspecies indica. The RIP sequences are scattered over different chromosomes but are mostly found on the third chromosome. The phylogenetic tree revealed the pairwise clustering of RIPs from japonica and indica. Molecular modeling and sequence analysis yielded information on the catalytic site of the enzyme, and suggested that a large part of RIP domains probably possess N-glycosidase activity. Several RIPs are differentially expressed in plant tissues and in response to specific abiotic stresses. This study provides an overview of RIP motifs in rice and will help to understand their biological role(s) and evolutionary relationships. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less

  20. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach.

    PubMed

    Musumeci, Matías A; Lozada, Mariana; Rial, Daniela V; Mac Cormack, Walter P; Jansson, Janet K; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M

    2017-04-09

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.

  1. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    PubMed Central

    Musumeci, Matías A.; Lozada, Mariana; Rial, Daniela V.; Mac Cormack, Walter P.; Jansson, Janet K.; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M.

    2017-01-01

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer–Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments. PMID:28397770

  2. Nucleotide sequence and regulatory studies of VGF, a nervous system-specific mRNA that is rapidly and relatively selectively induced by nerve growth factor.

    PubMed

    Salton, S R

    1991-09-01

    A nervous system-specific mRNA that is rapidly induced in PC12 cells to a greater extent by nerve growth factor (NGF) than by epidermal growth factor treatment has been cloned. The polypeptide deduced from the nucleic acid sequence of the NGF33.1 cDNA clone contains regions of amino acid sequence identity with that predicted by the cDNA clone VGF, and further analysis suggests that both NGF33.1 and VGF cDNA clones very likely correspond to the same mRNA (VGF). In this report both the nucleic acid sequence that corresponds to VGF mRNA and the polypeptide predicted by the NGF33.1 cDNA clone are presented. Genomic Southern analysis and database comparison did not detect additional sequences with high homology to the VGF gene. Induction of VGF mRNA by depolarization and phorbol 12-myristate 13-acetate treatment was greater than by serum stimulation or protein kinase A pathway activation. These studies suggest that VGF mRNA is induced to the greatest extent by NGF treatment and that VGF is one of the most rapidly regulated neuronal mRNAs identified in PC12 cells.

  3. Mutational Analysis of Extranodal NK/T-Cell Lymphoma Using Targeted Sequencing with a Comprehensive Cancer Panel.

    PubMed

    Choi, Seungkyu; Go, Jai Hyang; Kim, Eun Kyung; Lee, Hojung; Lee, Won Mi; Cho, Chun-Sung; Han, Kyudong

    2016-09-01

    Extranodal natural killer (NK)/T-cell lymphoma, nasal type (NKTCL), is a malignant disorder of cytotoxic lymphocytes of NK or T cells. It is an aggressive neoplasm with a very poor prognosis. Although extranodal NKTCL reportedly has a strong association with Epstein-Barr virus, the molecular pathogenesis of NKTCL has been unexplored. The recent technological advancements in next-generation sequencing (NGS) have made DNA sequencing cost- and time-effective, with more reliable results. Using the Ion Proton Comprehensive Cancer Panel, we sequenced 409 cancer-related genes to identify somatic mutations in five NKTCL tissue samples. The sequencing analysis detected 25 mutations in 21 genes. Among them, KMT2D , a histone modification-related gene, was the most frequently mutated gene (four of the five cases). This result was consistent with recent NGS studies that have suggested KMT2D as a novel driver gene in NKTCL. Mutations were also found in ARID1A , a chromatin remodeling gene, and TP53 , which also recurred in recent NGS studies. We also found mutations in 18 novel candidate genes, with molecular functions that were potentially implicated in cancer development. We suggest that these genes may result in multiple oncogenic events and may be used as potential bio-markers of NKTCL in the future.

  4. Core-SINE blocks comprise a large fraction of monotreme genomes; implications for vertebrate chromosome evolution.

    PubMed

    Kirby, Patrick J; Greaves, Ian K; Koina, Edda; Waters, Paul D; Marshall Graves, Jennifer A

    2007-01-01

    The genomes of the egg-laying platypus and echidna are of particular interest because monotremes are the most basal mammal group. The chromosomal distribution of an ancient family of short interspersed repeats (SINEs), the core-SINEs, was investigated to better understand monotreme genome organization and evolution. Previous studies have identified the core-SINE as the predominant SINE in the platypus genome, and in this study we quantified, characterized and localized subfamilies. Dot blot analysis suggested that a very large fraction (32% of the platypus and 16% of the echidna genome) is composed of Mon core-SINEs. Core-SINE-specific primers were used to amplify PCR products from platypus and echidna genomic DNA. Sequence analysis suggests a common consensus sequence Mon 1-B, shared by platypus and echidna, as well as platypus-specific Mon 1-C and echidna specific Mon 1-D consensus sequences. FISH mapping of the Mon core-SINE products to platypus metaphase spreads demonstrates that the Mon-1C subfamily is responsible for the striking Mon core-SINE accumulation in the distal regions of the six large autosomal pairs and the largest X chromosome. This unusual distribution highlights the dichotomy between the seven large chromosome pairs and the 19 smaller pairs in the monotreme karyotype, which has some similarity to the macro- and micro-chromosomes of birds and reptiles, and suggests that accumulation of repetitive sequences may have enlarged small chromosomes in an ancestral vertebrate. In the forthcoming sequence of the platypus genome there are still large gaps, and the extensive Mon core-SINE accumulation on the distal regions of the six large autosomal pairs may provide one explanation for this missing sequence.

  5. Identification of RAN1 orthologue associated with sex determination through whole genome sequencing analysis in fig (Ficus carica L.).

    PubMed

    Mori, Kazuki; Shirasawa, Kenta; Nogata, Hitoshi; Hirata, Chiharu; Tashiro, Kosuke; Habu, Tsuyoshi; Kim, Sangwan; Himeno, Shuichi; Kuhara, Satoru; Ikegami, Hidetoshi

    2017-01-25

    With the aim of identifying sex determinants of fig, we generated the first draft genome sequence of fig and conducted the subsequent analyses. Linkage analysis with a high-density genetic map established by a restriction-site associated sequencing technique, and genome-wide association study followed by whole-genome resequencing analysis identified two missense mutations in RESPONSIVE-TO-ANTAGONIST1 (RAN1) orthologue encoding copper-transporting ATPase completely associated with sex phenotypes of investigated figs. This result suggests that RAN1 is a possible sex determinant candidate in the fig genome. The genomic resources and genetic findings obtained in this study can contribute to general understanding of Ficus species and provide an insight into fig's and plant's sex determination system.

  6. Cellulase Linkers Are Optimized Based on Domain Type and Function: Insights from Sequence Analysis, Biophysical Measurements, and Molecular Simulation

    PubMed Central

    Sammond, Deanne W.; Payne, Christina M.; Brunecky, Roman; Himmel, Michael E.; Crowley, Michael F.; Beckham, Gregg T.

    2012-01-01

    Cellulase enzymes deconstruct cellulose to glucose, and are often comprised of glycosylated linkers connecting glycoside hydrolases (GHs) to carbohydrate-binding modules (CBMs). Although linker modifications can alter cellulase activity, the functional role of linkers beyond domain connectivity remains unknown. Here we investigate cellulase linkers connecting GH Family 6 or 7 catalytic domains to Family 1 or 2 CBMs, from both bacterial and eukaryotic cellulases to identify conserved characteristics potentially related to function. Sequence analysis suggests that the linker lengths between structured domains are optimized based on the GH domain and CBM type, such that linker length may be important for activity. Longer linkers are observed in eukaryotic GH Family 6 cellulases compared to GH Family 7 cellulases. Bacterial GH Family 6 cellulases are found with structured domains in either N to C terminal order, and similar linker lengths suggest there is no effect of domain order on length. O-glycosylation is uniformly distributed across linkers, suggesting that glycans are required along entire linker lengths for proteolysis protection and, as suggested by simulation, for extension. Sequence comparisons show that proline content for bacterial linkers is more than double that observed in eukaryotic linkers, but with fewer putative O-glycan sites, suggesting alternative methods for extension. Conversely, near linker termini where linkers connect to structured domains, O-glycosylation sites are observed less frequently, whereas glycines are more prevalent, suggesting the need for flexibility to achieve proper domain orientations. Putative N-glycosylation sites are quite rare in cellulase linkers, while an N-P motif, which strongly disfavors the attachment of N-glycans, is commonly observed. These results suggest that linkers exhibit features that are likely tailored for optimal function, despite possessing low sequence identity. This study suggests that cellulase linkers may exhibit function in enzyme action, and highlights the need for additional studies to elucidate cellulase linker functions. PMID:23139804

  7. Molecular Evidence of Chlamydia-Like Organisms in the Feces of Myotis daubentonii Bats.

    PubMed

    Hokynar, K; Vesterinen, E J; Lilley, T M; Pulliainen, A T; Korhonen, S J; Paavonen, J; Puolakkainen, M

    2017-01-15

    Chlamydia-like organisms (CLOs) are recently identified members of the Chlamydiales order. CLOs share intracellular lifestyles and biphasic developmental cycles, and they have been detected in environmental samples as well as in various hosts such as amoebae and arthropods. In this study, we screened bat feces for the presence of CLOs by molecular analysis. Using pan-Chlamydiales PCR targeting the 16S rRNA gene, Chlamydiales DNA was detected in 54% of the specimens. PCR amplification, sequencing, and phylogenetic analysis of the 16S rRNA and 23S rRNA genes were used to classify positive specimens and infer their phylogenetic relationships. Most sequences matched best with Rhabdochlamydia species or uncultured Chlamydia sequences identified in ticks. Another set of sequences matched best with sequences of the Chlamydia genus or uncultured Chlamydiales from snakes. To gain evidence of whether CLOs in bat feces are merely diet borne, we analyzed insects trapped from the same location where the bats foraged. Interestingly, the CLO sequences resembling Rhabdochlamydia spp. were detected in insect material as well, but the other set of CLO sequences was not, suggesting that this set might not originate from prey. Thus, bats represent another potential host for Chlamydiales and could harbor novel, previously unidentified members of this order. Several pathogenic viruses are known to colonize bats, and recent analyses indicate that bats are also reservoir hosts for bacterial genera. Chlamydia-like organisms (CLOs) have been detected in several animal species. CLOs have high 16S rRNA sequence similarity to Chlamydiaceae and exhibit similar intracellular lifestyles and biphasic developmental cycles. Our study describes the frequent occurrence of CLO DNA in bat feces, suggesting an expanding host species spectrum for the Chlamydiales As bats can acquire various infectious agents through their diet, prey insects were also studied. We identified CLO sequences in bats that matched best with sequences in prey insects but also CLO sequences not detected in prey insects. This suggests that a portion of CLO DNA present in bat feces is not prey borne. Furthermore, some sequences from bat droppings not originating from their diet might well represent novel, previously unidentified members of the Chlamydiales order. Copyright © 2016 American Society for Microbiology.

  8. Analysis of beta-carotene hydroxylase gene cDNA isolated from the American oil-palm (Elaeis oleifera) mesocarp tissue cDNA library

    PubMed Central

    Bhore, Subhash J; Kassim, Amelia; Loh, Chye Ying; Shah, Farida H

    2010-01-01

    It is well known that the nutritional quality of the American oil-palm (Elaeis oleifera) mesocarp oil is superior to that of African oil-palm (Elaeis guineensis Jacq. Tenera) mesocarp oil. Therefore, it is of important to identify the genetic features for its superior value. This could be achieved through the genome sequencing of the oil-palm. However, the genome sequence is not available in the public domain due to commercial secrecy. Hence, we constructed a cDNA library and generated expressed sequence tags (3,205) from the mesocarp tissue of the American oil-palm. We continued to annotate each of these cDNAs after submitting to GenBank/DDBJ/EMBL. A rough analysis turned our attention to the beta-carotene hydroxylase (Chyb) enzyme encoding cDNA. Then, we completed the full sequencing of cDNA clone for its both strands using M13 forward and reverse primers. The full nucleotide and protein sequence was further analyzed and annotated using various Bioinformatics tools. The analysis results showed the presence of fatty acid hydroxylase superfamily domain in the protein sequence. The multiple sequence alignment of selected Chyb amino acid sequences from other plant species and algal members with E. oleifera Chyb using ClustalW and its phylogenetic analysis suggest that Chyb from monocotyledonous plant species, Lilium hubrid, Crocus sativus and Zea mays are the most evolutionary related with E. oleifera Chyb. This study reports the annotation of E. oleifera Chyb. Abbreviations ESTs - expressed sequence tags, EoChyb - Elaeis oleifera beta-carotene hydroxylase, MC - main cluster PMID:21364789

  9. A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.

    PubMed

    Razvi, F; Gargiulo, G; Worcel, A

    1983-08-01

    Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.

  10. Genomic characterization and taxonomic position of a rhabdovirus from a hybrid snakehead.

    PubMed

    Zeng, Weiwei; Wang, Qing; Wang, Yingying; Liu, Cun; Liang, Hongru; Fang, Xiang; Wu, Shuqin

    2014-09-01

    A new rhabdovirus, tentatively designated as hybrid snakehead rhabdovirus C1207 (HSHRV-C1207), was first isolated from a moribund hybrid snakehead (Channa maculata×Channa argus) in China. We present the complete genome sequence of HSHRV-C1207 and a comprehensive sequence comparison between HSHRV-C1207 and other rhabdoviruses. Sequence alignment and phylogenetic analysis revealed that HSHRV-C1207 shared the highest degree of homology with Monopterus albus rhabdovirus and Siniperca chuatsi rhabdovirus. All three viruses clustered into a single group that was distinct from the recognized genera in the family Rhabdoviridae. Our analysis suggests that HSHRV-C1207, as well as MARV and SCRV, should be assigned to a new rhabdovirus genus.

  11. Exome sequence analysis suggests genetic burden contributes to phenotypic variability and complex neuropathy

    PubMed Central

    Gonzaga-Jauregui, Claudia; Harel, Tamar; Gambin, Tomasz; Kousi, Maria; Griffin, Laurie B.; Francescatto, Ludmila; Ozes, Burcak; Karaca, Ender; Jhangiani, Shalini; Bainbridge, Matthew N.; Lawson, Kim S.; Pehlivan, Davut; Okamoto, Yuji; Withers, Marjorie; Mancias, Pedro; Slavotinek, Anne; Reitnauer, Pamela J; Goksungur, Meryem T.; Shy, Michael; Crawford, Thomas O.; Koenig, Michel; Willer, Jason; Flores, Brittany N.; Pediaditrakis, Igor; Us, Onder; Wiszniewski, Wojciech; Parman, Yesim; Antonellis, Anthony; Muzny, Donna M.; Katsanis, Nicholas; Battaloglu, Esra; Boerwinkle, Eric; Gibbs, Richard A.; Lupski, James R.

    2015-01-01

    Charcot-Marie-Tooth (CMT) disease is a clinically and genetically heterogeneous distal symmetric polyneuropathy. Whole-exome sequencing (WES) of 40 individuals from 37 unrelated families with CMT-like peripheral neuropathy refractory to molecular diagnosis identified apparent causal mutations in ~45% (17/37) of families. Three candidate disease genes are proposed, supported by a combination of genetic and in vivo studies. Aggregate analysis of mutation data revealed a significantly increased number of rare variants across 58 neuropathy associated genes in subjects versus controls; confirmed in a second ethnically discrete neuropathy cohort, suggesting mutation burden potentially contributes to phenotypic variability. Neuropathy genes shown to have highly penetrant Mendelizing variants (HMPVs) and implicated by burden in families were shown to interact genetically in a zebrafish assay exacerbating the phenotype established by the suppression of single genes. Our findings suggest that the combinatorial effect of rare variants contributes to disease burden and variable expressivity. PMID:26257172

  12. Great majority of recombination events in Arabidopsis are gene conversion events

    PubMed Central

    Yang, Sihai; Yuan, Yang; Wang, Long; Li, Jing; Wang, Wen; Liu, Haoxuan; Chen, Jian-Qun; Hurst, Laurence D.; Tian, Dacheng

    2012-01-01

    The evolutionary importance of meiosis may not solely be associated with allelic shuffling caused by crossing-over but also have to do with its more immediate effects such as gene conversion. Although estimates of the crossing-over rate are often well resolved, the gene conversion rate is much less clear. In Arabidopsis, for example, next-generation sequencing approaches suggest that the two rates are about the same, which contrasts with indirect measures, these suggesting an excess of gene conversion. Here, we provide analysis of this problem by sequencing 40 F2 Arabidopsis plants and their parents. Small gene conversion tracts, with biased gene conversion content, represent over 90% (probably nearer 99%) of all recombination events. The rate of alteration of protein sequence caused by gene conversion is over 600 times that caused by mutation. Finally, our analysis reveals recombination hot spots and unexpectedly high recombination rates near centromeres. This may be responsible for the previously unexplained pattern of high genetic diversity near Arabidopsis centromeres. PMID:23213238

  13. Genetic diversity of HIV-1 non-B strains in Sicily: evidence of intersubtype recombinants by sequence analysis of gag, pol, and env genes.

    PubMed

    Tramuto, Fabio; Bonura, Filippa; Perna, Anna Maria; Mancuso, Salvatrice; Firenze, Alberto; Romano, Nino; Vitale, Francesco

    2007-09-01

    The molecular epidemiology of HIV-1 strains in Sicily (Italy) was phylogenetically investigated by the analysis of HIV-1 gag, pol, and env gene sequences from 11 HIV-1 non-B strains from 408 HIV-1-seropositive patients observed from September 2001 to August 2006. Sequences suggestive of recombination were further investigated by bootscanning analysis of various fragments. Overall, we identified several second-generation recombinant (SGRs) strains, which contained genetic material of CRF02_AG in at least one gene. Notably, three individuals were found to be infected with subsubtype A3, and one of them showed genetic recombination with subsubtype A4. The current study emphasizes the genetic analysis of gag, pol, and env genes as a powerful tool to trace the spread of complex HIV-1 recombinant forms, and highlight the genetic diversity of HIV-1 non-B strains in Italy.

  14. MetaDP: a comprehensive web server for disease prediction of 16S rRNA metagenomic datasets.

    PubMed

    Xu, Xilin; Wu, Aiping; Zhang, Xinlei; Su, Mingming; Jiang, Taijiao; Yuan, Zhe-Ming

    2016-01-01

    High-throughput sequencing-based metagenomics has garnered considerable interest in recent years. Numerous methods and tools have been developed for the analysis of metagenomic data. However, it is still a daunting task to install a large number of tools and complete a complicated analysis, especially for researchers with minimal bioinformatics backgrounds. To address this problem, we constructed an automated software named MetaDP for 16S rRNA sequencing data analysis, including data quality control, operational taxonomic unit clustering, diversity analysis, and disease risk prediction modeling. Furthermore, a support vector machine-based prediction model for intestinal bowel syndrome (IBS) was built by applying MetaDP to microbial 16S sequencing data from 108 children. The success of the IBS prediction model suggests that the platform may also be applied to other diseases related to gut microbes, such as obesity, metabolic syndrome, or intestinal cancer, among others (http://metadp.cn:7001/).

  15. Phylogenetic Network Analysis Revealed the Occurrence of Horizontal Gene Transfer of 16S rRNA in the Genus Enterobacter

    PubMed Central

    Sato, Mitsuharu; Miyazaki, Kentaro

    2017-01-01

    Horizontal gene transfer (HGT) is a ubiquitous genetic event in bacterial evolution, but it seldom occurs for genes involved in highly complex supramolecules (or biosystems), which consist of many gene products. The ribosome is one such supramolecule, but several bacteria harbor dissimilar and/or chimeric 16S rRNAs in their genomes, suggesting the occurrence of HGT of this gene. However, we know little about whether the genes actually experience HGT and, if so, the frequency of such a transfer. This is primarily because the methods currently employed for phylogenetic analysis (e.g., neighbor-joining, maximum likelihood, and maximum parsimony) of 16S rRNA genes assume point mutation-driven tree-shape evolution as an evolutionary model, which is intrinsically inappropriate to decipher the evolutionary history for genes driven by recombination. To address this issue, we applied a phylogenetic network analysis, which has been used previously for detection of genetic recombination in homologous alleles, to the 16S rRNA gene. We focused on the genus Enterobacter, whose phylogenetic relationships inferred by multi-locus sequence alignment analysis and 16S rRNA sequences are incompatible. All 10 complete genomic sequences were retrieved from the NCBI database, in which 71 16S rRNA genes were included. Neighbor-joining analysis demonstrated that the genes residing in the same genomes clustered, indicating the occurrence of intragenomic recombination. However, as suggested by the low bootstrap values, evolutionary relationships between the clusters were uncertain. We then applied phylogenetic network analysis to representative sequences from each cluster. We found three ancestral 16S rRNA groups; the others were likely created through recursive recombination between the ancestors and chimeric descendants. Despite the large sequence changes caused by the recombination events, the RNA secondary structures were conserved. Successive intergenomic and intragenomic recombination thus shaped the evolution of 16S rRNA genes in the genus Enterobacter. PMID:29180992

  16. A novel typing method for Listeria monocytogenes using high-resolution melting analysis (HRMA) of tandem repeat regions.

    PubMed

    Ohshima, Chihiro; Takahashi, Hajime; Iwakawa, Ai; Kuda, Takashi; Kimura, Bon

    2017-07-17

    Listeria monocytogenes, which is responsible for causing food poisoning known as listeriosis, infects humans and animals. Widely distributed in the environment, this bacterium is known to contaminate food products after being transmitted to factories via raw materials. To minimize the contamination of products by food pathogens, it is critical to identify and eliminate factory entry routes and pathways for the causative bacteria. High resolution melting analysis (HRMA) is a method that takes advantage of differences in DNA sequences and PCR product lengths that are reflected by the disassociation temperature. Through our research, we have developed a multiple locus variable-number tandem repeat analysis (MLVA) using HRMA as a simple and rapid method to differentiate L. monocytogenes isolates. While evaluating our developed method, the ability of MLVA-HRMA, MLVA using capillary electrophoresis, and multilocus sequence typing (MLST) was compared for their ability to discriminate between strains. The MLVA-HRMA method displayed greater discriminatory ability than MLST and MLVA using capillary electrophoresis, suggesting that the variation in the number of repeat units, along with mutations within the DNA sequence, was accurately reflected by the melting curve of HRMA. Rather than relying on DNA sequence analysis or high-resolution electrophoresis, the MLVA-HRMA method employs the same process as PCR until the analysis step, suggesting a combination of speed and simplicity. The result of MLVA-HRMA method is able to be shared between different laboratories. There are high expectations that this method will be adopted for regular inspections at food processing facilities in the near future. Copyright © 2017. Published by Elsevier B.V.

  17. Genes encoding calmodulin-binding proteins in the Arabidopsis genome

    NASA Technical Reports Server (NTRS)

    Reddy, Vaka S.; Ali, Gul S.; Reddy, Anireddy S N.

    2002-01-01

    Analysis of the recently completed Arabidopsis genome sequence indicates that approximately 31% of the predicted genes could not be assigned to functional categories, as they do not show any sequence similarity with proteins of known function from other organisms. Calmodulin (CaM), a ubiquitous and multifunctional Ca(2+) sensor, interacts with a wide variety of cellular proteins and modulates their activity/function in regulating diverse cellular processes. However, the primary amino acid sequence of the CaM-binding domain in different CaM-binding proteins (CBPs) is not conserved. One way to identify most of the CBPs in the Arabidopsis genome is by protein-protein interaction-based screening of expression libraries with CaM. Here, using a mixture of radiolabeled CaM isoforms from Arabidopsis, we screened several expression libraries prepared from flower meristem, seedlings, or tissues treated with hormones, an elicitor, or a pathogen. Sequence analysis of 77 positive clones that interact with CaM in a Ca(2+)-dependent manner revealed 20 CBPs, including 14 previously unknown CBPs. In addition, by searching the Arabidopsis genome sequence with the newly identified and known plant or animal CBPs, we identified a total of 27 CBPs. Among these, 16 CBPs are represented by families with 2-20 members in each family. Gene expression analysis revealed that CBPs and CBP paralogs are expressed differentially. Our data suggest that Arabidopsis has a large number of CBPs including several plant-specific ones. Although CaM is highly conserved between plants and animals, only a few CBPs are common to both plants and animals. Analysis of Arabidopsis CBPs revealed the presence of a variety of interesting domains. Our analyses identified several hypothetical proteins in the Arabidopsis genome as CaM targets, suggesting their involvement in Ca(2+)-mediated signaling networks.

  18. Multi-Virulence-Locus Sequence Typing of Staphylococcus lugdunensis Generates Results Consistent with a Clonal Population Structure and Is Reliable for Epidemiological Typing

    PubMed Central

    Didi, Jennifer; Lemée, Ludovic; Gibert, Laure; Pons, Jean-Louis

    2014-01-01

    Staphylococcus lugdunensis is an emergent virulent coagulase-negative staphylococcus responsible for severe infections similar to those caused by Staphylococcus aureus. To understand its potentially pathogenic capacity and have further detailed knowledge of the molecular traits of this organism, 93 isolates from various geographic origins were analyzed by multi-virulence-locus sequence typing (MVLST), targeting seven known or putative virulence-associated loci (atlLR2, atlLR3, hlb, isdJ, SLUG_09050, SLUG_16930, and vwbl). The polymorphisms of the putative virulence-associated loci were moderate and comparable to those of the housekeeping genes analyzed by multilocus sequence typing (MLST). However, the MVLST scheme generated 43 virulence types (VTs) compared to 20 sequence types (STs) based on MLST, indicating that MVLST was significantly more discriminating (Simpson's index [D], 0.943). No hypervirulent lineage or cluster specific to carriage strains was defined. The results of multilocus sequence analysis of known and putative virulence-associated loci are consistent with a clonal population structure for S. lugdunensis, suggesting a coevolution of these genes with housekeeping genes. Indeed, the nonsynonymous to synonymous evolutionary substitutions (dN/dS) ratio, the Tajima's D test, and Single-likelihood ancestor counting (SLAC) analysis suggest that all virulence-associated loci were under negative selection, even atlLR2 (AtlL protein) and SLUG_16930 (FbpA homologue), for which the dN/dS ratios were higher. In addition, this analysis of virulence-associated loci allowed us to propose a trilocus sequence typing scheme based on the intragenic regions of atlLR3, isdJ, and SLUG_16930, which is more discriminant than MLST for studying short-term epidemiology and further characterizing the lineages of the rare but highly pathogenic S. lugdunensis. PMID:25078912

  19. Subsurface microbial diversity in deep-granitic-fracture water in Colorado

    USGS Publications Warehouse

    Sahl, J.W.; Schmidt, R.; Swanner, E.D.; Mandernack, K.W.; Templeton, A.S.; Kieft, Thomas L.; Smith, R.L.; Sanford, W.E.; Callaghan, R.L.; Mitton, J.B.; Spear, J.R.

    2008-01-01

    A microbial community analysis using 16S rRNA gene sequencing was performed on borehole water and a granite rock core from Henderson Mine, a >1,000-meter-deep molybdenum mine near Empire, CO. Chemical analysis of borehole water at two separate depths (1,044 m and 1,004 m below the mine entrance) suggests that a sharp chemical gradient exists, likely from the mixing of two distinct subsurface fluids, one metal rich and one relatively dilute; this has created unique niches for microorganisms. The microbial community analyzed from filtered, oxic borehole water indicated an abundance of sequences from iron-oxidizing bacteria (Gallionella spp.) and was compared to the community from the same borehole after 2 weeks of being plugged with an expandable packer. Statistical analyses with UniFrac revealed a significant shift in community structure following the addition of the packer. Phospholipid fatty acid (PLFA) analysis suggested that Nitrosomonadales dominated the oxic borehole, while PLFAs indicative of anaerobic bacteria were most abundant in the samples from the plugged borehole. Microbial sequences were represented primarily by Firmicutes, Proteobacteria, and a lineage of sequences which did not group with any identified bacterial division; phylogenetic analyses confirmed the presence of a novel candidate division. This "Henderson candidate division" dominated the clone libraries from the dilute anoxic fluids. Sequences obtained from the granitic rock core (1,740 m below the surface) were represented by the divisions Proteobacteria (primarily the family Ralstoniaceae) and Firmicutes. Sequences grouping within Ralstoniaceae were also found in the clone libraries from metal-rich fluids yet were absent in more dilute fluids. Lineage-specific comparisons, combined with phylogenetic statistical analyses, show that geochemical variance has an important effect on microbial community structure in deep, subsurface systems. Copyright ?? 2008, American Society for Microbiology. All Rights Reserved.

  20. Subsurface Microbial Diversity in Deep-Granitic-Fracture Water in Colorado▿

    PubMed Central

    Sahl, Jason W.; Schmidt, Raleigh; Swanner, Elizabeth D.; Mandernack, Kevin W.; Templeton, Alexis S.; Kieft, Thomas L.; Smith, Richard L.; Sanford, William E.; Callaghan, Robert L.; Mitton, Jeffry B.; Spear, John R.

    2008-01-01

    A microbial community analysis using 16S rRNA gene sequencing was performed on borehole water and a granite rock core from Henderson Mine, a >1,000-meter-deep molybdenum mine near Empire, CO. Chemical analysis of borehole water at two separate depths (1,044 m and 1,004 m below the mine entrance) suggests that a sharp chemical gradient exists, likely from the mixing of two distinct subsurface fluids, one metal rich and one relatively dilute; this has created unique niches for microorganisms. The microbial community analyzed from filtered, oxic borehole water indicated an abundance of sequences from iron-oxidizing bacteria (Gallionella spp.) and was compared to the community from the same borehole after 2 weeks of being plugged with an expandable packer. Statistical analyses with UniFrac revealed a significant shift in community structure following the addition of the packer. Phospholipid fatty acid (PLFA) analysis suggested that Nitrosomonadales dominated the oxic borehole, while PLFAs indicative of anaerobic bacteria were most abundant in the samples from the plugged borehole. Microbial sequences were represented primarily by Firmicutes, Proteobacteria, and a lineage of sequences which did not group with any identified bacterial division; phylogenetic analyses confirmed the presence of a novel candidate division. This “Henderson candidate division” dominated the clone libraries from the dilute anoxic fluids. Sequences obtained from the granitic rock core (1,740 m below the surface) were represented by the divisions Proteobacteria (primarily the family Ralstoniaceae) and Firmicutes. Sequences grouping within Ralstoniaceae were also found in the clone libraries from metal-rich fluids yet were absent in more dilute fluids. Lineage-specific comparisons, combined with phylogenetic statistical analyses, show that geochemical variance has an important effect on microbial community structure in deep, subsurface systems. PMID:17981950

  1. Initial genome sequencing and analysis of multiple myeloma

    PubMed Central

    Chapman, Michael A.; Lawrence, Michael S.; Keats, Jonathan J.; Cibulskis, Kristian; Sougnez, Carrie; Schinzel, Anna C.; Harview, Christina L.; Brunet, Jean-Philippe; Ahmann, Gregory J.; Adli, Mazhar; Anderson, Kenneth C.; Ardlie, Kristin G.; Auclair, Daniel; Baker, Angela; Bergsagel, P. Leif; Bernstein, Bradley E.; Drier, Yotam; Fonseca, Rafael; Gabriel, Stacey B.; Hofmeister, Craig C.; Jagannath, Sundar; Jakubowiak, Andrzej J.; Krishnan, Amrita; Levy, Joan; Liefeld, Ted; Lonial, Sagar; Mahan, Scott; Mfuko, Bunmi; Monti, Stefano; Perkins, Louise M.; Onofrio, Robb; Pugh, Trevor J.; Vincent Rajkumar, S.; Ramos, Alex H.; Siegel, David S.; Sivachenko, Andrey; Trudel, Suzanne; Vij, Ravi; Voet, Douglas; Winckler, Wendy; Zimmerman, Todd; Carpten, John; Trent, Jeff; Hahn, William C.; Garraway, Levi A.; Meyerson, Matthew; Lander, Eric S.; Getz, Gad; Golub, Todd R.

    2013-01-01

    Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumor genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the dataset. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-κB signaling was suggested by mutations in 11 members of the NF-κB pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge. PMID:21430775

  2. Genomic characterisation of the feline sarcoid-associated papillomavirus and proposed classification as Bos taurus papillomavirus type 14.

    PubMed

    Munday, John S; Thomson, Neroli; Dunowska, Magda; Knight, Cameron G; Laurie, Rebecca E; Hills, Simon

    2015-06-12

    Feline sarcoids are rare mesenchymal neoplasms of domestic and exotic cats. Previous studies have consistently detected short DNA sequences from a papillomavirus (PV), designated feline sarcoid-associated papillomavirus (FeSarPV), in these neoplasms. The FeSarPV sequence has never been detected in any non-sarcoid sample from cats but has been amplified from the skin of cattle suggesting that feline sarcoids are caused by cross-species infection by a bovine papillomavirus (BPV). The aim of the present study was to determine the genome of the PV that contains the FeSarPV sequence. Using the circular nature of PV DNA, four specifically designed 'outward facing' primers were used to amplify two approximately 4,000 bp DNA segments from a feline sarcoid. The two PCR products were sequenced using next generation sequencing and the full genome of the PV, consisting 7,966 bp, was assembled and analysed. Phylogenetic analysis revealed the PV was closely related to the species 4 delta BPVs-1, -2, and -13, but distantly related to any carnivoran PV genus. These results are consistent with feline sarcoids being caused by a BPV type and we propose a classification of BPV-14 for this novel PV. Initial analysis suggests that, like other delta BPVs, the BPV-14 E5 protein could cause mesenchymal proliferation by binding to the platelet derived growth factor beta receptor. Interestingly BPV-14 has not been detected in any equine sarcoid suggesting that BPV-14 has a host range that is limited to bovids and felids. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Levels of integration in cognitive control and sequence processing in the prefrontal cortex.

    PubMed

    Bahlmann, Jörg; Korb, Franziska M; Gratton, Caterina; Friederici, Angela D

    2012-01-01

    Cognitive control is necessary to flexibly act in changing environments. Sequence processing is needed in language comprehension to build the syntactic structure in sentences. Functional imaging studies suggest that sequence processing engages the left ventrolateral prefrontal cortex (PFC). In contrast, cognitive control processes additionally recruit bilateral rostral lateral PFC regions. The present study aimed to investigate these two types of processes in one experimental paradigm. Sequence processing was manipulated using two different sequencing rules varying in complexity. Cognitive control was varied with different cue-sets that determined the choice of a sequencing rule. Univariate analyses revealed distinct PFC regions for the two types of processing (i.e. sequence processing: left ventrolateral PFC and cognitive control processing: bilateral dorsolateral and rostral PFC). Moreover, in a common brain network (including left lateral PFC and intraparietal sulcus) no interaction between sequence and cognitive control processing was observed. In contrast, a multivariate pattern analysis revealed an interaction of sequence and cognitive control processing, such that voxels in left lateral PFC and parietal cortex showed different tuning functions for tasks involving different sequencing and cognitive control demands. These results suggest that the difference between the process of rule selection (i.e. cognitive control) and the process of rule-based sequencing (i.e. sequence processing) find their neuronal underpinnings in distinct activation patterns in lateral PFC. Moreover, the combination of rule selection and rule sequencing can shape the response of neurons in lateral PFC and parietal cortex.

  4. Levels of Integration in Cognitive Control and Sequence Processing in the Prefrontal Cortex

    PubMed Central

    Bahlmann, Jörg; Korb, Franziska M.; Gratton, Caterina; Friederici, Angela D.

    2012-01-01

    Cognitive control is necessary to flexibly act in changing environments. Sequence processing is needed in language comprehension to build the syntactic structure in sentences. Functional imaging studies suggest that sequence processing engages the left ventrolateral prefrontal cortex (PFC). In contrast, cognitive control processes additionally recruit bilateral rostral lateral PFC regions. The present study aimed to investigate these two types of processes in one experimental paradigm. Sequence processing was manipulated using two different sequencing rules varying in complexity. Cognitive control was varied with different cue-sets that determined the choice of a sequencing rule. Univariate analyses revealed distinct PFC regions for the two types of processing (i.e. sequence processing: left ventrolateral PFC and cognitive control processing: bilateral dorsolateral and rostral PFC). Moreover, in a common brain network (including left lateral PFC and intraparietal sulcus) no interaction between sequence and cognitive control processing was observed. In contrast, a multivariate pattern analysis revealed an interaction of sequence and cognitive control processing, such that voxels in left lateral PFC and parietal cortex showed different tuning functions for tasks involving different sequencing and cognitive control demands. These results suggest that the difference between the process of rule selection (i.e. cognitive control) and the process of rule-based sequencing (i.e. sequence processing) find their neuronal underpinnings in distinct activation patterns in lateral PFC. Moreover, the combination of rule selection and rule sequencing can shape the response of neurons in lateral PFC and parietal cortex. PMID:22952762

  5. Streptococcus oriloxodontae sp. nov., isolated from the oral cavities of elephants.

    PubMed

    Shinozaki-Kuwahara, Noriko; Saito, Masanori; Hirasawa, Masatomo; Takada, Kazuko

    2014-11-01

    Two strains were isolated from oral cavity samples of healthy elephants. The isolates were Gram-positive, catalase-negative, coccus-shaped organisms that were tentatively identified as a streptococcal species based on the results of biochemical tests. Comparative 16S rRNA gene sequence analysis suggested classification of these organisms in the genus Streptococcus with Streptococcus criceti ATCC 19642(T) and Streptococcus orisuis NUM 1001(T) as their closest phylogenetic neighbours with 98.2 and 96.9% gene sequence similarity, respectively. When multi-locus sequence analysis using four housekeeping genes, groEL, rpoB, gyrB and sodA, was carried out, similarity of concatenated sequences of the four housekeeping genes from the new isolates and Streptococcus mutans was 89.7%. DNA-DNA hybridization experiments suggested that the new isolates were distinct from S. criceti and other species of the genus Streptococcus. On the basis of genotypic and phenotypic differences, it is proposed that the novel isolates are classified in the genus Streptococcus as representatives of Streptococcus oriloxodontae sp. nov. The type strain of S. oriloxodontae is NUM 2101(T) ( =JCM 19285(T) =DSM 27377(T)). © 2014 IUMS.

  6. Influence of Cognitive Functioning on Age-Related Performance Declines in Visuospatial Sequence Learning.

    PubMed

    Krüger, Melanie; Hinder, Mark R; Puri, Rohan; Summers, Jeffery J

    2017-01-01

    Objectives: The aim of this study was to investigate how age-related performance differences in a visuospatial sequence learning task relate to age-related declines in cognitive functioning. Method: Cognitive functioning of 18 younger and 18 older participants was assessed using a standardized test battery. Participants then undertook a perceptual visuospatial sequence learning task. Various relationships between sequence learning and participants' cognitive functioning were examined through correlation and factor analysis. Results: Older participants exhibited significantly lower performance than their younger counterparts in the sequence learning task as well as in multiple cognitive functions. Factor analysis revealed two independent subsets of cognitive functions associated with performance in the sequence learning task, related to either the processing and storage of sequence information (first subset) or problem solving (second subset). Age-related declines were only found for the first subset of cognitive functions, which also explained a significant degree of the performance differences in the sequence learning task between age-groups. Discussion: The results suggest that age-related performance differences in perceptual visuospatial sequence learning can be explained by declines in the ability to process and store sequence information in older adults, while a set of cognitive functions related to problem solving mediates performance differences independent of age.

  7. It’s More Than Stamp Collecting: How Genome Sequencing Can Unify Biological Research

    PubMed Central

    Richards, Stephen

    2015-01-01

    The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, whilst the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to “Big Science” survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. PMID:26003218

  8. It's more than stamp collecting: how genome sequencing can unify biological research.

    PubMed

    Richards, Stephen

    2015-07-01

    The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, while the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to 'big science' survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. A novel endogenous betaretrovirus group characterized from polar bears (Ursus maritimus) and giant pandas (Ailuropoda melanoleuca).

    PubMed

    Mayer, Jens; Tsangaras, Kyriakos; Heeger, Felix; Avila-Arcos, María; Stenglein, Mark D; Chen, Wei; Sun, Wei; Mazzoni, Camila J; Osterrieder, Nikolaus; Greenwood, Alex D

    2013-08-15

    Transcriptome analysis of polar bears (Ursus maritimus) yielded sequences with highest similarity to the human endogenous retrovirus group HERV-K(HML-2). Further analysis of the polar bear draft genome identified an endogenous betaretrovirus group comprising 26 proviral copies and 231 solo LTRs. Molecular dating indicates the group originated before the divergence of bears from a common ancestor but is not present in all carnivores. Closely related sequences were identified in the giant panda (Ailuropoda melanoleuca) and characterized from its genome. We have designated the polar bear and giant panda sequences U. maritimus endogenous retrovirus (UmaERV) and A. melanoleuca endogenous retrovirus (AmeERV), respectively. Phylogenetic analysis demonstrated that the bear virus group is nested within the HERV-K supergroup among bovine and bat endogenous retroviruses suggesting a complex evolutionary history within the HERV-K group. All individual remnants of proviral sequences contain numerous frameshifts and stop codons and thus, the virus is likely non-infectious. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Complete genome sequence of Enterobacter sp. IIT-BT 08: A potential microbial strain for high rate hydrogen production.

    PubMed

    Khanna, Namita; Ghosh, Ananta Kumar; Huntemann, Marcel; Deshpande, Shweta; Han, James; Chen, Amy; Kyrpides, Nikos; Mavrommatis, Kostas; Szeto, Ernest; Markowitz, Victor; Ivanova, Natalia; Pagani, Ioanna; Pati, Amrita; Pitluck, Sam; Nolan, Matt; Woyke, Tanja; Teshima, Hazuki; Chertkov, Olga; Daligault, Hajnalka; Davenport, Karen; Gu, Wei; Munk, Christine; Zhang, Xiaojing; Bruce, David; Detter, Chris; Xu, Yan; Quintana, Beverly; Reitenga, Krista; Kunde, Yulia; Green, Lance; Erkkila, Tracy; Han, Cliff; Brambilla, Evelyne-Marie; Lang, Elke; Klenk, Hans-Peter; Goodwin, Lynne; Chain, Patrick; Das, Debabrata

    2013-12-20

    Enterobacter sp. IIT-BT 08 belongs to Phylum: Proteobacteria, Class: Gammaproteobacteria, Order: Enterobacteriales, Family: Enterobacteriaceae. The organism was isolated from the leaves of a local plant near the Kharagpur railway station, Kharagpur, West Bengal, India. It has been extensively studied for fermentative hydrogen production because of its high hydrogen yield. For further enhancement of hydrogen production by strain development, complete genome sequence analysis was carried out. Sequence analysis revealed that the genome was linear, 4.67 Mbp long and had a GC content of 56.01%. The genome properties encode 4,393 protein-coding and 179 RNA genes. Additionally, a putative pathway of hydrogen production was suggested based on the presence of formate hydrogen lyase complex and other related genes identified in the genome. Thus, in the present study we describe the specific properties of the organism and the generation, annotation and analysis of its genome sequence as well as discuss the putative pathway of hydrogen production by this organism.

  11. A novel endogenous betaretrovirus group characterized from polar bears (Ursus maritimus) and giant pandas (Ailuropoda melanoleuca)

    PubMed Central

    Mayer, Jens; Tsangaras, Kyriakos; Heeger, Felix; Ávila-Arcos, Maria; Stenglein, Mark D.; Chen, Wei; Sun, Wei; Mazzoni, Camila; Osterrieder, Nikolaus; Greenwood, Alex D.

    2013-01-01

    Transcriptome analysis of polar bears (Ursus maritimus) yielded sequences with highest similarity to the human endogenous retrovirus group HERV-K(HML-2). Further analysis of the polar bear draft genome identified an endogenous betaretrovirus group comprising 26 proviral copies and 231 solo LTRs. Molecular dating indicates the group originated before the divergence of bears from a common ancestor but is not present in all carnivores. Closely related sequences were identified in the giant panda (Ailuropoda melanoleuca) and characterized from its genome. We have designated the polar bear and giant panda sequences Ursus maritimus endogenous retrovirus (UmaERV) and Ailuropoda melanoleuca endogenous retrovirus (AmeERV), respectively. Phylogenetic analysis demonstrated that the bear virus group is nested within the HERV-K supergroup among bovine and bat endogenous retroviruses suggesting a complex evolutionary history within the HERV-K group. All individual remnants of proviral sequences contain numerous frameshifts and stop codons and thus, the virus is likely non-infectious. PMID:23725819

  12. Strain-specific and pooled genome sequences for populations of Drosophila melanogaster from three continents.

    PubMed Central

    Bergman, Casey M.; Haddrill, Penelope R.

    2015-01-01

    To contribute to our general understanding of the evolutionary forces that shape variation in genome sequences in nature, we have sequenced genomes from 50 isofemale lines and six pooled samples from populations of Drosophila melanogaster on three continents. Analysis of raw and reference-mapped reads indicates the quality of these genomic sequence data is very high. Comparison of the predicted and experimentally-determined Wolbachia infection status of these samples suggests that strain or sample swaps are unlikely to have occurred in the generation of these data. Genome sequences are freely available in the European Nucleotide Archive under accession ERP009059. Isofemale lines can be obtained from the Drosophila Species Stock Center. PMID:25717372

  13. Strain-specific and pooled genome sequences for populations of Drosophila melanogaster from three continents.

    PubMed

    Bergman, Casey M; Haddrill, Penelope R

    2015-01-01

    To contribute to our general understanding of the evolutionary forces that shape variation in genome sequences in nature, we have sequenced genomes from 50 isofemale lines and six pooled samples from populations of Drosophila melanogaster on three continents. Analysis of raw and reference-mapped reads indicates the quality of these genomic sequence data is very high. Comparison of the predicted and experimentally-determined Wolbachia infection status of these samples suggests that strain or sample swaps are unlikely to have occurred in the generation of these data. Genome sequences are freely available in the European Nucleotide Archive under accession ERP009059. Isofemale lines can be obtained from the Drosophila Species Stock Center.

  14. Pstl repeat: a family of short interspersed nucleotide element (SINE)-like sequences in the genomes of cattle, goat, and buffalo.

    PubMed

    Sheikh, Faruk G; Mukhopadhyay, Sudit S; Gupta, Prabhakar

    2002-02-01

    The PstI family of elements are short, highly repetitive DNA sequences interspersed throughout the genome of the Bovidae. We have cloned and sequenced some members of the PstI family from cattle, goat, and buffalo. These elements are approximately 500 bp, have a copy number of 2 x 10(5) - 4 x 10(5), and comprise about 4% of the haploid genome. Studies of nucleotide sequence homology indicate that the buffalo and goat PstI repeats (type II) are similar types of short interspersed nucleotide element (SINE) sequences, but the cattle PstI repeat (type I) is considerably more divergent. Additionally, the goat PstI sequence showed significant sequence homology with bovine serine tRNA, and is therefore likely derived from serine tRNA. Interestingly, Southern hybridization suggests that both types of SINEs (I and II) are present in all the species of Bovidae. Dendrogram analysis indicates that cattle PstI SINE is similar to bovine Alu-like SINEs. Goat and buffalo SINEs formed a separate cluster, suggesting that these two types of SINEs evolved separately in the genome of the Bovidae.

  15. Phylogenetic Analysis of Aedes aegypti Based on Mitochondrial ND4 Gene Sequences in Almadinah, Saudi Arabia.

    PubMed

    Ali, Khalil H Al; El-Badry, Ayman A; Ali, Mouhanad Al; El-Sayed, Wael S M; El-Beshbishy, Hesham A

    2016-06-01

    Aedes aegypti is the main vector of the yellow fever and dengue virus. This mosquito has become the major indirect cause of morbidity and mortality of the human worldwide. Dengue virus activity has been reported recently in the western areas of Saudi Arabia. There is no vaccine for dengue virus until now, and the control of the disease depends on the control of the vector. The present study has aimed to perform phylogenetic analysis of Aedes aegypti based on mitochondrial NADH dehydrogenase subunit 4 ( ND4 ) gene at Almadinah, Saudi Arabia in order to get further insight into the epidemiology and transmission of this vector. Mitochondrial ND4 gene was sequenced in the eight isolated Aedes aegypti mosquitoes from Almadinah, Saudi Arabia, sequences were aligned, and phylogenetic analysis were performed and compared with 54 sequences of Aedes reported in the previous studies from Mexico, Thailand, Brazil, and Africa. Our results suggest that increased gene flow among Aedes aegypti populations occurs between Africa and Saudi Arabia. Phylogenetic relationship analysis showed two genetically distinct Aedes aegypti in Saudi Arabia derived from dual African ancestor.

  16. Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation.

    PubMed

    Ralph, Duncan K; Matsen, Frederick A

    2016-01-01

    VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.

  17. Molecular characterization of Giardia psittaci by multilocus sequence analysis.

    PubMed

    Abe, Niichiro; Makino, Ikuko; Kojima, Atsushi

    2012-12-01

    Multilocus sequence analyses targeting small subunit ribosomal DNA (SSU rDNA), elongation factor 1 alpha (ef1α), glutamate dehydrogenase (gdh), and beta giardin (β-giardin) were performed on Giardia psittaci isolates from three Budgerigars (Melopsittacus undulates) and four Barred parakeets (Bolborhynchus lineola) kept in individual households or imported from overseas. Nucleotide differences and phylogenetic analyses at four loci indicate the distinction of G. psittaci from the other known Giardia species: Giardia muris, Giardia microti, Giardia ardeae, and Giardia duodenalis assemblages. Furthermore, G. psittaci was related more closely to G. duodenalis than to the other known Giardia species, except for G. microti. Conflicting signals regarded as "double peaks" were found at the same nucleotide positions of the ef1α in all isolates. However, the sequences of the other three loci, including gdh and β-giardin, which are known to be highly variable, from all isolates were also mutually identical at every locus. They showed no double peaks. These results suggest that double peaks found in the ef1α sequences are caused not by mixed infection with genetically different G. psittaci isolates but by allelic sequence heterogeneity (ASH), which is observed in diplomonad lineages including G. duodenalis. No sequence difference was found in any G. psittaci isolates at the gdh and β-giardin, suggesting that G. psittaci is indeed not more diverse genetically than other Giardia species. This report is the first to provide evidence related to the genetic characteristics of G. psittaci obtained using multilocus sequence analysis. Copyright © 2012 Elsevier B.V. All rights reserved.

  18. Potential ligand-binding residues in rat olfactory receptors identified by correlated mutation analysis

    NASA Technical Reports Server (NTRS)

    Singer, M. S.; Oliveira, L.; Vriend, G.; Shepherd, G. M.

    1995-01-01

    A family of G-protein-coupled receptors is believed to mediate the recognition of odor molecules. In order to identify potential ligand-binding residues, we have applied correlated mutation analysis to receptor sequences from the rat. This method identifies pairs of sequence positions where residues remain conserved or mutate in tandem, thereby suggesting structural or functional importance. The analysis supported molecular modeling studies in suggesting several residues in positions that were consistent with ligand-binding function. Two of these positions, dominated by histidine residues, may play important roles in ligand binding and could confer broad specificity to mammalian odor receptors. The presence of positive (overdominant) selection at some of the identified positions provides additional evidence for roles in ligand binding. Higher-order groups of correlated residues were also observed. Each group may interact with an individual ligand determinant, and combinations of these groups may provide a multi-dimensional mechanism for receptor diversity.

  19. Effects of informed consent for individual genome sequencing on relevant knowledge.

    PubMed

    Kaphingst, K A; Facio, F M; Cheng, M-R; Brooks, S; Eidem, H; Linn, A; Biesecker, B B; Biesecker, L G

    2012-11-01

    Increasing availability of individual genomic information suggests that patients will need knowledge about genome sequencing to make informed decisions, but prior research is limited. In this study, we examined genome sequencing knowledge before and after informed consent among 311 participants enrolled in the ClinSeq™ sequencing study. An exploratory factor analysis of knowledge items yielded two factors (sequencing limitations knowledge; sequencing benefits knowledge). In multivariable analysis, high pre-consent sequencing limitations knowledge scores were significantly related to education [odds ratio (OR): 8.7, 95% confidence interval (CI): 2.45-31.10 for post-graduate education, and OR: 3.9; 95% CI: 1.05, 14.61 for college degree compared with less than college degree] and race/ethnicity (OR: 2.4, 95% CI: 1.09, 5.38 for non-Hispanic Whites compared with other racial/ethnic groups). Mean values increased significantly between pre- and post-consent for the sequencing limitations knowledge subscale (6.9-7.7, p < 0.0001) and sequencing benefits knowledge subscale (7.0-7.5, p < 0.0001); increase in knowledge did not differ by sociodemographic characteristics. This study highlights gaps in genome sequencing knowledge and underscores the need to target educational efforts toward participants with less education or from minority racial/ethnic groups. The informed consent process improved genome sequencing knowledge. Future studies could examine how genome sequencing knowledge influences informed decision making. © 2012 John Wiley & Sons A/S.

  20. Molecular Analysis of Dehalococcoides 16S Ribosomal DNA from Chloroethene-Contaminated Sites throughout North America and Europe

    PubMed Central

    Hendrickson, Edwin R.; Payne, Jo Ann; Young, Roslyn M.; Starr, Mark G.; Perry, Michael P.; Fahnestock, Stephen; Ellis, David E.; Ebersole, Richard C.

    2002-01-01

    The environmental distribution of Dehalococcoides group organisms and their association with chloroethene-contaminated sites were examined. Samples from 24 chloroethene-dechlorinating sites scattered throughout North America and Europe were tested for the presence of members of the Dehalococcoides group by using a PCR assay developed to detect Dehalococcoides 16S rRNA gene (rDNA) sequences. Sequences identified by sequence analysis as sequences of members of the Dehalococcoides group were detected at 21 sites. Full dechlorination of chloroethenes to ethene occurred at these sites. Dehalococcoides sequences were not detected in samples from three sites at which partial dechlorination of chloroethenes occurred, where dechlorination appeared to stop at 1,2-cis-dichloroethene. Phylogenetic analysis of the 16S rDNA amplicons confirmed that Dehalococcoides sequences formed a unique 16S rDNA group. These 16S rDNA sequences were divided into three subgroups based on specific base substitution patterns in variable regions 2 and 6 of the Dehalococcoides 16S rDNA sequence. Analyses also demonstrated that specific base substitution patterns were signature patterns. The specific base substitutions distinguished the three sequence subgroups phylogenetically. These results demonstrated that members of the Dehalococcoides group are widely distributed in nature and can be found in a variety of geological formations and in different climatic zones. Furthermore, the association of these organisms with full dechlorination of chloroethenes suggests that they are promising candidates for engineered bioremediation and may be important contributors to natural attenuation of chloroethenes. PMID:11823182

  1. Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.

    PubMed

    Denas, Olgert; Sandstrom, Richard; Cheng, Yong; Beal, Kathryn; Herrero, Javier; Hardison, Ross C; Taylor, James

    2015-02-14

    Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements.

  2. Characterization of a species-specific repetitive DNA from a highly endangered wild animal, Rhinoceros unicornis, and assessment of genetic polymorphism by microsatellite associated sequence amplification (MASA).

    PubMed

    Ali, S; Azfer, M A; Bashamboo, A; Mathur, P K; Malik, P K; Mathur, V B; Raha, A K; Ansari, S

    1999-03-04

    We have cloned and sequenced a 906bp EcoRI repeat DNA fraction from Rhinoceros unicornis genome. The contig pSS(R)2 is AT rich with 340 A (37.53%), 187 C (20.64%), 173 G (19.09%) and 206 T (22.74%). The sequence contains MALT box, NF-E1, Poly-A signal, lariat consensus sequences, TATA box, translational initiation sequences and several stop codons. Translation of the contig showed seven different types of protein motifs, among which, EGF-like domain cysteine pattern signatures and Bowman-Birk serine protease inhibitor family signatures were prominent. The presence of eukaryotic transcriptional elements, protein signatures and analysis of subset sequences in the 5' region from 1 to 165nt indicating coding potential (test code value=0.97) suggest possible regulatory and/or functional role(s) of these sequences in the rhino genome. Translation of the complementary strand from 906 to 706nt and 190 to 2nt showed proteins of more than 7kDa rich in non-polar residues. This suggests that pSS(R)2 is either a part of, or adjacent to, a functional gene. The contig contains mostly non-consecutive simple repeat units from 2 to 17nt with varying frequencies, of which four base motifs were found to be predominant. Zoo-blot hybridization revealed that pSS(R)2 sequences are unique to R. unicornis genome because they do not cross-hybridize, even with the genomic DNA of South African black rhino Diceros bicornis. Southern blot analysis of R. unicornis genomic DNA with pSS(R)2 and other synthetic oligo probes revealed a high level of genetic homogeneity, which was also substantiated by microsatellite associated sequence amplification (MASA). Owing to its uniqueness, the pSS(R)2 probe has a potential application in the area of conservation biology for unequivocal identification of horn or other body tissues of R. unicornis. The evolutionary aspect of this repeat fraction in the context of comparative genome analysis is discussed.

  3. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India.

    PubMed

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-03-01

    Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.

  4. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India

    PubMed Central

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-01-01

    Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199

  5. Isolation and characterization of a novel Rhabdovirus from a wild boar (Sus scrofa) in Japan.

    PubMed

    Sakai, Kouji; Hagiwara, Katsuro; Omatsu, Tsutomu; Hamasaki, Chinami; Kuwata, Ryusei; Shimoda, Hiroshi; Suzuki, Kazuo; Endoh, Daiji; Nagata, Noriyo; Nagai, Makoto; Katayama, Yukie; Oba, Mami; Kurane, Ichiro; Saijo, Masayuki; Morikawa, Shigeru; Mizutani, Tetsuya; Maeda, Ken

    2015-09-30

    A novel rhabdovirus was isolated from the serum of a healthy Japanese wild boar (Sus scrofa leucomystax) and identified using the rapid determination system for viral nucleic acid sequences (RDV), next-generation sequencing, and electron microscopy. The virus was tentatively named wild boar rhabdovirus 1 (WBRV1). Phylogenetic analysis of the entire genome sequence indicated that WBRV1 is closely related to Tupaia rhabdovirus (TRV), which was isolated from cultured cells of hepatocellular carcinoma tissue of tree shrew. TRV has not been assigned to any genus of Rhabdoviridae till date. Analysis of the L gene indicated that WBRV1 belongs to the genus Vesiculovirus. These observations suggest that both TRV and WBRV1 belong to a new genus of Rhabdoviridae. Next-generation genome sequencing of WBRV1 revealed 5 open reading frames of 1329, 765, 627, 1629, and 6336 bases in length. The WBRV1 gene sequences are similar to those of other rhabdoviruses. Epizootiological analysis of a population of wild boars in Wakayama prefecture in Japan indicated that 6.5% were positive for the WBRV1 gene and 52% were positive for WBRV1-neutralizing antibodies. Furthermore, such viral neutralizing antibodies were found in domestic pigs in another prefecture. WBRV1 was inoculated intranasally and intraperitoneally into SCID and BALB/c mice and viral RNA was detected in SCID mice, suggesting that WBRV1 can replicate in immunocompromised mice. These results indicate this novel virus is endemic in wild animals and livestock in Japan. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Characterization of the gene encoding component C3 of the complement system from the spider Loxosceles laeta venom glands: Phylogenetic implications.

    PubMed

    Myamoto, D T; Pidde-Queiroz, G; Pedroso, A; Gonçalves-de-Andrade, R M; van den Berg, C W; Tambourgi, D V

    2016-09-01

    A transcriptome analysis of the venom glands of the spider Loxosceles laeta, performed by our group, in a previous study (Fernandes-Pedrosa et al., 2008), revealed a transcript with a sequence similar to the human complement component C3. Here we present the analysis of this transcript. cDNA fragments encoding the C3 homologue (Lox-C3) were amplified from total RNA isolated from the venom glands of L. laeta by RACE-PCR. Lox-C3 is a 5178 bps cDNA sequence encoding a 190kDa protein, with a domain configuration similar to human C3. Multiple alignments of C3-like proteins revealed two processing sites, suggesting that Lox-C3 is composed of three chains. Furthermore, the amino acids consensus sequences for the thioester was found, in addition to putative sequences responsible for FB binding. The phylogenetic analysis showed that Lox-C3 belongs to the same group as two C3 isoforms from the spider Hasarius adansoni (Family Salcitidae), showing 53% homology with these. This is the first characterization of a Loxosceles cDNA sequence encoding a human C3 homologue, and this finding, together with our previous finding of the expression of a FB-like molecule, suggests that this spider species also has a complement system. This work will help to improve our understanding of the innate immune system in these spiders and the ancestral structure of C3. Copyright © 2016 Elsevier GmbH. All rights reserved.

  7. Cloning and Characterization of the Pyrrolomycin Biosynthetic Gene Clusters from Actinosporangium vitaminophilum ATCC 31673 and Streptomyces sp. Strain UC 11065▿

    PubMed Central

    Zhang, Xiujun; Parry, Ronald J.

    2007-01-01

    The pyrrolomycins are a family of polyketide antibiotics, some of which contain a nitro group. To gain insight into the nitration mechanism associated with the formation of these antibiotics, the pyrrolomycin biosynthetic gene cluster from Actinosporangium vitaminophilum was cloned. Sequencing of ca. 56 kb of A. vitaminophilum DNA revealed 35 open reading frames (ORFs). Sequence analysis revealed a clear relationship between some of these ORFs and the biosynthetic gene cluster for pyoluteorin, a structurally related antibiotic. Since a gene transfer system could not be devised for A. vitaminophilum, additional proof for the identity of the cloned gene cluster was sought by cloning the pyrrolomycin gene cluster from Streptomyces sp. strain UC 11065, a transformable pyrrolomycin producer. Sequencing of ca. 26 kb of UC 11065 DNA revealed the presence of 17 ORFs, 15 of which exhibit strong similarity to ORFs in the A. vitaminophilum cluster as well as a nearly identical organization. Single-crossover disruption of two genes in the UC 11065 cluster abolished pyrrolomycin production in both cases. These results confirm that the genetic locus cloned from UC 11065 is essential for pyrrolomycin production, and they also confirm that the highly similar locus in A. vitaminophilum encodes pyrrolomycin biosynthetic genes. Sequence analysis revealed that both clusters contain genes encoding the two components of an assimilatory nitrate reductase. This finding suggests that nitrite is required for the formation of the nitrated pyrrolomycins. However, sequence analysis did not provide additional insights into the nitration process, suggesting the operation of a novel nitration mechanism. PMID:17158935

  8. Sensitivity of BRCA1/2 testing in high-risk breast/ovarian/male breast cancer families: little contribution of comprehensive RNA/NGS panel testing.

    PubMed

    Byers, Helen; Wallis, Yvonne; van Veen, Elke M; Lalloo, Fiona; Reay, Kim; Smith, Philip; Wallace, Andrew J; Bowers, Naomi; Newman, William G; Evans, D Gareth

    2016-11-01

    The sensitivity of testing BRCA1 and BRCA2 remains unresolved as the frequency of deep intronic splicing variants has not been defined in high-risk familial breast/ovarian cancer families. This variant category is reported at significant frequency in other tumour predisposition genes, including NF1 and MSH2. We carried out comprehensive whole gene RNA analysis on 45 high-risk breast/ovary and male breast cancer families with no identified pathogenic variant on exonic sequencing and copy number analysis of BRCA1/2. In addition, we undertook variant screening of a 10-gene high/moderate risk breast/ovarian cancer panel by next-generation sequencing. DNA testing identified the causative variant in 50/56 (89%) breast/ovarian/male breast cancer families with Manchester scores of ≥50 with two variants being confirmed to affect splicing on RNA analysis. RNA sequencing of BRCA1/BRCA2 on 45 individuals from high-risk families identified no deep intronic variants and did not suggest loss of RNA expression as a cause of lost sensitivity. Panel testing in 42 samples identified a known RAD51D variant, a high-risk ATM variant in another breast ovary family and a truncating CHEK2 mutation. Current exonic sequencing and copy number analysis variant detection methods of BRCA1/2 have high sensitivity in high-risk breast/ovarian cancer families. Sequence analysis of RNA does not identify any variants undetected by current analysis of BRCA1/2. However, RNA analysis clarified the pathogenicity of variants of unknown significance detected by current methods. The low diagnostic uplift achieved through sequence analysis of the other known breast/ovarian cancer susceptibility genes indicates that further high-risk genes remain to be identified.

  9. Analysis of Epstein-Barr Virus Genomes and Expression Profiles in Gastric Adenocarcinoma.

    PubMed

    Borozan, Ivan; Zapatka, Marc; Frappier, Lori; Ferretti, Vincent

    2018-01-15

    Epstein-Barr virus (EBV) is a causative agent of a variety of lymphomas, nasopharyngeal carcinoma (NPC), and ∼9% of gastric carcinomas (GCs). An important question is whether particular EBV variants are more oncogenic than others, but conclusions are currently hampered by the lack of sequenced EBV genomes. Here, we contribute to this question by mining whole-genome sequences of 201 GCs to identify 13 EBV-positive GCs and by assembling 13 new EBV genome sequences, almost doubling the number of available GC-derived EBV genome sequences and providing the first non-Asian EBV genome sequences from GC. Whole-genome sequence comparisons of all EBV isolates sequenced to date (85 from tumors and 57 from healthy individuals) showed that most GC and NPC EBV isolates were closely related although American Caucasian GC samples were more distant, suggesting a geographical component. However, EBV GC isolates were found to contain some consistent changes in protein sequences regardless of geographical origin. In addition, transcriptome data available for eight of the EBV-positive GCs were analyzed to determine which EBV genes are expressed in GC. In addition to the expected latency proteins (EBNA1, LMP1, and LMP2A), specific subsets of lytic genes were consistently expressed that did not reflect a typical lytic or abortive lytic infection, suggesting a novel mechanism of EBV gene regulation in the context of GC. These results are consistent with a model in which a combination of specific latent and lytic EBV proteins promotes tumorigenesis. IMPORTANCE Epstein-Barr virus (EBV) is a widespread virus that causes cancer, including gastric carcinoma (GC), in a small subset of individuals. An important question is whether particular EBV variants are more cancer associated than others, but more EBV sequences are required to address this question. Here, we have generated 13 new EBV genome sequences from GC, almost doubling the number of EBV sequences from GC isolates and providing the first EBV sequences from non-Asian GC. We further identify sequence changes in some EBV proteins common to GC isolates. In addition, gene expression analysis of eight of the EBV-positive GCs showed consistent expression of both the expected latency proteins and a subset of lytic proteins that was not consistent with typical lytic or abortive lytic expression. These results suggest that novel mechanisms activate expression of some EBV lytic proteins and that their expression may contribute to oncogenesis. Copyright © 2018 American Society for Microbiology.

  10. Rather than by direct acquisition via lateral gene transfer, GHF5 cellulases were passed on from early Pratylenchidae to root-knot and cyst nematodes.

    PubMed

    Rybarczyk-Mydłowska, Katarzyna; Maboreke, Hazel Ruvimbo; van Megen, Hanny; van den Elsen, Sven; Mooyman, Paul; Smant, Geert; Bakker, Jaap; Helder, Johannes

    2012-11-21

    Plant parasitic nematodes are unusual Metazoans as they are equipped with genes that allow for symbiont-independent degradation of plant cell walls. Among the cell wall-degrading enzymes, glycoside hydrolase family 5 (GHF5) cellulases are relatively well characterized, especially for high impact parasites such as root-knot and cyst nematodes. Interestingly, ancestors of extant nematodes most likely acquired these GHF5 cellulases from a prokaryote donor by one or multiple lateral gene transfer events. To obtain insight into the origin of GHF5 cellulases among evolutionary advanced members of the order Tylenchida, cellulase biodiversity data from less distal family members were collected and analyzed. Single nematodes were used to obtain (partial) genomic sequences of cellulases from representatives of the genera Meloidogyne, Pratylenchus, Hirschmanniella and Globodera. Combined Bayesian analysis of ≈ 100 cellulase sequences revealed three types of catalytic domains (A, B, and C). Represented by 84 sequences, type B is numerically dominant, and the overall topology of the catalytic domain type shows remarkable resemblance with trees based on neutral (= pathogenicity-unrelated) small subunit ribosomal DNA sequences. Bayesian analysis further suggested a sister relationship between the lesion nematode Pratylenchus thornei and all type B cellulases from root-knot nematodes. Yet, the relationship between the three catalytic domain types remained unclear. Superposition of intron data onto the cellulase tree suggests that types B and C are related, and together distinct from type A that is characterized by two unique introns. All Tylenchida members investigated here harbored one or multiple GHF5 cellulases. Three types of catalytic domains are distinguished, and the presence of at least two types is relatively common among plant parasitic Tylenchida. Analysis of coding sequences of cellulases suggests that root-knot and cyst nematodes did not acquire this gene directly by lateral genes transfer. More likely, these genes were passed on by ancestors of a family nowadays known as the Pratylenchidae.

  11. Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

    PubMed

    Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

    2005-09-01

    We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.

  12. Bartonella dromedarii sp. nov. isolated from domesticated camels (Camelus dromedarius) in Israel.

    PubMed

    Rasis, Michal; Rudoler, Nir; Schwartz, David; Giladi, Michael

    2014-11-01

    Bartonella spp. are fastidious, Gram-negative bacilli that cause a wide spectrum of diseases in humans. Most Bartonella spp. have adapted to a specific host, generally a domestic or wild mammal. Dromedary camels (Camelus dromedarius) have become a focus of growing public-health interest because they have been identified as a reservoir host for the Middle East respiratory syndrome coronavirus. Nevertheless, data on camel zoonoses are limited. We aimed to study the occurrence of Bartonella bacteremia among dromedaries in Israel. Nine of 51 (17.6%) camels were found to be bacteremic with Bartonella spp.; bacteremia levels ranged from five to >1000 colony-forming units/mL. Phylogenetic reconstruction based on the concatenated sequences of gltA and rpoB genes demonstrated that the dromedary Bartonella isolates are closely related to other ruminant-derived Bartonella spp., with B. bovis being the nearest relative. Using electron microscopy, the novel isolates were shown to be flagellated, whereas B. bovis is nonflagellated. Sequence comparisons analysis of the housekeeping genes ftsZ, ribC, and groEL showed the highest homology to B. chomelii, B. capreoli, and B. birtlesii, respectively. Sequence analysis of the gltA and rpoB revealed ∼96% identity to B. bovis, a previously suggested cutoff value for sequence-based differentiation of Bartonella spp., suggesting that this approach does not have sufficient discriminatory power for differentiating ruminant-related Bartonella spp. A comprehensive multilocus sequence typing (MLST) analysis based on nine genetic loci (gltA, rpoB, ftsZ, internal transcribed spacer (ITS), 16S rRNA, ribC, groEL, nuoG, and SsrA) identified seven sequence types of the new dromedary isolates. This is the first description of a Bartonella sp. from camelids. On the basis of a distinct reservoir and ecological niche, sequence analyses, and expression of flagella, we designate these isolates as a novel Bartonella sp. named Bartonella dromedarii sp. nov. Further studies are required to explore its zoonotic potential.

  13. Identification of a sequence element on the 3' side of AAUAAA which is necessary for simian virus 40 late mRNA 3'-end processing.

    PubMed Central

    Sadofsky, M; Connelly, S; Manley, J L; Alwine, J C

    1985-01-01

    Our previous studies of the 3'-end processing of simian virus 40 late mRNAs indicated the existence of an essential element (or elements) downstream of the AAUAAA signal. We report here the use of transient expression analysis to study a functional element which we located within the sequence AGGUUUUUU, beginning 59 nucleotides downstream of the recognized signal AAUAAA. Deletion of this element resulted in (i) at least a 75% drop in 3'-end processing at the normal site and (ii) appearance of readthrough transcripts with alternate 3' ends. Some flexibility in the downstream position of this element relative to the AAUAAA was noted by deletion analysis. Using computer sequence comparison, we located homologous regions within downstream sequences of other genes, suggesting a generalized sequence element. In addition, specific complementarity is noted between the downstream element and U4 RNA. The possibility that this complementarity could participate in 3'-end site selection is discussed. Images PMID:3016512

  14. Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software.

    PubMed

    Nakano, Shogo; Asano, Yasuhisa

    2015-02-03

    Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.

  15. Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software

    NASA Astrophysics Data System (ADS)

    Nakano, Shogo; Asano, Yasuhisa

    2015-02-01

    Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.

  16. Molecular analysis of the microbial diversity present in the colonic wall, colonic lumen, and cecal lumen of a pig.

    PubMed

    Pryde, S E; Richardson, A J; Stewart, C S; Flint, H J

    1999-12-01

    Random clones of 16S ribosomal DNA gene sequences were isolated after PCR amplification with eubacterial primers from total genomic DNA recovered from samples of the colonic lumen, colonic wall, and cecal lumen from a pig. Sequences were also obtained for cultures isolated anaerobically from the same colonic-wall sample. Phylogenetic analysis showed that many sequences were related to those of Lactobacillus or Streptococcus spp. or fell into clusters IX, XIVa, and XI of gram-positive bacteria. In addition, 59% of randomly cloned sequences showed less than 95% similarity to database entries or sequences from cultivated organisms. Cultivation bias is also suggested by the fact that the majority of isolates (54%) recovered from the colon wall by culturing were related to Lactobacillus and Streptococcus, whereas this group accounted for only one-third of the sequence variation for the same sample from random cloning. The remaining cultured isolates were mainly Selenomonas related. A higher proportion of Lactobacillus reuteri-related sequences than of Lactobacillus acidophilus- and Lactobacillus amylovorus-related sequences were present in the colonic-wall sample. Since the majority of bacterial ribosomal sequences recovered from the colon wall are less than 95% related to known organisms, the roles of many of the predominant wall-associated bacteria remain to be defined.

  17. Molecular Analysis of the Microbial Diversity Present in the Colonic Wall, Colonic Lumen, and Cecal Lumen of a Pig

    PubMed Central

    Pryde, Susan E.; Richardson, Anthony J.; Stewart, Colin S.; Flint, Harry J.

    1999-01-01

    Random clones of 16S ribosomal DNA gene sequences were isolated after PCR amplification with eubacterial primers from total genomic DNA recovered from samples of the colonic lumen, colonic wall, and cecal lumen from a pig. Sequences were also obtained for cultures isolated anaerobically from the same colonic-wall sample. Phylogenetic analysis showed that many sequences were related to those of Lactobacillus or Streptococcus spp. or fell into clusters IX, XIVa, and XI of gram-positive bacteria. In addition, 59% of randomly cloned sequences showed less than 95% similarity to database entries or sequences from cultivated organisms. Cultivation bias is also suggested by the fact that the majority of isolates (54%) recovered from the colon wall by culturing were related to Lactobacillus and Streptococcus, whereas this group accounted for only one-third of the sequence variation for the same sample from random cloning. The remaining cultured isolates were mainly Selenomonas related. A higher proportion of Lactobacillus reuteri-related sequences than of Lactobacillus acidophilus- and Lactobacillus amylovorus-related sequences were present in the colonic-wall sample. Since the majority of bacterial ribosomal sequences recovered from the colon wall are less than 95% related to known organisms, the roles of many of the predominant wall-associated bacteria remain to be defined. PMID:10583991

  18. Silence of the centromeres--not.

    PubMed

    Cooke, Howard J

    2004-07-01

    Centromeres are a conundrum; although many proteins associated with centomeres are conserved from yeast to humans, the underlying DNA sequence is not. A proposed solution to this problem is that an epigenetic, largely heterochromatic, state be imposed by these proteins. Recent analysis of a human neocentromere and the complete sequence of a rice centromere suggest that this epigenetic state can enable transcription of at least some genes within a centromere.

  19. Draft Genome Sequence of Exiguobacterium sp. Strain BMC-KP, an Environmental Isolate from Bryn Mawr, Pennsylvania.

    PubMed

    Hyson, Peter; Shapiro, Joshua A; Wien, Michelle W

    2015-10-08

    Exiguobacterium sp. strain BMC-KP was isolated as part of a student environmental sampling project at Bryn Mawr College, PA. Sequencing of bacterial DNA assembled a 3.32-Mb draft genome. Analysis suggests the presence of genes for tolerance to cold and toxic metals, broad carbohydrate metabolism, and genes derived from phage. Copyright © 2015 Hyson et al.

  20. Correlating low-similarity peptide sequences and allergenic epitopes.

    PubMed

    Kanduc, D

    2008-01-01

    Although a high number of allergenic peptide epitopes has been experimentally identified and defined, the molecular basis and the precise mechanisms underlying peptide allergenicity are unknown. This issue was analyzed exploring the relationship between peptide allergenicity and sequence similarity to the human proteome. The structured analysis of the data reported in literature put into evidence that the most part of IgE-binding epitopes are (or harbor) pentapeptide unit(s) with no/low similarity to the human proteome, this way suggesting that no or low sequence similarity to the host proteome might represent a minimum common denominator identifying allergenic peptides. The present literature analysis might be of relevance in devising and designing short amino acid modules to be used for blocking pathogenic IgE.

  1. Whole-genome CNV analysis: advances in computational approaches.

    PubMed

    Pirooznia, Mehdi; Goes, Fernando S; Zandi, Peter P

    2015-01-01

    Accumulating evidence indicates that DNA copy number variation (CNV) is likely to make a significant contribution to human diversity and also play an important role in disease susceptibility. Recent advances in genome sequencing technologies have enabled the characterization of a variety of genomic features, including CNVs. This has led to the development of several bioinformatics approaches to detect CNVs from next-generation sequencing data. Here, we review recent advances in CNV detection from whole genome sequencing. We discuss the informatics approaches and current computational tools that have been developed as well as their strengths and limitations. This review will assist researchers and analysts in choosing the most suitable tools for CNV analysis as well as provide suggestions for new directions in future development.

  2. No apparent correlation between honey bee forager gut microbiota and honey production.

    PubMed

    Horton, Melissa A; Oliver, Randy; Newton, Irene L

    2015-01-01

    One of the best indicators of colony health for the European honey bee (Apis mellifera) is its performance in the production of honey. Recent research into the microbial communities naturally populating the bee gut raise the question as to whether there is a correlation between microbial community structure and colony productivity. In this work, we used 16S rRNA amplicon sequencing to explore the microbial composition associated with forager bees from honey bee colonies producing large amounts of surplus honey (productive) and compared them to colonies producing less (unproductive). As supported by previous work, the honey bee microbiome was found to be dominated by three major phyla: the Proteobacteria, Bacilli and Actinobacteria, within which we found a total of 23 different bacterial genera, including known "core" honey bee microbiome members. Using discriminant function analysis and correlation-based network analysis, we identified highly abundant members (such as Frischella and Gilliamella) as important in shaping the bacterial community; libraries from colonies with high quantities of these Orbaceae members were also likely to contain fewer Bifidobacteria and Lactobacillus species (such as Firm-4). However, co-culture assays, using isolates from these major clades, were unable to confirm any antagonistic interaction between Gilliamella and honey bee gut bacteria. Our results suggest that honey bee colony productivity is associated with increased bacterial diversity, although this mechanism behind this correlation has yet to be determined. Our results also suggest researchers should not base inferences of bacterial interactions solely on correlations found using sequencing. Instead, we suggest that depth of sequencing and library size can dramatically influence statistically significant results from sequence analysis of amplicons and should be cautiously interpreted.

  3. Epidemic history of hepatitis C virus infection in two remote communities in Nigeria, West Africa.

    PubMed

    Forbi, Joseph C; Purdy, Michael A; Campo, David S; Vaughan, Gilberto; Dimitrova, Zoya E; Ganova-Raeva, Lilia M; Xia, Guo-Liang; Khudyakov, Yury E

    2012-07-01

    We investigated the molecular epidemiology and population dynamics of HCV infection among indigenes of two semi-isolated communities in North-Central Nigeria. Despite remoteness and isolation, ~15% of the population had serological or molecular markers of hepatitis C virus (HCV) infection. Phylogenetic analysis of the NS5b sequences obtained from 60 HCV-infected residents showed that HCV variants belonged to genotype 1 (n=51; 85%) and genotype 2 (n=9; 15%). All sequences were unique and intermixed in the phylogenetic tree with HCV sequences from people infected from other West African countries. The high-throughput 454 pyrosequencing of the HCV hypervariable region 1 and an empirical threshold error correction algorithm were used to evaluate intra-host heterogeneity of HCV strains of genotype 1 (n=43) and genotype 2 (n=6) from residents of the communities. Analysis revealed a rare detectable intermixing of HCV intra-host variants among residents. Identification of genetically close HCV variants among all known groups of relatives suggests a common intra-familial HCV transmission in the communities. Applying Bayesian coalescent analysis to the NS5b sequences, the most recent common ancestors for genotype 1 and 2 variants were estimated to have existed 675 and 286 years ago, respectively. Bayesian skyline plots suggest that HCV lineages of both genotypes identified in the Nigerian communities experienced epidemic growth for 200-300 years until the mid-20th century. The data suggest a massive introduction of numerous HCV variants to the communities during the 20th century in the background of a dynamic evolutionary history of the hepatitis C epidemic in Nigeria over the past three centuries.

  4. Large-Scale Genomic Analysis of Codon Usage in Dengue Virus and Evaluation of Its Phylogenetic Dependence

    PubMed Central

    Lara-Ramírez, Edgar E.; Salazar, Ma Isabel; López-López, María de Jesús; Salas-Benito, Juan Santiago; Sánchez-Varela, Alejandro

    2014-01-01

    The increasing number of dengue virus (DENV) genome sequences available allows identifying the contributing factors to DENV evolution. In the present study, the codon usage in serotypes 1–4 (DENV1–4) has been explored for 3047 sequenced genomes using different statistics methods. The correlation analysis of total GC content (GC) with GC content at the three nucleotide positions of codons (GC1, GC2, and GC3) as well as the effective number of codons (ENC, ENCp) versus GC3 plots revealed mutational bias and purifying selection pressures as the major forces influencing the codon usage, but with distinct pressure on specific nucleotide position in the codon. The correspondence analysis (CA) and clustering analysis on relative synonymous codon usage (RSCU) within each serotype showed similar clustering patterns to the phylogenetic analysis of nucleotide sequences for DENV1–4. These clustering patterns are strongly related to the virus geographic origin. The phylogenetic dependence analysis also suggests that stabilizing selection acts on the codon usage bias. Our analysis of a large scale reveals new feature on DENV genomic evolution. PMID:25136631

  5. Comparative genomic analysis reveals a novel mitochondrial isoform of human rTS protein and unusual phylogenetic distribution of the rTS gene

    PubMed Central

    Liang, Ping; Nair, Jayakumar R; Song, Lei; McGuire, John J; Dolnick, Bruce J

    2005-01-01

    Background The rTS gene (ENOSF1), first identified in Homo sapiens as a gene complementary to the thymidylate synthase (TYMS) mRNA, is known to encode two protein isoforms, rTSα and rTSβ. The rTSβ isoform appears to be an enzyme responsible for the synthesis of signaling molecules involved in the down-regulation of thymidylate synthase, but the exact cellular functions of rTS genes are largely unknown. Results Through comparative genomic sequence analysis, we predicted the existence of a novel protein isoform, rTS, which has a 27 residue longer N-terminus by virtue of utilizing an alternative start codon located upstream of the start codon in rTSβ. We observed that a similar extended N-terminus could be predicted in all rTS genes for which genomic sequences are available and the extended regions are conserved from bacteria to human. Therefore, we reasoned that the protein with the extended N-terminus might represent an ancestral form of the rTS protein. Sequence analysis strongly predicts a mitochondrial signal sequence in the extended N-terminal of human rTSγ, which is absent in rTSβ. We confirmed the existence of rTS in human mitochondria experimentally by demonstrating the presence of both rTSγ and rTSβ proteins in mitochondria isolated by subcellular fractionation. In addition, our comprehensive analysis of rTS orthologous sequences reveals an unusual phylogenetic distribution of this gene, which suggests the occurrence of one or more horizontal gene transfer events. Conclusion The presence of two rTS isoforms in mitochondria suggests that the rTS signaling pathway may be active within mitochondria. Our report also presents an example of identifying novel protein isoforms and for improving gene annotation through comparative genomic analysis. PMID:16162288

  6. Molecular analysis of a 11 700-year-old rodent midden from the Atacama Desert, Chile

    USGS Publications Warehouse

    Kuch, M.; Rohland, N.; Betancourt, J.L.; Latorre, C.; Steppan, S.; Poinar, H.N.

    2002-01-01

    DNA was extracted from an 11 700-year-old rodent midden from the Atacama Desert, Chile and the chloroplast and animal mitochondrial DNA (mtDNA) gene sequences were analysed to investigate the floral environment surrounding the midden, and the identity of the midden agent. The plant sequences, together with the macroscopic identifications, suggest the presence of 13 plant families and three orders that no longer exist today at the midden locality, and thus point to a much more diverse and humid climate 11 700 years ago. The mtDNA sequences suggest the presence of at least four different vertebrates, which have been putatively identified as a camelid (vicuna), two rodents (Phyllotis and Abrocoma), and a cardinal bird (Passeriformes). To identify the midden agent, DNA was extracted from pooled faecal pellets, three small overlapping fragments of the mitochondrial cytochrome b gene were amplified and multiple clones were sequenced. These results were analysed along with complete cytochrome b sequences for several modern Phyllotis species to place the midden sequence phylogenetically. The results identified the midden agent as belonging to an ancestral P. limatus. Today, P. limatus is not found at the midden locality but it can be found 100 km to the north, indicating at least a small range shift. The more extensive sampling of modern Phyllotis reinforces the suggestion that P. limatus is recently derived from a peripheral isolate.

  7. A Generalized Least-Squares Estimate for the Origin of Sporophytic Self-Incompatibility

    PubMed Central

    Uyenoyama, M. K.

    1995-01-01

    Analysis of nucleotide sequences that regulate the expression of self-incompatibility in flowering plants affords a direct means of examining classical hypotheses for the origin and evolution of this major feature of mating systems. Departing from the classical view of monophyly of all forms of self-incompatibility, the current paradigm for the origin of self-incompatibility postulates multiple episodes of recruitment and modification of preexisting genes. In Brassica, the S locus, which regulates sporophytic self-incompatibility, shows homology to a multigene family present both in self-compatible congeners and in groups for which this form of self-incompatibility is atypical. A phylogenetic analysis of S-allele sequences together with homologous sequences that do not cosegregate with self-incompatibility permits dating the change of function that marked the origin of self-incompatibility. A generalized least-squares method is introduced that provides closed-form expressions for estimates and standard errors for function-specific divergence rates and times of divergence among sequences. This analysis suggests that the age of the sporophytic self-incompatibility system expressed in Brassica exceeds species divergence within the genus by four- to fivefold. The extraordinarily high levels of sequence diversity exhibited by S alleles appears to reflect their ancient derivation, with the alternative hypothesis of hypermutability rejected by the analysis. PMID:7713446

  8. Interpreting the biological relevance of bioinformatic analyses with T-DNA sequence for protein allergenicity.

    PubMed

    Harper, B; McClain, S; Ganko, E W

    2012-08-01

    Global regulatory agencies require bioinformatic sequence analysis as part of their safety evaluation for transgenic crops. Analysis typically focuses on encoded proteins and adjacent endogenous flanking sequences. Recently, regulatory expectations have expanded to include all reading frames of the inserted DNA. The intent is to provide biologically relevant results that can be used in the overall assessment of safety. This paper evaluates the relevance of assessing the allergenic potential of all DNA reading frames found in common food genes using methods considered for the analysis of T-DNA sequences used in transgenic crops. FASTA and BLASTX algorithms were used to compare genes from maize, rice, soybean, cucumber, melon, watermelon, and tomato using international regulatory guidance. Results show that BLASTX for maize yielded 7254 alignments that exceeded allergen similarity thresholds and 210,772 alignments that matched eight or more consecutive amino acids with an allergen; other crops produced similar results. This analysis suggests that each nontransgenic crop has a much greater potential for allergenic risk than what has been observed clinically. We demonstrate that a meaningful safety assessment is unlikely to be provided by using methods with inherently high frequencies of false positive alignments when broadly applied to all reading frames of DNA sequence. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. Complete Genome Sequence and Comparative Analysis of the Fish Pathogen Lactococcus garvieae

    PubMed Central

    Oshima, Kenshiro; Yoshizaki, Mariko; Kawanishi, Michiko; Nakaya, Kohei; Suzuki, Takehito; Miyauchi, Eiji; Ishii, Yasuo; Tanabe, Soichi; Murakami, Masaru; Hattori, Masahira

    2011-01-01

    Lactococcus garvieae causes fatal haemorrhagic septicaemia in fish such as yellowtail. The comparative analysis of genomes of a virulent strain Lg2 and a non-virulent strain ATCC 49156 of L. garvieae revealed that the two strains shared a high degree of sequence identity, but Lg2 had a 16.5-kb capsule gene cluster that is absent in ATCC 49156. The capsule gene cluster was composed of 15 genes, of which eight genes are highly conserved with those in exopolysaccharide biosynthesis gene cluster often found in Lactococcus lactis strains. Sequence analysis of the capsule gene cluster in the less virulent strain L. garvieae Lg2-S, Lg2-derived strain, showed that two conserved genes were disrupted by a single base pair deletion, respectively. These results strongly suggest that the capsule is crucial for virulence of Lg2. The capsule gene cluster of Lg2 may be a genomic island from several features such as the presence of insertion sequences flanked on both ends, different GC content from the chromosomal average, integration into the locus syntenic to other lactococcal genome sequences, and distribution in human gut microbiomes. The analysis also predicted other potential virulence factors such as haemolysin. The present study provides new insights into understanding of the virulence mechanisms of L. garvieae in fish. PMID:21829716

  10. High diversity and rapid diversification in the head louse, Pediculus humanus (Pediculidae: Phthiraptera)

    PubMed Central

    Ashfaq, Muhammad; Prosser, Sean; Nasir, Saima; Masood, Mariyam; Ratnasingham, Sujeevan; Hebert, Paul D. N.

    2015-01-01

    The study analyzes sequence variation of two mitochondrial genes (COI, cytb) in Pediculus humanus from three countries (Egypt, Pakistan, South Africa) that have received little prior attention, and integrates these results with prior data. Analysis indicates a maximum K2P distance of 10.3% among 960 COI sequences and 13.8% among 479 cytb sequences. Three analytical methods (BIN, PTP, ABGD) reveal five concordant OTUs for COI and cytb. Neighbor-Joining analysis of the COI sequences confirm five clusters; three corresponding to previously recognized mitochondrial clades A, B, C and two new clades, “D” and “E”, showing 2.3% and 2.8% divergence from their nearest neighbors (NN). Cytb data corroborate five clusters showing that clades “D” and “E” are both 4.6% divergent from their respective NN clades. Phylogenetic analysis supports the monophyly of all clusters recovered by NJ analysis. Divergence time estimates suggest that the earliest split of P. humanus clades occured slightly more than one million years ago (MYa) and the latest about 0.3 MYa. Sequence divergences in COI and cytb among the five clades of P. humanus are 10X those in their human host, a difference that likely reflects both rate acceleration and the acquisition of lice clades from several archaic hominid lineages. PMID:26373806

  11. Hv 1 Proton Channels in Dinoflagellates: Not Just for Bioluminescence?

    PubMed

    Kigundu, Gabriel; Cooper, Jennifer L; Smith, Susan M E

    2018-04-26

    Bioluminescence in dinoflagellates is controlled by H V 1 proton channels. Database searches of dinoflagellate transcriptomes and genomes yielded hits with sequence features diagnostic of all confirmed H V 1, and show that H V 1 is widely distributed in the dinoflagellate phylogeny including the basal species Oxyrrhis marina. Multiple sequence alignments followed by phylogenetic analysis revealed three major subfamilies of H V 1 that do not correlate with presence of theca, autotrophy, geographic location, or bioluminescence. These data suggest that most dinoflagellates express a H V 1 which has a function separate from bioluminescence. Sequence evidence also suggests that dinoflagellates can contain more than one H V 1 gene. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  12. Molecular comparison of the structural proteins encoding gene clusters of two related Lactobacillus delbrueckii bacteriophages.

    PubMed Central

    Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T

    1993-01-01

    Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043

  13. Genetic discovery in Xylella fastidiosa through sequence analysis of selected randomly amplified polymorphic DNAs.

    PubMed

    Chen, Jianchi; Civerolo, Edwin L; Jarret, Robert L; Van Sluys, Marie-Anne; de Oliveira, Mariana C

    2005-02-01

    Xylella fastidiosa causes many important plant diseases including Pierce's disease (PD) in grape and almond leaf scorch disease (ALSD). DNA-based methodologies, such as randomly amplified polymorphic DNA (RAPD) analysis, have been playing key roles in genetic information collection of the bacterium. This study further analyzed the nucleotide sequences of selected RAPDs from X. fastidiosa strains in conjunction with the available genome sequence databases and unveiled several previously unknown novel genetic traits. These include a sequence highly similar to those in the phage family of Podoviridae. Genome comparisons among X. fastidiosa strains suggested that the "phage" is currently active. Two other RAPDs were also related to horizontal gene transfer: one was part of a broadly distributed cryptic plasmid and the other was associated with conjugal transfer. One RAPD inferred a genomic rearrangement event among X. fastidiosa PD strains and another identified a single nucleotide polymorphism of evolutionary value.

  14. Cutaneous Granulomas in Dolphins Caused by Novel Uncultivated Paracoccidioides brasiliensis

    PubMed Central

    Vilela, Raquel; Bossart, Gregory D.; St. Leger, Judy A.; Dalton, Leslie M.; Reif, John S.; Schaefer, Adam M.; McCarthy, Peter J.; Fair, Patricia A.

    2016-01-01

    Cutaneous granulomas in dolphins were believed to be caused by Lacazia loboi, which also causes a similar disease in humans. This hypothesis was recently challenged by reports that fungal DNA sequences from dolphins grouped this pathogen with Paracoccidioides brasiliensis. We conducted phylogenetic analysis of fungi from 6 bottlenose dolphins (Tursiops truncatus) with cutaneous granulomas and chains of yeast cells in infected tissues. Kex gene sequences of P. brasiliensis from dolphins showed 100% homology with sequences from cultivated P. brasiliensis, 73% with those of L. loboi, and 93% with those of P. lutzii. Parsimony analysis placed DNA sequences from dolphins within a cluster with human P. brasiliensis strains. This cluster was the sister taxon to P. lutzii and L. loboi. Our molecular data support previous findings and suggest that a novel uncultivated strain of P. brasiliensis restricted to cutaneous lesions in dolphins is probably the cause of lacaziosis/lobomycosis, herein referred to as paracoccidioidomycosis ceti. PMID:27869614

  15. Cutaneous Granulomas in Dolphins Caused by Novel Uncultivated Paracoccidioides brasiliensis.

    PubMed

    Vilela, Raquel; Bossart, Gregory D; St Leger, Judy A; Dalton, Leslie M; Reif, John S; Schaefer, Adam M; McCarthy, Peter J; Fair, Patricia A; Mendoza, Leonel

    2016-12-01

    Cutaneous granulomas in dolphins were believed to be caused by Lacazia loboi, which also causes a similar disease in humans. This hypothesis was recently challenged by reports that fungal DNA sequences from dolphins grouped this pathogen with Paracoccidioides brasiliensis. We conducted phylogenetic analysis of fungi from 6 bottlenose dolphins (Tursiops truncatus) with cutaneous granulomas and chains of yeast cells in infected tissues. Kex gene sequences of P. brasiliensis from dolphins showed 100% homology with sequences from cultivated P. brasiliensis, 73% with those of L. loboi, and 93% with those of P. lutzii. Parsimony analysis placed DNA sequences from dolphins within a cluster with human P. brasiliensis strains. This cluster was the sister taxon to P. lutzii and L. loboi. Our molecular data support previous findings and suggest that a novel uncultivated strain of P. brasiliensis restricted to cutaneous lesions in dolphins is probably the cause of lacaziosis/lobomycosis, herein referred to as paracoccidioidomycosis ceti.

  16. Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

    PubMed

    Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

    2017-01-01

    Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.

  17. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution

    PubMed Central

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2005-01-01

    We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085

  18. Conservation and variability of West Nile virus proteins.

    PubMed

    Koo, Qi Ying; Khan, Asif M; Jung, Keun-Ok; Ramdas, Shweta; Miotto, Olivo; Tan, Tin Wee; Brusic, Vladimir; Salmon, Jerome; August, J Thomas

    2009-01-01

    West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of < or = 1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (< or = 10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing approximately 34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivirus infections, and for studies of homologous sequences among other flaviviruses.

  19. A diverse family of serine proteinase genes expressed in cotton boll weevil (Anthonomus grandis): implications for the design of pest-resistant transgenic cotton plants.

    PubMed

    Oliveira-Neto, Osmundo B; Batista, João A N; Rigden, Daniel J; Fragoso, Rodrigo R; Silva, Rodrigo O; Gomes, Eliane A; Franco, Octávio L; Dias, Simoni C; Cordeiro, Célia M T; Monnerat, Rose G; Grossi-De-Sá, Maria F

    2004-09-01

    Fourteen different cDNA fragments encoding serine proteinases were isolated by reverse transcription-PCR from cotton boll weevil (Anthonomus grandis) larvae. A large diversity between the sequences was observed, with a mean pairwise identity of 22% in the amino acid sequence. The cDNAs encompassed 11 trypsin-like sequences classifiable into three families and three chymotrypsin-like sequences belonging to a single family. Using a combination of 5' and 3' RACE, the full-length sequence was obtained for five of the cDNAs, named Agser2, Agser5, Agser6, Agser10 and Agser21. The encoded proteins included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Southern blotting analysis suggested that one or two copies of these serine proteinase genes exist in the A. grandis genome. Northern blotting analysis of Agser2 and Agser5 showed that for both genes, expression is induced upon feeding and is concentrated in the gut of larvae and adult insects. Reverse northern analysis of the 14 cDNA fragments showed that only two trypsin-like and two chymotrypsin-like were expressed at detectable levels. Under the effect of the serine proteinase inhibitors soybean Kunitz trypsin inhibitor and black-eyed pea trypsin/chymotrypsin inhibitor, expression of one of the trypsin-like sequences was upregulated while expression of the two chymotrypsin-like sequences was downregulated. Copyright 2004 Elsevier Ltd.

  20. DNA Barcode Analysis of Thrips (Thysanoptera) Diversity in Pakistan Reveals Cryptic Species Complexes.

    PubMed

    Iftikhar, Romana; Ashfaq, Muhammad; Rasool, Akhtar; Hebert, Paul D N

    2016-01-01

    Although thrips are globally important crop pests and vectors of viral disease, species identifications are difficult because of their small size and inconspicuous morphological differences. Sequence variation in the mitochondrial COI-5' (DNA barcode) region has proven effective for the identification of species in many groups of insect pests. We analyzed barcode sequence variation among 471 thrips from various plant hosts in north-central Pakistan. The Barcode Index Number (BIN) system assigned these sequences to 55 BINs, while the Automatic Barcode Gap Discovery detected 56 partitions, a count that coincided with the number of monophyletic lineages recognized by Neighbor-Joining analysis and Bayesian inference. Congeneric species showed an average of 19% sequence divergence (range = 5.6% - 27%) at COI, while intraspecific distances averaged 0.6% (range = 0.0% - 7.6%). BIN analysis suggested that all intraspecific divergence >3.0% actually involved a species complex. In fact, sequences for three major pest species (Haplothrips reuteri, Thrips palmi, Thrips tabaci), and one predatory thrips (Aeolothrips intermedius) showed deep intraspecific divergences, providing evidence that each is a cryptic species complex. The study compiles the first barcode reference library for the thrips of Pakistan, and examines global haplotype diversity in four important pest thrips.

  1. Biosynthesis of Lipoic Acid in Arabidopsis: Cloning and Characterization of the cDNA for Lipoic Acid Synthase1

    PubMed Central

    Yasuno, Rie; Wada, Hajime

    1998-01-01

    Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738

  2. Large-scale genomic analyses reveal the population structure and evolutionary trends of Streptococcus agalactiae strains in Brazilian fish farms.

    PubMed

    Barony, Gustavo M; Tavares, Guilherme C; Pereira, Felipe L; Carvalho, Alex F; Dorella, Fernanda A; Leal, Carlos A G; Figueiredo, Henrique C P

    2017-10-19

    Streptococcus agalactiae is a major pathogen and a hindrance on tilapia farming worldwide. The aims of this work were to analyze the genomic evolution of Brazilian strains of S. agalactiae and to establish spatial and temporal relations between strains isolated from different outbreaks of streptococcosis. A total of 39 strains were obtained from outbreaks and their whole genomes were sequenced and annotated for comparative analysis of multilocus sequence typing, genomic similarity and whole genome multilocus sequence typing (wgMLST). The Brazilian strains presented two sequence types, including a newly described ST, and a non-typeable lineage. The use of wgMLST could differentiate each strain in a single clone and was used to establish temporal and geographical correlations among strains. Bayesian phylogenomic analysis suggests that the studied Brazilian population was co-introduced in the country with their host, approximately 60 years ago. Brazilian strains of S. agalactiae were shown to be heterogeneous in their genome sequences and were distributed in different regions of the country according to their genotype, which allowed the use of wgMLST analysis to track each outbreak event individually.

  3. Phylogenetic analysis of nitrite, nitric oxide, and nitrous oxide respiratory enzymes reveal a complex evolutionary history for denitrification.

    PubMed

    Jones, Christopher M; Stres, Blaz; Rosenquist, Magnus; Hallin, Sara

    2008-09-01

    Denitrification is a facultative respiratory pathway in which nitrite (NO2(-)), nitric oxide (NO), and nitrous oxide (N2O) are successively reduced to nitrogen gas (N(2)), effectively closing the nitrogen cycle. The ability to denitrify is widely dispersed among prokaryotes, and this polyphyletic distribution has raised the possibility of horizontal gene transfer (HGT) having a substantial role in the evolution of denitrification. Comparisons of 16S rRNA and denitrification gene phylogenies in recent studies support this possibility; however, these results remain speculative as they are based on visual comparisons of phylogenies from partial sequences. We reanalyzed publicly available nirS, nirK, norB, and nosZ partial sequences using Bayesian and maximum likelihood phylogenetic inference. Concomitant analysis of denitrification genes with 16S rRNA sequences from the same organisms showed substantial differences between the trees, which were supported by examining the posterior probability of monophyletic constraints at different taxonomic levels. Although these differences suggest HGT of denitrification genes, the presence of structural variants for nirK, norB, and nosZ makes it difficult to determine HGT from other evolutionary events. Additional analysis using phylogenetic networks and likelihood ratio tests of phylogenies based on full-length sequences retrieved from genomes also revealed significant differences in tree topologies among denitrification and 16S rRNA gene phylogenies, with the exception of the nosZ gene phylogeny within the data set of the nirK-harboring genomes. However, inspection of codon usage and G + C content plots from complete genomes gave no evidence for recent HGT. Instead, the close proximity of denitrification gene copies in the genomes of several denitrifying bacteria suggests duplication. Although HGT cannot be ruled out as a factor in the evolution of denitrification genes, our analysis suggests that other phenomena, such gene duplication/divergence and lineage sorting, may have differently influenced the evolution of each denitrification gene.

  4. Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing

    PubMed Central

    2012-01-01

    Background RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Results Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. Conclusions This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates. PMID:22985019

  5. Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing.

    PubMed

    Robles, José A; Qureshi, Sumaira E; Stephen, Stuart J; Wilson, Susan R; Burden, Conrad J; Taylor, Jennifer M

    2012-09-17

    RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates.

  6. Structural analysis of DNA binding by C.Csp231I, a member of a novel class of R-M controller proteins regulating gene expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shevtsov, M. B.; Streeter, S. D.; Thresh, S.-J.

    2015-02-01

    The structure of the new class of controller proteins (exemplified by C.Csp231I) in complex with its 21 bp DNA-recognition sequence is presented, and the molecular basis of sequence recognition in this class of proteins is discussed. An unusual extended spacer between the dimer binding sites suggests a novel interaction between the two C-protein dimers. In a wide variety of bacterial restriction–modification systems, a regulatory ‘controller’ protein (or C-protein) is required for effective transcription of its own gene and for transcription of the endonuclease gene found on the same operon. We have recently turned our attention to a new class ofmore » controller proteins (exemplified by C.Csp231I) that have quite novel features, including a much larger DNA-binding site with an 18 bp (∼60 Å) spacer between the two palindromic DNA-binding sequences and a very different recognition sequence from the canonical GACT/AGTC. Using X-ray crystallography, the structure of the protein in complex with its 21 bp DNA-recognition sequence was solved to 1.8 Å resolution, and the molecular basis of sequence recognition in this class of proteins was elucidated. An unusual aspect of the promoter sequence is the extended spacer between the dimer binding sites, suggesting a novel interaction between the two C-protein dimers when bound to both recognition sites correctly spaced on the DNA. A U-bend model is proposed for this tetrameric complex, based on the results of gel-mobility assays, hydrodynamic analysis and the observation of key contacts at the interface between dimers in the crystal.« less

  7. Factor structure of paediatric timed motor examination and its relationship with IQ

    PubMed Central

    MARTIN, REBECCA; TIGERA, CASSIE; DENCKLA, MARTHA B; MAHONE, E MARK

    2012-01-01

    AIM Brain systems supporting higher cognitive and motor control develop in a parallel manner, dependent on functional integrity and maturation of related regions, suggesting neighbouring neural circuitry. Concurrent examination of motor and cognitive control can provide a window into neurological development. However, identification of performance-based measures that do not correlate with IQ has been a challenge. METHOD Timed motor performance from the Physical and Neurological Examination of Subtle Signs and IQ were analysed in 136 children aged 6 to 16 (mean age 10y 2.6mo, SD 2y 6.4mo; 98 female, 38male) attending an outpatient neuropsychology clinic and 136 right-handed comparison individuals aged 6 to 16 (mean age 10y 3.1mo, SD 2y 6.1mo; 98 female, 38male). Timed activities – three repetitive movements (toe tapping, hand patting, finger tapping) and three sequenced movements (heel–toe tap, hand pronate/supinate, finger sequencing) each performed on the right and left – were included in exploratory factor analyses. RESULTS Among comparison individuals, factor analysis yielded two factors – repetitive and sequenced movements – with the sequenced factor significantly predictive of Verbal IQ (VIQ) (ΔR2=0.018, p=0.019), but not the repetitive factor (ΔR2=0.004, p=0.39). Factor analysis within the clinical group yielded two similar factors (repetitive and sequenced), both significantly predictive of VIQ, (ΔR2=0.028, p=0.015; ΔR2=0.046, p=0.002 respectively). INTERPRETATION Among typical children, repetitive timed tasks may be independent of IQ; however, sequenced tasks share more variance, implying shared neural substrates. Among neurologically vulnerable populations, however, both sequenced and repetitive movements covary with IQ, suggesting that repetitive speed is more indicative of underlying neurological integrity. PMID:20412260

  8. Comparative Analysis and Distribution of Omega-3 lcPUFA Biosynthesis Genes in Marine Molluscs

    PubMed Central

    Surm, Joachim M.; Prentis, Peter J.; Pavasovic, Ana

    2015-01-01

    Recent research has identified marine molluscs as an excellent source of omega-3 long-chain polyunsaturated fatty acids (lcPUFAs), based on their potential for endogenous synthesis of lcPUFAs. In this study we generated a representative list of fatty acyl desaturase (Fad) and elongation of very long-chain fatty acid (Elovl) genes from major orders of Phylum Mollusca, through the interrogation of transcriptome and genome sequences, and various publicly available databases. We have identified novel and uncharacterised Fad and Elovl sequences in the following species: Anadara trapezia, Nerita albicilla, Nerita melanotragus, Crassostrea gigas, Lottia gigantea, Aplysia californica, Loligo pealeii and Chlamys farreri. Based on alignments of translated protein sequences of Fad and Elovl genes, the haeme binding motif and histidine boxes of Fad proteins, and the histidine box and seventeen important amino acids in Elovl proteins, were highly conserved. Phylogenetic analysis of aligned reference sequences was used to reconstruct the evolutionary relationships for Fad and Elovl genes separately. Multiple, well resolved clades for both the Fad and Elovl sequences were observed, suggesting that repeated rounds of gene duplication best explain the distribution of Fad and Elovl proteins across the major orders of molluscs. For Elovl sequences, one clade contained the functionally characterised Elovl5 proteins, while another clade contained proteins hypothesised to have Elovl4 function. Additional well resolved clades consisted only of uncharacterised Elovl sequences. One clade from the Fad phylogeny contained only uncharacterised proteins, while the other clade contained functionally characterised delta-5 desaturase proteins. The discovery of an uncharacterised Fad clade is particularly interesting as these divergent proteins may have novel functions. Overall, this paper presents a number of novel Fad and Elovl genes suggesting that many mollusc groups possess most of the required enzymes for the synthesis of lcPUFAs. PMID:26308548

  9. Distinct profiles of expressed sequence tags during intestinal regeneration in the sea cucumber Holothuria glaberrima

    PubMed Central

    Rojas-Cartagena, Carmencita; Ortíz-Pineda, Pablo; Ramírez-Gómez, Francisco; Suárez-Castillo, Edna C.; Matos-Cruz, Vanessa; Rodríguez, Carlos; Ortíz-Zuazaga, Humberto; García-Arrarás, José E.

    2010-01-01

    Repair and regeneration are key processes for tissue maintenance, and their disruption may lead to disease states. Little is known about the molecular mechanisms that underline the repair and regeneration of the digestive tract. The sea cucumber Holothuria glaberrima represents an excellent model to dissect and characterize the molecular events during intestinal regeneration. To study the gene expression profile, cDNA libraries were constructed from normal, 3-day, and 7-day regenerating intestines of H. glaberrima. Clones were randomly sequenced and queried against the nonredundant protein database at the National Center for Biotechnology Information. RT-PCR analyses were made of several genes to determine their expression profile during intestinal regeneration. A total of 5,173 sequences from three cDNA libraries were obtained. About 46.2, 35.6, and 26.2% of the sequences for the normal, 3-days, and 7-days cDNA libraries, respectively, shared significant similarity with known sequences in the protein database of GenBank but only present 10% of similarity among them. Analysis of the libraries in terms of functional processes, protein domains, and most common sequences suggests that a differential expression profile is taking place during the regeneration process. Further examination of the expressed sequence tag dataset revealed that 12 putative genes are differentially expressed at significant level (R > 6). Experimental validation by RT-PCR analysis reveals that at least three genes (unknown C-4677-1, melanotransferrin, and centaurin) present a differential expression during regeneration. These findings strongly suggest that the gene expression profile varies among regeneration stages and provide evidence for the existence of differential gene expression. PMID:17579180

  10. Detection of integrated papillomavirus sequences by ligation-mediated PCR (DIPS-PCR) and molecular characterization in cervical cancer cells.

    PubMed

    Luft, F; Klaes, R; Nees, M; Dürst, M; Heilmann, V; Melsheimer, P; von Knebel Doeberitz, M

    2001-04-01

    Human papillomavirus (HPV) genomes usually persist as episomal molecules in HPV associated preneoplastic lesions whereas they are frequently integrated into the host cell genome in HPV-related cancers cells. This suggests that malignant conversion of HPV-infected epithelia is linked to recombination of cellular and viral sequences. Due to technical limitations, precise sequence information on viral-cellular junctions were obtained only for few cell lines and primary lesions. In order to facilitate the molecular analysis of genomic HPV integration, we established a ligation-mediated PCR assay for the detection of integrated papillomavirus sequences (DIPS-PCR). DIPS-PCR was initially used to amplify genomic viral-cellular junctions from HPV-associated cervical cancer cell lines (C4-I, C4-II, SW756, and HeLa) and HPV-immortalized keratinocyte lines (HPKIA, HPKII). In addition to junctions already reported in public data bases, various new fusion fragments were identified. Subsequently, 22 different viral-cellular junctions were amplified from 17 cervical carcinomas and 1 vulval intraepithelial neoplasia (VIN III). Sequence analysis of each junction revealed that the viral E1 open reading frame (ORF) was fused to cellular sequences in 20 of 22 (91%) cases. Chromosomal integration loci mapped to chromosomes 1 (2n), 2 (3n), 7 (2n), 8 (3n), 10 (1n), 14 (5n), 16 (1n), 17 (2n), and mitochondrial DNA (1n), suggesting random distribution of chromosomal integration sites. Precise sequence information obtained by DIPS-PCR was further used to monitor the monoclonal origin of 4 cervical cancers, 1 case of recurrent premalignant lesions and 1 lymph node metastasis. Therefore, DIPS-PCR might allow efficient therapy control and prediction of relapse in patients with HPV-associated anogenital cancers. Copyright 2001 Wiley-Liss, Inc.

  11. Genome Fragmentation Is Not Confined to the Peridinin Plastid in Dinoflagellates

    PubMed Central

    Espelund, Mari; Minge, Marianne A.; Gabrielsen, Tove M.; Nederbragt, Alexander J.; Shalchian-Tabrizi, Kamran; Otis, Christian; Turmel, Monique; Lemieux, Claude; Jakobsen, Kjetill S.

    2012-01-01

    When plastids are transferred between eukaryote lineages through series of endosymbiosis, their environment changes dramatically. Comparison of dinoflagellate plastids that originated from different algal groups has revealed convergent evolution, suggesting that the host environment mainly influences the evolution of the newly acquired organelle. Recently the genome from the anomalously pigmented dinoflagellate Karlodinium veneficum plastid was uncovered as a conventional chromosome. To determine if this haptophyte-derived plastid contains additional chromosomal fragments that resemble the mini-circles of the peridin-containing plastids, we have investigated its genome by in-depth sequencing using 454 pyrosequencing technology, PCR and clone library analysis. Sequence analyses show several genes with significantly higher copy numbers than present in the chromosome. These genes are most likely extrachromosomal fragments, and the ones with highest copy numbers include genes encoding the chaperone DnaK(Hsp70), the rubisco large subunit (rbcL), and two tRNAs (trnE and trnM). In addition, some photosystem genes such as psaB, psaA, psbB and psbD are overrepresented. Most of the dnaK and rbcL sequences are found as shortened or fragmented gene sequences, typically missing the 3′-terminal portion. Both dnaK and rbcL are associated with a common sequence element consisting of about 120 bp of highly conserved AT-rich sequence followed by a trnE gene, possibly serving as a control region. Decatenation assays and Southern blot analysis indicate that the extrachromosomal plastid sequences do not have the same organization or lengths as the minicircles of the peridinin dinoflagellates. The fragmentation of the haptophyte-derived plastid genome K. veneficum suggests that it is likely a sign of a host-driven process shaping the plastid genomes of dinoflagellates. PMID:22719952

  12. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE PAGES

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.; ...

    2018-01-09

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  13. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  14. Phylogenetic analysis of of Sarcocystis nesbitti (Coccidia: Sarcocystidae) suggests a snake as its probable definitive host

    USDA-ARS?s Scientific Manuscript database

    Sarcocystis nesbitti was first described by Mandour in 1969 from rhesus monkey muscle. Its definitive host remains unknown. 18SrRNA gene of Sarcocystis nesbitti was amplified, sequenced, and subjected to phylogenetic analysis. Among those congeners available for comparison, it shares closest affinit...

  15. The major histocompatibility complex of tassel-eared squirrels. II. Genetic diversity associated with Abert squirrels.

    PubMed

    Wettstein, P J; States, J S

    1986-01-01

    The extent of polymorphism and the rate of divergence of class I and class II sequences mapping to the mammalian major histocompatibility complex (MHC) have been the subject of experimentation and speculation. To provide further insight into the evolution of the MHC we have initiated the analysis of two geographically isolated subspecies of tassel-eared squirrels. In the preceding communication we described the number and polymorphism of TSLA class I and class II sequences in Kaibab squirrels (S. aberti kaibabensis), which live north of the Grand Canyon. In this report we present a parallel analysis of Abert squirrels (S. aberti aberti), which live south of the Grand Canyon in northern Arizona. Genomic DNA from 12 Abert squirrels was digested with restriction enzymes, electrophoresed, blotted, and hybridized with DR alpha, DR beta, DQ alpha, DQ beta, and HLA-B7 probes. The results of these hybridizations were remarkably similar to those obtained in Kaibab squirrels. The majority of class I and class II bands were identical in size and number, suggesting that Abert and Kaibab squirrels have not significantly diverged in the TSLA complex despite their geographical separation. Relative polymorphism of class II sequences was similar to that observed with Kaibab squirrels: beta sequences exhibited higher polymorphism than alpha sequences. As in Kaibab squirrels, a number of alpha and beta sequences were apparently carried on the same fragments. In comparison to class II beta sequences, there was limited polymorphism in class I sequences, although a diverse number of class I genotypes were observed. Attempts to identify segregating TSLA haplotypes were futile in that the only families of sequences with concordant distributions were DQ alpha and DQ beta. These observations and those obtained with Kaibab squirrels suggest that the present-day TSLA haplotypes of both subspecies are derived from a limited number of common, progenitor haplotypes through repeated intra-TSLA recombination.

  16. Microbial evolution of sulphate reduction when lateral gene transfer is geographically restricted.

    PubMed

    Chi Fru, E

    2011-07-01

    Lateral gene transfer (LGT) is an important mechanism by which micro-organisms acquire new functions. This process has been suggested to be central to prokaryotic evolution in various environments. However, the influence of geographical constraints on the evolution of laterally acquired genes in microbial metabolic evolution is not yet well understood. In this study, the influence of geographical isolation on the evolution of laterally acquired dissimilatory sulphite reductase (dsr) gene sequences in the sulphate-reducing micro-organisms (SRM) was investigated. Sequences on four continental blocks related to SRM known to have received dsr by LGT were analysed using standard phylogenetic and multidimensional statistical methods. Sequences related to lineages with large genetic diversity correlated positively with habitat divergence. Those affiliated to Thermodesulfobacterium indicated strong biogeographical delineation; hydrothermal-vent sequences clustered independently from hot-spring sequences. Some of the hydrothermal-vent and hot-spring sequences suggested to have been acquired from a common ancestral source may have diverged upon isolation within distinct habitats. In contrast, analysis of some Desulfotomaculum sequences indicated they could have been transferred from different ancestral sources but converged upon isolation within the same niche. These results hint that, after lateral acquisition of dsr genes, barriers to gene flow probably play a strong role in their subsequent evolution.

  17. Sequence-dependent DNA deformability studied using molecular dynamics simulations.

    PubMed

    Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

    2007-01-01

    Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.

  18. Origins of domestication and polyploidy in oca (Oxalis Tuberosa: Oxalidaceae). 2. Chloroplast-expressed glutamine synthetase data.

    PubMed

    Emshwiller, Eve; Doyle, Jeff J

    2002-07-01

    In continuing study of the origins of the octoploid tuber crop oca, Oxalis tuberosa Molina, we used phylogenetic analysis of DNA sequences of the chloroplast-active (nuclear encoded) isozyme of glutamine synthetase (ncpGS) from cultivated oca, its allies in the "Oxalis tuberosa alliance," and other Andean Oxalis. Multiple ncpGS sequences found within individuals of both the cultigen and a yet unnamed wild tuber-bearing taxon of Bolivia were separated by molecular cloning, but some cloned sequences appeared to be artifacts of polymerase chain reaction (PCR) recombination and/or Taq error. Nonetheless, three classes of nonrecombinant sequences each joined a different part of the O. tuberosa alliance clade on the ncpGS gene tree. Octoploid oca shares two sequence classes with the Bolivian tuber-bearing taxon (of unknown ploidy level). Fixed heterozygosity of these two sequence classes in all ocas sampled suggests that they represent homeologous loci and that oca is allopolyploid. A third sequence class, found in eight of nine oca plants sampled, might represent a third homeologous locus, suggesting that oca may be autoallopolyploid, and is shared with another wild tuber-bearing species, tetraploid O. picchensis of southern Peru. Thus, ncpGS data identify these two taxa as the best candidates as progenitors of cultivated oca.

  19. Alt a 1 allergen homologs from Alternaria and related taxa: analysis of phylogenetic content and secondary structure.

    PubMed

    Hong, Soon Gyu; Cramer, Robert A; Lawrence, Christopher B; Pryor, Barry M

    2005-02-01

    A gene for the Alternaria major allergen, Alt a 1, was amplified from 52 species of Alternaria and related genera, and sequence information was used for phylogenetic study. Alt a 1 gene sequences evolved 3.8 times faster and contained 3.5 times more parsimony-informative sites than glyceraldehyde-3-phosphate dehydrogenase (gpd) sequences. Analyses of Alt a 1 gene and gpd exon sequences strongly supported grouping of Alternaria spp. and related taxa into several species-groups described in previous studies, especially the infectoria, alternata, porri, brassicicola, and radicina species-groups and the Embellisia group. The sonchi species-group was newly suggested in this study. Monophyly of the Nimbya group was moderately supported, and monophyly of the Ulocladium group was weakly supported. Relationships among species-groups and among closely related species of the same species-group were not fully resolved. However, higher resolution could be obtained using Alt a 1 sequences or a combined dataset than using gpd sequences alone. Despite high levels of variation in amino acid sequences, results of in silico prediction of protein secondary structure for Alt a 1 demonstrated a high degree of structural similarity for most of the species suggesting a conservation of function.

  20. Crystal structure of bacillus subtilis YdaF protein : a putative ribosomal N-acetyltransferase.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brunzelle, J. S.; Wu, R.; Korolev, S. V.

    2004-12-01

    Comparative sequence analysis suggests that the ydaF gene encodes a protein (YdaF) that functions as an N-acetyltransferase, more specifically, a ribosomal N-acetyltransferase. Sequence analysis using basic local alignment search tool (BLAST) suggests that YdaF belongs to a large family of proteins (199 proteins found in 88 unique species of bacteria, archaea, and eukaryotes). YdaF also belongs to the COG1670, which includes the Escherichia coli RimL protein that is known to acetylate ribosomal protein L12. N-acetylation (NAT) has been found in all kingdoms. NAT enzymes catalyze the transfer of an acetyl group from acetyl-CoA (AcCoA) to a primary amino group. Formore » example, NATs can acetylate the N-terminal {alpha}-amino group, the {epsilon}-amino group of lysine residues, aminoglycoside antibiotics, spermine/speridine, or arylalkylamines such as serotonin. The crystal structure of the alleged ribosomal NAT protein, YdaF, from Bacillus subtilis presented here was determined as a part of the Midwest Center for Structural Genomics. The structure maintains the conserved tertiary structure of other known NATs and a high sequence similarity in the presumed AcCoA binding pocket in spite of a very low overall level of sequence identity to other NATs of known structure.« less

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bahl, C.; Morisseau, C; Bomberger, J

    Cystic fibrosis transmembrane conductance regulator (CFTR) inhibitory factor (Cif) is a virulence factor secreted by Pseudomonas aeruginosa that reduces the quantity of CFTR in the apical membrane of human airway epithelial cells. Initial sequence analysis suggested that Cif is an epoxide hydrolase (EH), but its sequence violates two strictly conserved EH motifs and also is compatible with other {alpha}/{beta} hydrolase family members with diverse substrate specificities. To investigate the mechanistic basis of Cif activity, we have determined its structure at 1.8-{angstrom} resolution by X-ray crystallography. The catalytic triad consists of residues Asp129, His297, and Glu153, which are conserved across themore » family of EHs. At other positions, sequence deviations from canonical EH active-site motifs are stereochemically conservative. Furthermore, detailed enzymatic analysis confirms that Cif catalyzes the hydrolysis of epoxide compounds, with specific activity against both epibromohydrin and cis-stilbene oxide, but with a relatively narrow range of substrate selectivity. Although closely related to two other classes of {alpha}/{beta} hydrolase in both sequence and structure, Cif does not exhibit activity as either a haloacetate dehalogenase or a haloalkane dehalogenase. A reassessment of the structural and functional consequences of the H269A mutation suggests that Cif's effect on host-cell CFTR expression requires the hydrolysis of an extended endogenous epoxide substrate.« less

  2. 1,4-Benzoquinone reductase from Phanerochaete chrysosporium: cDNA cloning and regulation of expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Akileswaran, L.; Brock, B.J.; Cereghino, J.L.

    1999-02-01

    A cDNA clone encoding a quinone reductase (QR) from the white rot basidiomycete Phanerochaete chrysosporium was isolated and sequenced. The cDNA consisted of 1,007 nucleotides and a poly(A) tail and encoded a deduced protein containing 271 amino acids. The experimentally determined eight-amino-acid N-germinal sequence of the purified QR protein from P. chrysosporium matched amino acids 72 to 79 of the predicted translation product of the cDNA. The M{sub r} of the predicted translation product, beginning with Pro-72, was essentially identical to the experimentally determined M{sub r} of one monomer of the QR dimer, and this finding suggested that QR ismore » synthesized as a proenzyme. The results of in vitro transcription-translation experiments suggested that QR is synthesized as a proenzyme with a 71-amino-acid leader sequence. This leader sequence contains two potential KEX2 cleavage sites and numerous potential cleavage sites for dipeptidyl aminopeptidase. The QR activity in cultures of P. chrysosporium increased following the addition of 2-dimethoxybenzoquinone, vanillic acid, or several other aromatic compounds. An immunoblot analysis indicated that induction resulted in an increase in the amount of QR protein, and a Northern blot analysis indicated that this regulation occurs at the level of the qr mRNA.« less

  3. Heterogeneous Suppression of Sequential Effects in Random Sequence Generation, but Not in Operant Learning.

    PubMed

    Shteingart, Hanan; Loewenstein, Yonatan

    2016-01-01

    There is a long history of experiments in which participants are instructed to generate a long sequence of binary random numbers. The scope of this line of research has shifted over the years from identifying the basic psychological principles and/or the heuristics that lead to deviations from randomness, to one of predicting future choices. In this paper, we used generalized linear regression and the framework of Reinforcement Learning in order to address both points. In particular, we used logistic regression analysis in order to characterize the temporal sequence of participants' choices. Surprisingly, a population analysis indicated that the contribution of the most recent trial has only a weak effect on behavior, compared to more preceding trials, a result that seems irreconcilable with standard sequential effects that decay monotonously with the delay. However, when considering each participant separately, we found that the magnitudes of the sequential effect are a monotonous decreasing function of the delay, yet these individual sequential effects are largely averaged out in a population analysis because of heterogeneity. The substantial behavioral heterogeneity in this task is further demonstrated quantitatively by considering the predictive power of the model. We show that a heterogeneous model of sequential dependencies captures the structure available in random sequence generation. Finally, we show that the results of the logistic regression analysis can be interpreted in the framework of reinforcement learning, allowing us to compare the sequential effects in the random sequence generation task to those in an operant learning task. We show that in contrast to the random sequence generation task, sequential effects in operant learning are far more homogenous across the population. These results suggest that in the random sequence generation task, different participants adopt different cognitive strategies to suppress sequential dependencies when generating the "random" sequences.

  4. Assessing pathogenicity for novel mutation/sequence variants: the value of healthy older individuals.

    PubMed

    Zatz, Mayana; Pavanello, Rita de Cassia M; Lourenço, Naila Cristina V; Cerqueira, Antonia; Lazar, Monize; Vainzof, Mariz

    2012-12-01

    Improvement in DNA technology is increasingly revealing unexpected/unknown mutations in healthy persons and generating anxiety due to their still unknown health consequences. We report a 44-year-old healthy father of a 10-year-old daughter with bilateral coloboma and hearing loss, but without muscle weakness, in whom a whole-genome CGH revealed a deletion of exons 38-44 in the dystrophin gene. This mutation was inherited from her asymptomatic father, who was further clinically and molecularly evaluated for prognosis and genetic counseling (GC). This deletion was never identified by us in 982 Duchenne/Becker patients. To assess whether the present case represents a rare case of non-penetrance, and aiming to obtain more information for prognosis and GC, we suggested that healthy older relatives submit their DNA for analysis, to which several complied. Mutation analysis revealed that his mother, brother, and 56-year-old maternal uncle also carry the 38-44 deletion, suggesting it an unlikely cause of muscle weakness. Genome sequencing will disclose mutations and variants whose health impact are still unknown, raising important problems in interpreting results, defining prognosis, and discussing GC. We suggest that, in addition to family history, keeping the DNA of older relatives could be very informative, in particular for those interested in having their genome sequenced.

  5. Evaluation of the genetic diversity of Plum pox virus in a single plum tree.

    PubMed

    Predajňa, Lukáš; Šubr, Zdeno; Candresse, Thierry; Glasa, Miroslav

    2012-07-01

    Genetic diversity of Plum pox virus (PPV) and its distribution within a single perennial woody host (plum, Prunus domestica) has been evaluated. A plum tree was triply infected by chip-budding with PPV-M, PPV-D and PPV-Rec isolates in 2003 and left to develop untreated under open field conditions. In September 2010 leaf and fruit samples were collected from different parts of the tree canopy. A 745-bp NIb-CP fragment of PPV genome, containing the hypervariable region encoding the CP N-terminal end was amplified by RT-PCR from each sample and directly sequenced to determine the dominant sequence. In parallel, the PCR products were cloned and a total of 105 individual clones were sequenced. Sequence analysis revealed that after 7 years of infection, only PPV-M was still detectable in the tree and that the two other isolates (PPV-Rec and PPV-D) had been displaced. Despite the fact that the analysis targeted a relatively short portion of the genome, a substantial amount of intra-isolate variability was observed for PPV-M. A total of 51 different haplotypes could be identified from the 105 individual sequences, two of which were largely dominant. However, no clear-cut structuration of the viral population by the tree architecture could be highlighted although the results obtained suggest the possibility of intra-leaf/fruit differentiation of the viral population. Comparison of the consensus sequence with the original source isolate showed no difference, suggesting within-plant stability of this original isolate under open field conditions. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.

    PubMed

    Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F

    2017-08-01

    Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus.

    PubMed Central

    Laprevotte, I; Hampe, A; Sherr, C J; Galibert, F

    1984-01-01

    The nucleotide sequence of the gag gene of feline leukemia virus and its flanking sequences were determined and compared with the corresponding sequences of two strains of feline sarcoma virus and with that of the Moloney strain of murine leukemia virus. A high degree of nucleotide sequence homology between the feline leukemia virus and murine leukemia virus gag genes was observed, suggesting that retroviruses of domestic cats and laboratory mice have a common, proximal evolutionary progenitor. The predicted structure of the complete feline leukemia virus gag gene precursor suggests that the translation of nonglycosylated and glycosylated gag gene polypeptides is initiated at two different AUG codons. These initiator codons fall in the same reading frame and are separated by a 222-base-pair segment which encodes an amino terminal signal peptide. The nucleotide sequence predicts the order of amino acids in each of the individual gag-coded proteins (p15, p12, p30, p10), all of which derive from the gag gene precursor. Stable stem-and-loop secondary structures are proposed for two regions of viral RNA. The first falls within sequences at the 5' end of the viral genome, together with adjacent palindromic sequences which may play a role in dimer linkage of RNA subunits. The second includes coding sequences at the gag-pol junction and is proposed to be involved in translation of the pol gene product. Sequence analysis of the latter region shows that the gag and pol genes are translated in different reading frames. Classical consensus splice donor and acceptor sequences could not be localized to regions which would permit synthesis of the expected gag-pol precursor protein. Alternatively, we suggest that the pol gene product (RNA-dependent DNA polymerase) could be translated by a frameshift suppressing mechanism which could involve cleavage modification of stems and loops in a manner similar to that observed in tRNA processing. PMID:6328019

  8. Nucleotide sequence determination of guinea-pig casein B mRNA reveals homology with bovine and rat alpha s1 caseins and conservation of the non-coding regions of the mRNA.

    PubMed Central

    Hall, L; Laird, J E; Craig, R K

    1984-01-01

    Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375

  9. Species-specific identification of commercial probiotic strains.

    PubMed

    Yeung, P S M; Sanders, M E; Kitts, C L; Cano, R; Tong, P S

    2002-05-01

    Products containing probiotic bacteria are gaining popularity, increasing the importance of their accurate speciation. Unfortunately, studies have suggested that improper labeling of probiotic species is common in commercial products. Species identification of a bank of commercial probiotic strains was attempted using partial 16S rDNA sequencing, carbohydrate fermentation analysis, and cellular fatty acid methyl ester analysis. Results from partial 16S rDNA sequencing indicated discrepancies between species designations for 26 out of 58 strains tested, including two ATCC Lactobacillus strains. When considering only the commercial strains obtained directly from the manufacturers, 14 of 29 strains carried species designations different from those obtained by partial 16S rDNA sequencing. Strains from six commercial products were species not listed on the label. The discrepancies mainly occurred in Lactobacillus acidophilus and Lactobacillus casei groups. Carbohydrate fermentation analysis was not sensitive enough to identify species within the L. acidophilus group. Fatty acid methyl ester analysis was found to be variable and inaccurate and is not recommended to identify probiotic lactobacilli.

  10. Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies.

    PubMed

    Bakker, Bjorn; Taudt, Aaron; Belderbos, Mirjam E; Porubsky, David; Spierings, Diana C J; de Jong, Tristan V; Halsema, Nancy; Kazemier, Hinke G; Hoekstra-Wakker, Karina; Bradley, Allan; de Bont, Eveline S J M; van den Berg, Anke; Guryev, Victor; Lansdorp, Peter M; Colomé-Tatché, Maria; Foijer, Floris

    2016-05-31

    Chromosome instability leads to aneuploidy, a state in which cells have abnormal numbers of chromosomes, and is found in two out of three cancers. In a chromosomal instable p53 deficient mouse model with accelerated lymphomagenesis, we previously observed whole chromosome copy number changes affecting all lymphoma cells. This suggests that chromosome instability is somehow suppressed in the aneuploid lymphomas or that selection for frequently lost/gained chromosomes out-competes the CIN-imposed mis-segregation. To distinguish between these explanations and to examine karyotype dynamics in chromosome instable lymphoma, we use a newly developed single-cell whole genome sequencing (scWGS) platform that provides a complete and unbiased overview of copy number variations (CNV) in individual cells. To analyse these scWGS data, we develop AneuFinder, which allows annotation of copy number changes in a fully automated fashion and quantification of CNV heterogeneity between cells. Single-cell sequencing and AneuFinder analysis reveals high levels of copy number heterogeneity in chromosome instability-driven murine T-cell lymphoma samples, indicating ongoing chromosome instability. Application of this technology to human B cell leukaemias reveals different levels of karyotype heterogeneity in these cancers. Our data show that even though aneuploid tumours select for particular and recurring chromosome combinations, single-cell analysis using AneuFinder reveals copy number heterogeneity. This suggests ongoing chromosome instability that other platforms fail to detect. As chromosome instability might drive tumour evolution, karyotype analysis using single-cell sequencing technology could become an essential tool for cancer treatment stratification.

  11. Resources and costs for microbial sequence analysis evaluated using virtual machines and cloud computing.

    PubMed

    Angiuoli, Samuel V; White, James R; Matalka, Malcolm; White, Owen; Fricke, W Florian

    2011-01-01

    The widespread popularity of genomic applications is threatened by the "bioinformatics bottleneck" resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers.

  12. Resources and Costs for Microbial Sequence Analysis Evaluated Using Virtual Machines and Cloud Computing

    PubMed Central

    Angiuoli, Samuel V.; White, James R.; Matalka, Malcolm; White, Owen; Fricke, W. Florian

    2011-01-01

    Background The widespread popularity of genomic applications is threatened by the “bioinformatics bottleneck” resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. Results We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Conclusions Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers. PMID:22028928

  13. How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

    PubMed

    Tian, Pengfei; Best, Robert B

    2017-10-17

    Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.

  14. The Impact of Normalization Methods on RNA-Seq Data Analysis

    PubMed Central

    Zyprych-Walczak, J.; Szabelska, A.; Handschuh, L.; Górczak, K.; Klamecka, K.; Figlerowicz, M.; Siatkowski, I.

    2015-01-01

    High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. PMID:26176014

  15. Constructing storyboards based on hierarchical clustering analysis

    NASA Astrophysics Data System (ADS)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  16. HTLV-1aA introduction into Brazil and its association with the trans-Atlantic slave trade.

    PubMed

    Amoussa, Adjile Edjide Roukiyath; Wilkinson, Eduan; Giovanetti, Marta; de Almeida Rego, Filipe Ferreira; Araujo, Thessika Hialla A; de Souza Gonçalves, Marilda; de Oliveira, Tulio; Alcantara, Luiz Carlos Junior

    2017-03-01

    Human T-lymphotropic virus (HTLV) is an endemic virus in some parts of the world, with Africa being home to most of the viral genetic diversity. In Brazil, HTLV-1 is endemic amongst Japanese and African immigrant populations. Multiple introductions of the virus in Brazil from other epidemic foci were hypothesized. The long terminal repeat (LTR) region of HTLV-1 was used to infer the origin of the virus in Brazil, using phylogenetic analysis. LTR sequences were obtained from the HTLV-1 database (http://htlv1db.bahia.fiocruz.br). Sequences were aligned and maximum-likelihood and Bayesian tree topologies were inferred. Brazilian specific clusters were identified and molecular-clock and coalescent models were used to estimate each cluster's time to the most recent common ancestor (tMRCA). Three Brazilian clusters were identified with a posterior probability ranged from 0.61 to 0.99. Molecular clock analysis of these three clusters dated back their respective tMRCAs between the year 1499 and the year 1668. Additional analysis also identified a close association between Brazilian sequences and new sequences from South Africa. Our results support the hypothesis of a multiple introductions of HTLV-1 into Brazil, with the majority of introductions occurring in the post-Colombian period. Our results further suggest that HTLV-1 introduction into Brazil was facilitated by the trans-Atlantic slave trade from endemic areas of Africa. The close association between southern African and Brazilian sequences also suggested that greater numbers of the southern African Bantu population might also have been part of the slave trade than previously thought. Copyright © 2016. Published by Elsevier B.V.

  17. Change in IgHV Mutational Status of CLL Suggests Origin From Multiple Clones.

    PubMed

    Osman, Afaf; Gocke, Christopher D; Gladstone, Douglas E

    2017-02-01

    Fluorescence in situ hybridization and immunoglobulin (Ig) heavy-chain variable-region (IgHV) mutational status are used to predict outcome in chronic lymphocytic leukemia (CLL). Although DNA aberrations change over time, IgHV sequences and mutational status are considered stable. In a retrospective review, 409 CLL patients, between 2008 and 2015, had IgHV analysis: 56 patients had multiple analyses performed. Seven patients' IgHV results changed: 2 from unmutated to mutated and 5 from mutated to unmutated IgHV sequence. Three concurrently changed their variable heavy-chain sequence. Secondary to allelic exclusion, 2 of the new variable heavy chains produced were biologically nonplausible. The existence of these new nonplausible heavy-chain variable regions suggests either the CLL cancer stem-cell maintains the ability to rearrange a previously silenced IgH allele or more likely that the cancer stem-cell produced at least 2 subclones, suggesting that the CLL cancer stem cell exists before the process of allelic exclusion occurs. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. RNA Editing in Plant Mitochondria

    NASA Astrophysics Data System (ADS)

    Hiesel, Rudolf; Wissinger, Bernd; Schuster, Wolfgang; Brennicke, Axel

    1989-12-01

    Comparative sequence analysis of genomic and complementary DNA clones from several mitochondrial genes in the higher plant Oenothera revealed nucleotide sequence divergences between the genomic and the messenger RNA-derived sequences. These sequence alterations could be most easily explained by specific post-transcriptional nucleotide modifications. Most of the nucleotide exchanges in coding regions lead to altered codons in the mRNA that specify amino acids better conserved in evolution than those encoded by the genomic DNA. Several instances show that the genomic arginine codon CGG is edited in the mRNA to the tryptophan codon TGG in amino acid positions that are highly conserved as tryptophan in the homologous proteins of other species. This editing suggests that the standard genetic code is used in plant mitochondria and resolves the frequent coincidence of CGG codons and tryptophan in different plant species. The apparently frequent and non-species-specific equivalency of CGG and TGG codons in particular suggests that RNA editing is a common feature of all higher plant mitochondria.

  19. Listeria monocytogenes sequence type 1 is predominant in ruminant rhombencephalitis

    PubMed Central

    Dreyer, Margaux; Aguilar-Bultet, Lisandra; Rupp, Sebastian; Guldimann, Claudia; Stephan, Roger; Schock, Alexandra; Otter, Arthur; Schüpbach, Gertraud; Brisse, Sylvain; Lecuit, Marc; Frey, Joachim; Oevermann, Anna

    2016-01-01

    Listeria (L.) monocytogenes is an opportunistic pathogen causing life-threatening infections in diverse mammalian species including humans and ruminants. As little is known on the link between strains and clinicopathological phenotypes, we studied potential strain-associated virulence and organ tropism in L. monocytogenes isolates from well-defined ruminant cases of clinical infections and the farm environment. The phylogeny of isolates and their virulence-associated genes were analyzed by multilocus sequence typing (MLST) and sequence analysis of virulence-associated genes. Additionally, a panel of representative isolates was subjected to in vitro infection assays. Our data suggest the environmental exposure of ruminants to a broad range of strains and yet the strong association of sequence type (ST) 1 from clonal complex (CC) 1 with rhombencephalitis, suggesting increased neurotropism of ST1 in ruminants, which is possibly related to its hypervirulence. This study emphasizes the importance of considering clonal background of L. monocytogenes isolates in surveillance, epidemiological investigation and disease control. PMID:27848981

  20. High sequence variability among hemocyte-specific Kazal-type proteinase inhibitors in decapod crustaceans.

    PubMed

    Cerenius, Lage; Liu, Haipeng; Zhang, Yanjiao; Rimphanitchayakit, Vichien; Tassanakajon, Anchalee; Gunnar Andersson, M; Söderhäll, Kenneth; Söderhäll, Irene

    2010-01-01

    Crustacean hemocytes were found to produce a large number of transcripts coding for Kazal-type proteinase inhibitors (KPIs). A detailed study performed with the crayfish Pacifastacus leniusculus and the shrimp Penaeus monodon revealed the presence of at least 26 and 20 different Kazal domains from the hemocyte KPIs, respectively. Comparisons with KPIs from other taxa indicate that the sequences of these domains evolve rapidly. A few conserved positions, e.g. six invariant cysteines were present in all domain sequences whereas the position of P1 amino acid, a determinant for substrate specificity, varied highly. A study with a single crayfish animal suggested that even at the individual level considerable sequence variability among hemocyte KPIs produced exist. Expression analysis of four crayfish KPI transcripts in hematopoietic tissue cells and different hemocyte types suggest that some of these KPIs are likely to be involved in hematopoiesis or hemocyte release as they were produced in particular hemocyte types or maturation stages only.

  1. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspectmore » centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.« less

  2. Evaluation of cysteine proteases of Plasmodium vivax as antimalarial drug targets: sequence analysis and sensitivity to cysteine protease inhibitors.

    PubMed

    Na, Byoung-Kuk; Kim, Tong-Soo; Rosenthal, Philip J; Lee, Jong-Koo; Kong, Yoon

    2004-10-01

    Cysteine proteases perform critical roles in the life cycles of malaria parasites. In Plasmodium falciparum, treatment of cysteine protease inhibitors inhibits hemoglobin hydrolysis and blocks the parasite development in vitro and in vivo, suggesting that plasmodial cysteine proteases may be interesting targets for new chemotherapeutics. To determine whether sequence diversity may limit chemotherapy against Plasmodium vivax, we analyzed sequence variations in the genes encoding three cysteine proteases, vivapain-1, -2 and -3, in 22 wild isolates of P. vivax. The sequences were highly conserved among wild isolates. A small number of substitutions leading to amino acid changes were found, while they did not modify essential residues for the function or structure of the enzymes. The substrate specificities and sensitivities to synthetic cysteine protease inhibitors of vivapain-2 and -3 from wild isolates were also very similar. These results support the suggestion that cysteine proteases of P. vivax are promising antimalarial chemotherapeutic targets.

  3. Detection of novel gammaherpesviruses from fruit bats in Indonesia.

    PubMed

    Wada, Yuji; Sasaki, Michihito; Setiyono, Agus; Handharyani, Ekowati; Rahmadani, Ibenu; Taha, Siswatiana; Adiani, Sri; Latief, Munira; Kholilullah, Zainal Abidin; Subangkit, Mawar; Kobayashi, Shintaro; Nakamura, Ichiro; Kimura, Takashi; Orba, Yasuko; Sawa, Hirofumi

    2018-03-01

    Bats are an important natural reservoir of zoonotic viral pathogens. We previously isolated an alphaherpesvirus in fruit bats in Indonesia, and here establish the presence of viruses belonging to other taxa of the family Herpesviridae. We screened the same fruit bat population with pan-herpesvirus PCR and discovered 68 sequences of novel gammaherpesvirus, designated 'megabat gammaherpesvirus' (MgGHV). A phylogenetic analysis of approximately 3.4 kbp of continuous MgGHV sequences encompassing the glycoprotein B gene and DNA polymerase gene revealed that the MgGHV sequences are distinct from those of other reported gammaherpesviruses. Further analysis suggested the existence of co-infections of herpesviruses in Indonesian fruit bats. Our findings extend our understanding of the infectious cycles of herpesviruses in bats in Indonesia and the phylogenetic diversity of the gammaherpesviruses.

  4. Analysis of BAC end sequences in oak, a keystone forest tree species, providing insight into the composition of its genome

    PubMed Central

    2011-01-01

    Background One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences. Results The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera. Conclusions This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak. PMID:21645357

  5. Selection and Validation of a Multilocus Variable-Number Tandem-Repeat Analysis Panel for Typing Shigella spp.▿ †

    PubMed Central

    Gorgé, Olivier; Lopez, Stéphanie; Hilaire, Valérie; Lisanti, Olivier; Ramisse, Vincent; Vergnaud, Gilles

    2008-01-01

    The Shigella genus has historically been separated into four species, based on biochemical assays. The classification within each species relies on serotyping. Recently, genome sequencing and DNA assays, in particular the multilocus sequence typing (MLST) approach, greatly improved the current knowledge of the origin and phylogenetic evolution of Shigella spp. The Shigella and Escherichia genera are now considered to belong to a unique genomospecies. Multilocus variable-number tandem-repeat (VNTR) analysis (MLVA) provides valuable polymorphic markers for genotyping and performing phylogenetic analyses of highly homogeneous bacterial pathogens. Here, we assess the capability of MLVA for Shigella typing. Thirty-two potentially polymorphic VNTRs were selected by analyzing in silico five Shigella genomic sequences and subsequently evaluated. Eventually, a panel of 15 VNTRs was selected (i.e., MLVA15 analysis). MLVA15 analysis of 78 strains or genome sequences of Shigella spp. and 11 strains or genome sequences of Escherichia coli distinguished 83 genotypes. Shigella population cluster analysis gave consistent results compared to MLST. MLVA15 analysis showed capabilities for E. coli typing, providing classification among pathogenic and nonpathogenic E. coli strains included in the study. The resulting data can be queried on our genotyping webpage (http://mlva.u-psud.fr). The MLVA15 assay is rapid, highly discriminatory, and reproducible for Shigella and Escherichia strains, suggesting that it could significantly contribute to epidemiological trace-back analysis of Shigella infections and pathogenic Escherichia outbreaks. Typing was performed on strains obtained mostly from collections. Further studies should include strains of much more diverse origins, including all pathogenic E. coli types. PMID:18216214

  6. Identification of three duplicated Spin genes in medaka (Oryzias latipes).

    PubMed

    Wang, Xiao-Lei; Mei, Jie; Sun, Min; Hong, Yun-Han; Gui, Jian-Fang

    2005-05-09

    Gene and genomic duplications are very important and frequent events in fish evolution, and the divergence of duplicated genes in sequences and functions is a focus of research on gene evolution. Here, we report the identification and characterization of three duplicated Spindlin (Spin) genes from medaka (Oryzias latipes): OlSpinA, OlSpinB, and OlSpinC. Molecular cloning, genomic DNA Blast analysis and phylogenetic relationship analysis demonstrated that the three duplicated OlSpin genes should belong to gene duplication. Furthermore, Western blot analysis revealed significant expression differences of the three OlSpins among different tissues and during embryogenesis in medaka, and suggested that sequence and functional divergence might have occurred in evolution among them.

  7. Evidence for tyrosine-linked glycosaminoglycan in a bacterial surface protein.

    PubMed

    Peters, J; Rudolf, S; Oschkinat, H; Mengele, R; Sumper, M; Kellermann, J; Lottspeich, F; Baumeister, W

    1992-04-01

    The S-layer protein of Acetogenium kivui was subjected to proteolysis with different proteases and several high molecular mass glycosaminoglycan peptides containing glucose, galactosamine and an unidentified sugar-related component were separated by molecular sieve chromatography and reversed-phase HPLC and subjected to N-terminal sequence analysis. By methylation analysis glucose was found to be uniformly 1,6-linked, whereas galactosamine was exclusively 1,4-linked. Hydrazinolysis and subsequent amino-acid analysis as well as two-dimensional NMR spectroscopy were used to demonstrate that in these peptides carbohydrate was covalently linked to tyrosine. As all of the four Tyr-glycosylation sites were found to be preceded by valine, a new recognition sequence for glycosylation is suggested.

  8. Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

    PubMed

    Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

    2004-02-01

    To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.

  9. Patient perspectives on whole-genome sequencing for undiagnosed diseases.

    PubMed

    Boeldt, Debra L; Cheung, Cynthia; Ariniello, Lauren; Darst, Burcu F; Topol, Sarah; Schork, Nicholas J; Philis-Tsimikas, Athena; Torkamani, Ali; Fortmann, Addie L; Bloss, Cinnamon S

    2017-01-01

    This study assessed perspectives on whole-genome sequencing (WGS) for rare disease diagnosis and the process of receiving genetic results. Semistructured interviews were conducted with adult patients and parents of minor patients affected by idiopathic diseases (n = 10 cases). Three main themes were identified through qualitative data analysis and interpretation: perceived benefits of WGS; perceived drawbacks of WGS; and perceptions of the return of results from WGS. Findings suggest that patients and their families have important perspectives on the use of WGS in diagnostic odyssey cases. These perspectives could inform clinical sequencing research study designs as well as the appropriate deployment of patient and family support services in the context of clinical genome sequencing.

  10. Complete genome sequence of a new enamovirus from Argentina infecting alfalfa plants showing dwarfism symptoms.

    PubMed

    Bejerman, Nicolás; Giolitti, Fabián; Trucco, Verónica; de Breuil, Soledad; Dietzgen, Ralf G; Lenardon, Sergio

    2016-07-01

    Alfalfa dwarf disease, probably caused by synergistic interactions of mixed virus infections, is a major and emergent disease that threatens alfalfa production in Argentina. Deep sequencing of diseased alfalfa plant samples from the central region of Argentina resulted in the identification of a new virus genome resembling enamoviruses in sequence and genome structure. Phylogenetic analysis suggests that it is a new member of the genus Enamovirus, family Luteoviridae. The virus is tentatively named "alfalfa enamovirus 1" (AEV-1). The availability of the AEV-1 genome sequence will make it possible to assess the genetic variability of this virus and to construct an infectious clone to investigate its role in alfalfa dwarfism disease.

  11. Conservation and diversification of Msx protein in metazoan evolution.

    PubMed

    Takahashi, Hirokazu; Kamiya, Akiko; Ishiguro, Akira; Suzuki, Atsushi C; Saitou, Naruya; Toyoda, Atsushi; Aruga, Jun

    2008-01-01

    Msx (/msh) family genes encode homeodomain (HD) proteins that control ontogeny in many animal species. We compared the structures of Msx genes from a wide range of Metazoa (Porifera, Cnidaria, Nematoda, Arthropoda, Tardigrada, Platyhelminthes, Mollusca, Brachiopoda, Annelida, Echiura, Echinodermata, Hemichordata, and Chordata) to gain an understanding of the role of these genes in phylogeny. Exon-intron boundary analysis suggested that the position of the intron located N-terminally to the HDs was widely conserved in all the genes examined, including those of cnidarians. Amino acid (aa) sequence comparison revealed 3 new evolutionarily conserved domains, as well as very strong conservation of the HDs. Two of the three domains were associated with Groucho-like protein binding in both a vertebrate and a cnidarian Msx homolog, suggesting that the interaction between Groucho-like proteins and Msx proteins was established in eumetazoan ancestors. Pairwise comparison among the collected HDs and their C-flanking aa sequences revealed that the degree of sequence conservation varied depending on the animal taxa from which the sequences were derived. Highly conserved Msx genes were identified in the Vertebrata, Cephalochordata, Hemichordata, Echinodermata, Mollusca, Brachiopoda, and Anthozoa. The wide distribution of the conserved sequences in the animal phylogenetic tree suggested that metazoan ancestors had already acquired a set of conserved domains of the current Msx family genes. Interestingly, although strongly conserved sequences were recovered from the Vertebrata, Cephalochordata, and Anthozoa, the sequences from the Urochordata and Hydrozoa showed weak conservation. Because the Vertebrata-Cephalochordata-Urochordata and Anthozoa-Hydrozoa represent sister groups in the Chordata and Cnidaria, respectively, Msx sequence diversification may have occurred differentially in the course of evolution. We speculate that selective loss of the conserved domains in Msx family proteins contributed to the diversification of animal body organization.

  12. A gene-specific non-enhancer sequence is critical for expression from the promoter of the small heat shock protein gene αB-crystallin

    PubMed Central

    2014-01-01

    Background Deciphering of the information content of eukaryotic promoters has remained confined to universal landmarks and conserved sequence elements such as enhancers and transcription factor binding motifs, which are considered sufficient for gene activation and regulation. Gene-specific sequences, interspersed between the canonical transacting factor binding sites or adjoining them within a promoter, are generally taken to be devoid of any regulatory information and have therefore been largely ignored. An unanswered question therefore is, do gene-specific sequences within a eukaryotic promoter have a role in gene activation? Here, we present an exhaustive experimental analysis of a gene-specific sequence adjoining the heat shock element (HSE) in the proximal promoter of the small heat shock protein gene, αB-crystallin (cryab). These sequences are highly conserved between the rodents and the humans. Results Using human retinal pigment epithelial cells in culture as the host, we have identified a 10-bp gene-specific promoter sequence (GPS), which, unlike an enhancer, controls expression from the promoter of this gene, only when in appropriate position and orientation. Notably, the data suggests that GPS in comparison with the HSE works in a context-independent fashion. Additionally, when moved upstream, about a nucleosome length of DNA (−154 bp) from the transcription start site (TSS), the activity of the promoter is markedly inhibited, suggesting its involvement in local promoter access. Importantly, we demonstrate that deletion of the GPS results in complete loss of cryab promoter activity in transgenic mice. Conclusions These data suggest that gene-specific sequences such as the GPS, identified here, may have critical roles in regulating gene-specific activity from eukaryotic promoters. PMID:24589182

  13. Molecular cloning and sequence analysis of full-length growth hormone cDNAs from six important economic fishes.

    PubMed

    Zhang, Jing-Nan; Song, Ping; Hu, Jia-Rui; Mo, Sai-Jun; Peng, Mao-Yu; Zhou, Wei; Zou, Ji-Xing; Hu, Yin-Chang

    2005-01-01

    In this study,the full-length cDNAs of GH (Growth Hormone) gene was isolated from six important economic fishes, Siniperca kneri, Epinephelus coioides, Monopterus albus, Silurus asotus, Misgurnus anguillicaudatus and Carassius auratus gibelio Bloch. It is the first time to clone these GH sequences except E. coioides GH. The lengths of the above cDNAs are as follows: 953 bp, 1 023 bp, 825 bp, 1 082 bp, 1 154 bp and 1 180 bp. Each sequence includes an ORF of about 600 bp which encodes a protein of about 200 amino acid: S. kneri, E. coioides and M. albus GHs of 204 amino acid, S. asotus GH of 200 amino acid, M. anguillicaudatus and C. auratus gibelio GHs of 210 amino acid. Then detailed sequence analysis of the six GHs with many other fish sequences was performed. The six sequences all showed high homology to other sequences, especially to sequences within the same order, and many conserved residues were identified, most localized in five domains. The phylogenetic trees (MP and NJ) of many fish GH ORF sequences (including the new six) with Amia calva as outgroup were generally resolved and largely congruent with the morphology-based tree though some incongruities were observed, suggesting GH ORF should be paid more attention to in teleostean phylogeny.

  14. Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.

    PubMed

    Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T

    1996-10-31

    Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.

  15. Genomic analysis suggests that mRNA destabilization by the microprocessor is specialized for the auto-regulation of Dgcr8.

    PubMed

    Shenoy, Archana; Blelloch, Robert

    2009-09-11

    The Microprocessor, containing the RNA binding protein Dgcr8 and RNase III enzyme Drosha, is responsible for processing primary microRNAs to precursor microRNAs. The Microprocessor regulates its own levels by cleaving hairpins in the 5'UTR and coding region of the Dgcr8 mRNA, thereby destabilizing the mature transcript. To determine whether the Microprocessor has a broader role in directly regulating other coding mRNA levels, we integrated results from expression profiling and ultra high-throughput deep sequencing of small RNAs. Expression analysis of mRNAs in wild-type, Dgcr8 knockout, and Dicer knockout mouse embryonic stem (ES) cells uncovered mRNAs that were specifically upregulated in the Dgcr8 null background. A number of these transcripts had evolutionarily conserved predicted hairpin targets for the Microprocessor. However, analysis of deep sequencing data of 18 to 200nt small RNAs in mouse ES, HeLa, and HepG2 indicates that exonic sequence reads that map in a pattern consistent with Microprocessor activity are unique to Dgcr8. We conclude that the Microprocessor's role in directly destabilizing coding mRNAs is likely specifically targeted to Dgcr8 itself, suggesting a specialized cellular mechanism for gene auto-regulation.

  16. Transcriptomic analysis of rice aleurone cells identified a novel abscisic acid response element.

    PubMed

    Watanabe, Kenneth A; Homayouni, Arielle; Gu, Lingkun; Huang, Kuan-Ying; Ho, Tuan-Hua David; Shen, Qingxi J

    2017-09-01

    Seeds serve as a great model to study plant responses to drought stress, which is largely mediated by abscisic acid (ABA). The ABA responsive element (ABRE) is a key cis-regulatory element in ABA signalling. However, its consensus sequence (ACGTG(G/T)C) is present in the promoters of only about 40% of ABA-induced genes in rice aleurone cells, suggesting other ABREs may exist. To identify novel ABREs, RNA sequencing was performed on aleurone cells of rice seeds treated with 20 μM ABA. Gibbs sampling was used to identify enriched elements, and particle bombardment-mediated transient expression studies were performed to verify the function. Gene ontology analysis was performed to predict the roles of genes containing the novel ABREs. This study revealed 2443 ABA-inducible genes and a novel ABRE, designated as ABREN, which was experimentally verified to mediate ABA signalling in rice aleurone cells. Many of the ABREN-containing genes are predicted to be involved in stress responses and transcription. Analysis of other species suggests that the ABREN may be monocot specific. This study also revealed interesting expression patterns of genes involved in ABA metabolism and signalling. Collectively, this study advanced our understanding of diverse cis-regulatory sequences and the transcriptomes underlying ABA responses in rice aleurone cells. © 2017 John Wiley & Sons Ltd.

  17. Whole genome sequencing and phylogenetic analysis of Bluetongue virus serotype 2 strains isolated in the Americas including a novel strain from the western United States.

    PubMed

    Gaudreault, Natasha N; Mayo, Christie E; Jasperson, Dane C; Crossley, Beate M; Breitmeyer, Richard E; Johnson, Donna J; Ostlund, Eileen N; MacLachlan, N James; Wilson, William C

    2014-07-01

    Bluetongue is a potentially fatal arboviral disease of domestic and wild ruminants that is characterized by widespread edema and tissue necrosis. Bluetongue virus (BTV) serotypes 10, 11, 13, and 17 occur throughout much of the United States, whereas serotype 2 (BTV-2) was previously only detected in the southeastern United States. Since 1998, 10 other BTV serotypes have also been isolated from ruminants in the southeastern United States. In 2010, BTV-2 was identified in California for the first time, and preliminary sequence analysis indicated that the virus isolate was closely related to BTV strains circulating in the southeastern United States. In the current study, the whole genome sequence of the California strain of BTV-2 was compared with those of other BTV-2 strains in the Americas. The results of the analysis suggest co-circulation of genetically distinct viruses in the southeastern United States, and further suggest that the 2010 western isolate is closely related to southeastern strains of BTV. Although it remains uncertain as to how this novel virus was translocated to California, the findings of the current study underscore the need for ongoing surveillance of this economically important livestock disease.

  18. Complete genome analysis of jasmine virus T from Jasminum sambac in China.

    PubMed

    Tang, Yajun; Gao, Fangluan; Yang, Zhen; Wu, Zujian; Yang, Liang

    2016-07-01

    The genome of a potyvirus (isolate JaVT_FZ) recovered from jasmine (Jasminum sambac L.) showing yellow ringspot symptoms in Fuzhou, China, was sequenced. JaVT_FZ is closely related to seven other potyviruses with completely sequenced genomes, with which it shares 66-70 % nucleotide and 52-56 % amino acid sequence identity. However, the coat protein (CP) gene shares 82-92 % nucleotide and 90-97 % amino acid sequence identity with those of two partially sequenced potyviruses, named jasmine potyvirus T (JaVT-jasmine) and jasmine yellow mosaic potyvirus (JaYMV-India), respectively. This suggests that JaVT_FZ, JaVT-jasmine and JaYMV-India should be regarded as members of a single potyvirus species, for which the name "Jasmine virus T" has priority.

  19. Isolation and characterization of major histocompatibility complex class II B genes in cranes.

    PubMed

    Kohyama, Tetsuo I; Akiyama, Takuya; Nishida, Chizuko; Takami, Kazutoshi; Onuma, Manabu; Momose, Kunikazu; Masuda, Ryuichi

    2015-11-01

    In this study, we isolated and characterized the major histocompatibility complex (MHC) class II B genes in cranes. Genomic sequences spanning exons 1 to 4 were amplified and determined in 13 crane species and three other species closely related to cranes. In all, 55 unique sequences were identified, and at least two polymorphic MHC class II B loci were found in most species. An analysis of sequence polymorphisms showed the signature of positive selection and recombination. A phylogenetic reconstruction based on exon 2 sequences indicated that trans-species polymorphism has persisted for at least 10 million years, whereas phylogenetic analyses of the sequences flanking exon 2 revealed a pattern of concerted evolution. These results suggest that both balancing selection and recombination play important roles in the crane MHC evolution.

  20. Multi-virulence-locus sequence typing of Staphylococcus lugdunensis generates results consistent with a clonal population structure and is reliable for epidemiological typing.

    PubMed

    Didi, Jennifer; Lemée, Ludovic; Gibert, Laure; Pons, Jean-Louis; Pestel-Caron, Martine

    2014-10-01

    Staphylococcus lugdunensis is an emergent virulent coagulase-negative staphylococcus responsible for severe infections similar to those caused by Staphylococcus aureus. To understand its potentially pathogenic capacity and have further detailed knowledge of the molecular traits of this organism, 93 isolates from various geographic origins were analyzed by multi-virulence-locus sequence typing (MVLST), targeting seven known or putative virulence-associated loci (atlLR2, atlLR3, hlb, isdJ, SLUG_09050, SLUG_16930, and vwbl). The polymorphisms of the putative virulence-associated loci were moderate and comparable to those of the housekeeping genes analyzed by multilocus sequence typing (MLST). However, the MVLST scheme generated 43 virulence types (VTs) compared to 20 sequence types (STs) based on MLST, indicating that MVLST was significantly more discriminating (Simpson's index [D], 0.943). No hypervirulent lineage or cluster specific to carriage strains was defined. The results of multilocus sequence analysis of known and putative virulence-associated loci are consistent with a clonal population structure for S. lugdunensis, suggesting a coevolution of these genes with housekeeping genes. Indeed, the nonsynonymous to synonymous evolutionary substitutions (dN/dS) ratio, the Tajima's D test, and Single-likelihood ancestor counting (SLAC) analysis suggest that all virulence-associated loci were under negative selection, even atlLR2 (AtlL protein) and SLUG_16930 (FbpA homologue), for which the dN/dS ratios were higher. In addition, this analysis of virulence-associated loci allowed us to propose a trilocus sequence typing scheme based on the intragenic regions of atlLR3, isdJ, and SLUG_16930, which is more discriminant than MLST for studying short-term epidemiology and further characterizing the lineages of the rare but highly pathogenic S. lugdunensis. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  1. Complete genome sequencing and phylogenetic analysis of dengue type 1 virus isolated from Jeddah, Saudi Arabia.

    PubMed

    Azhar, Esam I; Hashem, Anwar M; El-Kafrawy, Sherif A; Abol-Ela, Said; Abd-Alla, Adly M M; Sohrab, Sayed Sartaj; Farraj, Suha A; Othman, Norah A; Ben-Helaby, Huda G; Ashshi, Ahmed; Madani, Tariq A; Jamjoom, Ghazi

    2015-01-16

    Dengue viruses (DENVs) are mosquito-borne viruses which can cause disease ranging from mild fever to severe dengue infection. These viruses are endemic in several tropical and subtropical regions. Multiple outbreaks of DENV serotypes 1, 2 and 3 (DENV-1, DENV-2 and DENV-3) have been reported from the western region in Saudi Arabia since 1994. Strains from at least two genotypes of DENV-1 (Asia and America/Africa genotypes) have been circulating in western Saudi Arabia until 2006. However, all previous studies reported from Saudi Arabia were based on partial sequencing data of the envelope (E) gene without any reports of full genome sequences for any DENV serotypes circulating in Saudi Arabia. Here, we report the isolation and the first complete genome sequence of a DENV-1 strain (DENV-1-Jeddah-1-2011) isolated from a patient from Jeddah, Saudi Arabia in 2011. Whole genome sequence alignment and phylogenetic analysis showed high similarity between DENV-1-Jeddah-1-2011 strain and D1/H/IMTSSA/98/606 isolate (Asian genotype) reported from Djibouti in 1998. Further analysis of the full envelope gene revealed a close relationship between DENV-1-Jeddah-1-2011 strain and isolates reported between 2004-2006 from Jeddah as well as recent isolates from Somalia, suggesting the widespread of the Asian genotype in this region. These data suggest that strains belonging to the Asian genotype might have been introduced into Saudi Arabia long before 2004 most probably by African pilgrims and continued to circulate in western Saudi Arabia at least until 2011. Most importantly, these results indicate that pilgrims from dengue endemic regions can play an important role in the spread of new DENVs in Saudi Arabia and the rest of the world. Therefore, availability of complete genome sequences would serve as a reference for future epidemiological studies of DENV-1 viruses.

  2. Analysis of Claviceps africana and C. sorghi from India using AFLPs, EF-1alpha gene intron 4, and beta-tubulin gene intron 3.

    PubMed

    Tooley, Paul W; Bandyopadhyay, Ranajit; Carras, Marie M; Pazoutová, Sylvie

    2006-04-01

    Isolates of Claviceps causing ergot on sorghum in India were analysed by AFLP analysis, and by analysis of DNA sequences of the EF-1alpha gene intron 4 and beta-tubulin gene intron 3 region. Of 89 isolates assayed from six states in India, four were determined to be C. sorghi, and the rest C. africana. A relatively low level of genetic diversity was observed within the Indian C. africana population. No evidence of genetic exchange between C. africana and C. sorghi was observed in either AFLP or DNA sequence analysis. Phylogenetic analysis was conducted using DNA sequences from 14 different Claviceps species. A multigene phylogeny based on the EF-1alpha gene intron 4, the beta-tubulin gene intron 3 region, and rDNA showed that C. sorghi grouped most closely with C. gigantea and C. africana. Although the Claviceps species we analysed were closely related, they colonize hosts that are taxonomically very distinct suggesting that there is no direct coevolution of Claviceps with its hosts.

  3. Teaching Crystallography to Noncrystallographers.

    ERIC Educational Resources Information Center

    Glusker, Jenny P.

    1988-01-01

    Addresses the requirements of high school students and noncrystallographers in lectures on crystals, diffraction, and structure analysis. Discusses basic understanding and a sequence that addresses these requirements. Suggests visual and descriptive teaching methods used in this effort. (CW)

  4. Optimal packaging of FIV genomic RNA depends upon a conserved long-range interaction and a palindromic sequence within gag.

    PubMed

    Rizvi, Tahir A; Kenyon, Julia C; Ali, Jahabar; Aktar, Suriya J; Phillip, Pretty S; Ghazawi, Akela; Mustafa, Farah; Lever, Andrew M L

    2010-10-15

    The feline immunodeficiency virus (FIV) is a lentivirus that is related to human immunodeficiency virus (HIV), causing a similar pathology in cats. It is a potential small animal model for AIDS and the FIV-based vectors are also being pursued for human gene therapy. Previous studies have mapped the FIV packaging signal (ψ) to two or more discontinuous regions within the 5' 511 nt of the genomic RNA and structural analyses have determined its secondary structure. The 5' and 3' sequences within ψ region interact through extensive long-range interactions (LRIs), including a conserved heptanucleotide interaction between R/U5 and gag. Other secondary structural elements identified include a conserved 150 nt stem-loop (SL2) and a small palindromic stem-loop within gag open reading frame that might act as a viral dimerization initiation site. We have performed extensive mutational analysis of these sequences and structures and ascertained their importance in FIV packaging using a trans-complementation assay. Disrupting the conserved heptanucleotide LRI to prevent base pairing between R/U5 and gag reduced packaging by 2.8-5.5 fold. Restoration of pairing using an alternative, non-wild type (wt) LRI sequence restored RNA packaging and propagation to wt levels, suggesting that it is the structure of the LRI, rather than its sequence, that is important for FIV packaging. Disrupting the palindrome within gag reduced packaging by 1.5-3-fold, but substitution with a different palindromic sequence did not restore packaging completely, suggesting that the sequence of this region as well as its palindromic nature is important. Mutation of individual regions of SL2 did not have a pronounced effect on FIV packaging, suggesting that either it is the structure of SL2 as a whole that is necessary for optimal packaging, or that there is redundancy within this structure. The mutational analysis presented here has further validated the previously predicted RNA secondary structure of FIV ψ. Copyright © 2010 Elsevier Ltd. All rights reserved.

  5. [Isolation and identification of specific sequences correlated to cytoplasmic male sterility and fertile maintenance in cauliflower (Brassica oleracea var. botrytis)].

    PubMed

    Wang, Chun Guo; Chen, Xiao Qiang; Li, Hui; Zhao, Qian Cheng; Sun, De Ling; Song, Wen Qin

    2008-02-01

    Analysis of ISSR (Inter-Simple Sequence Repeat) and DDRT-PCR (Differential Display Reverse Transcriptase Polymerase Chain Reaction) was performed between cytoplasmic male sterility cauliflower ogura-A and its corresponding maintainer line ogura-B. Totally, 306 detectable bands were obtained by ISSR using thirty oligonucleotide primers. Commonly, six to twelve bands were produced per primer. Among all these primers only the amplification of primer ISSR3 was polymorphic, an 1100 bp specific band was only detected in maintainer line, named ISSR3(1100). Analysis of this sequence indicated that ISSR3(1100) was high homologous with the corresponding sequences of mitochondrial genome in Brassica napus and Arabidopsis thaliana,which suggested that ISSR3(1100) may derive from mitochondrial genome in cauliflower. To carry out DDRT-PCR analysis, three anchor primers and fifteen random primers were selected to combine. Totally, 1122 bands from 1 000 bp to 50 bp were detected. However, only four bands, named ogura-A 205, ogura-A383, ogura-B307 and ogura-B352, were confirmed to be different display in both lines. This result was further identified by reverse Northern dot blotting analysis. Among these four bands, ogura-A205 and ogura-A383 only express in cytoplasmic male sterility line, while ogura-B307 and ogura-B352 were only detected in maintainer line. Analysis of these sequences indicated that it was the first time that these four sequences were reported in cauliflower. Interestingly, ogura-A205 and ogura-B307 did not exhibit any similarities to other reported sequences in other species, more investigations were required to obtain further information. ogura-A383 and ogura-B352 were also two new sequences, they showed high similarities to corresponding chloroplast sequences of Arabidopsis thaliana and Brassica rapa subsp. pekinensis. So we speculated that these two sequences may derive from chloroplast genome. All these results obtained in this study offer new and significant information to investigate the molecular mechanism of cytoplasmic male sterility and fertile maintenance in cauliflower.

  6. Genome sequence analysis of dengue virus 1 isolated in Key West, Florida.

    PubMed

    Shin, Dongyoung; Richards, Stephanie L; Alto, Barry W; Bettinardi, David J; Smartt, Chelsea T

    2013-01-01

    Dengue virus (DENV) is transmitted to humans through the bite of mosquitoes. In November 2010, a dengue outbreak was reported in Monroe County in southern Florida (FL), including greater than 20 confirmed human cases. The virus collected from the human cases was verified as DENV serotype 1 (DENV-1) and one isolate was provided for sequence analysis. RNA was extracted from the DENV-1 isolate and was used in reverse transcription polymerase chain reaction (RT-PCR) to amplify PCR fragments to sequence. Nucleic acid primers were designed to generate overlapping PCR fragments that covered the entire genome. The DENV-1 isolate found in Key West (KW), FL was sequenced for whole genome characterization. Sequence assembly, Genbank searches, and recombination analyses were performed to verify the identity of the genome sequences and to determine percent similarity to known DENV-1 sequences. We show that the KW DENV-1 strain is 99% identical to Nicaraguan and Mexican DENV-1 strains. Phylogenetic and recombination analyses suggest that the DENV-1 isolated in KW originated from Nicaragua (NI) and the KW strain may circulate in KW. Also, recombination analysis results detected recombination events in the KW strain compared to DENV-1 strains from Puerto Rico. We evaluate the relative growth of KW strain of DENV-1 compared to other dengue viruses to determine whether the underlying genetics of the strain is associated with a replicative advantage, an important consideration since local transmission of DENV may result because domestic tourism can spread DENVs.

  7. Quantitative mutant analysis of viral quasispecies by chip-based matrix-assisted laser desorption/ ionization time-of-flight mass spectrometry

    PubMed Central

    Amexis, Georgios; Oeth, Paul; Abel, Kenneth; Ivshina, Anna; Pelloquin, Francois; Cantor, Charles R.; Braun, Andreas; Chumakov, Konstantin

    2001-01-01

    RNA viruses exist as quasispecies, heterogeneous and dynamic mixtures of mutants having one or more consensus sequences. An adequate description of the genomic structure of such viral populations must include the consensus sequence(s) plus a quantitative assessment of sequence heterogeneities. For example, in quality control of live attenuated viral vaccines, the presence of even small quantities of mutants or revertants may indicate incomplete or unstable attenuation that may influence vaccine safety. Previously, we demonstrated the monitoring of oral poliovirus vaccine with the use of mutant analysis by PCR and restriction enzyme cleavage (MAPREC). In this report, we investigate genetic variation in live attenuated mumps virus vaccine by using both MAPREC and a platform (DNA MassArray) based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. Mumps vaccines prepared from the Jeryl Lynn strain typically contain at least two distinct viral substrains, JL1 and JL2, which have been characterized by full length sequencing. We report the development of assays for characterizing sequence variants in these substrains and demonstrate their use in quantitative analysis of substrains and sequence variations in mixed virus cultures and mumps vaccines. The results obtained from both the MAPREC and MALDI-TOF methods showed excellent correlation. This suggests the potential utility of MALDI-TOF for routine quality control of live viral vaccines and for assessment of genetic stability and quantitative monitoring of genetic changes in other RNA viruses of clinical interest. PMID:11593021

  8. Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds

    PubMed Central

    Dean, Rebecca; Harrison, Peter W.; Wright, Alison E.; Zimmer, Fabian; Mank, Judith E.

    2015-01-01

    The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. PMID:26067773

  9. Dynamic scanpaths: eye movement analysis methods

    NASA Astrophysics Data System (ADS)

    Blackmon, Theodore T.; Ho, Yeuk F.; Chernyak, Dimitri A.; Azzariti, Michela; Stark, Lawrence W.

    1999-05-01

    An eye movements sequence, or scanpath, during viewing of a stationary stimulus has been described as a set of fixations onto regions-of-interest, ROIs, and the saccades or transitions between them. Such scanpaths have high similarity for the same subject and stimulus both in the spatial loci of the ROIs and their sequence; scanpaths also take place during recollection of a previously viewed stimulus, suggesting that they play a similar role in visual memory and recall.

  10. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods.

    PubMed

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from "Tua Nao" of Thailand traces a different evolutionary process from other strains.

  11. Analysis of the cytochrome c oxidase subunit II (COX2) gene in giant panda, Ailuropoda melanoleuca.

    PubMed

    Ling, S S; Zhu, Y; Lan, D; Li, D S; Pang, H Z; Wang, Y; Li, D Y; Wei, R P; Zhang, H M; Wang, C D; Hu, Y D

    2017-01-23

    The giant panda, Ailuropoda melanoleuca (Ursidae), has a unique bamboo-based diet; however, this low-energy intake has been sufficient to maintain the metabolic processes of this species since the fourth ice age. As mitochondria are the main sites for energy metabolism in animals, the protein-coding genes involved in mitochondrial respiratory chains, particularly cytochrome c oxidase subunit II (COX2), which is the rate-limiting enzyme in electron transfer, could play an important role in giant panda metabolism. Therefore, the present study aimed to isolate, sequence, and analyze the COX2 DNA from individuals kept at the Giant Panda Protection and Research Center, China, and compare these sequences with those of the other Ursidae family members. Multiple sequence alignment showed that the COX2 gene had three point mutations that defined three haplotypes, with 60% of the sequences corresponding to haplotype I. The neutrality tests revealed that the COX2 gene was conserved throughout evolution, and the maximum likelihood phylogenetic analysis, using homologous sequences from other Ursidae species, showed clustering of the COX2 sequences of giant pandas, suggesting that this gene evolved differently in them.

  12. A core microbiome associated with the peritoneal tumors of pseudomyxoma peritonei

    PubMed Central

    2013-01-01

    Background Pseudomyxoma peritonei (PMP) is a malignancy characterized by dissemination of mucus-secreting cells throughout the peritoneum. This disease is associated with significant morbidity and mortality and despite effective treatment options for early-stage disease, patients with PMP often relapse. Thus, there is a need for additional treatment options to reduce relapse rate and increase long-term survival. A previous study identified the presence of both typed and non-culturable bacteria associated with PMP tissue and determined that increased bacterial density was associated with more severe disease. These findings highlighted the possible role for bacteria in PMP disease. Methods To more clearly define the bacterial communities associated with PMP disease, we employed a sequenced-based analysis to profile the bacterial populations found in PMP tumor and mucin tissue in 11 patients. Sequencing data were confirmed by in situ hybridization at multiple taxonomic depths and by culturing. A pilot clinical study was initiated to determine whether the addition of antibiotic therapy affected PMP patient outcome. Main results We determined that the types of bacteria present are highly conserved in all PMP patients; the dominant phyla are the Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes. A core set of taxon-specific sequences were found in all 11 patients; many of these sequences were classified into taxonomic groups that also contain known human pathogens. In situ hybridization directly confirmed the presence of bacteria in PMP at multiple taxonomic depths and supported our sequence-based analysis. Furthermore, culturing of PMP tissue samples allowed us to isolate 11 different bacterial strains from eight independent patients, and in vitro analysis of subset of these isolates suggests that at least some of these strains may interact with the PMP-associated mucin MUC2. Finally, we provide evidence suggesting that targeting these bacteria with antibiotic treatment may increase the survival of PMP patients. Conclusions Using 16S amplicon-based sequencing, direct in situ hybridization analysis and culturing methods, we have identified numerous bacterial taxa that are consistently present in all PMP patients tested. Combined with data from a pilot clinical study, these data support the hypothesis that adding antimicrobials to the standard PMP treatment could improve PMP patient survival. PMID:23844722

  13. Using DGGE and 16S rRNA gene sequence analysis to evaluate changes in oral bacterial composition.

    PubMed

    Chen, Zhou; Trivedi, Harsh M; Chhun, Nok; Barnes, Virginia M; Saxena, Deepak; Xu, Tao; Li, Yihong

    2011-01-01

    To investigate whether a standard dental prophylaxis followed by tooth brushing with an antibacterial dentifrice will affect the oral bacterial community, as determined by denaturing gradient gel electrophoresis (DGGE) combined with 16S rRNA gene sequence analysis. Twenty-four healthy adults were instructed to brush their teeth using commercial dentifrice for 1 week during a washout period. An initial set of pooled supragingival plaque samples was collected from each participant at baseline (0 h) before prophylaxis treatment. The subjects were given a clinical examination and dental prophylaxis and asked to brush for 1 min with a dentifrice containing 0.3% triclosan, 2.0% PVM/MA copolymer and 0.243% sodium fluoride (Colgate Total). On the following day, a second set of pooled supragingival plaque samples (24 h) was collected. Total bacterial genomic DNA was isolated from the samples. Differences in the microbial composition before and after the prophylactic procedure and tooth brushing were assessed by comparing the DGGE profiles and 16S rRNA gene segments sequence analysis. Two distinct clusters of DGGE profiles were found, suggesting that a shift in the microbial composition had occurred 24 h after the prophylaxis and brushing. A detailed sequencing analysis of 16S rRNA gene segments further identified 6 phyla and 29 genera, including known and unknown bacterial species. Importantly, an increase in bacterial diversity was observed after 24 h, including members of the Streptococcaceae family, Prevotella, Corynebacterium, TM7 and other commensal bacteria. The results suggest that the use of a standard prophylaxis followed by the use of the dentifrice containing 0.3% triclosan, 2.0% PVM/MA copolymer and 0.243% sodium fluoride may promote a healthier composition within the oral bacterial community.

  14. Genetic analysis of Trichuris suis and Trichuris trichiura recovered from humans and pigs in a sympatric setting in Uganda.

    PubMed

    Nissen, Sofie; Al-Jubury, Azmi; Hansen, Tina V A; Olsen, Annette; Christensen, Henrik; Thamsborg, Stig M; Nejsum, Peter

    2012-08-13

    The whipworms Trichuris trichiura and Trichuris suis in humans and pigs, respectively, are believed to be two different species yet closely related. Morphologically, adult worms, eggs and larvae of the two species are indistinguishable. The aim of this study was to examine the genetic variation of Trichuris sp. mainly recovered from natural infected pigs and humans. Worm material isolated from humans and pigs living in the same geographical region in Uganda were analyzed by PCR, cloning and sequencing. Measurements of morphometric characters were also performed. The analysis of the ITS-2 (internal transcribed spacer) region showed a high genetic variation in the human-derived worms with two sequence types, designated type 1 and type 2, differing with up to 45%, the type 2 being identical to the sequence found in pig-derived worms. A single human-derived worm showed exclusively the type 2-genotype (T. suis-type) and three cases of 'heterozygote' worms in humans were identified. However, the analysis showed that sympatric Trichuris primarily assorted with host origin. Sequence analysis of a part of the genetically conserved β-tubulin gene confirmed two separate populations/species but also showed that the 'heterozygote' worms had a T. suis-like β-tubulin gene. A PCR-RFLP on the ITS-2 region was developed, that could distinguish between worms of the pig, human and 'heterozygote' type. The data suggest that Trichuris in pigs and humans belong to two different populations (i.e. are two different species). However, the data presented also suggest that cross-infections of humans with T. suis takes place. Further studies on sympatric Trichuris populations are highly warranted in order to explore transmission dynamics and unravel the zoonotic potential of T. suis. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. Group B Streptococcus Vaginal Carriage in Pregnant Women as Deciphered by Clustered Regularly Interspaced Short Palindromic Repeat Analysis.

    PubMed

    Beauruelle, Clemence; Pastuszka, Adeline; Mereghetti, Laurent; Lanotte, Philippe

    2018-06-01

    We evaluated the diversity of group B Streptococcus (GBS) vaginal carriage populations in pregnant women. For this purpose, we studied each isolate present in a primary culture of a vaginal swab using a new approach based on clustered regularly interspaced short palindromic repeats (CRISPR) locus analysis. To evaluate the CRISPR array composition rapidly, a restriction fragment length polymorphism (RFLP) analysis was performed. For each different pattern observed, the CRISPR array was sequenced and capsular typing and multilocus sequence typing (MLST) were performed. A total of 970 isolates from 10 women were analyzed by CRISPR-RFLP. Each woman carrying GBS isolates presented one to five specific "personal" patterns. Five women showed similar isolates with specific and unique restriction patterns, suggesting the carriage of a single GBS clone. Different patterns were observed among isolates from the other five women. For three of these, CRISPR locus sequencing highlighted low levels of internal modifications in the locus backbone, whereas there were high levels of modifications for the last two women, suggesting the carriage of two different clones. These two clones were closely related, having the same ancestral spacer(s), the same capsular type and, in one case, the same ST, but showed different antibiotic resistance patterns in pairs. Eight of 10 women were colonized by a single GBS clone, while two of them were colonized by two strains, leading to a risk of selection of more-virulent and/or more-resistant clones during antibiotic prophylaxis. This CRISPR analysis made it possible to separate isolates belonging to a single capsular type and sequence type, highlighting the greater discriminating power of this approach. Copyright © 2018 American Society for Microbiology.

  16. Putative cross-kingdom horizontal gene transfer in sponge (Porifera) mitochondria.

    PubMed

    Rot, Chagai; Goldfarb, Itay; Ilan, Micha; Huchon, Dorothée

    2006-09-14

    The mitochondrial genome of Metazoa is usually a compact molecule without introns. Exceptions to this rule have been reported only in corals and sea anemones (Cnidaria), in which group I introns have been discovered in the cox1 and nad5 genes. Here we show several lines of evidence demonstrating that introns can also be found in the mitochondria of sponges (Porifera). A 2,349 bp fragment of the mitochondrial cox1 gene was sequenced from the sponge Tetilla sp. (Spirophorida). This fragment suggests the presence of a 1143 bp intron. Similar to all the cnidarian mitochondrial introns, the putative intron has group I intron characteristics. The intron is present in the cox1 gene and encodes a putative homing endonuclease. In order to establish the distribution of this intron in sponges, the cox1 gene was sequenced from several representatives of the demosponge diversity. The intron was found only in the sponge order Spirophorida. A phylogenetic analysis of the COI protein sequence and of the intron open reading frame suggests that the intron may have been transmitted horizontally from a fungus donor. Little is known about sponge-associated fungi, although in the last few years the latter have been frequently isolated from sponges. We suggest that the horizontal gene transfer of a mitochondrial intron was facilitated by a symbiotic relationship between fungus and sponge. Ecological relationships are known to have implications at the genomic level. Here, an ecological relationship between sponge and fungus is suggested based on the genomic analysis.

  17. CRISPR regulation of intraspecies diversification by limiting IS transposition and intercellular recombination.

    PubMed

    Watanabe, Takayasu; Nozawa, Takashi; Aikawa, Chihiro; Amano, Atsuo; Maruyama, Fumito; Nakagawa, Ichiro

    2013-01-01

    Mobile genetic elements (MGEs) and genetic rearrangement are considered as major driving forces of bacterial diversification. Previous comparative genome analysis of Porphyromonas gingivalis, a pathogen related to periodontitis, implied such an important relationship. As a counterpart system to MGEs, clustered regularly interspaced short palindromic repeats (CRISPRs) in bacteria may be useful for genetic typing. We found that CRISPR typing could be a reasonable alternative to conventional methods for characterizing phylogenetic relationships among 60 highly diverse P. gingivalis isolates. Examination of genetic recombination along with multilocus sequence typing suggests the importance of such events between different isolates. MGEs appear to be strategically located at the breakpoint gaps of complicated genome rearrangements. Of these MGEs, insertion sequences (ISs) were found most frequently. CRISPR analysis identified 2,150 spacers that were clustered into 1,187 unique ones. Most of these spacers exhibited no significant nucleotide similarity to known sequences (97.6%: 1,158/1,187). Surprisingly, CRISPR spacers exhibiting high nucleotide similarity to regions of P. gingivalis genomes including ISs were predominant. The proportion of such spacers to all the unique spacers (1.6%: 19/1,187) was the highest among previous studies, suggesting novel functions for these CRISPRs. These results indicate that P. gingivalis is a bacterium with high intraspecies diversity caused by frequent insertion sequence (IS) transposition, whereas both the introduction of foreign DNA, primarily from other P. gingivalis cells, and IS transposition are limited by CRISPR interference. It is suggested that P. gingivalis CRISPRs could be an important source for understanding the role of CRISPRs in the development of bacterial diversity.

  18. Community and gene composition of a human dental plaque microbiota obtained by metagenomic sequencing

    PubMed Central

    Xie, G.; Chain, P.S.G.; Lo, C.; Liu, K-L.; Gans, J.; Merritt, J.; Qi, F.

    2010-01-01

    SUMMARY Human dental plaque is a complex microbial community containing an estimated 700 to 19,000 species/phylotypes. Despite numerous studies analysing species richness in healthy and diseased human subjects, the true genomic composition of the human dental plaque microbiota remains unknown. Here we report a metagenomic analysis of a healthy human plaque sample using a combination of second-generation sequencing platforms. A total of 860 million base pairs of non-human sequences were generated. Various analysis tools revealed the presence of 12 well-characterized phyla, members of the TM-7 and BRC1 clade, and sequences that could not be classified. Both pathogens and opportunistic pathogens were identified, supporting the ecological plaque hypothesis for oral diseases. Mapping the metagenomic reads to sequenced reference genomes demonstrated that 4% of the reads could be assigned to the sequenced species. Preliminary annotation identified genes belonging to all known functional categories. Interestingly, although 73% of the total assembled contig sequences were predicted to code for proteins, only 51% of them could be assigned a functional role. Furthermore, ~ 2.8% of the total predicted genes coded for proteins involved in resistance to antibiotics and toxic compounds, suggesting that the oral cavity is an important reservoir for antimicrobial resistance. PMID:21040513

  19. Community and gene composition of a human dental plaque microbiota obtained by metagenomic sequencing.

    PubMed

    Xie, G; Chain, P S G; Lo, C-C; Liu, K-L; Gans, J; Merritt, J; Qi, F

    2010-12-01

    Human dental plaque is a complex microbial community containing an estimated 700 to 19,000 species/phylotypes. Despite numerous studies analysing species richness in healthy and diseased human subjects, the true genomic composition of the human dental plaque microbiota remains unknown. Here we report a metagenomic analysis of a healthy human plaque sample using a combination of second-generation sequencing platforms. A total of 860 million base pairs of non-human sequences were generated. Various analysis tools revealed the presence of 12 well-characterized phyla, members of the TM-7 and BRC1 clade, and sequences that could not be classified. Both pathogens and opportunistic pathogens were identified, supporting the ecological plaque hypothesis for oral diseases. Mapping the metagenomic reads to sequenced reference genomes demonstrated that 4% of the reads could be assigned to the sequenced species. Preliminary annotation identified genes belonging to all known functional categories. Interestingly, although 73% of the total assembled contig sequences were predicted to code for proteins, only 51% of them could be assigned a functional role. Furthermore, ~2.8% of the total predicted genes coded for proteins involved in resistance to antibiotics and toxic compounds, suggesting that the oral cavity is an important reservoir for antimicrobial resistance. © 2010 John Wiley & Sons A/S.

  20. Sequence-based analysis of pQBR103; a representative of a unique, transfer-proficient mega plasmid resident in the microbial community of sugar beet

    PubMed Central

    Tett, Adrian; Spiers, Andrew J; Crossman, Lisa C; Ager, Duane; Ciric, Lena; Dow, J Maxwell; Fry, John C; Harris, David; Lilley, Andrew; Oliver, Anna; Parkhill, Julian; Quail, Michael A; Rainey, Paul B; Saunders, Nigel J; Seeger, Kathy; Snyder, Lori AS; Squares, Rob; Thomas, Christopher M; Turner, Sarah L; Zhang, Xue-Xian; Field, Dawn; Bailey, Mark J

    2009-01-01

    The plasmid pQBR103 was found within Pseudomonas populations colonizing the leaf and root surfaces of sugar beet plants growing at Wytham, Oxfordshire, UK. At 425 kb it is the largest self-transmissible plasmid yet sequenced from the phytosphere. It is known to enhance the competitive fitness of its host, and parts of the plasmid are known to be actively transcribed in the plant environment. Analysis of the complete sequence of this plasmid predicts a coding sequence (CDS)-rich genome containing 478 CDSs and an exceptional degree of genetic novelty; 80% of predicted coding sequences cannot be ascribed a function and 60% are orphans. Of those to which function could be assigned, 40% bore greatest similarity to sequences from Pseudomonas spp, and the majority of the remainder showed similarity to other c-proteobacterial genera and plasmids. pQBR103 has identifiable regions presumed responsible for replication and partitioning, but despite being tra+ lacks the full complement of any previously described conjugal transfer functions. The DNA sequence provided few insights into the functional significance of plant-induced transcriptional regions, but suggests that 14% of CDSs may be expressed (11 CDSs with functional annotation and 54 without), further highlighting the ecological importance of these novel CDSs. Comparative analysis indicates that pQBR103 shares significant regions of sequence with other plasmids isolated from sugar beet plants grown at the same geographic location. These plasmid sequences indicate there is more novelty in the mobile DNA pool accessible to phytosphere pseudomonas than is currently appreciated or understood. PMID:18043644

  1. Analysis of Two Cosmid Clones from Chromosome 4 of Drosophila melanogaster Reveals Two New Genes Amid an Unusual Arrangement of Repeated Sequences

    PubMed Central

    Locke, John; Podemski, Lynn; Roy, Ken; Pilgrim, David; Hodgetts, Ross

    1999-01-01

    Chromosome 4 from Drosophila melanogaster has several unusual features that distinguish it from the other chromosomes. These include a diffuse appearance in salivary gland polytene chromosomes, an absence of recombination, and the variegated expression of P-element transgenes. As part of a larger project to understand these properties, we are assembling a physical map of this chromosome. Here we report the sequence of two cosmids representing ∼5% of the polytenized region. Both cosmid clones contain numerous repeated DNA sequences, as identified by cross hybridization with labeled genomic DNA, BLAST searches, and dot matrix analysis, which are positioned between and within the transcribed sequences. The repetitive sequences include three copies of the mobile element Hoppel, one copy of the mobile element HB, and 18 DINE repeats. DINE is a novel, short repeated sequence dispersed throughout both cosmid sequences. One cosmid includes the previously described cubitus interruptus (ci) gene and two new genes: that a gene with a predicted amino acid sequence similar to ribosomal protein S3a which is consistent with the Minute(4)101 locus thought to be in the region, and a novel member of the protein family that includes plexin and met–hepatocyte growth factor receptor. The other cosmid contains only the two short 5′-most exons from the zinc-finger-homolog-2 (zfh-2) gene. This is the first extensive sequence analysis of noncoding DNA from chromosome 4. The distribution of the various repeats suggests its organization is similar to the β-heterochromatic regions near the base of the major chromosome arms. Such a pattern may account for the diffuse banding of the polytene chromosome 4 and the variegation of many P-element transgenes on the chromosome. PMID:10022978

  2. Transcriptome de novo assembly sequencing and analysis of the toxic dinoflagellate Alexandrium catenella using the Illumina platform.

    PubMed

    Zhang, Shu; Sui, Zhenghong; Chang, Lianpeng; Kang, Kyoungho; Ma, Jinhua; Kong, Fanna; Zhou, Wei; Wang, Jinguo; Guo, Liliang; Geng, Huili; Zhong, Jie; Ma, Qingxia

    2014-03-10

    In this article, high-throughput de novo transcriptomic sequencing was performed in Alexandrium catenella, which provided the first view of the gene repertoire in this dinoflagellate based on next-generation sequencing (NGS) technologies. A total of 118,304 unigenes were identified with an average length of 673bp (base pair). Of these unigenes, 77,936 (65.9%) were annotated with known proteins based on sequence similarities, among which 24,149 and 22,956 unigenes were assigned to gene ontology categories (GO) and clusters of orthologous groups (COGs), respectively. Furthermore, 16,467 unigenes were mapped onto 322 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). We also detected 1143 simple sequence repeats (SSRs), in which the tri-nucleotide repeat motif (69.3%) was the most abundant. The genetic facts and significance derived from the transcriptome dataset were suggested and discussed. All four core nucleosomal histones and linker histones were detected, in addition to the unigenes involved in histone modifications.190 unigenes were identified as being involved in the endocytosis pathway, and clathrin-dependent endocytosis was suggested to play a role in the heterotrophy of A. catenella. A conserved 22-nt spliced leader (SL) was identified in 21 unigenes which suggested the existence of trans-splicing processing of mRNA in A. catenella. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.

  3. Identification of a precursor genomic segment that provided a sequence unique to glycophorin B and E genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Onda, M.; Kudo, S.; Fukuda, M.

    Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less

  4. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints

    PubMed Central

    Chan, Yvonne H.; Venev, Sergey V.; Zeldovich, Konstantin B.; Matthews, C. Robert

    2017-01-01

    Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs. PMID:28262665

  5. [Identification and phylogenetic application of unique nucleotide sequence of nad7 intron2 in Rhodiola (Crassulaceae) species].

    PubMed

    Deng, Ke-Jun; Yang, Zu-Jun; Liu, Cheng; Zhao, Wei; Liu, Chang; Feng, Juan; Ren, Zheng-Long

    2007-03-01

    Genetic characterization of 9 populations of Rhodiola crenulata, R. fastigiata and R. sachalinensis (Crassulaceae) species from Sichuan and Jilin Provinces of China, was investigated using the conserved primer of nad7 intron 2. All PCR products about 800 bp long were shorter than other Crassulaceae plants, which were used as molecular markers to identify the Rhodiola species. The sequence of the products indicated that total exon of 53 bp and intron of 738 bp exhibit only 9 nucleotide variations. Blasting the nad7 sequences to GenBank and the phylogenetic analysis showed that the sequence of Rhodiola species was clusted independently, and the length was smaller than all the registered sequences of higher plants. The result suggests that the Rhiodola species had a unique sequence in this gene region, which might be related to the special growth condition.

  6. Application of Inter-Simple Sequence Repeat Markers in the Analysis of Populations of the Chagas Disease Vector Triatoma infestans (Hemiptera, Reduviidae)

    PubMed Central

    Pérez de Rosas, Alicia R.; Restelli, María F.; Fernández, Cintia J.; Blariza, María J.; García, Beatriz A.

    2017-01-01

    Here we apply inter-simple sequence repeat (ISSR) markers to explore the fine-scale genetic structure and dispersal in populations of Triatoma infestans. Five selected primers from 30 primers were used to amplify ISSRs by polymerase chain reaction. A total of 90 polymorphic bands were detected across 134 individuals captured from 11 peridomestic sites from the locality of San Martín (Capayán Department, Catamarca Province, Argentina). Significant levels of genetic differentiation suggest limited gene flow among sampling sites. Spatial autocorrelation analysis confirms that dispersal occurs on the scale of ∼469 m, suggesting that insecticide spraying should be extended at least within a radius of ∼500 m around the infested area. Moreover, Bayesian clustering algorithms indicated genetic exchange among different sites analyzed, supporting the hypothesis of an important role of peridomestic structures in the process of reinfestation. PMID:28115670

  7. First report of the complete sequence of Sida golden yellow vein virus from Jamaica.

    PubMed

    Stewart, Cheryl S; Kon, Tatsuya; Gilbertson, Robert L; Roye, Marcia E

    2011-08-01

    Begomoviruses are phytopathogens that threaten food security [18]. Sida spp. are ubiquitous weed species found in Jamaica. Sida samples were collected island-wide, DNA was extracted via a modified Dellaporta method, and the viral genome was amplified using degenerate and sequence-specific primers [2, 11]. The amplicons were cloned and sequenced. Sequence analysis revealed that a DNA-A molecule isolated from a plant in Liguanea, St. Andrew, was 90.9% similar to Sida golden yellow vein virus-[United States of America:Homestead:A11], making it a strain of SiGYVV. It was named Sida golden yellow vein virus-[Jamaica:Liguanea 2:2008] (SiGYVV-[JM:Lig2:08]). The cognate DNA-B, previously unreported, was successfully cloned and was most similar to that of Malvastrum yellow mosaic Jamaica virus (MaYMJV). Phylogenetic analysis suggested that this virus was most closely related to begomoviruses that infect malvaceous hosts in Jamaica, Cuba and Florida in the United States.

  8. Uniform, optimal signal processing of mapped deep-sequencing data.

    PubMed

    Kumar, Vibhor; Muratani, Masafumi; Rayan, Nirmala Arul; Kraus, Petra; Lufkin, Thomas; Ng, Huck Hui; Prabhakar, Shyam

    2013-07-01

    Despite their apparent diversity, many problems in the analysis of high-throughput sequencing data are merely special cases of two general problems, signal detection and signal estimation. Here we adapt formally optimal solutions from signal processing theory to analyze signals of DNA sequence reads mapped to a genome. We describe DFilter, a detection algorithm that identifies regulatory features in ChIP-seq, DNase-seq and FAIRE-seq data more accurately than assay-specific algorithms. We also describe EFilter, an estimation algorithm that accurately predicts mRNA levels from as few as 1-2 histone profiles (R ∼0.9). Notably, the presence of regulatory motifs in promoters correlates more with histone modifications than with mRNA levels, suggesting that histone profiles are more predictive of cis-regulatory mechanisms. We show by applying DFilter and EFilter to embryonic forebrain ChIP-seq data that regulatory protein identification and functional annotation are feasible despite tissue heterogeneity. The mathematical formalism underlying our tools facilitates integrative analysis of data from virtually any sequencing-based functional profile.

  9. Hepatitis C Virus Antigenic Convergence

    PubMed Central

    Campo, David S.; Dimitrova, Zoya; Yokosawa, Jonny; Hoang, Duc; Perez, Nestor O.; Ramachandran, Sumathi; Khudyakov, Yury

    2012-01-01

    Vaccine development against hepatitis C virus (HCV) is hindered by poor understanding of factors defining cross-immunoreactivity among heterogeneous epitopes. Using synthetic peptides and mouse immunization as a model, we conducted a quantitative analysis of cross-immunoreactivity among variants of the HCV hypervariable region 1 (HVR1). Analysis of 26,883 immunological reactions among pairs of peptides showed that the distribution of cross-immunoreactivity among HVR1 variants was skewed, with antibodies against a few variants reacting with all tested peptides. The HVR1 cross-immunoreactivity was accurately modeled based on amino acid sequence alone. The tested peptides were mapped in the HVR1 sequence space, which was visualized as a network of 11,319 sequences. The HVR1 variants with a greater network centrality showed a broader cross-immunoreactivity. The entire sequence space is explored by each HCV genotype and subtype. These findings indicate that HVR1 antigenic diversity is extensively convergent and effectively limited, suggesting significant implications for vaccine development. PMID:22355779

  10. Complete Nucleotide Sequence of Watermelon Chlorotic Stunt Virus Originating from Oman

    PubMed Central

    Khan, Akhtar J.; Akhtar, Sohail; Briddon, Rob W.; Ammara, Um; Al-Matrooshi, Abdulrahman M.; Mansoor, Shahid

    2012-01-01

    Watermelon chlorotic stunt virus (WmCSV) is a bipartite begomovirus (genus Begomovirus, family Geminiviridae) that causes economic losses to cucurbits, particularly watermelon, across the Middle East and North Africa. Recently squash (Cucurbita moschata) grown in an experimental field in Oman was found to display symptoms such as leaf curling, yellowing and stunting, typical of a begomovirus infection. Sequence analysis of the virus isolated from squash showed 97.6–99.9% nucleotide sequence identity to previously described WmCSV isolates for the DNA A component and 93–98% identity for the DNA B component. Agrobacterium-mediated inoculation to Nicotiana benthamiana resulted in the development of symptoms fifteen days post inoculation. This is the first bipartite begomovirus identified in Oman. Overall the Oman isolate showed the highest levels of sequence identity to a WmCSV isolate originating from Iran, which was confirmed by phylogenetic analysis. This suggests that WmCSV present in Oman has been introduced from Iran. The significance of this finding is discussed. PMID:22852046

  11. Complete nucleotide sequence of watermelon chlorotic stunt virus originating from Oman.

    PubMed

    Khan, Akhtar J; Akhtar, Sohail; Briddon, Rob W; Ammara, Um; Al-Matrooshi, Abdulrahman M; Mansoor, Shahid

    2012-07-01

    Watermelon chlorotic stunt virus (WmCSV) is a bipartite begomovirus (genus Begomovirus, family Geminiviridae) that causes economic losses to cucurbits, particularly watermelon, across the Middle East and North Africa. Recently squash (Cucurbita moschata) grown in an experimental field in Oman was found to display symptoms such as leaf curling, yellowing and stunting, typical of a begomovirus infection. Sequence analysis of the virus isolated from squash showed 97.6-99.9% nucleotide sequence identity to previously described WmCSV isolates for the DNA A component and 93-98% identity for the DNA B component. Agrobacterium-mediated inoculation to Nicotiana benthamiana resulted in the development of symptoms fifteen days post inoculation. This is the first bipartite begomovirus identified in Oman. Overall the Oman isolate showed the highest levels of sequence identity to a WmCSV isolate originating from Iran, which was confirmed by phylogenetic analysis. This suggests that WmCSV present in Oman has been introduced from Iran. The significance of this finding is discussed.

  12. Identification and phylogenetic diversity of parvovirus circulating in commercial chicken and turkey flocks in Croatia.

    PubMed

    Bidin, M; Lojkić, I; Bidin, Z; Tiljar, M; Majnarić, D

    2011-12-01

    Phylogenetic diversity of parvovirus detected in commercial chicken and turkey flocks is described. Nine chicken and six turkey flocks from Croatian farms were tested for parvovirus presence. Intestinal samples from one turkey and seven chicken flocks were found positive, and were sequenced. Natural parvovirus infection was more frequently detected in chickens than in turkeys examined in this study. Sequence analysis of 400 nucleotide fragments of the nonstructural gene (NS) showed that our sequences had more similarity with chicken parvovirus (ChPV) (92.3%-99.7%) than turkey parvovirus (TuPV) (89.5%-98.9%) strains. Phylogenetic analysis grouped our sequences in two clades. Also, the higher prevalence of ChPV than TuPV in tested flocks was defined. The necropsy findings suggested a malabsorption syndrome followed by a preascitic condition. Further research of parvovirus infection, pathogenesis, and the possibility of its association with poult enteritis and mortality syndrome (PEMS) and runting and stunting syndrome (RSS) is needed to clarify its significance as an agent of enteric disease.

  13. Novel, non-symbiotic isolates of Neorhizobium from a dryland agricultural soil.

    PubMed

    Soenens, Amalia; Imperial, Juan

    2018-01-01

    Semi-selective enrichment, followed by PCR screening, resulted in the successful direct isolation of fast-growing Rhizobia from a dryland agricultural soil. Over 50% of these isolates belong to the genus Neorhizobium , as concluded from partial rpoB and near-complete 16S rDNA sequence analysis. Further genotypic and genomic analysis of five representative isolates confirmed that they form a coherent group within Neorhizobium , closer to N. galegae than to the remaining Neorhizobium species, but clearly differentiated from the former, and constituting at least one new genomospecies within Neorhizobium. All the isolates lacked nod and nif symbiotic genes but contained a repABC replication/maintenance region, characteristic of rhizobial plasmids, within large contigs from their draft genome sequences. These repABC sequences were related, but not identical, to repABC sequences found in symbiotic plasmids from N. galegae , suggesting that the non-symbiotic isolates have the potential to harbor symbiotic plasmids. This is the first report of non-symbiotic members of Neorhizobium from soil.

  14. Tn5401, a new class II transposable element from Bacillus thuringiensis.

    PubMed Central

    Baum, J A

    1994-01-01

    A new class II (Tn3-like) transposable element, designated Tn5401, was recovered from a sporulation-deficient variant of Bacillus thuringiensis subsp. morrisoni EG2158 following its insertion into a recombinant plasmid. Sequence analysis of the insert revealed a 4,837-bp transposon with two large open reading frames, in the same orientation, encoding proteins of 36 kDa (306 residues) and 116 kDa (1,005 residues) and 53-bp terminal inverted repeats. The deduced amino acid sequence for the 36-kDa protein shows 24% sequence identity with the TnpI recombinase of the B. thuringiensis transposon Tn4430, a member of the phage integrase family of site-specific recombinases. The deduced amino acid sequence for the 116-kDa protein shows 42% sequence identity with the transposase of Tn3 but only 28% identity with the TnpA transposase of Tn4430. Two small open reading frames of unknown function, designated orf1 (85 residues) and orf2 (74 residues), were also identified. Southern blot analysis indicated that Tn5401, in contrast to Tn4430, is not commonly found among different subspecies of B. thuringiensis and is not typically associated with known insecticidal crystal protein genes. Transposition was studied with B. thuringiensis by using plasmid pEG922, a temperature-sensitive shuttle vector containing Tn5401. Tn5401 transposed to both chromosomal and plasmid target sites but displayed an apparent preference for plasmid sites. Transposition was replicative and resulted in the generation of a 5-bp duplication at the target site. Transcriptional start sites within Tn5401 were mapped by primer extension analysis. Two promoters, designated PL and PR, direct the transcription of orf1-orf2 and tnpI-tnpA, respectively, and are negatively regulated by TnpI. Sequence comparison of the promoter regions of Tn5401 and Tn4430 suggests that the conserved sequence element ATGTCCRCTAAY mediates TnpI binding and cointegrate resolution. The same element is contained within the 53-bp terminal inverted repeats, thus accounting for their unusual lengths and suggesting an additional role for TnpI in regulating Tn5401 transposition. Images PMID:7514590

  15. A sequence-based survey of the complex structural organization of tumor genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison ofmore » the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.« less

  16. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    PubMed Central

    Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

    2003-01-01

    Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626

  17. Communities of archaea and bacteria in a subsurface radioactive thermal spring in the Austrian Central Alps, and evidence of ammonia-oxidizing Crenarchaeota.

    PubMed

    Weidler, Gerhard W; Dornmayr-Pfaffenhuemer, Marion; Gerbl, Friedrich W; Heinen, Wolfgang; Stan-Lotter, Helga

    2007-01-01

    Scanning electron microscopy revealed great morphological diversity in biofilms from several largely unexplored subterranean thermal Alpine springs, which contain radium 226 and radon 222. A culture-independent molecular analysis of microbial communities on rocks and in the water of one spring, the "Franz-Josef-Quelle" in Bad Gastein, Austria, was performed. Four hundred fifteen clones were analyzed. One hundred thirty-two sequences were affiliated with 14 bacterial operational taxonomic units (OTUs) and 283 with four archaeal OTUs. Rarefaction analysis indicated a high diversity of bacterial sequences, while archaeal sequences were less diverse. The majority of the cloned archaeal 16S rRNA gene sequences belonged to the soil-freshwater-subsurface (1.1b) crenarchaeotic group; other representatives belonged to the freshwater-wastewater-soil (1.3b) group, except one clone, which was related to a group of uncultivated Euryarchaeota. These findings support recent reports that Crenarchaeota are not restricted to high-temperature environments. Most of the bacterial sequences were related to the Proteobacteria (alpha, beta, gamma, and delta), Bacteroidetes, and Planctomycetes. One OTU was allied with Nitrospina sp. (delta-Proteobacteria) and three others grouped with Nitrospira. Statistical analyses suggested high diversity based on 16S rRNA gene analyses; the rarefaction plot of archaeal clones showed a plateau. Since Crenarchaeota have been implicated recently in the nitrogen cycle, the spring environment was probed for the presence of the ammonia monooxygenase subunit A (amoA) gene. Sequences were obtained which were related to crenarchaeotic amoA genes from marine and soil habitats. The data suggested that nitrification processes are occurring in the subterranean environment and that ammonia may possibly be an energy source for the resident communities.

  18. Molecular Population Genetics of the Alcohol Dehydrogenase Gene Region of DROSOPHILA MELANOGASTER

    PubMed Central

    Aquadro, Charles F.; Desse, Susan F.; Bland, Molly M.; Langley, Charles H.; Laurie-Ahlberg, Cathy C.

    1986-01-01

    Variation in the DNA restriction map of a 13-kb region of chromosome II including the alcohol dehydrogenase structural gene (Adh) was examined in Drosophila melanogaster from natural populations. Detailed analysis of 48 D. melanogaster lines representing four eastern United States populations revealed extensive DNA sequence variation due to base substitutions, insertions and deletions. Cloning of this region from several lines allowed characterization of length variation as due to unique sequence insertions or deletions [nine sizes; 21–200 base pairs (bp)] or transposable element insertions (several sizes, 340 bp to 10.2 kb, representing four different elements). Despite this extensive variation in sequences flanking the Adh gene, only one length polymorphism is clearly associated with altered Adh expression (a copia element approximately 250 bp 5' to the distal transcript start site). Nonetheless, the frequency spectra of transposable elements within and between Drosophila species suggests they are slightly deleterious. Strong nonrandom associations are observed among Adh region sequence variants, ADH allozyme (Fast vs. Slow), ADH enzyme activity and the chromosome inversion ln(2L) t. Phylogenetic analysis of restriction map haplotypes suggest that the major twofold component of ADH activity variation (high vs. low, typical of Fast and Slow allozymes, respectively) is due to sequence variation tightly linked to and possibly distinct from that underlying the allozyme difference. The patterns of nucleotide and haplotype variation for Fast and Slow allozyme lines are consistent with the recent increase in frequency and spread of the Fast haplotype associated with high ADH activity. These data emphasize the important role of evolutionary history and strong nonrandom associations among tightly linked sequence variation as determinants of the patterns of variation observed in natural populations. PMID:3026893

  19. Prevalence and genome characteristics of canine astrovirus in southwest China.

    PubMed

    Li, Mingxiang; Yan, Nan; Ji, Conghui; Wang, Min; Zhang, Bin; Yue, Hua; Tang, Cheng

    2018-05-30

    The aim of this study was to investigate canine astrovirus (CaAstV) infection in southwest China. We collected 107 faecal samples from domestic dogs with obvious diarrhoea. Forty-two diarrhoeic samples (39.3 %) were positive for CaAstV by RT-PCR, and 41/42 samples showed co-infection with canine coronavirus (CCoV), canine parvovirus-2 (CPV-2) and canine distemper virus (CDV). Phylogenetic analysis based on 26 CaAstV partial ORF1a and ORF1b sequences revealed that most CaAstV strains showed unique evolutionary features. Interestingly, putative recombination events were observed among four of the five complete ORF2 sequences cloned in this study, and three of the five complete ORF2 sequences formed a single unique group, suggesting that these strains could be a novel genotype. We successfully sequenced the complete genome of one CaAstV strain (designated 2017/44/CHN), which was 6628 nt in length. The features of this genome include putative recombination events in the ORF1a, ORF1b and ORF2 genes, while the ORF2 gene had a continuous insertion of 7 aa in region II compared with the other complete ORF2 sequences available in GenBank. Phylogenetic analysis showed that 2017/44/CHN formed a single group based on genome sequences, suggesting that this strain might be a novel genotype. The results of this study revealed that CaAstV circulates widely in diarrhoeic dogs in southwest China and exhibits unique evolutionary events. To the best of our knowledge, this is the first report of recombination events in CaAstV, and it contributes to further understanding of the genetic evolution of CaAstV.

  20. Evolutionary insight into the ionotropic glutamate receptor superfamily of photosynthetic organisms.

    PubMed

    De Bortoli, Sara; Teardo, Enrico; Szabò, Ildikò; Morosinotto, Tomas; Alboresi, Alessandro

    2016-11-01

    Photosynthetic eukaryotes have a complex evolutionary history shaped by multiple endosymbiosis events that required a tight coordination between the organelles and the rest of the cell. Plant ionotropic glutamate receptors (iGLRs) form a large superfamily of proteins with a predicted or proven non-selective cation channel activity regulated by a broad range of amino acids. They are involved in different physiological processes such as C/N sensing, resistance against fungal infection, root and pollen tube growth and response to wounding and pathogens. Most of the present knowledge is limited to iGLRs located in plasma membranes. However, recent studies localized different iGLR isoforms to mitochondria and/or chloroplasts, suggesting the possibility that they play a specific role in bioenergetic processes. In this work, we performed a comparative analysis of GLR sequences from bacteria and various photosynthetic eukaryotes. In particular, novel types of selectivity filters of bacteria are reported adding new examples of the great diversity of the GLR superfamily. The highest variability in GLR sequences was found among the algal sequences (cryptophytes, diatoms, brown and green algae). GLRs of land plants are not closely related to the GLRs of green algae analyzed in this work. The GLR family underwent a great expansion in vascular plants. Among plant GLRs, Clade III includes sequences from Physcomitrella patens, Marchantia polymorpha and gymnosperms and can be considered the most ancient, while other clades likely emerged later. In silico analysis allowed the identification of sequences with a putative target to organelles. Sequences with a predicted localization to mitochondria and chloroplasts are randomly distributed among different type of GLRs, suggesting that no compartment-related specific function has been maintained across the species. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. MHC class I loci of the Bar-Headed goose (Anser indicus)

    PubMed Central

    2010-01-01

    MHC class I proteins mediate functions in anti-pathogen defense. MHC diversity has already been investigated by many studies in model avian species, but here we chose the bar-headed goose, a worldwide migrant bird, as a non-model avian species. Sequences from exons encoding the peptide-binding region (PBR) of MHC class I molecules were isolated from liver genomic DNA, to investigate variation in these genes. These are the first MHC class I partial sequences of the bar-headed goose to be reported. A preliminary analysis suggests the presence of at least four MHC class I genes, which share great similarity with those of the goose and duck. A phylogenetic analysis of bar-headed goose, goose and duck MHC class I sequences using the NJ method supports the idea that they all cluster within the anseriforms clade. PMID:21637434

  2. Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test

    PubMed Central

    Lionel, Anath C; Costain, Gregory; Monfared, Nasim; Walker, Susan; Reuter, Miriam S; Hosseini, S Mohsen; Thiruvahindrapuram, Bhooma; Merico, Daniele; Jobling, Rebekah; Nalpathamkalam, Thomas; Pellecchia, Giovanna; Sung, Wilson W L; Wang, Zhuozhi; Bikangaga, Peter; Boelman, Cyrus; Carter, Melissa T; Cordeiro, Dawn; Cytrynbaum, Cheryl; Dell, Sharon D; Dhir, Priya; Dowling, James J; Heon, Elise; Hewson, Stacy; Hiraki, Linda; Inbar-Feigenberg, Michal; Klatt, Regan; Kronick, Jonathan; Laxer, Ronald M; Licht, Christoph; MacDonald, Heather; Mercimek-Andrews, Saadet; Mendoza-Londono, Roberto; Piscione, Tino; Schneider, Rayfel; Schulze, Andreas; Silverman, Earl; Siriwardena, Komudi; Snead, O Carter; Sondheimer, Neal; Sutherland, Joanne; Vincent, Ajoy; Wasserman, Jonathan D; Weksberg, Rosanna; Shuman, Cheryl; Carew, Chris; Szego, Michael J; Hayeems, Robin Z; Basran, Raveen; Stavropoulos, Dimitri J; Ray, Peter N; Bowdin, Sarah; Meyn, M Stephen; Cohn, Ronald D; Scherer, Stephen W; Marshall, Christian R

    2018-01-01

    Purpose Genetic testing is an integral diagnostic component of pediatric medicine. Standard of care is often a time-consuming stepwise approach involving chromosomal microarray analysis and targeted gene sequencing panels, which can be costly and inconclusive. Whole-genome sequencing (WGS) provides a comprehensive testing platform that has the potential to streamline genetic assessments, but there are limited comparative data to guide its clinical use. Methods We prospectively recruited 103 patients from pediatric non-genetic subspecialty clinics, each with a clinical phenotype suggestive of an underlying genetic disorder, and compared the diagnostic yield and coverage of WGS with those of conventional genetic testing. Results WGS identified diagnostic variants in 41% of individuals, representing a significant increase over conventional testing results (24% P = 0.01). Genes clinically sequenced in the cohort (n = 1,226) were well covered by WGS, with a median exonic coverage of 40 × ±8 × (mean ±SD). All the molecular diagnoses made by conventional methods were captured by WGS. The 18 new diagnoses made with WGS included structural and non-exonic sequence variants not detectable with whole-exome sequencing, and confirmed recent disease associations with the genes PIGG, RNU4ATAC, TRIO, and UNC13A. Conclusion WGS as a primary clinical test provided a higher diagnostic yield than conventional genetic testing in a clinically heterogeneous cohort. PMID:28771251

  3. Improved diagnostic yield compared with targeted gene sequencing panels suggests a role for whole-genome sequencing as a first-tier genetic test.

    PubMed

    Lionel, Anath C; Costain, Gregory; Monfared, Nasim; Walker, Susan; Reuter, Miriam S; Hosseini, S Mohsen; Thiruvahindrapuram, Bhooma; Merico, Daniele; Jobling, Rebekah; Nalpathamkalam, Thomas; Pellecchia, Giovanna; Sung, Wilson W L; Wang, Zhuozhi; Bikangaga, Peter; Boelman, Cyrus; Carter, Melissa T; Cordeiro, Dawn; Cytrynbaum, Cheryl; Dell, Sharon D; Dhir, Priya; Dowling, James J; Heon, Elise; Hewson, Stacy; Hiraki, Linda; Inbar-Feigenberg, Michal; Klatt, Regan; Kronick, Jonathan; Laxer, Ronald M; Licht, Christoph; MacDonald, Heather; Mercimek-Andrews, Saadet; Mendoza-Londono, Roberto; Piscione, Tino; Schneider, Rayfel; Schulze, Andreas; Silverman, Earl; Siriwardena, Komudi; Snead, O Carter; Sondheimer, Neal; Sutherland, Joanne; Vincent, Ajoy; Wasserman, Jonathan D; Weksberg, Rosanna; Shuman, Cheryl; Carew, Chris; Szego, Michael J; Hayeems, Robin Z; Basran, Raveen; Stavropoulos, Dimitri J; Ray, Peter N; Bowdin, Sarah; Meyn, M Stephen; Cohn, Ronald D; Scherer, Stephen W; Marshall, Christian R

    2018-04-01

    PurposeGenetic testing is an integral diagnostic component of pediatric medicine. Standard of care is often a time-consuming stepwise approach involving chromosomal microarray analysis and targeted gene sequencing panels, which can be costly and inconclusive. Whole-genome sequencing (WGS) provides a comprehensive testing platform that has the potential to streamline genetic assessments, but there are limited comparative data to guide its clinical use.MethodsWe prospectively recruited 103 patients from pediatric non-genetic subspecialty clinics, each with a clinical phenotype suggestive of an underlying genetic disorder, and compared the diagnostic yield and coverage of WGS with those of conventional genetic testing.ResultsWGS identified diagnostic variants in 41% of individuals, representing a significant increase over conventional testing results (24%; P = 0.01). Genes clinically sequenced in the cohort (n = 1,226) were well covered by WGS, with a median exonic coverage of 40 × ±8 × (mean ±SD). All the molecular diagnoses made by conventional methods were captured by WGS. The 18 new diagnoses made with WGS included structural and non-exonic sequence variants not detectable with whole-exome sequencing, and confirmed recent disease associations with the genes PIGG, RNU4ATAC, TRIO, and UNC13A.ConclusionWGS as a primary clinical test provided a higher diagnostic yield than conventional genetic testing in a clinically heterogeneous cohort.

  4. Mosaic CREBBP mutation causes overlapping clinical features of Rubinstein-Taybi and Filippi syndromes.

    PubMed

    de Vries, Tamar I; Monroe, Glen R; van Belzen, Martine J; van der Lans, Christian A; Savelberg, Sanne Mc; Newman, William G; van Haaften, Gijs; Nievelstein, Rutger A; van Haelst, Mieke M

    2016-08-01

    Rubinstein-Taybi syndrome (RTS, OMIM 180849) and Filippi syndrome (FLPIS, OMIM 272440) are both rare syndromes, with multiple congenital anomalies and intellectual deficit (MCA/ID). We present a patient with intellectual deficit, short stature, bilateral syndactyly of hands and feet, broad thumbs, ocular abnormalities, and dysmorphic facial features. These clinical features suggest both RTS and FLPIS. Initial DNA analysis of DNA isolated from blood did not identify variants to confirm either of these syndrome diagnoses. Whole-exome sequencing identified a homozygous variant in C9orf173, which was novel at the time of analysis. Further Sanger sequencing analysis of FLPIS cases tested negative for CKAP2L variants did not, however, reveal any further variants. Subsequent analysis using DNA isolated from buccal mucosa revealed a mosaic variant in CREBBP. This report highlights the importance of excluding mosaic variants in patients with a strong but atypical clinical presentation of a MCA/ID syndrome if no disease-causing variants can be detected in DNA isolated from blood samples. As the striking syndactyly observed in the present case is typical for FLPIS, we suggest CREBBP analysis in saliva samples for FLPIS syndrome cases in which no causal CKAP2L variant is detected.

  5. Granulometry of pebble beach ridges in Fort Williams Point, Greenwich Island, Antarctic Peninsula; a possible result from Holocene climate fluctuations

    USGS Publications Warehouse

    Santana, E.; Dumont, J.F.

    2007-01-01

    We present a granulometric study of emerged pebble beach ridges in the Fort Williams Point, Greenwich Island, Antarctic Peninsula. We studied 8 beach ridges from the shore up to 13.5 m above current sea level. The beach ridges are made of volcanic material from the surrounding relief, but also include glacially transported gneiss and granodiorite pebble and cobble. Based on granulometric distribution analysis of 2100 samples from 39 locations we identified evidence of 4 sequences of 1 to 3 ridges. Most of the material seems to be reworked from a till. Pavement formation by iceberg between the sequences of beach ridges suggests periods of lower temperature. The interpretation suggests that sequences of beach ridge construction formed during warmer periods of the late Holocene. This occurs in the framework of an isostatic postglacial uplift allowing the progressive mobilization of periglaciar material.

  6. Extensive concerted evolution of rice paralogs and the road to regaining independence.

    PubMed

    Wang, Xiyin; Tang, Haibao; Bowers, John E; Feltus, Frank A; Paterson, Andrew H

    2007-11-01

    Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the approximately 0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, approximately 8% of japonica paralogs produced 5-7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while approximately 70-MY-old "paleologs" resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice-sorghum divergence approximately 41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity--that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5-7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization.

  7. Construction of a full-length cDNA Library from Chinese oak silkworm pupa and identification of a KK-42-binding protein gene in relation to pupa-diapause termination.

    PubMed

    Li, Yu-Ping; Xia, Run-Xi; Wang, Huan; Li, Xi-Sheng; Liu, Yan-Qun; Wei, Zhao-Jun; Lu, Cheng; Xiang, Zhong-Huai

    2009-06-24

    In this study we successfully constructed a full-length cDNA library from Chinese oak silkworm, Antheraea pernyi, the most well-known wild silkworm used for silk production and insect food. Total RNA was extracted from a single fresh female pupa at the diapause stage. The titer of the library was 5 x 10(5) cfu/ml and the proportion of recombinant clones was approximately 95%. Expressed sequence tag (EST) analysis was used to characterize the library. A total of 175 clustered ESTs consisting of 24 contigs and 151 singlets were generated from 250 effective sequences. Of the 175 unigenes, 97 (55.4%) were known genes but only five from A. pernyi, 37 (21.2%) were known ESTs without function annotation, and 41 (23.4%) were novel ESTs. By EST sequencing, a gene coding KK-42-binding protein in A. pernyi (named as ApKK42-BP; GenBank accession no. FJ744151) was identified and characterized. Protein sequence analysis showed that ApKK42-BP was not a membrane protein but an extracellular protein with a signal peptide at position 1-18, and contained two putative conserved domains, abhydro_lipase and abhydrolase_1, suggesting it may be a member of lipase superfamily. Expression analysis based on number of ESTs showed that ApKK42-BP was an abundant gene in the period of diapause stage, suggesting it may also be involved in pupa-diapause termination.

  8. Construction of a full-length cDNA Library from Chinese oak silkworm pupa and identification of a KK-42-binding protein gene in relation to pupa-diapause termination

    PubMed Central

    Li, Yu-Ping; Xia, Run-Xi; Wang, Huan; Li, Xi-Sheng; Liu, Yan-Qun; Wei, Zhao-Jun; Lu, Cheng; Xiang, Zhong-Huai

    2009-01-01

    In this study we successfully constructed a full-length cDNA library from Chinese oak silkworm, Antheraea pernyi, the most well-known wild silkworm used for silk production and insect food. Total RNA was extracted from a single fresh female pupa at the diapause stage. The titer of the library was 5 × 105 cfu/ml and the proportion of recombinant clones was approximately 95%. Expressed sequence tag (EST) analysis was used to characterize the library. A total of 175 clustered ESTs consisting of 24 contigs and 151 singlets were generated from 250 effective sequences. Of the 175 unigenes, 97 (55.4%) were known genes but only five from A. pernyi, 37 (21.2%) were known ESTs without function annotation, and 41 (23.4%) were novel ESTs. By EST sequencing, a gene coding KK-42-binding protein in A. pernyi (named as ApKK42-BP; GenBank accession no. FJ744151) was identified and characterized. Protein sequence analysis showed that ApKK42-BP was not a membrane protein but an extracellular protein with a signal peptide at position 1-18, and contained two putative conserved domains, abhydro_lipase and abhydrolase_1, suggesting it may be a member of lipase superfamily. Expression analysis based on number of ESTs showed that ApKK42-BP was an abundant gene in the period of diapause stage, suggesting it may also be involved in pupa-diapause termination. PMID:19564928

  9. Routine HLA-B genotyping with PCR-sequence-specific oligonucleotides detects a B*52 variant (B*5206).

    PubMed

    Hoelsch, K; Lenggeler, I; Pfannes, W; Knabe, H; Klein, H-G; Woelpl, A

    2005-05-01

    A new human leukocyte antigen (HLA)-B allele was found during routine typing of samples for a German unrelated bone marrow donor registry, the "Aktion Knochenmarkspende Bayern". After first interpretation of data of two independent low-resolution sequence-specific oligonucleotide typing tests, a B*51 variant was suggested. Further analysis via sequence-based typing identified the sequence as new B*52 allele. This new allele officially assigned as B*5206 differs from HLA-B*520102 by one nucleotide exchange in exon 2. The mutation is located at nucleotide position 274, at which a cytosine is substituted by a thymine leading to an amino acid change at protein position 67 from serine (TCC) to phenylalanine (TTC).

  10. What is a melody? On the relationship between pitch and brightness of timbre.

    PubMed

    Cousineau, Marion; Carcagno, Samuele; Demany, Laurent; Pressnitzer, Daniel

    2013-01-01

    Previous studies showed that the perceptual processing of sound sequences is more efficient when the sounds vary in pitch than when they vary in loudness. We show here that sequences of sounds varying in brightness of timbre are processed with the same efficiency as pitch sequences. The sounds used consisted of two simultaneous pure tones one octave apart, and the listeners' task was to make same/different judgments on pairs of sequences varying in length (one, two, or four sounds). In one condition, brightness of timbre was varied within the sequences by changing the relative level of the two pure tones. In other conditions, pitch was varied by changing fundamental frequency, or loudness was varied by changing the overall level. In all conditions, only two possible sounds could be used in a given sequence, and these two sounds were equally discriminable. When sequence length increased from one to four, discrimination performance decreased substantially for loudness sequences, but to a smaller extent for brightness sequences and pitch sequences. In the latter two conditions, sequence length had a similar effect on performance. These results suggest that the processes dedicated to pitch and brightness analysis, when probed with a sequence-discrimination task, share unexpected similarities.

  11. Acyl carrier protein structural classification and normal mode analysis

    PubMed Central

    Cantu, David C; Forrester, Michael J; Charov, Katherine; Reilly, Peter J

    2012-01-01

    All acyl carrier protein primary and tertiary structures were gathered into the ThYme database. They are classified into 16 families by amino acid sequence similarity, with members of the different families having sequences with statistically highly significant differences. These classifications are supported by tertiary structure superposition analysis. Tertiary structures from a number of families are very similar, suggesting that these families may come from a single distant ancestor. Normal vibrational mode analysis was conducted on experimentally determined freestanding structures, showing greater fluctuations at chain termini and loops than in most helices. Their modes overlap more so within families than between different families. The tertiary structures of three acyl carrier protein families that lacked any known structures were predicted as well. PMID:22374859

  12. [Detection of the mitochondrial DNA haplotype characteristic of the least cisco (Coregonus sardinella, Valenciennes, 1848) in the vendace (C. albula, Linnaeus, 1758) population of Vodlozero (the Baltic Sea basin)].

    PubMed

    Borovikova, E A; Makhrov, A A

    2009-01-01

    Analysis of the nucleotide sequence of the mitochondrial ND-1 gene in the vendace population in lake Vodlozero (the eastern part of the Baltic Sea basin) revealed a sequence variant that is closely related to that of the least cisco of Siberia (the Indigirka River). Thus, together with the results of morphological and allozyme analysis of this population performed earlier, the results obtained in this study are suggestive of the immigration of the least cisco to the Baltic Sea basin during the last glaciation.

  13. Common features and peculiarities of the seismic activity at Phlegraean Fields, Long Valley, and Vesuvius

    USGS Publications Warehouse

    Marzocchi, W.; Vilardo, G.; Hill, D.P.; Ricciardi, G.P.; Ricco, C.

    2001-01-01

    We analyzed and compared the seismic activity that has occurred in the last two to three decades in three distinct volcanic areas: Phlegraean Fields, Italy; Vesuvius, Italy; and Long Valley, California. Our main goal is to identify and discuss common features and peculiarities in the temporal evolution of earthquake sequences that may reflect similarities and differences in the generating processes between these volcanic systems. In particular, we tried to characterize the time series of the number of events and of the seismic energy release in terms of stochastic, deterministic, and chaotic components. The time sequences from each area consist of thousands of earthquakes that allow a detailed quantitative analysis and comparison. The results obtained showed no evidence for either deterministic or chaotic components in the earthquake sequences in Long Valley caldera, which appears to be dominated by stochastic behavior. In contrast, earthquake sequences at Phlegrean Fields and Mount Vesuvius show a deterministic signal mainly consisting of a 24-hour periodicity. Our analysis suggests that the modulation in seismicity is in some way related to thermal diurnal processes, rather than luni-solar tidal effects. Independently from the process that generates these periodicities on the seismicity., it is suggested that the lack (or presence) of diurnal cycles is seismic swarms of volcanic areas could be closely linked to the presence (or lack) of magma motion.

  14. Quantifying transfer after perceptual-motor sequence learning: how inflexible is implicit learning?

    PubMed

    Sanchez, Daniel J; Yarnik, Eric N; Reber, Paul J

    2015-03-01

    Studies of implicit perceptual-motor sequence learning have often shown learning to be inflexibly tied to the training conditions during learning. Since sequence learning is seen as a model task of skill acquisition, limits on the ability to transfer knowledge from the training context to a performance context indicates important constraints on skill learning approaches. Lack of transfer across contexts has been demonstrated by showing that when task elements are changed following training, this leads to a disruption in performance. These results have typically been taken as suggesting that the sequence knowledge relies on integrated representations across task elements (Abrahamse, Jiménez, Verwey, & Clegg, Psychon Bull Rev 17:603-623, 2010a). Using a relatively new sequence learning task, serial interception sequence learning, three experiments are reported that quantify this magnitude of performance disruption after selectively manipulating individual aspects of motor performance or perceptual information. In Experiment 1, selective disruption of the timing or order of sequential actions was examined using a novel response manipulandum that allowed for separate analysis of these two motor response components. In Experiments 2 and 3, transfer was examined after selective disruption of perceptual information that left the motor response sequence intact. All three experiments provided quantifiable estimates of partial transfer to novel contexts that suggest some level of information integration across task elements. However, the ability to identify quantifiable levels of successful transfer indicates that integration is not all-or-none and that measurement sensitivity is a key in understanding sequence knowledge representations.

  15. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  16. Phylogenetic analysis of dissimilatory Fe(III)-reducing bacteria

    USGS Publications Warehouse

    Lonergan, D.J.; Jenter, H.L.; Coates, J.D.; Phillips, E.J.P.; Schmidt, T.M.; Lovley, D.R.

    1996-01-01

    Evolutionary relationships among strictly anaerobic dissimilatory Fe(III)- reducing bacteria obtained from a diversity of sedimentary environments were examined by phylogenetic analysis of 16S rRNA gene sequences. Members of the genera Geobacter, Desulfuromonas, Pelobacter, and Desulfuromusa formed a monophyletic group within the delta subdivision of the class Proteobacteria. On the basis of their common ancestry and the shared ability to reduce Fe(III) and/or S0, we propose that this group be considered a single family, Geobacteraceae. Bootstrap analysis, characteristic nucleotides, and higher- order secondary structures support the division of Geobacteraceae into two subgroups, designated the Geobacter and Desulfuromonas clusters. The genus Desulfuromusa and Pelobacter acidigallici make up a distinct branch with the Desulfuromonas cluster. Several members of the family Geobacteraceae, none of which reduce sulfate, were found to contain the target sequences of probes that have been previously used to define the distribution of sulfate-reducing bacteria and sulfate-reducing bacterium-like microorganisms. The recent isolations of Fe(III)-reducing microorganisms distributed throughout the domain Bacteria suggest that development of 16S rRNA probes that would specifically target all Fe(III) reducers may not be feasible. However, all of the evidence suggests that if a 16S rRNA sequence falls within the family Geobacteraceae, then the organism has the capacity for Fe(III) reduction. The suggestion, based on geological evidence, that Fe(III) reduction was the first globally significant process for oxidizing organic matter back to carbon dioxide is consistent with the finding that acetate-oxidizing Fe(III) reducers are phylogenetically diverse.

  17. Ray Wu as Fifth Business: Deconstructing collective memory in the history of DNA sequencing.

    PubMed

    Onaga, Lisa A

    2014-06-01

    The concept of 'Fifth Business' is used to analyze a minority standpoint and bring serious attention to the role of scientists who play a galvanizing role in a science but for multiple reasons appear less prominently in more common recounts of any particular development. Biochemist Ray Wu (1928-2008) published a DNA sequencing experiment in March 1970 using DNA polymerase catalysis and specific nucleotide labeling, both of which are foundational to general sequencing methods today. The scant mention of Wu's work from textbooks, research articles, and other accounts of DNA sequencing calls into question how scientific collective memory forms. This alternative history seeks to understand why a key figure in nucleic acid sequence analysis has remained less visibly connected or peripheral to solidifying narratives about the history of DNA sequencing. The study resists predictable dismissals of Wu's work in order to seriously examine the formation of his nucleic acid sequence analysis research program and how he shared his knowledge of sequencing during a period of rapid advancement in the field. An analysis of Wu's work on sequencing the cohesive ends of lambda bacteriophage in the 1960s and 1970s exemplifies how a variety of individuals and groups attempted to develop protocol for sequencing the order of nucleotide base pairs comprising DNA. This historical examination of the sociality of scientific research suggests a way to understand how Wu and others contributed to the very collective memory of DNA sequencing that Wu eventually tried to repair. The study of Wu, who was a Chinese immigrant to the United States, provides a foundation for further critical scholarship on the heterogeneous histories of Asian American bioscientists, the sociality of their scientific works, and how the resulting knowledge produced is preserved, if not evenly, in a scientific field's collective memory. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Phylogenetic Analysis and Epidemic History of Hepatitis C Virus Genotype 2 in Tunisia, North Africa

    PubMed Central

    Rajhi, Mouna; Ghedira, Kais; Chouikha, Anissa; Djebbi, Ahlem; Cheikh, Imed; Ben Yahia, Ahlem; Sadraoui, Amel; Hammami, Walid; Azouz, Msaddek; Ben Mami, Nabil; Triki, Henda

    2016-01-01

    HCV genotype 2 (HCV-2) has a worldwide distribution with prevalence rates that vary from country to country. High genetic diversity and long-term endemicity were suggested in West African countries. A global dispersal of HCV-2 would have occurred during the 20th century, especially in European countries. In Tunisia, genotype 2 was the second prevalent genotype after genotype 1 and most isolates belong to subtypes 2c and 2k. In this study, phylogenetic analyses based on the NS5B genomic sequences of 113 Tunisian HCV isolates from subtypes 2c and 2k were carried out. A Bayesian coalescent-based framework was used to estimate the origin and the spread of these subtypes circulating in Tunisia. Phylogenetic analyses of HCV-2c sequences suggest the absence of country-specific or time-specific variants. In contrast, the phylogenetic grouping of HCV-2k sequences shows the existence of two major genetic clusters that may represent two distinct circulating variants. Coalescent analysis indicated a most recent common ancestor (tMRCA) of Tunisian HCV-2c around 1886 (1869–1902) before the introduction of HCV-2k in 1901 (1867–1931). Our findings suggest that the introduction of HCV-2c in Tunisia is possibly a result of population movements between Tunisia and European population following the French colonization. PMID:27100294

  19. Phylogenetic Analysis and Epidemic History of Hepatitis C Virus Genotype 2 in Tunisia, North Africa.

    PubMed

    Rajhi, Mouna; Ghedira, Kais; Chouikha, Anissa; Djebbi, Ahlem; Cheikh, Imed; Ben Yahia, Ahlem; Sadraoui, Amel; Hammami, Walid; Azouz, Msaddek; Ben Mami, Nabil; Triki, Henda

    2016-01-01

    HCV genotype 2 (HCV-2) has a worldwide distribution with prevalence rates that vary from country to country. High genetic diversity and long-term endemicity were suggested in West African countries. A global dispersal of HCV-2 would have occurred during the 20th century, especially in European countries. In Tunisia, genotype 2 was the second prevalent genotype after genotype 1 and most isolates belong to subtypes 2c and 2k. In this study, phylogenetic analyses based on the NS5B genomic sequences of 113 Tunisian HCV isolates from subtypes 2c and 2k were carried out. A Bayesian coalescent-based framework was used to estimate the origin and the spread of these subtypes circulating in Tunisia. Phylogenetic analyses of HCV-2c sequences suggest the absence of country-specific or time-specific variants. In contrast, the phylogenetic grouping of HCV-2k sequences shows the existence of two major genetic clusters that may represent two distinct circulating variants. Coalescent analysis indicated a most recent common ancestor (tMRCA) of Tunisian HCV-2c around 1886 (1869-1902) before the introduction of HCV-2k in 1901 (1867-1931). Our findings suggest that the introduction of HCV-2c in Tunisia is possibly a result of population movements between Tunisia and European population following the French colonization.

  20. Linear and Nonlinear Statistical Characterization of DNA

    NASA Astrophysics Data System (ADS)

    Norio Oiwa, Nestor; Goldman, Carla; Glazier, James

    2002-03-01

    We find spatial order in the distribution of protein-coding (including RNAs) and control segments of GenBank genomic sequences, irrespective of ATCG content. This is achieved by correlations, histograms, fractal dimensions and singularity spectra. Estimates of these quantities in complete nuclear genome indicate that coding sequences are long-range correlated and their disposition are self-similar (multifractal) for eukaryotes. These characteristics are absent in prokaryotes, where there are few noncoding sequences, suggesting the `junk' DNA play a relevant role to the genome structure and function. Concerning the genetic message of ATCG sequences, we build a random walk (Levy flight), using DNA symmetry arguments, where we associate A, T, C and G as left, right, down and up steps, respectively. Nonlinear analysis of mitochondrial DNA walks reveal multifractal pattern based on palindromic sequences, which fold in hairpins and loops.

  1. Analysis of Variability in HIV-1 Subtype A Strains in Russia Suggests a Combination of Deep Sequencing and Multitarget RNA Interference for Silencing of the Virus.

    PubMed

    Kretova, Olga V; Chechetkin, Vladimir R; Fedoseeva, Daria M; Kravatsky, Yuri V; Sosin, Dmitri V; Alembekov, Ildar R; Gorbacheva, Maria A; Gashnikova, Natalya M; Tchurikov, Nickolai A

    2017-02-01

    Any method for silencing the activity of the HIV-1 retrovirus should tackle the extremely high variability of HIV-1 sequences and mutational escape. We studied sequence variability in the vicinity of selected RNA interference (RNAi) targets from isolates of HIV-1 subtype A in Russia, and we propose that using artificial RNAi is a potential alternative to traditional antiretroviral therapy. We prove that using multiple RNAi targets overcomes the variability in HIV-1 isolates. The optimal number of targets critically depends on the conservation of the target sequences. The total number of targets that are conserved with a probability of 0.7-0.8 should exceed at least 2. Combining deep sequencing and multitarget RNAi may provide an efficient approach to cure HIV/AIDS.

  2. T cell receptor repertoires of mice and humans are clustered in similarity networks around conserved public CDR3 sequences

    PubMed Central

    Madi, Asaf; Poran, Asaf; Shifrut, Eric; Reich-Zeliger, Shlomit; Greenstein, Erez; Zaretsky, Irena; Arnon, Tomer; Laethem, Francois Van; Singer, Alfred; Lu, Jinghua; Sun, Peter D; Cohen, Irun R; Friedman, Nir

    2017-01-01

    Diversity of T cell receptor (TCR) repertoires, generated by somatic DNA rearrangements, is central to immune system function. However, the level of sequence similarity of TCR repertoires within and between species has not been characterized. Using network analysis of high-throughput TCR sequencing data, we found that abundant CDR3-TCRβ sequences were clustered within networks generated by sequence similarity. We discovered a substantial number of public CDR3-TCRβ segments that were identical in mice and humans. These conserved public sequences were central within TCR sequence-similarity networks. Annotated TCR sequences, previously associated with self-specificities such as autoimmunity and cancer, were linked to network clusters. Mechanistically, CDR3 networks were promoted by MHC-mediated selection, and were reduced following immunization, immune checkpoint blockade or aging. Our findings provide a new view of T cell repertoire organization and physiology, and suggest that the immune system distributes its TCR sequences unevenly, attending to specific foci of reactivity. DOI: http://dx.doi.org/10.7554/eLife.22057.001 PMID:28731407

  3. Genomic analysis of coxsackieviruses A1, A19, A22, enteroviruses 113 and 104: viruses representing two clades with distinct tropism within enterovirus C

    PubMed Central

    Haq, Saddef; Sameroff, Stephen; Howie, Stephen R. C.; Lipkin, W. Ian

    2013-01-01

    Coxsackieviruses (CV) A1, CV-A19 and CV-A22 have historically comprised a distinct phylogenetic clade within Enterovirus (EV) C. Several novel serotypes that are genetically similar to these three viruses have been recently discovered and characterized. Here, we report the coding sequence analysis of two genotypes of a previously uncharacterized serotype EV-C113 from Bangladesh and demonstrate that it is most similar to CV-A22 and EV-C116 within the capsid region. We sequenced novel genotypes of CV-A1, CV-A19 and CV-A22 from Bangladesh and observed a high rate of recombination within this group. We also report genomic analysis of the rarely reported EV-C104 circulating in the Gambia in 2009. All available EV-C104 sequences displayed a high degree of similarity within the structural genes but formed two clusters within the non-structural genes. One cluster included the recently reported EV-C117, suggesting an ancestral recombination between these two serotypes. Phylogenetic analysis of all available complete genome sequences indicated the existence of two subgroups within this distinct Enterovirus C clade: one has been exclusively recovered from gastrointestinal samples, while the other cluster has been implicated in respiratory disease. PMID:23761409

  4. Correcting names of bacteria deposited in National Microbial Repositories: an analysed sequence data necessary for taxonomic re-categorization of misclassified bacteria-ONE example, genus Lysinibacillus.

    PubMed

    Rekadwad, Bhagwan N; Gonzalez, Juan M

    2017-08-01

    A report on 16S rRNA gene sequence re-analysis and digitalization is presented using Lysinibacillus species (one example) deposited in National Microbial Repositories in India. Lysinibacillus species 16S rRNA gene sequences were digitalized to provide quick response (QR) codes, Chaose Game Representation (CGR) and Frequency of Chaose Game Representation (FCGR). GC percentage, phylogenetic analysis, and principal component analysis (PCA) are tools used for the differentiation and reclassification of the strains under investigation. The seven reasons supporting the statements made by us as misclassified Lysinibacillus species deposited in National Microbial Depositories are given in this paper. Based on seven reasons, bacteria deposited in National Microbial Repositories such as Lysinibacillus and many other needs reanalyses for their exact identity. Leaves of identity with type strains of related species shows difference 2 to 8 % suggesting that reclassification is needed to correctly assign species names to the analyzed Lysinibacillus strains available in National Microbial Repositories.

  5. Microsatellite markers identify three lineages of Phytophthora ramorum in US nurseries, yet single lineages in US forest and European nursery populations.

    PubMed

    Ivors, K; Garbelotto, M; Vries, I D E; Ruyter-Spira, C; Te Hekkert, B; Rosenzweig, N; Bonants, P

    2006-05-01

    Analysis of 12 polymorphic simple sequence repeats identified in the genome sequence of Phytophthora ramorum, causal agent of 'sudden oak death', revealed genotypic diversity to be significantly higher in nurseries (91% of total) than in forests (18% of total). Our analysis identified only two closely related genotypes in US forests, while the genetic structure of populations from European nurseries was of intermediate complexity, including multiple, closely related genotypes. Multilocus analysis determined populations in US forests reproduce clonally and are likely descendants of a single introduced individual. The 151 isolates analysed clustered in three clades. US forest and European nursery isolates clustered into two distinct clades, while one isolate from a US nursery belonged to a third novel clade. The combined microsatellite, sequencing and morphological analyses suggest the three clades represent distinct evolutionary lineages. All three clades were identified in some US nurseries, emphasizing the role of commercial plant trade in the movement of this pathogen.

  6. Analysis of sequencing data for probing RNA secondary structures and protein-RNA binding in studying posttranscriptional regulations.

    PubMed

    Hu, Xihao; Wu, Yang; Lu, Zhi John; Yip, Kevin Y

    2016-11-01

    High-throughput sequencing has been used to study posttranscriptional regulations, where the identification of protein-RNA binding is a major and fast-developing sub-area, which is in turn benefited by the sequencing methods for whole-transcriptome probing of RNA secondary structures. In the study of RNA secondary structures using high-throughput sequencing, bases are modified or cleaved according to their structural features, which alter the resulting composition of sequencing reads. In the study of protein-RNA binding, methods have been proposed to immuno-precipitate (IP) protein-bound RNA transcripts in vitro or in vivo By sequencing these transcripts, the protein-RNA interactions and the binding locations can be identified. For both types of data, read counts are affected by a combination of confounding factors, including expression levels of transcripts, sequence biases, mapping errors and the probing or IP efficiency of the experimental protocols. Careful processing of the sequencing data and proper extraction of important features are fundamentally important to a successful analysis. Here we review and compare different experimental methods for probing RNA secondary structures and binding sites of RNA-binding proteins (RBPs), and the computational methods proposed for analyzing the corresponding sequencing data. We suggest how these two types of data should be integrated to study the structural properties of RBP binding sites as a systematic way to better understand posttranscriptional regulations. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  7. Isolation and characterization of full-length putative alcohol dehydrogenase genes from polygonum minus

    NASA Astrophysics Data System (ADS)

    Hamid, Nur Athirah Abd; Ismail, Ismanizan

    2013-11-01

    Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.

  8. Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

    PubMed

    Dong, Zheng; Zhou, Hongyu; Tao, Peng

    2018-02-01

    PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.

  9. Genetic differences between blood- and brain-derived viral sequences from human immunodeficiency virus type 1-infected patients: evidence of conserved elements in the V3 region of the envelope protein of brain-derived sequences.

    PubMed Central

    Korber, B T; Kunstman, K J; Patterson, B K; Furtado, M; McEvilly, M M; Levy, R; Wolinsky, S M

    1994-01-01

    Human immunodeficiency virus type 1 (HIV-1) sequences were generated from blood and from brain tissue obtained by stereotactic biopsy from six patients undergoing a diagnostic neurosurgical procedure. Proviral DNA was directly amplified by nested PCR, and 8 to 36 clones from each sample were sequenced. Phylogenetic analysis of intrapatient envelope V3-V5 region HIV-1 DNA sequence sets revealed that brain viral sequences were clustered relative to the blood viral sequences, suggestive of tissue-specific compartmentalization of the virus in four of the six cases. In the other two cases, the blood and brain virus sequences were intermingled in the phylogenetic analyses, suggesting trafficking of virus between the two tissues. Slide-based PCR-driven in situ hybridization of two of the patients' brain biopsy samples confirmed our interpretation of the intrapatient phylogenetic analyses. Interpatient V3 region brain-derived sequence distances were significantly less than blood-derived sequence distances. Relative to the tip of the loop, the set of brain-derived viral sequences had a tendency towards negative or neutral charge compared with the set of blood-derived viral sequences. Entropy calculations were used as a measure of the variability at each position in alignments of blood and brain viral sequences. A relatively conserved set of positions were found, with a significantly lower entropy in the brain-than in the blood-derived viral sequences. These sites constitute a brain "signature pattern," or a noncontiguous set of amino acids in the V3 region conserved in viral sequences derived from brain tissue. This brain-derived signature pattern was also well preserved among isolates previously characterized in vitro as macrophage tropic. Macrophage-monocyte tropism may be the biological constraint that results in the conservation of the viral brain signature pattern. Images PMID:7933130

  10. Genetic Variation and Geographic Differentiation Among Populations of the Nonmigratory Agricultural Pest Oedaleus infernalis (Orthoptera: Acridoidea) in China

    PubMed Central

    Sun, Wei; Dong, Hui; Gao, Yue-Bo; Su, Qian-Fu; Qian, Hai-Tao; Bai, Hong-Yan; Zhang, Zhu-Ting; Cong, Bin

    2015-01-01

    The nonmigratory grasshopper Oedaleus infernalis Saussure (Orthoptera : Acridoidea) is an agricultural pest to crops and forage grasses over a wide natural geographical distribution in China. The genetic diversity and genetic variation among 10 geographically separated populations of O. infernalis was assessed using polymerase chain reaction-based molecular markers, including the intersimple sequence repeat and mitochondrial cytochrome oxidase sequences. A high level of genetic diversity was detected among these populations from the intersimple sequence repeat (H: 0.2628, I: 0.4129, Hs: 0.2130) and cytochrome oxidase analyses (Hd: 0.653). There was no obvious geographical structure based on an unweighted pair group method analysis and median-joining network. The values of FST, θII, and Gst estimated in this study are low, and the gene flow is high (Nm > 4). Analysis of the molecular variance suggested that most of the genetic variation occurs within populations, whereas only a small variation takes place between populations. No significant correlation was found between the genetic distance and geographical distance. Overall, our results suggest that the geographical distance plays an unimpeded role in the gene flow among O. infernalis populations. PMID:26496789

  11. Sequence analyses reveal that a TPR-DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR-DP domains and prokaryotic GerD proteins.

    PubMed

    Hernández Torres, Jorge; Papandreou, Nikolaos; Chomilier, Jacques

    2009-05-01

    The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR-DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR-DP domains.

  12. Genetic and molecular characterization of the maize rp3 rust resistance locus.

    PubMed Central

    Webb, Craig A; Richter, Todd E; Collins, Nicholas C; Nicolas, Marie; Trick, Harold N; Pryor, Tony; Hulbert, Scot H

    2002-01-01

    In maize, the Rp3 gene confers resistance to common rust caused by Puccinia sorghi. Flanking marker analysis of rust-susceptible rp3 variants suggested that most of them arose via unequal crossing over, indicating that rp3 is a complex locus like rp1. The PIC13 probe identifies a nucleotide binding site-leucine-rich repeat (NBS-LRR) gene family that maps to the complex. Rp3 variants show losses of PIC13 family members relative to the resistant parents when probed with PIC13, indicating that the Rp3 gene is a member of this family. Gel blots and sequence analysis suggest that at least 9 family members are at the locus in most Rp3-carrying lines and that at least 5 of these are transcribed in the Rp3-A haplotype. The coding regions of 14 family members, isolated from three different Rp3-carrying haplotypes, had DNA sequence identities from 93 to 99%. Partial sequencing of clones of a BAC contig spanning the rp3 locus in the maize inbred line B73 identified five different PIC13 paralogues in a region of approximately 140 kb. PMID:12242248

  13. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

    PubMed

    Álvarez-Martos, Isabel; Ferapontova, Elena E

    2017-08-05

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Streptococcus pneumoniae PstS production is phosphate responsive and enhanced during growth in the murine peritoneal cavity

    NASA Technical Reports Server (NTRS)

    Orihuela, C. J.; Mills, J.; Robb, C. W.; Wilson, C. J.; Watson, D. A.; Niesel, D. W.

    2001-01-01

    Differential display-PCR (DDPCR) was used to identify a Streptococcus pneumoniae gene with enhanced transcription during growth in the murine peritoneal cavity. Northern dot blot analysis and comparative densitometry confirmed a 1.8-fold increase in expression of the encoded sequence following murine peritoneal culture (MPC) versus laboratory culture or control culture (CC). Sequencing and basic local alignment search tool analysis identified the DDPCR fragment as pstS, the phosphate-binding protein of a high-affinity phosphate uptake system. PCR amplification of the complete pstS gene followed by restriction analysis and sequencing suggests a high level of conservation between strains and serotypes. Quantitative immunodot blotting using antiserum to recombinant PstS (rPstS) demonstrated an approximately twofold increase in PstS production during MPC from that during CCs, a finding consistent with the low levels of phosphate observed in the peritoneum. Moreover, immunodot blot and Northern analysis demonstrated phosphate-dependent production of PstS in six of seven strains examined. These results identify pstS expression as responsive to the MPC environment and extracellular phosphate concentrations. Presently, it remains unclear if phosphate concentrations in vivo contribute to the regulation of pstS. Finally, polyclonal antiserum to rPstS did not inhibit growth of the pneumococcus in vitro, suggesting that antibodies do not block phosphate uptake; moreover, vaccination of mice with rPstS did not protect against intraperitoneal challenge as assessed by the 50% lethal dose.

  15. Phylogenetic analysis of Newcastle disease viruses from Bangladesh suggests continuing evolution of genotype XIII.

    PubMed

    Barman, Lalita Rani; Nooruzzaman, Mohammed; Sarker, Rahul Deb; Rahman, Md Tazinur; Saife, Md Rajib Bin; Giasuddin, Mohammad; Das, Bidhan Chandra; Das, Priya Mohan; Chowdhury, Emdadul Haque; Islam, Mohammad Rafiqul

    2017-10-01

    A total of 23 Newcastle disease virus (NDV) isolates from Bangladesh taken between 2010 and 2012 were characterized on the basis of partial F gene sequences. All the isolates belonged to genotype XIII of class II NDV but segregated into three sub-clusters. One sub-cluster with 17 isolates aligned with sub-genotype XIIIc. The other two sub-clusters were phylogenetically distinct from the previously described sub-genotypes XIIIa, XIIIb and XIIIc and could be candidates of new sub-genotypes; however, that needs to be validated through full-length F gene sequence data. The results of the present study suggest that genotype XIII NDVs are under continuing evolution in Bangladesh.

  16. Isolation and in silico analysis of a novel H+-pyrophosphatase gene orthologue from the halophytic grass Leptochloa fusca

    NASA Astrophysics Data System (ADS)

    Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid

    2017-02-01

    Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.

  17. Rolling circle amplification-based analysis of Sri Lankan cassava mosaic virus isolates from Tamil Nadu, India, suggests a low level of genetic variability.

    PubMed

    Kushawaha, Akhilesh Kumar; Rabindran, Ramalingam; Dasgupta, Indranil

    2018-03-01

    Cassava mosaic disease is a widespread disease of cassava in south Asia and the African continent. In India, CMD is known to be caused by two single-stranded DNA viruses (geminiviruses), Indian cassava mosaic virus (ICMV) and Sri Lankan cassava mosdaic virus (SLCMV). Previously, the diversity of ICMV and SLCMV in India has been studied using PCR, a sequence-dependent method. To have a more in-depth study of the variability of the above viruses and to detect any novel geminiviruses associated with CMD, sequence-independent amplification using rolling circle amplification (RCA)-based methods were used. CMD affected cassava plants were sampled across eighty locations in nine districts of the southern Indian state of Tamil Nadu. Twelve complete sequence of coat protein genes of the resident geminiviruses, comprising 256 amino acid residues were generated from the above samples, which indicated changes at only six positions. RCA followed by RFLP of the 80 samples indicated that most samples (47) contained only SLCMV, followed by 8, which were infected jointly with ICMV and SLCMV. In 11 samples, the pattern did not match the expected patterns from either of the two viruses and hence, were variants. Sequence analysis of an average of 700 nucleotides from 31 RCA-generated fragments of the variants indicated identities of 97-99% with the sequence of a previously reported infectious clone of SLCMV. The evidence suggests low levels of genetic variability in the begomoviruses infecting cassava, mainly in the form of scattered single nucleotide changes.

  18. Comparison of large-insert, small-insert and pyrosequencing libraries for metagenomic analysis.

    PubMed

    Danhorn, Thomas; Young, Curtis R; DeLong, Edward F

    2012-11-01

    The development of DNA sequencing methods for characterizing microbial communities has evolved rapidly over the past decades. To evaluate more traditional, as well as newer methodologies for DNA library preparation and sequencing, we compared fosmid, short-insert shotgun and 454 pyrosequencing libraries prepared from the same metagenomic DNA samples. GC content was elevated in all fosmid libraries, compared with shotgun and 454 libraries. Taxonomic composition of the different libraries suggested that this was caused by a relative underrepresentation of dominant taxonomic groups with low GC content, notably Prochlorales and the SAR11 cluster, in fosmid libraries. While these abundant taxa had a large impact on library representation, we also observed a positive correlation between taxon GC content and fosmid library representation in other low-GC taxa, suggesting a general trend. Analysis of gene category representation in different libraries indicated that the functional composition of a library was largely a reflection of its taxonomic composition, and no additional systematic biases against particular functional categories were detected at the level of sequencing depth in our samples. Another important but less predictable factor influencing the apparent taxonomic and functional library composition was the read length afforded by the different sequencing technologies. Our comparisons and analyses provide a detailed perspective on the influence of library type on the recovery of microbial taxa in metagenomic libraries and underscore the different uses and utilities of more traditional, as well as contemporary 'next-generation' DNA library construction and sequencing technologies for exploring the genomics of the natural microbial world.

  19. Alteration of gene expression in human hepatocellular carcinoma with integrated hepatitis B virus DNA.

    PubMed

    Tamori, Akihiro; Yamanishi, Yoshihiro; Kawashima, Shuichi; Kanehisa, Minoru; Enomoto, Masaru; Tanaka, Hiromu; Kubo, Shoji; Shiomi, Susumu; Nishiguchi, Shuhei

    2005-08-15

    Integration of hepatitis B virus (HBV) DNA into the human genome is one of the most important steps in HBV-related carcinogenesis. This study attempted to find the link between HBV DNA, the adjoining cellular sequence, and altered gene expression in hepatocellular carcinoma (HCC) with integrated HBV DNA. We examined 15 cases of HCC infected with HBV by cassette ligation-mediated PCR. The human DNA adjacent to the integrated HBV DNA was sequenced. Protein coding sequences were searched for in the human sequence. In five cases with HBV DNA integration, from which good quality RNA was extracted, gene expression was examined by cDNA microarray analysis. The human DNA sequence successive to integrated HBV DNA was determined in the 15 HCCs. Eight protein-coding regions were involved: ras-responsive element binding protein 1, calmodulin 1, mixed lineage leukemia 2 (MLL2), FLJ333655, LOC220272, LOC255345, LOC220220, and LOC168991. The MLL2 gene was expressed in three cases with HBV DNA integrated into exon 3 of MLL2 and in one case with HBV DNA integrated into intron 3 of MLL2. Gene expression analysis suggested that two HCCs with HBV integrated into MLL2 had similar patterns of gene expression compared with three HCCs with HBV integrated into other loci of human chromosomes. HBV DNA was integrated at random sites of human DNA, and the MLL2 gene was one of the targets for integration. Our results suggest that HBV DNA might modulate human genes near integration sites, followed by integration site-specific expression of such genes during hepatocarcinogenesis.

  20. Identification of a novel astrovirus in domestic sheep in Hungary.

    PubMed

    Reuter, Gábor; Pankovics, Péter; Delwart, Eric; Boros, Ákos

    2012-02-01

    The family Astroviridae consists of two genera, Avastrovirus and Mamastrovirus, whose members are associated with gastroenteritis in avian and mammalian hosts, respectively. We serendipitously identified a novel ovine astrovirus in a fecal specimen from a domestic sheep (Ovis aries) in Hungary by viral metagenomic analysis. Sequencing of the fragment indicated that it was an ORF1b/ORF2/3'UTR sequence, and it has been submitted to the GenBank database as ovine astrovirus type 2 (OAstV-2/Hungary/2009) with accession number JN592482. The unique sequence characteristics and the phylogenetic position of OAstV-2 suggest that genetically divergent lineages of astroviruses exist in sheep.

  1. A 12-year molecular survey of clinical herpes simplex virus type 2 isolates demonstrates the circulation of clade A and B strains in Germany.

    PubMed

    Schmidt-Chanasit, Jonas; Bialonski, Alexandra; Heinemann, Patrick; Ulrich, Rainer G; Günther, Stephan; Rabenau, Holger F; Doerr, Hans Wilhelm

    2010-07-01

    Recently two different herpes simplex virus type 2 (HSV-2) clades (A and B) were described on DNA sequence data of the glycoprotein E (gE), G (gG) and I (gI) genes. To type the circulating HSV-2 wild-type strains in Germany by a novel approach and to monitor potential changes in the molecular epidemiology between 1997 and 2008. A total of 64 clinical HSV-2 isolates were analyzed by a novel approach using the DNA sequences of the complete open reading frames of glycoprotein B (gB) and gG. Recombination analysis of the gB and gG gene sequences was performed to reveal intragenic recombinants. Based on the phylogenetic analysis of the gB coding DNA sequence 8 of 64 (12%) isolates were classified as clade A strains and 56 of 64 (88%) isolates were classified as clade B strains. Analysis of the gG coding DNA sequence classified 4 (6%) isolates as clade A strains and 60 (94%) isolates as clade B strains. In comparison, the 8 isolates classified as clade A strains using the gB sequence data were classified as clade B strains when using the gG coding DNA sequence, suggesting intergenic recombination events. Intragenic recombination events were not detected. The first molecular survey of clinical HSV-2 isolates from Germany demonstrated the circulation of clade A and B strains and of intergenic recombinants over a period of 12 years. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  2. The utility of DNA sequences of an intron from the beta-fibrinogen gene in phylogenetic analysis of woodpeckers (Aves: Picidae).

    PubMed

    Prychitko, T M; Moore, W S

    1997-10-01

    Estimating phylogenies from DNA sequence data has become the major methodology of molecular phylogenetics. To date, molecular phylogenetics of the vertebrates has been very dependent on mtDNA, but studies involving mtDNA are limited because the several genes comprising the mt-genome are inherited as a single linkage group. The only apparent solution to this problem is to sequence additional genes, each representing a distinct linkage group, so that the resultant gene trees provide independent estimates of the species tree. There exists the need to find novel gene sequences which contain enough phylogenetic information to resolve relationships between closely related species. A possible source is the nuclear-encoded introns, because they evolve more rapidly than exons. We designed primers to amplify and sequence the 7 intron from the beta-fibrinogen gene for a recently evolved group, the woodpeckers. We sequenced the entire intron for 10 specimens representing five species. Nucleotide substitutions are randomly distributed along the length of the intron, suggesting selective neutrality. A preliminary analysis indicates that the phylogenetic signal in the intron is as strong as that in the mitochondrial encoded cytochrome b (cyt b) gene. The topology of the beta-fibrinogen tree is identical to that of the cyt b tree. This analysis demonstrates the ability of the 7 intron of beta-fibrinogen to provide well resolved, independent gene trees for recently evolved groups and establishes it as a source of sequences to be used in other phylogenetic studies. Copyright 1997 Academic Press

  3. Mining, identification and function analysis of microRNAs and target genes in peanut (Arachis hypogaea L.).

    PubMed

    Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua

    2017-02-01

    In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  4. Comparative full-length genome sequence analysis of 14 SARS coronavirus isolates and common mutations associated with putative origins of infection.

    PubMed

    Ruan, Yi Jun; Wei, Chia Lin; Ee, Ai Ling; Vega, Vinsensius B; Thoreau, Herve; Su, Se Thoe Yun; Chia, Jer-Ming; Ng, Patrick; Chiu, Kuo Ping; Lim, Landri; Zhang, Tao; Peng, Chan Kwai; Lin, Ean Oon Lynette; Lee, Ng Mah; Yee, Sin Leo; Ng, Lisa F P; Chee, Ren Ee; Stanton, Lawrence W; Long, Philip M; Liu, Edison T

    2003-05-24

    The cause of severe acute respiratory syndrome (SARS) has been identified as a new coronavirus. Whole genome sequence analysis of various isolates might provide an indication of potential strain differences of this new virus. Moreover, mutation analysis will help to develop effective vaccines. We sequenced the entire SARS viral genome of cultured isolates from the index case (SIN2500) presenting in Singapore, from three primary contacts (SIN2774, SIN2748, and SIN2677), and one secondary contact (SIN2679). These sequences were compared with the isolates from Canada (TOR2), Hong Kong (CUHK-W1 and HKU39849), Hanoi (URBANI), Guangzhou (GZ01), and Beijing (BJ01, BJ02, BJ03, BJ04). We identified 129 sequence variations among the 14 isolates, with 16 recurrent variant sequences. Common variant sequences at four loci define two distinct genotypes of the SARS virus. One genotype was linked with infections originating in Hotel M in Hong Kong, the second contained isolates from Hong Kong, Guangzhou, and Beijing with no association with Hotel M (p<0.0001). Moreover, other common sequence variants further distinguished the geographical origins of the isolates, especially between Singapore and Beijing. Despite the recent onset of the SARS epidemic, genetic signatures are emerging that partition the worldwide SARS viral isolates into groups on the basis of contact source history and geography. These signatures can be used to trace sources of infection. In addition, a common variant associated with a non-conservative aminoacid change in the S1 region of the spike protein, suggests that immunological pressures might be starting to influence the evolution of the SARS virus in human populations.

  5. Seismic analysis of clinoform depositional sequences and shelf-margin trajectories in Lower Cretaceous (Albian) strata, Alaska North Slope

    USGS Publications Warehouse

    Houseknecht, D.W.; Bird, K.J.; Schenk, C.J.

    2009-01-01

    Lower Cretaceous strata beneath the Alaska North Slope include clinoform depositional sequences that filled the western Colville foreland basin and overstepped the Beaufort rift shoulder. Analysis of Albian clinoform sequences with two-dimensional (2D) seismic data resulted in the recognition of seismic facies inferred to represent lowstand, transgressive and highstand systems tracts. These are stacked to produce shelf-margin trajectories that appear in low-resolution seismic data to alternate between aggradational and progradational. Higher-resolution seismic data reveal shelf-margin trajectories that are more complex, particularly in net-aggradational areas, where three patterns commonly are observed: (1) a negative (downward) step across the sequence boundary followed by mostly aggradation in the lowstand systems tract (LST), (2) a positive (upward) step across the sequence boundary followed by mostly progradation in the LST and (3) an upward backstep across a mass-failure d??collement. These different shelf-margin trajectories are interpreted as (1) fall of relative sea level below the shelf edge, (2) fall of relative sea level to above the shelf edge and (3) mass-failure removal of shelf-margin sediment. Lowstand shelf margins mapped using these criteria are oriented north-south in the foreland basin, indicating longitudinal filling from west to east. The shelf margins turn westward in the north, where the clinoform depositional system overstepped the rift shoulder, and turn eastward in the south, suggesting progradation of depositional systems from the ancestral Brooks Range into the foredeep. Lowstand shelf-margin orientations are consistently perpendicular to clinoform-foreset-dip directions. Although the Albian clinoform sequences of the Alaska North Slope are generally similar in stratal geometry to clinoform sequences elsewhere, they are significantly thicker. Clinoform-sequence thickness ranges from 600-1000 m in the north to 1700-2000 m in the south, reflecting increased accommodation from the rift shoulder into the foredeep. The unusually thick clinoform sequences suggest significant subsidence followed by rapid sediment influx. No claim to original US government works. Journal Compilation ?? Blackwell Publishing Ltd, European Association of Geoscientists & Engineers and International Association of Sedimentologists.

  6. Next-Generation Sequence Analysis of the Genome of RFHVMn, the Macaque Homolog of Kaposi's Sarcoma (KS)-Associated Herpesvirus, from a KS-Like Tumor of a Pig-Tailed Macaque

    PubMed Central

    Bruce, A. Gregory; Ryan, Jonathan T.; Thomas, Mathew J.; Peng, Xinxia; Grundhoff, Adam; Tsai, Che-Chung

    2013-01-01

    The complete sequence of retroperitoneal fibromatosis-associated herpesvirus Macaca nemestrina (RFHVMn), the pig-tailed macaque homolog of Kaposi's sarcoma-associated herpesvirus (KSHV), was determined by next-generation sequence analysis of a Kaposi's sarcoma (KS)-like macaque tumor. Colinearity of genes was observed with the KSHV genome, and the core herpesvirus genes had strong sequence homology to the corresponding KSHV genes. RFHVMn lacked homologs of open reading frame 11 (ORF11) and KSHV ORFs K5 and K6, which appear to have been generated by duplication of ORFs K3 and K4 after the divergence of KSHV and RFHV. RFHVMn contained positional homologs of all other unique KSHV genes, although some showed limited sequence similarity. RFHVMn contained a number of candidate microRNA genes. Although there was little sequence similarity with KSHV microRNAs, one candidate contained the same seed sequence as the positional homolog, kshv-miR-K12-10a, suggesting functional overlap. RNA transcript splicing was highly conserved between RFHVMn and KSHV, and strong sequence conservation was noted in specific promoters and putative origins of replication, predicting important functional similarities. Sequence comparisons indicated that RFHVMn and KSHV developed in long-term synchrony with the evolution of their hosts, and both viruses phylogenetically group within the RV1 lineage of Old World primate rhadinoviruses. RFHVMn is the closest homolog of KSHV to be completely sequenced and the first sequenced RV1 rhadinovirus homolog of KSHV from a nonhuman Old World primate. The strong genetic and sequence similarity between RFHVMn and KSHV, coupled with similarities in biology and pathology, demonstrate that RFHVMn infection in macaques offers an important and relevant model for the study of KSHV in humans. PMID:24109218

  7. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  8. Toward the 1,000 dollars human genome.

    PubMed

    Bennett, Simon T; Barnes, Colin; Cox, Anthony; Davies, Lisa; Brown, Clive

    2005-06-01

    Revolutionary new technologies, capable of transforming the economics of sequencing, are providing an unparalleled opportunity to analyze human genetic variation comprehensively at the whole-genome level within a realistic timeframe and at affordable costs. Current estimates suggest that it would cost somewhere in the region of 30 million US dollars to sequence an entire human genome using Sanger-based sequencing, and on one machine it would take about 60 years. Solexa is widely regarded as a company with the necessary disruptive technology to be the first to achieve the ultimate goal of the so-called 1,000 dollars human genome - the conceptual cost-point needed for routine analysis of individual genomes. Solexa's technology is based on completely novel sequencing chemistry capable of sequencing billions of individual DNA molecules simultaneously, a base at a time, to enable highly accurate, low cost analysis of an entire human genome in a single experiment. When applied over a large enough genomic region, these new approaches to resequencing will enable the simultaneous detection and typing of known, as well as unknown, polymorphisms, and will also offer information about patterns of linkage disequilibrium in the population being studied. Technological progress, leading to the advent of single-molecule-based approaches, is beginning to dramatically drive down costs and increase throughput to unprecedented levels, each being several orders of magnitude better than that which is currently available. A new sequencing paradigm based on single molecules will be faster, cheaper and more sensitive, and will permit routine analysis at the whole-genome level.

  9. Whole-Genome Sequencing and Comparative Genome Analysis of Bacillus subtilis Strains Isolated from Non-Salted Fermented Soybean Foods

    PubMed Central

    Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi

    2015-01-01

    Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from “Tua Nao” of Thailand traces a different evolutionary process from other strains. PMID:26505996

  10. Molecular Cytogenetics Guides Massively Parallel Sequencing of a Radiation-Induced Chromosome Translocation in Human Cells.

    PubMed

    Cornforth, Michael N; Anur, Pavana; Wang, Nicholas; Robinson, Erin; Ray, F Andrew; Bedford, Joel S; Loucas, Bradford D; Williams, Eli S; Peto, Myron; Spellman, Paul; Kollipara, Rahul; Kittler, Ralf; Gray, Joe W; Bailey, Susan M

    2018-05-11

    Chromosome rearrangements are large-scale structural variants that are recognized drivers of oncogenic events in cancers of all types. Cytogenetics allows for their rapid, genome-wide detection, but does not provide gene-level resolution. Massively parallel sequencing (MPS) promises DNA sequence-level characterization of the specific breakpoints involved, but is strongly influenced by bioinformatics filters that affect detection efficiency. We sought to characterize the breakpoint junctions of chromosomal translocations and inversions in the clonal derivatives of human cells exposed to ionizing radiation. Here, we describe the first successful use of DNA paired-end analysis to locate and sequence across the breakpoint junctions of a radiation-induced reciprocal translocation. The analyses employed, with varying degrees of success, several well-known bioinformatics algorithms, a task made difficult by the involvement of repetitive DNA sequences. As for underlying mechanisms, the results of Sanger sequencing suggested that the translocation in question was likely formed via microhomology-mediated non-homologous end joining (mmNHEJ). To our knowledge, this represents the first use of MPS to characterize the breakpoint junctions of a radiation-induced chromosomal translocation in human cells. Curiously, these same approaches were unsuccessful when applied to the analysis of inversions previously identified by directional genomic hybridization (dGH). We conclude that molecular cytogenetics continues to provide critical guidance for structural variant discovery, validation and in "tuning" analysis filters to enable robust breakpoint identification at the base pair level.

  11. Global molecular genetic analysis of porcine circovirus type 2 (PCV2) sequences confirms the presence of four main PCV2 genotypes and reveals a rapid increase of PCV2d.

    PubMed

    Xiao, Chao-Ting; Halbur, Patrick G; Opriessnig, Tanja

    2015-07-01

    The oldest porcine circovirus type 2 (PCV2) sequence dates back to 1962 and is among several hundreds of publicly available PCV2 sequences. Despite this resource, few studies have investigated the global genetic diversity of PCV2. To evaluate the phylogenetic relationship of PCV2 strains, 1680 PCV2 open reading frame 2 (ORF2) sequences were compared and analysed by methods of neighbour-joining, maximum-likelihood, Bayesian inference and network analysis. Four distinct clades were consistently identified and included PCV2a, PCV2b, PCV2c and PCV2d; the p-distance between PCV2d and PCV2b was 0.055±0.008, larger than the PCV2 genotype-definition cut-off of 0.035, supporting PCV2d as an independent genotype. Among the 1680 sequences, 278-285 (16.5-17 %) were classified as PCV2a, 1007-1058 (59.9-63 %) as PCV2b, three (0.2 %) as PCV2c and 322-323 (19.2 %) as PCV2d, with the remaining 12-78 sequences (0.7-4.6 %) classified as intermediate clades or strains by the various methods. Classification of strains to genotypes differed based on the number of sequences used for the analysis, indicating that sample size is important when determining classification and assessing PCV2 trends and shifts. PCV2d was initially identified in 1999 in samples collected in Switzerland, now appears to be widespread in China and has been present in North America since 2012. During 2012-2013, 37 % of all investigated PCV2 sequences from US pigs were classified as PCV2d and overall data analysis suggests an ongoing genotype shift from PCV2b towards PCV2d. The present analyses indicate that PCV2d emerged approximately 20 years ago.

  12. An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

    PubMed Central

    Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

    1999-01-01

    A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707

  13. A Cluster of Legionella-Associated Pneumonia Cases in a Population of Military Recruits

    DTIC Science & Technology

    2007-06-01

    this cluster may suggest a previously unrecognized suscep- FIG. 1. Phylogenic analysis of the training center strain (represented by the MCRD consensus...military recruits during population- based surveillance for pneumonia pathogens. Results were confirmed by sequence analysis . Cases cluster tightly...17 April 2007 A Legionella cluster was identified through retrospective PCR analysis of 240 throat swab samples from X-ray-confirmed pneumonia cases

  14. A palindrome-mediated mechanism distinguishes translocations involving LCR-B of chromosome 22q11.2.

    PubMed

    Gotter, Anthony L; Shaikh, Tamim H; Budarf, Marcia L; Rhodes, C Harker; Emanuel, Beverly S

    2004-01-01

    Two known recurrent constitutional translocations, t(11;22) and t(17;22), as well as a non-recurrent t(4;22), display derivative chromosomes that have joined to a common site within the low copy repeat B (LCR-B) region of 22q11.2. This breakpoint is located between two AT-rich inverted repeats that form a nearly perfect palindrome. Breakpoints within the 11q23, 17q11 and 4q35 partner chromosomes also fall near the center of palindromic sequences. In the present work the breakpoints of a fourth translocation involving LCR-B, a balanced ependymoma-associated t(1;22), were characterized not only to localize this junction relative to known genes, but also to further understand the mechanism underlying these rearrangements. FISH mapping was used to localize the 22q11.2 breakpoint to LCR-B and the 1p21 breakpoint to single BAC clones. STS mapping narrowed the 1p21.2 breakpoint to a 1990 bp AT-rich region, and junction fragments were amplified by nested PCR. Junction fragment-derived sequence indicates that the 1p21.2 breakpoint splits a 278 nt palindrome capable of forming stem-loop secondary structure. In contrast, the 1p21.2 reference genomic sequence from clones in the database does not exhibit this configuration, suggesting a predisposition for regional genomic instability perhaps etiologic for this rearrangement. Given its similarity to known chromosomal fragile site (FRA) sequences, this polymorphic 1p21.2 sequence may represent one of the FRA1 loci. Comparative analysis of the secondary structure of sequences surrounding translocation breakpoints that involve LCR-B with those not involving this region indicate a unique ability of the former to form stem-loop structures. The relative likelihood of forming these configurations appears to be related to the rate of translocation occurrence. Further analysis suggests that constitutional translocations in general occur between sequences of similar melting temperature and propensity for secondary structure.

  15. A palindrome-mediated mechanism distinguishes translocations involving LCR-B of chromosome 22q11.2

    PubMed Central

    Gotter, Anthony L.; Shaikh, Tamim H.; Budarf, Marcia L.; Rhodes, C. Harker; Emanuel, Beverly S.

    2010-01-01

    Two known recurrent constitutional translocations, t(11;22) and t(17;22), as well as a non-recurrent t(4;22), display derivative chromosomes that have joined to a common site within the low copy repeat B (LCR-B) region of 22q11.2. This breakpoint is located between two AT-rich inverted repeats that form a nearly perfect palindrome. Breakpoints within the 11q23, 17q11 and 4q35 partner chromosomes also fall near the center of palindromic sequences. In the present work the breakpoints of a fourth translocation involving LCR-B, a balanced ependymoma-associated t(1;22), were characterized not only to localize this junction relative to known genes, but also to further understand the mechanism underlying these rearrangements. FISH mapping was used to localize the 22q11.2 breakpoint to LCR-B and the 1p21 breakpoint to single BAC clones. STS mapping narrowed the 1p21.2 breakpoint to a 1990 bp AT-rich region, and junction fragments were amplified by nested PCR. Junction fragment-derived sequence indicates that the 1p21.2 breakpoint splits a 278 nt palindrome capable of forming stem–loop secondary structure. In contrast, the 1p21.2 reference genomic sequence from clones in the database does not exhibit this configuration, suggesting a predisposition for regional genomic instability perhaps etiologic for this rearrangement. Given its similarity to known chromosomal fragile site (FRA) sequences, this polymorphic 1p21.2 sequence may represent one of the FRA1 loci. Comparative analysis of the secondary structure of sequences surrounding translocation breakpoints that involve LCR-B with those not involving this region indicate a unique ability of the former to form stem–loop structures. The relative likelihood of forming these configurations appears to be related to the rate of translocation occurrence. Further analysis suggests that constitutional translocations in general occur between sequences of similar melting temperature and propensity for secondary structure. PMID:14613967

  16. Genetic diversity and geographical structure of the pitcher plant Nepenthes vieillardii in New Caledonia: A chloroplast DNA haplotype analysis.

    PubMed

    Kurata, Kaoruko; Jaffré, Tanguy; Setoguchi, Hiroaki

    2008-12-01

    Among the many species that grow in New Caledonia, the pitcher plant Nepenthes vieillardii (Nepenthaceae) has a high degree of morphological variation. In this study, we present the patterns of genetic differentiation of pitcher plant populations based on chloroplast DNA haplotype analysis using the sequences of five spacers. We analyzed 294 samples from 16 populations covering the entire range of the species, using 4660 bp of sequence. Our analysis identified 17 haplotypes, including one that is widely distributed across the islands, as well as regional and private haplotypes. The greatest haplotype diversity was detected on the eastern coast of the largest island and included several private haplotypes, while haplotype diversity was low in the southern plains region. The parsimony network analysis of the 17 haplotypes suggested that the genetic divergence is the result of long-term isolation of individual populations. Results from a spatial analysis of molecular variance and a cluster analysis suggest that the plants once covered the entire serpentine area of New Caledonia and that subsequent regional fragmentation resulted in the isolation of each population and significantly restricted seed flow. This isolation may have been an important factor in the development of the morphological and genetic variation among pitcher plants in New Caledonia.

  17. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

    PubMed Central

    2013-01-01

    Background Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Results Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li’s D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li’s D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. Conclusions This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens. PMID:23497218

  18. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing.

    PubMed

    Cornman, Robert Scott; Boncristiani, Humberto; Dainat, Benjamin; Chen, Yanping; vanEngelsdorp, Dennis; Weaver, Daniel; Evans, Jay D

    2013-03-07

    Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li's D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li's D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens.

  19. Discovery of common sequences absent in the human reference genome using pooled samples from next generation sequencing.

    PubMed

    Liu, Yu; Koyutürk, Mehmet; Maxwell, Sean; Xiang, Min; Veigl, Martina; Cooper, Richard S; Tayo, Bamidele O; Li, Li; LaFramboise, Thomas; Wang, Zhenghe; Zhu, Xiaofeng; Chance, Mark R

    2014-08-16

    Sequences up to several megabases in length have been found to be present in individual genomes but absent in the human reference genome. These sequences may be common in populations, and their absence in the reference genome may indicate rare variants in the genomes of individuals who served as donors for the human genome project. As the reference genome is used in probe design for microarray technology and mapping short reads in next generation sequencing (NGS), this missing sequence could be a source of bias in functional genomic studies and variant analysis. One End Anchor (OEA) and/or orphan reads from paired-end sequencing have been used to identify novel sequences that are absent in reference genome. However, there is no study to investigate the distribution, evolution and functionality of those sequences in human populations. To systematically identify and study the missing common sequences (micSeqs), we extended the previous method by pooling OEA reads from large number of individuals and applying strict filtering methods to remove false sequences. The pipeline was applied to data from phase 1 of the 1000 Genomes Project. We identified 309 micSeqs that are present in at least 1% of the human population, but absent in the reference genome. We confirmed 76% of these 309 micSeqs by comparison to other primate genomes, individual human genomes, and gene expression data. Furthermore, we randomly selected fifteen micSeqs and confirmed their presence using PCR validation in 38 additional individuals. Functional analysis using published RNA-seq and ChIP-seq data showed that eleven micSeqs are highly expressed in human brain and three micSeqs contain transcription factor (TF) binding regions, suggesting they are functional elements. In addition, the identified micSeqs are absent in non-primates and show dynamic acquisition during primate evolution culminating with most micSeqs being present in Africans, suggesting some micSeqs may be important sources of human diversity. 76% of micSeqs were confirmed by a comparative genomics approach. Fourteen micSeqs are expressed in human brain or contain TF binding regions. Some micSeqs are primate-specific, conserved and may play a role in the evolution of primates.

  20. Sequence, distribution and chromosomal context of class I and class II pilin genes of Neisseria meningitidis identified in whole genome sequences

    PubMed Central

    2014-01-01

    Background Neisseria meningitidis expresses type four pili (Tfp) which are important for colonisation and virulence. Tfp have been considered as one of the most variable structures on the bacterial surface due to high frequency gene conversion, resulting in amino acid sequence variation of the major pilin subunit (PilE). Meningococci express either a class I or a class II pilE gene and recent work has indicated that class II pilins do not undergo antigenic variation, as class II pilE genes encode conserved pilin subunits. The purpose of this work was to use whole genome sequences to further investigate the frequency and variability of the class II pilE genes in meningococcal isolate collections. Results We analysed over 600 publically available whole genome sequences of N. meningitidis isolates to determine the sequence and genomic organization of pilE. We confirmed that meningococcal strains belonging to a limited number of clonal complexes (ccs, namely cc1, cc5, cc8, cc11 and cc174) harbour a class II pilE gene which is conserved in terms of sequence and chromosomal context. We also identified pilS cassettes in all isolates with class II pilE, however, our analysis indicates that these do not serve as donor sequences for pilE/pilS recombination. Furthermore, our work reveals that the class II pilE locus lacks the DNA sequence motifs that enable (G4) or enhance (Sma/Cla repeat) pilin antigenic variation. Finally, through analysis of pilin genes in commensal Neisseria species we found that meningococcal class II pilE genes are closely related to pilE from Neisseria lactamica and Neisseria polysaccharea, suggesting horizontal transfer among these species. Conclusions Class II pilins can be defined by their amino acid sequence and genomic context and are present in meningococcal isolates which have persisted and spread globally. The absence of G4 and Sma/Cla sequences adjacent to the class II pilE genes is consistent with the lack of pilin subunit variation in these isolates, although horizontal transfer may generate class II pilin diversity. This study supports the suggestion that high frequency antigenic variation of pilin is not universal in pathogenic Neisseria. PMID:24690385

  1. Security Analysis of a Block Encryption Algorithm Based on Dynamic Sequences of Multiple Chaotic Systems

    NASA Astrophysics Data System (ADS)

    Du, Mao-Kang; He, Bo; Wang, Yong

    2011-01-01

    Recently, the cryptosystem based on chaos has attracted much attention. Wang and Yu (Commun. Nonlin. Sci. Numer. Simulat. 14 (2009) 574) proposed a block encryption algorithm based on dynamic sequences of multiple chaotic systems. We analyze the potential flaws in the algorithm. Then, a chosen-plaintext attack is presented. Some remedial measures are suggested to avoid the flaws effectively. Furthermore, an improved encryption algorithm is proposed to resist the attacks and to keep all the merits of the original cryptosystem.

  2. Organization and transient expression of the gene for human U11 snRNA

    PubMed Central

    Clemens, Suter-Crazzolara; Walter, Keller

    1991-01-01

    The nucleotide sequence of U11 small nuclear RNA, a minor U RNA from HeLa cells, was determined. Computer analysis of the sequence (135 residues) predicts two strong hairpin loops which are separated by seventeen nucleotides containing an Sm binding site (AAUUUUUUGG). A synthetic gene was constructed in which the coding region of U11 RNA is under the control of a T7 promoter. This vector can be used to produce U11 RNA in vitro. Southern hybridization and PCR analysis of HeLa genomic DNA suggest that U11 RNA is encoded by a single copy gene, and that at least three genomic regions could be U11 RNA pseudogenes. A HeLa genomic copy of a U11 gene was isolated by inverted PCR. This gene contains the U11 RNA coding sequence and several sequence elements unique for the U RNA genes. These include a Distal Sequence Element (DSE, ATTTGCATA) present between positions −215 and −223 relative to the start of transcription; a Proximal Sequence Element (PSE, TTCACCTTTACCAAAAATG) located between positions −43 and −63 ; and a 3′box (GTTAGGCGAAATATTA) between positions +150 and +166. Transfection of HeLa cells with this gene revealed that it is functioning in vivo and can produce U11 RNA. PMID:1820214

  3. An Outbreak of Acute Hepatitis Caused by Genotype IB Hepatitis A Viruses Contaminating the Water Supply in Thailand.

    PubMed

    Ruchusatsawat, Kriangsak; Wongpiyabovorn, Jongkonnee; Kawidam, Chonthicha; Thiemsing, Laddawan; Sangkitporn, Somchai; Yoshizaki, Sayaka; Tatsumi, Masashi; Takeda, Naokazu; Ishii, Koji

    2016-01-01

    In 2000, an outbreak of acute hepatitis A was reported in a province adjacent to Bangkok, Thailand. To investigate the cause of the 2000 hepatitis A outbreaks in Thailand using molecular epidemiological analysis. Serum and stool specimens were collected from patients who were clinically diagnosed with acute viral hepatitis. Water samples from drinking water and deep-drilled wells were also collected. These specimens were subjected to polymerase chain reaction (PCR) amplification and sequencing of the VP1/2A region of the hepatitis A virus (HAV) genome. The entire genome sequence of one of the fecal specimens was determined and phylogenetically analyzed with those of known HAV sequences. Eleven of 24 fecal specimens collected from acute viral hepatitis patients were positive as determined by semi- nested reverse transcription PCR targeting the VP1/2A region of HAV. The nucleotide sequence of these samples had an identical genotype IB sequence, suggesting that the same causative agent was present. The complete nucleotide sequence derived from one of the samples indicated that the Thai genotype IB strain should be classified in a unique phylogenetic cluster. The analysis using an adjusted odds ratio showed that the consumption of groundwater was the most likely risk factor associated with the disease. © 2017 S. Karger AG, Basel.

  4. The complete genome sequence of a south Indian isolate of Rice tungro spherical virus reveals evidence of genetic recombination between distinct isolates.

    PubMed

    Sailaja, B; Anjum, Najreen; Patil, Yogesh K; Agarwal, Surekha; Malathi, P; Krishnaveni, D; Balachandran, S M; Viraktamath, B C; Mangrauthia, Satendra K

    2013-12-01

    In this study, complete genome of a south Indian isolate of Rice tungro spherical virus (RTSV) from Andhra Pradesh (AP) was sequenced, and the predicted amino acid sequence was analysed. The RTSV RNA genome consists of 12,171 nt without the poly(A) tail, encoding a putative typical polyprotein of 3,470 amino acids. Furthermore, cleavage sites and sequence motifs of the polyprotein were predicted. Multiple alignment with other RTSV isolates showed a nucleotide sequence identity of 95% to east Indian isolates and 90% to Philippines isolates. A phylogenetic tree based on complete genome sequence showed that Indian isolates clustered together, while Vt6 and PhilA isolates of Philippines formed two separate clusters. Twelve recombination events were detected in RNA genome of RTSV using the Recombination Detection Program version 3. Recombination analysis suggested significant role of 5' end and central region of genome in virus evolution. Further, AP and Odisha isolates appeared as important RTSV isolates involved in diversification of this virus in India through recombination phenomenon. The new addition of complete genome of first south Indian isolate provided an opportunity to establish the molecular evolution of RTSV through recombination analysis and phylogenetic relationship.

  5. First molecular data on the phylum Loricifera: an investigation into the phylogeny of ecdysozoa with emphasis on the positions of Loricifera and Priapulida.

    PubMed

    Park, Joong-Ki; Rho, Hyun Soo; Kristensen, Reinhardt Møbjerg; Kim, Won; Giribet, Gonzalo

    2006-11-01

    Recent progress in molecular techniques has generated a wealth of information for phylogenetic analysis. Among metazoans all but a single phylum have been incorporated into some sort of molecular analysis. However, the minute and rare species of the phylum Loricifera have remained elusive to molecular systematists. Here we report the first molecular sequence data (nearly complete 18S rRNA) for a member of the phylum Loricifera, Pliciloricus sp. from Korea. The new sequence data were analyzed together with 52 other ecdysozoan sequences, with all other phyla represented by three or more sequences. The data set was analyzed using parsimony as an optimality criterion under direct optimization as well as using a Bayesian approach. The parsimony analysis was also accompanied by a sensitivity analysis. The results of both analyses are largely congruent, finding monophyly of each ecdysozoan phylum, except for Priapulida, in which the coelomate Meiopriapulus is separate from a clade of pseudocoelomate priapulids. The data also suggest a relationship of the pseudocoelomate priapulids to kinorhynchs, and a relationship of nematodes to tardigrades. The Bayesian analysis placed the arthropods as the sister group to a clade that includes tardigrades and nematodes. However, these results were shown to be parameter dependent in the sensitivity analysis. The position of Loricifera was extremely unstable to parameter variation, and support for a relationship of loriciferans to any particular ecdysozoan phylum was not found in the data.

  6. Putative cross-kingdom horizontal gene transfer in sponge (Porifera) mitochondria

    PubMed Central

    Rot, Chagai; Goldfarb, Itay; Ilan, Micha; Huchon, Dorothée

    2006-01-01

    Background The mitochondrial genome of Metazoa is usually a compact molecule without introns. Exceptions to this rule have been reported only in corals and sea anemones (Cnidaria), in which group I introns have been discovered in the cox1 and nad5 genes. Here we show several lines of evidence demonstrating that introns can also be found in the mitochondria of sponges (Porifera). Results A 2,349 bp fragment of the mitochondrial cox1 gene was sequenced from the sponge Tetilla sp. (Spirophorida). This fragment suggests the presence of a 1143 bp intron. Similar to all the cnidarian mitochondrial introns, the putative intron has group I intron characteristics. The intron is present in the cox1 gene and encodes a putative homing endonuclease. In order to establish the distribution of this intron in sponges, the cox1 gene was sequenced from several representatives of the demosponge diversity. The intron was found only in the sponge order Spirophorida. A phylogenetic analysis of the COI protein sequence and of the intron open reading frame suggests that the intron may have been transmitted horizontally from a fungus donor. Conclusion Little is known about sponge-associated fungi, although in the last few years the latter have been frequently isolated from sponges. We suggest that the horizontal gene transfer of a mitochondrial intron was facilitated by a symbiotic relationship between fungus and sponge. Ecological relationships are known to have implications at the genomic level. Here, an ecological relationship between sponge and fungus is suggested based on the genomic analysis. PMID:16972986

  7. Molecular evolution of miraculin-like proteins in soybean Kunitz super-family.

    PubMed

    Selvakumar, Purushotham; Gahloth, Deepankar; Tomar, Prabhat Pratap Singh; Sharma, Nidhi; Sharma, Ashwani Kumar

    2011-12-01

    Miraculin-like proteins (MLPs) belong to soybean Kunitz super-family and have been characterized from many plant families like Rutaceae, Solanaceae, Rubiaceae, etc. Many of them possess trypsin inhibitory activity and are involved in plant defense. MLPs exhibit significant sequence identity (~30-95%) to native miraculin protein, also belonging to Kunitz super-family compared with a typical Kunitz family member (~30%). The sequence and structure-function comparison of MLPs with that of a classical Kunitz inhibitor have demonstrated that MLPs have evolved to form a distinct group within Kunitz super-family. Sequence analysis of new genes along with available MLP sequences in the literature revealed three major groups for these proteins. A significant feature of Rutaceae MLP type 2 sequences is the presence of phosphorylation motif. Subtle changes are seen in putative reactive loop residues among different MLPs suggesting altered specificities to specific proteases. In phylogenetic analysis, Rutaceae MLP type 1 and type 2 proteins clustered together on separate branches, whereas native miraculin along with other MLPs formed distinct clusters. Site-specific positive Darwinian selection was observed at many sites in both the groups of Rutaceae MLP sequences with most of the residues undergoing positive selection located in loop regions. The results demonstrate the sequence and thereby the structure-function divergence of MLPs as a distinct group within soybean Kunitz super-family due to biotic and abiotic stresses of local environment.

  8. The complete mitochondrial genome and phylogenetic analysis of the giant panda (Ailuropoda melanoleuca).

    PubMed

    Peng, Rui; Zeng, Bo; Meng, Xiuxiang; Yue, Bisong; Zhang, Zhihe; Zou, Fangdong

    2007-08-01

    The complete mitochondrial genome sequence of the giant panda, Ailuropoda melanoleuca, was determined by the long and accurate polymerase chain reaction (LA-PCR) with conserved primers and primer walking sequence methods. The complete mitochondrial DNA is 16,805 nucleotides in length and contains two ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and one control region. The total length of the 13 protein-coding genes is longer than the American black bear, brown bear and polar bear by 3 amino acids at the end of ND5 gene. The codon usage also followed the typical vertebrate pattern except for an unusual ATT start codon, which initiates the NADH dehydrogenase subunit 5 (ND5) gene. The molecular phylogenetic analysis was performed on the sequences of 12 concatenated heavy-strand encoded protein-coding genes, and suggested that the giant panda is most closely related to bears.

  9. Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer

    PubMed Central

    Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R.; Cheetham, R. Keira; Cheng, William; Connor, Thomas R.; Cox, Anthony J.; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J.; Harris, Simon R.; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J.; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J.; Wedge, David C.; Woods, Gregory M.; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M.J.; Carter, Nigel P.; Papenfuss, Anthony T.; Futreal, P. Andrew; Campbell, Peter J.; Yang, Fengtang; Bentley, David R.; Evers, Dirk J.; Stratton, Michael R.

    2012-01-01

    Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PaperClip PMID:22341448

  10. Role of indirect readout mechanism in TATA box binding protein-DNA interaction.

    PubMed

    Mondal, Manas; Choudhury, Devapriya; Chakrabarti, Jaydeb; Bhattacharyya, Dhananjay

    2015-03-01

    Gene expression generally initiates from recognition of TATA-box binding protein (TBP) to the minor groove of DNA of TATA box sequence where the DNA structure is significantly different from B-DNA. We have carried out molecular dynamics simulation studies of TBP-DNA system to understand how the DNA structure alters for efficient binding. We observed rigid nature of the protein while the DNA of TATA box sequence has an inherent flexibility in terms of bending and minor groove widening. The bending analysis of the free DNA and the TBP bound DNA systems indicate presence of some similar structures. Principal coordinate ordination analysis also indicates some structural features of the protein bound and free DNA are similar. Thus we suggest that the DNA of TATA box sequence regularly oscillates between several alternate structures and the one suitable for TBP binding is induced further by the protein for proper complex formation.

  11. Alu repeat discovery and characterization within human genomes

    PubMed Central

    Hormozdiari, Fereydoun; Alkan, Can; Ventura, Mario; Hajirasouliha, Iman; Malig, Maika; Hach, Faraz; Yorukoglu, Deniz; Dao, Phuong; Bakhshi, Marzieh; Sahinalp, S. Cenk; Eichler, Evan E.

    2011-01-01

    Human genomes are now being rapidly sequenced, but not all forms of genetic variation are routinely characterized. In this study, we focus on Alu retrotransposition events and seek to characterize differences in the pattern of mobile insertion between individuals based on the analysis of eight human genomes sequenced using next-generation sequencing. Applying a rapid read-pair analysis algorithm, we discover 4342 Alu insertions not found in the human reference genome and show that 98% of a selected subset (63/64) experimentally validate. Of these new insertions, 89% correspond to AluY elements, suggesting that they arose by retrotransposition. Eighty percent of the Alu insertions have not been previously reported and more novel events were detected in Africans when compared with non-African samples (76% vs. 69%). Using these data, we develop an experimental and computational screen to identify ancestry informative Alu retrotransposition events among different human populations. PMID:21131385

  12. Evaluation of massively parallel sequencing for forensic DNA methylation profiling.

    PubMed

    Richards, Rebecca; Patel, Jayshree; Stevenson, Kate; Harbison, SallyAnn

    2018-05-11

    Epigenetics is an emerging area of interest in forensic science. DNA methylation, a type of epigenetic modification, can be applied to chronological age estimation, identical twin differentiation and body fluid identification. However, there is not yet an agreed, established methodology for targeted detection and analysis of DNA methylation markers in forensic research. Recently a massively parallel sequencing-based approach has been suggested. The use of massively parallel sequencing is well established in clinical epigenetics and is emerging as a new technology in the forensic field. This review investigates the potential benefits, limitations and considerations of this technique for the analysis of DNA methylation in a forensic context. The importance of a robust protocol, regardless of the methodology used, that minimises potential sources of bias is highlighted. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  13. Computational analysis and functional expression of ancestral copepod luciferase.

    PubMed

    Takenaka, Yasuhiro; Noda-Ogura, Akiko; Imanishi, Tadashi; Yamaguchi, Atsushi; Gojobori, Takashi; Shigeri, Yasushi

    2013-10-10

    We recently reported the cDNA sequences of 11 copepod luciferases from the superfamily Augaptiloidea in the order Calanoida. They were classified into two groups, Metridinidae and Heterorhabdidae/Lucicutiidae families, by phylogenetic analyses. To elucidate the evolutionary processes, we have now further isolated 12 copepod luciferases from Augaptiloidea species (Metridia asymmetrica, Metridia curticauda, Pleuromamma scutullata, Pleuromamma xiphias, Lucicutia ovaliformis and Heterorhabdus tanneri). Codon-based synonymous/nonsynonymous tests of positive selection for 25 identified copepod luciferases suggested that positive Darwinian selection operated in the evolution of Heterorhabdidae luciferases, whereas two types of Metridinidae luciferases had diversified via neutral mechanism. By in silico analysis of the decoded amino acid sequences of 25 copepod luciferases, we inferred two protein sequences as ancestral copepod luciferases. They were expressed in HEK293 cells where they exhibited notable luciferase activity both in intracellular lysates and cultured media, indicating that the luciferase activity was established before evolutionary diversification of these copepod species. © 2013.

  14. Evaluation of Helicobacter pylori infection and clarithromycin resistance in strains from symptomatic Colombian children.

    PubMed

    Rosero Lasso, Yuliet Liliana; Arévalo-Jaimes, Betsy Verónica; Delgado, María de Pilar; Vera-Chamorro, José Fernando; García, Daniella; Ramírez, Andrea; Rodríguez-Urrego, Paula A; Álvarez, Johanna; Jaramillo, Carlos Alberto

    2018-04-27

    To determine the current prevalence of Helicobacter pylori in symptomatic Colombian children and evaluate the presence of mutations associated with clarithromycin resistance. Biopsies from 133 children were analyzed. The gastric fragment was used for urease test and reused for PCR-sequencing of the 23SrDNA gene. Mutations were detected by bioinformatic analysis. PCR-sequencing established that H. pylori infection was present in 47% of patients. Bioinformatics analysis of the 62 positive sequences for 23SrDNA revealed that 92% exhibited a genotype susceptible to clarithromycin, whereas remain strains (8%) showed mutations associated with clarithromycin resistance. The low rate of resistance to clarithromycin (8%) suggests that conventional treatment methods are an appropriate choice for children. Recycling a biopsy that is normally discarded reduces the risks associated with the procedure. The 23SrDNA gene amplification could be used for a dual purpose: detection of H. pylori and determination of susceptibility to clarithromycin.

  15. Nucleotide Sequence and Genetic Structure of a Novel Carbaryl Hydrolase Gene (cehA) from Rhizobium sp. Strain AC100

    PubMed Central

    Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito

    2002-01-01

    Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471

  16. Taxonomic evaluation of selected Ganoderma species and database sequence validation

    PubMed Central

    Jargalmaa, Suldbold; Eimes, John A.; Park, Myung Soo; Park, Jae Young; Oh, Seung-Yoon

    2017-01-01

    Species in the genus Ganoderma include several ecologically important and pathogenic fungal species whose medicinal and economic value is substantial. Due to the highly similar morphological features within the Ganoderma, identification of species has relied heavily on DNA sequencing using BLAST searches, which are only reliable if the GenBank submissions are accurately labeled. In this study, we examined 113 specimens collected from 1969 to 2016 from various regions in Korea using morphological features and multigene analysis (internal transcribed spacer, translation elongation factor 1-α, and the second largest subunit of RNA polymerase II). These specimens were identified as four Ganoderma species: G. sichuanense, G. cf. adspersum, G. cf. applanatum, and G. cf. gibbosum. With the exception of G. sichuanense, these species were difficult to distinguish based solely on morphological features. However, phylogenetic analysis at three different loci yielded concordant phylogenetic information, and supported the four species distinctions with high bootstrap support. A survey of over 600 Ganoderma sequences available on GenBank revealed that 65% of sequences were either misidentified or ambiguously labeled. Here, we suggest corrected annotations for GenBank sequences based on our phylogenetic validation and provide updated global distribution patterns for these Ganoderma species. PMID:28761785

  17. Study of cnidarian-algal symbiosis in the "omics" age.

    PubMed

    Meyer, Eli; Weis, Virginia M

    2012-08-01

    The symbiotic associations between cnidarians and dinoflagellate algae (Symbiodinium) support productive and diverse ecosystems in coral reefs. Many aspects of this association, including the mechanistic basis of host-symbiont recognition and metabolic interaction, remain poorly understood. The first completed genome sequence for a symbiotic anthozoan is now available (the coral Acropora digitifera), and extensive expressed sequence tag resources are available for a variety of other symbiotic corals and anemones. These resources make it possible to profile gene expression, protein abundance, and protein localization associated with the symbiotic state. Here we review the history of "omics" studies of cnidarian-algal symbiosis and the current availability of sequence resources for corals and anemones, identifying genes putatively involved in symbiosis across 10 anthozoan species. The public availability of candidate symbiosis-associated genes leaves the field of cnidarian-algal symbiosis poised for in-depth comparative studies of sequence diversity and gene expression and for targeted functional studies of genes associated with symbiosis. Reviewing the progress to date suggests directions for future investigations of cnidarian-algal symbiosis that include (i) sequencing of Symbiodinium, (ii) proteomic analysis of the symbiosome membrane complex, (iii) glycomic analysis of Symbiodinium cell surfaces, and (iv) expression profiling of the gastrodermal cells hosting Symbiodinium.

  18. DNA sequence analysis of the photosynthesis region of Rhodobacter sphaeroides 2.4.1.

    PubMed

    Choudhary, M; Kaplan, S

    2000-02-15

    This paper describes the DNA sequence of the photosynthesis region of Rhodobacter sphaeroides 2.4.1 (T). The photosynthesis gene cluster is located within a approximately 73 kb Ase I genomic DNA fragment containing the puf, puhA, cycA and puc operons. A total of 65 open reading frames (ORFs) have been identified, of which 61 showed significant similarity to genes/proteins of other organisms while only four did not reveal any significant sequence similarity to any gene/protein sequences in the database. The data were compared with the corresponding genes/ORFs from a different strain of R.sphaeroides and Rhodobacter capsulatus, a close relative of R. sphaeroides. A detailed analysis of the gene organization in the photosynthesis region revealed a similar gene order in both species with some notable differences located to the pucBAC = cycA region. In addition, photosynthesis gene regulatory protein (PpsR, FNR, IHF) binding motifs in upstream sequences of a number of photosynthesis genes have been identified and shown to differ between these two species. The difference in gene organization relative to pucBAC and cycA suggests that this region originated independently of the photosynthesis gene cluster of R.sphaeroides.

  19. Sieve analysis of breakthrough HIV-1 sequences in HVTN 505 identifies vaccine pressure targeting the CD4 binding site of Env-gp120.

    PubMed

    deCamp, Allan C; Rolland, Morgane; Edlefsen, Paul T; Sanders-Buell, Eric; Hall, Breana; Magaret, Craig A; Fiore-Gartland, Andrew J; Juraska, Michal; Carpp, Lindsay N; Karuna, Shelly T; Bose, Meera; LePore, Steven; Miller, Shana; O'Sullivan, Annemarie; Poltavee, Kultida; Bai, Hongjun; Dommaraju, Kalpana; Zhao, Hong; Wong, Kim; Chen, Lennie; Ahmed, Hasan; Goodman, Derrick; Tay, Matthew Z; Gottardo, Raphael; Koup, Richard A; Bailer, Robert; Mascola, John R; Graham, Barney S; Roederer, Mario; O'Connell, Robert J; Michael, Nelson L; Robb, Merlin L; Adams, Elizabeth; D'Souza, Patricia; Kublin, James; Corey, Lawrence; Geraghty, Daniel E; Frahm, Nicole; Tomaras, Georgia D; McElrath, M Juliana; Frenkel, Lisa; Styrchak, Sheila; Tovanabutra, Sodsai; Sobieszczyk, Magdalena E; Hammer, Scott M; Kim, Jerome H; Mullins, James I; Gilbert, Peter B

    2017-01-01

    Although the HVTN 505 DNA/recombinant adenovirus type 5 vector HIV-1 vaccine trial showed no overall efficacy, analysis of breakthrough HIV-1 sequences in participants can help determine whether vaccine-induced immune responses impacted viruses that caused infection. We analyzed 480 HIV-1 genomes sampled from 27 vaccine and 20 placebo recipients and found that intra-host HIV-1 diversity was significantly lower in vaccine recipients (P ≤ 0.04, Q-values ≤ 0.09) in Gag, Pol, Vif and envelope glycoprotein gp120 (Env-gp120). Furthermore, Env-gp120 sequences from vaccine recipients were significantly more distant from the subtype B vaccine insert than sequences from placebo recipients (P = 0.01, Q-value = 0.12). These vaccine effects were associated with signatures mapping to CD4 binding site and CD4-induced monoclonal antibody footprints. These results suggest either (i) no vaccine efficacy to block acquisition of any viral genotype but vaccine-accelerated Env evolution post-acquisition; or (ii) vaccine efficacy against HIV-1s with Env sequences closest to the vaccine insert combined with increased acquisition due to other factors, potentially including the vaccine vector.

  20. Sieve analysis of breakthrough HIV-1 sequences in HVTN 505 identifies vaccine pressure targeting the CD4 binding site of Env-gp120

    PubMed Central

    Edlefsen, Paul T.; Sanders-Buell, Eric; Hall, Breana; Magaret, Craig A.; Fiore-Gartland, Andrew J.; Juraska, Michal; Carpp, Lindsay N.; Karuna, Shelly T.; Bose, Meera; LePore, Steven; Miller, Shana; O'Sullivan, Annemarie; Poltavee, Kultida; Bai, Hongjun; Dommaraju, Kalpana; Zhao, Hong; Wong, Kim; Chen, Lennie; Ahmed, Hasan; Goodman, Derrick; Tay, Matthew Z.; Gottardo, Raphael; Koup, Richard A.; Bailer, Robert; Mascola, John R.; Graham, Barney S.; Roederer, Mario; O’Connell, Robert J.; Michael, Nelson L.; Robb, Merlin L.; Adams, Elizabeth; D’Souza, Patricia; Kublin, James; Corey, Lawrence; Geraghty, Daniel E.; Frahm, Nicole; Tomaras, Georgia D.; McElrath, M. Juliana; Frenkel, Lisa; Styrchak, Sheila; Tovanabutra, Sodsai; Sobieszczyk, Magdalena E.; Hammer, Scott M.; Kim, Jerome H.; Mullins, James I.; Gilbert, Peter B.

    2017-01-01

    Although the HVTN 505 DNA/recombinant adenovirus type 5 vector HIV-1 vaccine trial showed no overall efficacy, analysis of breakthrough HIV-1 sequences in participants can help determine whether vaccine-induced immune responses impacted viruses that caused infection. We analyzed 480 HIV-1 genomes sampled from 27 vaccine and 20 placebo recipients and found that intra-host HIV-1 diversity was significantly lower in vaccine recipients (P ≤ 0.04, Q-values ≤ 0.09) in Gag, Pol, Vif and envelope glycoprotein gp120 (Env-gp120). Furthermore, Env-gp120 sequences from vaccine recipients were significantly more distant from the subtype B vaccine insert than sequences from placebo recipients (P = 0.01, Q-value = 0.12). These vaccine effects were associated with signatures mapping to CD4 binding site and CD4-induced monoclonal antibody footprints. These results suggest either (i) no vaccine efficacy to block acquisition of any viral genotype but vaccine-accelerated Env evolution post-acquisition; or (ii) vaccine efficacy against HIV-1s with Env sequences closest to the vaccine insert combined with increased acquisition due to other factors, potentially including the vaccine vector. PMID:29149197

  1. High Bacterial Diversity in Permanently Cold Marine Sediments

    PubMed Central

    Ravenschlag, Katrin; Sahm, Kerstin; Pernthaler, Jakob; Amann, Rudolf

    1999-01-01

    A 16S ribosomal DNA (rDNA) clone library from permanently cold marine sediments was established. Screening 353 clones by dot blot hybridization with group-specific oligonucleotide probes suggested a predominance of sequences related to bacteria of the sulfur cycle (43.4% potential sulfate reducers). Within this fraction, the major cluster (19.0%) was affiliated with Desulfotalea sp. and other closely related psychrophilic sulfate reducers isolated from the same habitat. The cloned sequences showed between 93 and 100% similarity to these bacteria. Two additional groups were frequently encountered: 13% of the clones were related to Desulfuromonas palmitatis, and a second group was affiliated with Myxobacteria spp. and Bdellovibrio spp. Many clones (18.1%) belonged to the γ subclass of the class Proteobacteria and were closest to symbiotic or free-living sulfur oxidizers. Probe target groups were further characterized by amplified rDNA restriction analysis to determine diversity within the groups and within the clone library. Rarefaction analysis suggested that the total diversity assessed by 16S rDNA analysis was very high in these permanently cold sediments and was only partially revealed by screening of 353 clones. PMID:10473405

  2. The evolution and population structure of Lactobacillus fermentum from different naturally fermented products as determined by multilocus sequence typing (MLST).

    PubMed

    Dan, Tong; Liu, Wenjun; Song, Yuqin; Xu, Haiyan; Menghe, Bilige; Zhang, Heping; Sun, Zhihong

    2015-05-20

    Lactobacillus fermentum is economically important in the production and preservation of fermented foods. A repeatable and discriminative typing method was devised to characterize L. fermentum at the molecular level. The multilocus sequence typing (MLST) scheme developed was based on analysis of the internal sequence of 11 housekeeping gene fragments (clpX, dnaA, dnaK, groEL, murC, murE, pepX, pyrG, recA, rpoB, and uvrC). MLST analysis of 203 isolates of L. fermentum from Mongolia and seven provinces/ autonomous regions in China identified 57 sequence types (ST), 27 of which were represented by only a single isolate, indicating high genetic diversity. Phylogenetic analyses based on the sequence of the 11 housekeeping gene fragments indicated that the L. fermentum isolates analyzed belonged to two major groups. A standardized index of association (I A (S)) indicated a weak clonal population structure in L. fermentum. Split decomposition analysis indicated that recombination played an important role in generating the genetic diversity observed in L. fermentum. The results from the minimum spanning tree strongly suggested that evolution of L. fermentum STs was not correlated with geography or food-type. The MLST scheme developed will be valuable for further studies on the evolution and population structure of L. fermentum isolates used in food products.

  3. A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing

    PubMed Central

    Green, Richard E.; Malaspinas, Anna-Sapfo; Krause, Johannes; Briggs, Adrian W.; Johnson, Philip L. F.; Uhler, Caroline; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Stenzel, Udo; Prüfer, Kay; Siebauer, Michael; Burbano, Hernán A.; Ronan, Michael; Rothberg, Jonathan M.; Egholm, Michael; Rudan, Pavao; Brajković, Dejana; Kućan, Željko; Gušić, Ivan; Wikström, Mårten; Laakkonen, Liisa; Kelso, Janet; Slatkin, Montgomery; Pääbo, Svante

    2008-01-01

    Summary A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small. PMID:18692465

  4. New species and phylogenetic relationships of the spider genus Coptoprepes using morphological and sequence data (Araneae: Anyphaenidae).

    PubMed

    Barone, Mariana L; Werenkraut, Victoria; Ramírez, Martín J

    2016-10-17

    We present evidence from the standard cytochrome c oxidase subunit I (COI) barcoding marker and from new collections, showing that the males and females of C. ecotono Werenkraut & Ramírez were mismatched, and describe the female of that species for the first time. An undescribed male from Chile is assigned to the new species Coptoprepes laudani, together with the female that was previously thought as C. ecotono. The matching of sexes is justified after a dual cladistics analysis of morphological and sequence data in combination. New locality data and barcoding sequences are provided for other species of Coptoprepes, all endemic of the temperate forests of Chile and adjacent Argentina. Although morphology and sequences are not conclusive on the relationships of Coptoprepes species, the sequence data suggests that the species without a retrolateral tibial apophysis may belong to an independent lineage.

  5. Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses.

    PubMed Central

    Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T

    1987-01-01

    The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486

  6. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome

    PubMed Central

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-01-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230

  7. Sequence stratigraphy of the Triassic in the Barentsz Sea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Skjold, L.JU.; Van Veen, P.M.; Gjelberg, J.

    1990-05-01

    A regional study of the Triassic in the Barentsz Sea (20-32{degree}E, 71-74{degree}N) revealed sequences that correlate seismically for hundreds of kilometers. Recent offshore drilling results enabled them to establish a biostratigraphic time framework. Comparisons with information from onshore outcrops (such as the Svalbard Archipelago) aided the piecing together of these superregional sequences. Seismic character analysis identified three units with composite progradational patterns (Induan, Olenekian, and Anisian). Fluvial, deltaic, and marine deposits can be distinguished and located relative to the paleocoastlines. Corresponding downlap surfaces suggest the development of condensed intervals, predicted to consist of organic-rich source rocks, as was later confirmedmore » by drilling. Regional predictions based on this sequence-stratigraphic approach have proved valuable when correlating and evaluating well information. The sequences identified also help define third-order sea level curves for the area; these improve published curves thought to have global significance.« less

  8. Recurrent Network models of sequence generation and memory

    PubMed Central

    Rajan, Kanaka; Harvey, Christopher D; Tank, David W

    2016-01-01

    SUMMARY Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in which a principled mechanism is pre-wired into their connectivity. Here, we demonstrate that starting from random connectivity and modifying a small fraction of connections, a largely disordered recurrent network can produce sequences and implement working memory efficiently. We use this process, called Partial In-Network training (PINning), to model and match cellular-resolution imaging data from the posterior parietal cortex during a virtual memory-guided two-alternative forced choice task [Harvey, Coen and Tank, 2012]. Analysis of the connectivity reveals that sequences propagate by the cooperation between recurrent synaptic interactions and external inputs, rather than through feedforward or asymmetric connections. Together our results suggest that neural sequences may emerge through learning from largely unstructured network architectures. PMID:26971945

  9. SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.

    2002-01-01

    Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs inmore » gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.« less

  10. Phylogenomics of Phrynosomatid Lizards: Conflicting Signals from Sequence Capture versus Restriction Site Associated DNA Sequencing

    PubMed Central

    Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.

    2015-01-01

    Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487

  11. RAD-Seq analysis of typical and minor Citrus accessions, including Bhutanese varieties

    PubMed Central

    Penjor, Tshering; Mimura, Takashi; Kotoda, Nobuhiro; Matsumoto, Ryoji; Nagano, Atsushi J.; Honjo, Mie N.; Kudoh, Hiroshi; Yamamoto, Masashi; Nagano, Yukio

    2016-01-01

    We analyzed the reduced-representation genome sequences of Citrus species by double-digest restriction site-associated DNA sequencing (ddRAD-Seq) using 44 accessions, including typical and minor accessions, such as Bhutanese varieties. The results of this analysis using typical accessions were consistent with previous reports that citron, papeda, pummelo, and mandarin are ancestral species, and that most Citrus species are derivatives or hybrids of these four species. Citrus varieties often reproduce asexually and heterozygosity is highly conserved within each variety. Because this approach could readily detect conservation of heterozygosity, it was able to discriminate citrus varieties such as satsuma mandarin from closely related species. Thus, this method provides an inexpensive way to protect citrus varieties from unintended introduction and to prevent the provision of incorrect nursery stocks to customers. One Citrus variety in Bhutan was morphologically similar to Mexican lime and was designated as Himalayan lime. The current analysis confirmed the previous proposition that Mexican lime is a hybrid between papeda and citron, and also suggested that Himalayan lime is a probable hybrid between mandarin and citron. In addition to Himalayan lime, current analysis suggested that several accessions were formed by previously undescribed combinations. PMID:28163596

  12. Analysis of the complete mitochondrial genome of the Zhedong White goose and characterization of NUMTs: Reveal domestication history of goose in China and Euro.

    PubMed

    Ren, Ting; Liang, Shiri; Zhao, Ayong; He, Ke

    2016-02-10

    To understand the phyletic evolution of geese, the complete mitogenome of the Zhedong goose was sequenced for the first time. It is composed of 37 genes and 1 control region, and the structure and arrangement of all genes sequenced are identical to those of other goose breeds. We confirmed the accuracy of the mitogenome sequence through RT-PCR and found numts from amplification in genomic DNA. Comparisons of the phylogenetic trees and sequences of geese that were suggested a clade of Chinese geese, except the Yili goose, were classified in the Euro clade. Several breed-specific mutations and Chinese breed-specific mutations were found. Our results suggest that Chinese geese evolved from the swan goose, splitting from their common ancestors at different times, which was consistent with studies before. Furthermore, numts in most genes of Zhedong goose clustered with European geese in the phylogenetic tree, suggesting that the haplotypes in the Euro clade might be more ancient. However, the mitogenome of the swan goose shows distinctive evolutionary positions in some genes, which suggest its unclear relationship with Chinese geese and European geese. The current study added to the understanding of the evolution of geese and provided evidence that the typing of numts is an encouraging way for the evolutionary study of geese and the mitochondrial genomes of geese deserve further investigation. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Centromere Locations in Brassica A and C Genomes Revealed Through Half-Tetrad Analysis

    PubMed Central

    Mason, Annaliese S.; Rousseau-Gueutin, Mathieu; Morice, Jérôme; Bayer, Philipp E.; Besharat, Naghmeh; Cousin, Anouska; Pradhan, Aneeta; Parkin, Isobel A. P.; Chèvre, Anne-Marie; Batley, Jacqueline; Nelson, Matthew N.

    2016-01-01

    Locating centromeres on genome sequences can be challenging. The high density of repetitive elements in these regions makes sequence assembly problematic, especially when using short-read sequencing technologies. It can also be difficult to distinguish between active and recently extinct centromeres through sequence analysis. An effective solution is to identify genetically active centromeres (functional in meiosis) by half-tetrad analysis. This genetic approach involves detecting heterozygosity along chromosomes in segregating populations derived from gametes (half-tetrads). Unreduced gametes produced by first division restitution mechanisms comprise complete sets of nonsister chromatids. Along these chromatids, heterozygosity is maximal at the centromeres, and homologous recombination events result in homozygosity toward the telomeres. We genotyped populations of half-tetrad-derived individuals (from Brassica interspecific hybrids) using a high-density array of physically anchored SNP markers (Illumina Brassica 60K Infinium array). Mapping the distribution of heterozygosity in these half-tetrad individuals allowed the genetic mapping of all 19 centromeres of the Brassica A and C genomes to the reference Brassica napus genome. Gene and transposable element density across the B. napus genome were also assessed and corresponded well to previously reported genetic map positions. Known centromere-specific sequences were located in the reference genome, but mostly matched unanchored sequences, suggesting that the core centromeric regions may not yet be assembled into the pseudochromosomes of the reference genome. The increasing availability of genetic markers physically anchored to reference genomes greatly simplifies the genetic and physical mapping of centromeres using half-tetrad analysis. We discuss possible applications of this approach, including in species where half-tetrads are currently difficult to isolate. PMID:26614742

  14. Centromere Locations in Brassica A and C Genomes Revealed Through Half-Tetrad Analysis.

    PubMed

    Mason, Annaliese S; Rousseau-Gueutin, Mathieu; Morice, Jérôme; Bayer, Philipp E; Besharat, Naghmeh; Cousin, Anouska; Pradhan, Aneeta; Parkin, Isobel A P; Chèvre, Anne-Marie; Batley, Jacqueline; Nelson, Matthew N

    2016-02-01

    Locating centromeres on genome sequences can be challenging. The high density of repetitive elements in these regions makes sequence assembly problematic, especially when using short-read sequencing technologies. It can also be difficult to distinguish between active and recently extinct centromeres through sequence analysis. An effective solution is to identify genetically active centromeres (functional in meiosis) by half-tetrad analysis. This genetic approach involves detecting heterozygosity along chromosomes in segregating populations derived from gametes (half-tetrads). Unreduced gametes produced by first division restitution mechanisms comprise complete sets of nonsister chromatids. Along these chromatids, heterozygosity is maximal at the centromeres, and homologous recombination events result in homozygosity toward the telomeres. We genotyped populations of half-tetrad-derived individuals (from Brassica interspecific hybrids) using a high-density array of physically anchored SNP markers (Illumina Brassica 60K Infinium array). Mapping the distribution of heterozygosity in these half-tetrad individuals allowed the genetic mapping of all 19 centromeres of the Brassica A and C genomes to the reference Brassica napus genome. Gene and transposable element density across the B. napus genome were also assessed and corresponded well to previously reported genetic map positions. Known centromere-specific sequences were located in the reference genome, but mostly matched unanchored sequences, suggesting that the core centromeric regions may not yet be assembled into the pseudochromosomes of the reference genome. The increasing availability of genetic markers physically anchored to reference genomes greatly simplifies the genetic and physical mapping of centromeres using half-tetrad analysis. We discuss possible applications of this approach, including in species where half-tetrads are currently difficult to isolate. Copyright © 2016 by the Genetics Society of America.

  15. Performance comparison of two commercial human whole-exome capture systems on formalin-fixed paraffin-embedded lung adenocarcinoma samples.

    PubMed

    Bonfiglio, Silvia; Vanni, Irene; Rossella, Valeria; Truini, Anna; Lazarevic, Dejan; Dal Bello, Maria Giovanna; Alama, Angela; Mora, Marco; Rijavec, Erika; Genova, Carlo; Cittaro, Davide; Grossi, Francesco; Coco, Simona

    2016-08-30

    Next Generation Sequencing (NGS) has become a valuable tool for molecular landscape characterization of cancer genomes, leading to a better understanding of tumor onset and progression, and opening new avenues in translational oncology. Formalin-fixed paraffin-embedded (FFPE) tissue is the method of choice for storage of clinical samples, however low quality of FFPE genomic DNA (gDNA) can limit its use for downstream applications. To investigate the FFPE specimen suitability for NGS analysis and to establish the performance of two solution-based exome capture technologies, we compared the whole-exome sequencing (WES) data of gDNA extracted from 5 fresh frozen (FF) and 5 matched FFPE lung adenocarcinoma tissues using: SeqCap EZ Human Exome v.3.0 (Roche NimbleGen) and SureSelect XT Human All Exon v.5 (Agilent Technologies). Sequencing metrics on Illumina HiSeq were optimal for both exome systems and comparable among FFPE and FF samples, with a slight increase of PCR duplicates in FFPE, mainly in Roche NimbleGen libraries. Comparison of single nucleotide variants (SNVs) between FFPE-FF pairs reached overlapping values >90 % in both systems. Both WES showed high concordance with target re-sequencing data by Ion PGM™ in 22 lung-cancer genes, regardless the source of samples. Exon coverage of 623 cancer-related genes revealed high coverage efficiency of both kits, proposing WES as a valid alternative to target re-sequencing. High-quality and reliable data can be successfully obtained from WES of FFPE samples starting from a relatively low amount of input gDNA, suggesting the inclusion of NGS-based tests into clinical contest. In conclusion, our analysis suggests that the WES approach could be extended to a translational research context as well as to the clinic (e.g. to study rare malignancies), where the simultaneous analysis of the whole coding region of the genome may help in the detection of cancer-linked variants.

  16. HIV-1 diversity, transmission dynamics and primary drug resistance in Angola.

    PubMed

    Bártolo, Inês; Zakovic, Suzana; Martin, Francisco; Palladino, Claudia; Carvalho, Patrícia; Camacho, Ricardo; Thamm, Sven; Clemente, Sofia; Taveira, Nuno

    2014-01-01

    To assess HIV-1 diversity, transmission dynamics and prevalence of transmitted drug resistance (TDR) in Angola, five years after ART scale-up. Population sequencing of the pol gene was performed on 139 plasma samples collected in 2009 from drug-naive HIV-1 infected individuals living in Luanda. HIV-1 subtypes were determined using phylogenetic analysis. Drug resistance mutations were identified using the Calibrated Population Resistance Tool (CPR). Transmission networks were determined using phylogenetic analysis of all Angolan sequences present in the databases. Evolutionary trends were determined by comparison with a similar survey performed in 2001. 47.1% of the viruses were pure subtypes (all except B), 47.1% were recombinants and 5.8% were untypable. The prevalence of subtype A decreased significantly from 2001 to 2009 (40.0% to 10.8%, P = 0.0019) while the prevalence of unique recombinant forms (URFs) increased > 2-fold (40.0% to 83.1%, P < 0.0001). The most frequent URFs comprised untypable sequences with subtypes H (U/H, n = 7, 10.8%), A (U/A, n = 6, 9.2%) and G (G/U, n = 4, 6.2%). Newly identified U/H recombinants formed a highly supported monophyletic cluster suggesting a local and common origin. TDR mutation K103N was found in one (0.7%) patient (1.6% in 2001). Out of the 364 sequences sampled for transmission network analysis, 130 (35.7%) were part of a transmission network. Forty eight transmission clusters were identified; the majority (56.3%) comprised sequences sampled in 2008-2010 in Luanda which is consistent with a locally fuelled epidemic. Very low genetic distance was found in 27 transmission pairs sampled in the same year, suggesting recent transmission events. Transmission of drug resistant strains was still negligible in Luanda in 2009, five years after the scale-up of ART. The dominance of small and recent transmission clusters and the emergence of new URFs are consistent with a rising HIV-1 epidemics mainly driven by heterosexual transmission.

  17. Gene structure and evolution of transthyretin in the order Chiroptera.

    PubMed

    Khwanmunee, Jiraporn; Leelawatwattana, Ladda; Prapunpoj, Porntip

    2016-02-01

    Bats are mammals in the order Chiroptera. Although many extensive morphologic and molecular genetics analyses have been attempted, phylogenetic relationships of bats has not been completely resolved. The paraphyly of microbats is of particular controversy that needs to be confirmed. In this study, we attempted to use the nucleotide sequence of transthyretin (TTR) intron 1 to resolve the relationship among bats. To explore its utility, the complete sequences of TTR gene and intron 1 region of bats in Vespertilionidae: genus Eptesicus (Eptesicus fuscus) and genus Myotis (Myotis brandtii, Myotis davidii, and Myotis lucifugus), and Pteropodidae (Pteropus alecto and Pteropus vampyrus) were extracted from the retrieved sequences, whereas those of Rhinoluphus affinis and Scotophilus kuhlii were amplified and sequenced. The derived overall amino sequences of bat TTRs were found to be very similar to those in other eutherians but differed from those in other classes of vertebrates. However, missing of amino acids from N-terminal or C-terminal region was observed. The phylogenetic analysis of amino acid sequences suggested bat and other eutherian TTRs lineal descent from a single most recent common ancestor which differed from those of non-placental mammals and the other classes of vertebrates. The splicing of bat TTR precursor mRNAs was similar to those of other eutherian but different from those of marsupial, bird, reptile and amphibian. Based on TTR intron 1 sequence, the inferred evolutionary relationship within Chiroptera revealed more closely relatedness of R. affinis to megabats than to microbats. Accordingly, the paraphyly of microbats was suggested.

  18. Impact of sequencing depth in ChIP-seq experiments

    PubMed Central

    Jung, Youngsook L.; Luquette, Lovelace J.; Ho, Joshua W.K.; Ferrari, Francesco; Tolstorukov, Michael; Minoda, Aki; Issner, Robbyn; Epstein, Charles B.; Karpen, Gary H.; Kuroda, Mitzi I.; Park, Peter J.

    2014-01-01

    In a chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiment, an important consideration in experimental design is the minimum number of sequenced reads required to obtain statistically significant results. We present an extensive evaluation of the impact of sequencing depth on identification of enriched regions for key histone modifications (H3K4me3, H3K36me3, H3K27me3 and H3K9me2/me3) using deep-sequenced datasets in human and fly. We propose to define sufficient sequencing depth as the number of reads at which detected enrichment regions increase <1% for an additional million reads. Although the required depth depends on the nature of the mark and the state of the cell in each experiment, we observe that sufficient depth is often reached at <20 million reads for fly. For human, there are no clear saturation points for the examined datasets, but our analysis suggests 40–50 million reads as a practical minimum for most marks. We also devise a mathematical model to estimate the sufficient depth and total genomic coverage of a mark. Lastly, we find that the five algorithms tested do not agree well for broad enrichment profiles, especially at lower depths. Our findings suggest that sufficient sequencing depth and an appropriate peak-calling algorithm are essential for ensuring robustness of conclusions derived from ChIP-seq data. PMID:24598259

  19. Target Site Recognition by a Diversity-Generating Retroelement

    PubMed Central

    Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.

    2011-01-01

    Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701

  20. Origins of Genes: "Big Bang" or Continuous Creation?

    NASA Astrophysics Data System (ADS)

    Kesse, Paul K.; Gibbs, Adrian

    1992-10-01

    Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.

  1. The role of RT carry-over for congruence sequence effects in masked priming.

    PubMed

    Huber-Huber, Christoph; Ansorge, Ulrich

    2017-05-01

    The present study disentangles 2 sources of the congruence sequence effect with masked primes: congruence and response time of the previous trial (reaction time [RT] carry-over). Using arrows as primes and targets and a metacontrast masking procedure we found congruence as well as congruence sequence effects. In addition, congruence sequence effects decreased when RT carry-over was accounted for in a mixed model analysis, suggesting that RT carry-over contributes to congruence sequence effects in masked priming. Crucially, effects of previous trial congruence were not cancelled out completely indicating that RT carry-over and previous trial congruence are 2 sources feeding into the congruence sequence effect. A secondary task requiring response speed judgments demonstrated general awareness of response speed (Experiments 1), but removing this secondary task (Experiment 2) showed that RT carry-over effects were also present in single-task conditions. During (dual-task) prime-awareness test parts of both experiments, however, RT carry-over failed to modulate congruence effects, suggesting that some task sets of the participants can prevent the effect. The basic RT carry-over effects are consistent with the conflict adaptation account, with the adaptation to the statistics of the environment (ASE) model, and possibly with the temporal learning explanation. Additionally considering the task-dependence of RT carry-over, the results are most compatible with the conflict adaptation account. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  2. Second-order shaped pulsed for solid-state quantum computation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sengupta, Pinaki

    2008-01-01

    We present the construction and detailed analysis of highly optimized self-refocusing pulse shapes for several rotation angles. We characterize the constructed pulses by the coefficients appearing in the Magnus expansion up to second order. This allows a semianalytical analysis of the performance of the constructed shapes in sequences and composite pulses by computing the corresponding leading-order error operators. Higher orders can be analyzed with the numerical technique suggested by us previously. We illustrate the technique by analyzing several composite pulses designed to protect against pulse amplitude errors, and on decoupling sequences for potentially long chains of qubits with on-site andmore » nearest-neighbor couplings.« less

  3. Selective Exposure to Televised Violence.

    ERIC Educational Resources Information Center

    Atkin, Charles; And Others

    1979-01-01

    Present the results of a study conducted to determine the correlation between children's selection of television programs and aggression. The regression analysis suggests that the relationship between viewing and aggression may be attributable to selective exposure rather than the reverse viewing-causes-aggression sequence. (Author/JVP)

  4. What is a melody? On the relationship between pitch and brightness of timbre

    PubMed Central

    Cousineau, Marion; Carcagno, Samuele; Demany, Laurent; Pressnitzer, Daniel

    2014-01-01

    Previous studies showed that the perceptual processing of sound sequences is more efficient when the sounds vary in pitch than when they vary in loudness. We show here that sequences of sounds varying in brightness of timbre are processed with the same efficiency as pitch sequences. The sounds used consisted of two simultaneous pure tones one octave apart, and the listeners’ task was to make same/different judgments on pairs of sequences varying in length (one, two, or four sounds). In one condition, brightness of timbre was varied within the sequences by changing the relative level of the two pure tones. In other conditions, pitch was varied by changing fundamental frequency, or loudness was varied by changing the overall level. In all conditions, only two possible sounds could be used in a given sequence, and these two sounds were equally discriminable. When sequence length increased from one to four, discrimination performance decreased substantially for loudness sequences, but to a smaller extent for brightness sequences and pitch sequences. In the latter two conditions, sequence length had a similar effect on performance. These results suggest that the processes dedicated to pitch and brightness analysis, when probed with a sequence-discrimination task, share unexpected similarities. PMID:24478638

  5. Quantifying transfer after perceptual-motor sequence learning: how inflexible is implicit learning?

    PubMed Central

    Sanchez, Daniel J.; Yarnik, Eric N.

    2015-01-01

    Studies of implicit perceptual-motor sequence learning have often shown learning to be inflexibly tied to the training conditions during learning. Since sequence learning is seen as a model task of skill acquisition, limits on the ability to transfer knowledge from the training context to a performance context indicates important constraints on skill learning approaches. Lack of transfer across contexts has been demonstrated by showing that when task elements are changed following training, this leads to a disruption in performance. These results have typically been taken as suggesting that the sequence knowledge relies on integrated representations across task elements (Abrahamse, Jiménez, Verwey, & Clegg, Psychon Bull Rev 17:603–623, 2010a). Using a relatively new sequence learning task, serial interception sequence learning, three experiments are reported that quantify this magnitude of performance disruption after selectively manipulating individual aspects of motor performance or perceptual information. In Experiment 1, selective disruption of the timing or order of sequential actions was examined using a novel response manipulandum that allowed for separate analysis of these two motor response components. In Experiments 2 and 3, transfer was examined after selective disruption of perceptual information that left the motor response sequence intact. All three experiments provided quantifiable estimates of partial transfer to novel contexts that suggest some level of information integration across task elements. However, the ability to identify quantifiable levels of successful transfer indicates that integration is not all-or-none and that measurement sensitivity is a key in understanding sequence knowledge representations. PMID:24668505

  6. 'Candidatus Phytoplasma phoenicium' associated with almond witches'-broom disease: from draft genome to genetic diversity among strain populations.

    PubMed

    Quaglino, Fabio; Kube, Michael; Jawhari, Maan; Abou-Jawdah, Yusuf; Siewert, Christin; Choueiri, Elia; Sobh, Hana; Casati, Paola; Tedeschi, Rosemarie; Lova, Marina Molino; Alma, Alberto; Bianco, Piero Attilio

    2015-07-30

    Almond witches'-broom (AlmWB), a devastating disease of almond, peach and nectarine in Lebanon, is associated with 'Candidatus Phytoplasma phoenicium'. In the present study, we generated a draft genome sequence of 'Ca. P. phoenicium' strain SA213, representative of phytoplasma strain populations from different host plants, and determined the genetic diversity among phytoplasma strain populations by phylogenetic analyses of 16S rRNA, groEL, tufB and inmp gene sequences. Sequence-based typing and phylogenetic analysis of the gene inmp, coding an integral membrane protein, distinguished AlmWB-associated phytoplasma strains originating from diverse host plants, whereas their 16S rRNA, tufB and groEL genes shared 100 % sequence identity. Moreover, dN/dS analysis indicated positive selection acting on inmp gene. Additionally, the analysis of 'Ca. P. phoenicium' draft genome revealed the presence of integral membrane proteins and effector-like proteins and potential candidates for interaction with hosts. One of the integral membrane proteins was predicted as BI-1, an inhibitor of apoptosis-promoting Bax factor. Bioinformatics analyses revealed the presence of putative BI-1 in draft and complete genomes of other 'Ca. Phytoplasma' species. The genetic diversity within 'Ca. P. phoenicium' strain populations in Lebanon suggested that AlmWB disease could be associated with phytoplasma strains derived from the adaptation of an original strain to diverse hosts. Moreover, the identification of a putative inhibitor of apoptosis-promoting Bax factor (BI-1) in 'Ca. P. phoenicium' draft genome and within genomes of other 'Ca. Phytoplasma' species suggested its potential role as a phytoplasma fitness-increasing factor by modification of the host-defense response.

  7. Phylogeny of the family Moraxellaceae by 16S rDNA sequence analysis, with special emphasis on differentiation of Moraxella species.

    PubMed

    Pettersson, B; Kodjo, A; Ronaghi, M; Uhlén, M; Tønjum, T

    1998-01-01

    Thirty-three strains previously classified into 11 species in the bacterial family Moraxellaceae were subjected to phylogenetic analysis based on 16S rRNA sequences. The family Moraxellaceae formed a distinct clade consisting of four phylogenetic groups as judged from branch lengths, bootstrap values and signature nucleotides. Group I contained the classical moraxellae and strains of the coccal moraxellae, previously known as Branhamella, with 16S rRNA similarity of > or = 95%. A further division of group I into five tentative clusters is discussed. Group II consisted of two strains representing Moraxella atlantae and Moraxella osloensis. These strains were only distantly related to each other (93.4%) and also to the other members of the Moraxellaceae (< or = 93%). Therefore, reasons for reclassification of these species into separate and new genera are discussed. Group III harboured strains of the genus Psychrobacter and strain 752/52 of [Moraxella] phenylpyruvica. This strain of [M.] phenylpyruvica formed an early branch from the group III line of descent. Interestingly, a distant relationship was found between Psychrobacter phenylpyruvicus strain ATCC 23333T (formerly classified as [M.] phenylpyruvica) and [M.] phenylpyruvica strain 752/52, exhibiting less than 96% nucleotide similarity between their 16S rRNA sequences. The establishment of a new genus for [M.] phenylpyruvica strain 752/52 is therefore suggested. Group IV contained only two strains of the genus Acinetobacter. Strategies for the development of diagnostic probes and distinctive sequences for 16S rRNA-based species-specific assays within group I are suggested. Although these findings add to the classificatory placements within the Moraxellaceae, analysis of a more comprehensive selection of strains is still needed to obtain a complete classification system within this family.

  8. Whole Genome Sequence and Phylogenetic Analysis Show Helicobacter pylori Strains from Latin America Have Followed a Unique Evolution Pathway

    PubMed Central

    Muñoz-Ramírez, Zilia Y.; Mendez-Tenorio, Alfonso; Kato, Ikuko; Bravo, Maria M.; Rizzato, Cosmeri; Thorell, Kaisa; Torres, Roberto; Aviles-Jimenez, Francisco; Camorlinga, Margarita; Canzian, Federico; Torres, Javier

    2017-01-01

    Helicobacter pylori (HP) genetics may determine its clinical outcomes. Despite high prevalence of HP infection in Latin America (LA), there have been no phylogenetic studies in the region. We aimed to understand the structure of HP populations in LA mestizo individuals, where gastric cancer incidence remains high. The genome of 107 HP strains from Mexico, Nicaragua and Colombia were analyzed with 59 publicly available worldwide genomes. To study bacterial relationship on whole genome level we propose a virtual hybridization technique using thousands of high-entropy 13 bp DNA probes to generate fingerprints. Phylogenetic virtual genome fingerprint (VGF) was compared with Multi Locus Sequence Analysis (MLST) and with phylogenetic analyses of cagPAI virulence island sequences. With MLST some Nicaraguan and Mexican strains clustered close to Africa isolates, whereas European isolates were spread without clustering and intermingled with LA isolates. VGF analysis resulted in increased resolution of populations, separating European from LA strains. Furthermore, clusters with exclusively Colombian, Mexican, or Nicaraguan strains were observed, where the Colombian cluster separated from Europe, Asia, and Africa, while Nicaraguan and Mexican clades grouped close to Africa. In addition, a mixed large LA cluster including Mexican, Colombian, Nicaraguan, Peruvian, and Salvadorian strains was observed; all LA clusters separated from the Amerind clade. With cagPAI sequence analyses LA clades clearly separated from Europe, Asia and Amerind, and Colombian strains formed a single cluster. A NeighborNet analyses suggested frequent and recent recombination events particularly among LA strains. Results suggests that in the new world, H. pylori has evolved to fit mestizo LA populations, already 500 years after the Spanish colonization. This co-adaption may account for regional variability in gastric cancer risk. PMID:28293542

  9. Genomic sequences of murine gamma B- and gamma C-crystallin-encoding genes: promoter analysis and complete evolutionary pattern of mouse, rat and human gamma-crystallins.

    PubMed

    Graw, J; Liebstein, A; Pietrowski, D; Schmitt-John, T; Werner, T

    1993-12-22

    The murine genes, gamma B-cry and gamma C-cry, encoding the gamma B- and gamma C-crystallins, were isolated from a genomic DNA library. The complete nucleotide (nt) sequences of both genes were determined from 661 and 711 bp, respectively, upstream from the first exon to the corresponding polyadenylation sites, comprising more than 2650 and 2890 bp, respectively. The new sequences were compared to the partial cDNA sequences available for the murine gamma B-cry and gamma C-cry, as well as to the corresponding genomic sequences from rat and man, at both the nt and predicted amino acid (aa) sequence levels. In the gamma B-cry promoter region, a canonical CCAAT-box, a TATA-box, putative NF-I and C/EBP sites were detected. An R-repeat is inserted 366 bp upstream from the transcription start point. In contrast, the gamma C-cry promoter does not contain a CCAAT-box, but some other putative binding sites for transcription factors (AP-2, UBP-1, LBP-1) were located by computer analysis. The promoter regions of all six gamma-cry from mouse, rat and human, except human psi gamma F-cry, were analyzed for common sequence elements. A complex sequence element of about 70-80 bp was found in the proximal promoter, which contains a gamma-cry-specific and almost invariant sequence (crygpel) of 14 nt, and ends with the also invariant TATA-box. Within the complex sequence element, a minimum of three further features specific for the gamma A-, gamma B- and gamma D/E/F-cry genes can be defined, at least two of which were recently shown to be functional. In addition to these four sequence elements, a subtype-specific structure of inverted repeats with different-sized spacers can be deduced from the multiple sequence alignment. A phylogenetic analysis based on the promoter region, as well as the complete exon 3 of all gamma-cry from mouse, rat and man, suggests separation of only five gamma-cry subtypes (gamma A-, gamma B-, gamma C-, gamma D- and gamma E/F-cry) prior to species separation.

  10. Sequence Variation in the Small-Subunit rRNA Gene of Plasmodium malariae and Prevalence of Isolates with the Variant Sequence in Sichuan, China

    PubMed Central

    Liu, Qing; Zhu, Shenghua; Mizuno, Sahoko; Kimura, Masatsugu; Liu, Peina; Isomura, Shin; Wang, Xingzhen; Kawamoto, Fumihiko

    1998-01-01

    By two PCR-based diagnostic methods, Plasmodium malariae infections have been rediscovered at two foci in the Sichuan province of China, a region where no cases of P. malariae have been officially reported for the last 2 decades. In addition, a variant form of P. malariae which has a deletion of 19 bp and seven substitutions of base pairs in the target sequence of the small-subunit (SSU) rRNA gene was detected with high frequency. Alignment analysis of Plasmodium sp. SSU rRNA gene sequences revealed that the 5′ region of the variant sequence is identical to that of P. vivax or P. knowlesi and its 3′ region is identical to that of P. malariae. The same sequence variations were also found in P. malariae isolates collected along the Thai-Myanmar border, suggesting a wide distribution of this variant form from southern China to Southeast Asia. PMID:9774600

  11. Power law tails in phylogenetic systems.

    PubMed

    Qin, Chongli; Colwell, Lucy J

    2018-01-23

    Covariance analysis of protein sequence alignments uses coevolving pairs of sequence positions to predict features of protein structure and function. However, current methods ignore the phylogenetic relationships between sequences, potentially corrupting the identification of covarying positions. Here, we use random matrix theory to demonstrate the existence of a power law tail that distinguishes the spectrum of covariance caused by phylogeny from that caused by structural interactions. The power law is essentially independent of the phylogenetic tree topology, depending on just two parameters-the sequence length and the average branch length. We demonstrate that these power law tails are ubiquitous in the large protein sequence alignments used to predict contacts in 3D structure, as predicted by our theory. This suggests that to decouple phylogenetic effects from the interactions between sequence distal sites that control biological function, it is necessary to remove or down-weight the eigenvectors of the covariance matrix with largest eigenvalues. We confirm that truncating these eigenvectors improves contact prediction.

  12. Epidemiological survey of idiopathic scoliosis and sequence alignment analysis of multiple candidate genes.

    PubMed

    Yang, Tao; Jia, Quanzhang; Guo, Hong; Xu, Jianzhong; Bai, Yun; Yang, Kai; Luo, Fei; Zhang, Zehua; Hou, Tianyong

    2012-06-01

    To investigate the effects of genetic factors on idiopathic scoliosis (IS) and genetic modes through genetic epidemiological survey on IS in Chongqing City, China, and to determine whether SH3GL1, GADD45B, and FGF22 in the chromosome 19p13.3 are the pathogenic genes of IS through genetic sequence analysis. 214 nuclear families were investigated to analyse the age incidence, familial aggregation, and heritability. SH3GL1, GADD45B, and FGF22 were chosen as candidate genes for mutation screening in 56 IS patients of 214 families. The sequence alignment analysis was performed to determine mutations and predict the protein structure. The average age of onset of 10.8 years suggests that IS is a early onset disease. Incidences of IS in first-, second-, third-degree relatives and the overall incidence in families (5.68%) were also significantly higher than that of the general population (1.04%). The U test indicated a significant difference, suggesting that IS has a familial aggregation. The heritability of first-degree relatives (77.68 ±10.39%), second-degree relatives (69.89 ±3.14%), and third-degree relatives (62.14 ±11.92%) illustrated that genetic factors play an important role in IS pathogenesis. The incidence of first-degree relatives (10.01%), second-degree relatives (2.55%) and third-degree relatives (1.76%) illustrated that IS is not in simple accord with monogenic Mendel's law but manifests as traits of multifactorial hereditary diseases. Sequence alignment of exons of SH3GL1, GADD45B, and FGF22 showed 17 base mutations, of which 16 mutations do not induce open reading frame (ORF) shift or amino acid changes whereas one mutation (C→T)occurred in SH3GL1 results in formation of the termination codon, which induces variation of protein reading frame. Prediction analysis of protein sequence showed that the SH3GL1 mutant encoded a truncated protein, thus affecting the protein structure. IS is a multifactorial genetic disease and SH3GL1 may be one of the pathogenic genes for IS.

  13. Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures

    PubMed Central

    Pride, David T; Schoenfeld, Thomas

    2008-01-01

    Background Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. Results From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. Conclusion That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis. PMID:18798991

  14. Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures.

    PubMed

    Pride, David T; Schoenfeld, Thomas

    2008-09-17

    Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis.

  15. Characterization of X Chromosome Inactivation Using Integrated Analysis of Whole-Exome and mRNA Sequencing

    PubMed Central

    Szelinger, Szabolcs; Malenica, Ivana; Corneveaux, Jason J.; Siniard, Ashley L.; Kurdoglu, Ahmet A.; Ramsey, Keri M.; Schrauwen, Isabelle; Trent, Jeffrey M.; Narayanan, Vinodh; Huentelman, Matthew J.; Craig, David W.

    2014-01-01

    In females, X chromosome inactivation (XCI) is an epigenetic, gene dosage compensatory mechanism by inactivation of one copy of X in cells. Random XCI of one of the parental chromosomes results in an approximately equal proportion of cells expressing alleles from either the maternally or paternally inherited active X, and is defined by the XCI ratio. Skewed XCI ratio is suggestive of non-random inactivation, which can play an important role in X-linked genetic conditions. Current methods rely on indirect, semi-quantitative DNA methylation-based assay to estimate XCI ratio. Here we report a direct approach to estimate XCI ratio by integrated, family-trio based whole-exome and mRNA sequencing using phase-by-transmission of alleles coupled with allele-specific expression analysis. We applied this method to in silico data and to a clinical patient with mild cognitive impairment but no clear diagnosis or understanding molecular mechanism underlying the phenotype. Simulation showed that phased and unphased heterozygous allele expression can be used to estimate XCI ratio. Segregation analysis of the patient's exome uncovered a de novo, interstitial, 1.7 Mb deletion on Xp22.31 that originated on the paternally inherited X and previously been associated with heterogeneous, neurological phenotype. Phased, allelic expression data suggested an 83∶20 moderately skewed XCI that favored the expression of the maternally inherited, cytogenetically normal X and suggested that the deleterious affect of the de novo event on the paternal copy may be offset by skewed XCI that favors expression of the wild-type X. This study shows the utility of integrated sequencing approach in XCI ratio estimation. PMID:25503791

  16. Characterization of the telomere complex, TERF1 and TERF2 genes in muntjac species with fusion karyotypes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hartmann, Nils; Scherthan, Harry

    The telomere binding proteins TRF1 and TRF2 maintain and protect chromosome ends and confer karyotypic stability. Chromosome evolution in the genus Muntiacus is characterized by numerous tandem (end-to-end) fusions. To study TRF1 and TRF2 telomere binding proteins in Muntiacus species, we isolated and characterized the TERF1 and -2 genes from Indian muntjac (Muntiacus muntjak vaginalis; 2n = 6 female) and from Chinese muntjac (Muntiacus reveesi; 2n = 46). Expression analysis revealed that both genes are ubiquitously expressed and sequence analysis identified several transcript variants of both TERF genes. Control experiments disclosed a novel testis-specific splice variant of TERF1 in humanmore » testes. Amino acid sequence comparisons demonstrate that Muntiacus TRF1 and in particular TRF2 are highly conserved between muntjac and human. In vivo TRF2-GFP and immuno-staining studies in muntjac cell lines revealed telomeric TRF2 localization, while deletion of the DNA binding domain abrogated this localization, suggesting muntjac TRF2 represents a functional telomere protein. Finally, expression analysis of a set of telomere-related genes revealed their presence in muntjac fibroblasts and testis tissue, which suggests the presence of a conserved telomere complex in muntjacs. However, a deviation from the common theme was noted for the TERT gene, encoding the catalytic subunit of telomerase; TERT expression could not be detected in Indian or Chinese muntjac cDNA or genomic DNA using a series of conserved primers, while TRAP assay revealed functional telomerase in Chinese muntjac testis tissues. This suggests muntjacs may harbor a diverged telomerase sequence.« less

  17. Analysis of the grape MYB R2R3 subfamily reveals expanded wine quality-related clades and conserved gene structure organization across Vitis and Arabidopsis genomes

    PubMed Central

    Matus, José Tomás; Aquea, Felipe; Arce-Johnson, Patricio

    2008-01-01

    Background The MYB superfamily constitutes the most abundant group of transcription factors described in plants. Members control processes such as epidermal cell differentiation, stomatal aperture, flavonoid synthesis, cold and drought tolerance and pathogen resistance. No genome-wide characterization of this family has been conducted in a woody species such as grapevine. In addition, previous analysis of the recently released grape genome sequence suggested expansion events of several gene families involved in wine quality. Results We describe and classify 108 members of the grape R2R3 MYB gene subfamily in terms of their genomic gene structures and similarity to their putative Arabidopsis thaliana orthologues. Seven gene models were derived and analyzed in terms of gene expression and their DNA binding domain structures. Despite low overall sequence homology in the C-terminus of all proteins, even in those with similar functions across Arabidopsis and Vitis, highly conserved motif sequences and exon lengths were found. The grape epidermal cell fate clade is expanded when compared with the Arabidopsis and rice MYB subfamilies. Two anthocyanin MYBA related clusters were identified in chromosomes 2 and 14, one of which includes the previously described grape colour locus. Tannin related loci were also detected with eight candidate homologues in chromosomes 4, 9 and 11. Conclusion This genome wide transcription factor analysis in Vitis suggests that clade-specific grape R2R3 MYB genes are expanded while other MYB genes could be well conserved compared to Arabidopsis. MYB gene abundance, homology and orientation within particular loci also suggests that expanded MYB clades conferring quality attributes of grapes and wines, such as colour and astringency, could possess redundant, overlapping and cooperative functions. PMID:18647406

  18. Extensive Concerted Evolution of Rice Paralogs and the Road to Regaining Independence

    PubMed Central

    Wang, Xiyin; Tang, Haibao; Bowers, John E.; Feltus, Frank A.; Paterson, Andrew H.

    2007-01-01

    Many genes duplicated by whole-genome duplications (WGDs) are more similar to one another than expected. We investigated whether concerted evolution through conversion and crossing over, well-known to affect tandem gene clusters, also affects dispersed paralogs. Genome sequences for two Oryza subspecies reveal appreciable gene conversion in the ∼0.4 MY since their divergence, with a gradual progression toward independent evolution of older paralogs. Since divergence from subspecies indica, ∼8% of japonica paralogs produced 5–7 MYA on chromosomes 11 and 12 have been affected by gene conversion and several reciprocal exchanges of chromosomal segments, while ∼70-MY-old “paleologs” resulting from a genome duplication (GD) show much less conversion. Sequence similarity analysis in proximal gene clusters also suggests more conversion between younger paralogs. About 8% of paleologs may have been converted since rice–sorghum divergence ∼41 MYA. Domain-encoding sequences are more frequently converted than nondomain sequences, suggesting a sort of circularity—that sequences conserved by selection may be further conserved by relatively frequent conversion. The higher level of concerted evolution in the 5–7 MY-old segmental duplication may reflect the behavior of many genomes within the first few million years after duplication or polyploidization. PMID:18039882

  19. Amyloid fibril formation from sequences of a natural beta-structured fibrous protein, the adenovirus fiber.

    PubMed

    Papanikolopoulou, Katerina; Schoehn, Guy; Forge, Vincent; Forsyth, V Trevor; Riekel, Christian; Hernandez, Jean-François; Ruigrok, Rob W H; Mitraki, Anna

    2005-01-28

    Amyloid fibrils are fibrous beta-structures that derive from abnormal folding and assembly of peptides and proteins. Despite a wealth of structural studies on amyloids, the nature of the amyloid structure remains elusive; possible connections to natural, beta-structured fibrous motifs have been suggested. In this work we focus on understanding amyloid structure and formation from sequences of a natural, beta-structured fibrous protein. We show that short peptides (25 to 6 amino acids) corresponding to repetitive sequences from the adenovirus fiber shaft have an intrinsic capacity to form amyloid fibrils as judged by electron microscopy, Congo Red binding, infrared spectroscopy, and x-ray fiber diffraction. In the presence of the globular C-terminal domain of the protein that acts as a trimerization motif, the shaft sequences adopt a triple-stranded, beta-fibrous motif. We discuss the possible structure and arrangement of these sequences within the amyloid fibril, as compared with the one adopted within the native structure. A 6-amino acid peptide, corresponding to the last beta-strand of the shaft, was found to be sufficient to form amyloid fibrils. Structural analysis of these amyloid fibrils suggests that perpendicular stacking of beta-strand repeat units is an underlying common feature of amyloid formation.

  20. Conserved noncoding sequences conserve biological networks and influence genome evolution.

    PubMed

    Xie, Jianbo; Qian, Kecheng; Si, Jingna; Xiao, Liang; Ci, Dong; Zhang, Deqiang

    2018-05-01

    Comparative genomics approaches have identified numerous conserved cis-regulatory sequences near genes in plant genomes. Despite the identification of these conserved noncoding sequences (CNSs), our knowledge of their functional importance and selection remains limited. Here, we used a combination of DNA methylome analysis, microarray expression analyses, and functional annotation to study these sequences in the model tree Populus trichocarpa. Methylation in CG contexts and non-CG contexts was lower in CNSs, particularly CNSs in the 5'-upstream regions of genes, compared with other sites in the genome. We observed that CNSs are enriched in genes with transcription and binding functions, and this also associated with syntenic genes and those from whole-genome duplications, suggesting that cis-regulatory sequences play a key role in genome evolution. We detected a significant positive correlation between CNS number and protein interactions, suggesting that CNSs may have roles in the evolution and maintenance of biological networks. The divergence of CNSs indicates that duplication-degeneration-complementation drives the subfunctionalization of a proportion of duplicated genes from whole-genome duplication. Furthermore, population genomics confirmed that most CNSs are under strong purifying selection and only a small subset of CNSs shows evidence of adaptive evolution. These findings provide a foundation for future studies exploring these key genomic features in the maintenance of biological networks, local adaptation, and transcription.

  1. Extraordinary Structured Noncoding RNAs Revealed by Bacterial Metagenome Analysis

    PubMed Central

    Weinberg, Zasha; Perreault, Jonathan; Meyer, Michelle M.; Breaker, Ronald R.

    2012-01-01

    Estimates of the total number of bacterial species1-3 suggest that existing DNA sequence databases carry only a tiny fraction of the total amount of DNA sequence space represented by this division of life. Indeed, environmental DNA samples have been shown to encode many previously unknown classes of proteins4 and RNAs5. Bioinformatics searches6-10 of genomic DNA from bacteria commonly identify novel noncoding RNAs (ncRNAs)10-12 such as riboswitches13,14. In rare instances, RNAs that exhibit more extensive sequence and structural conservation across a wide range of bacteria are encountered15,16. Given that large structured RNAs are known to carry out complex biochemical functions such as protein synthesis and RNA processing reactions, identifying more RNAs of great size and intricate structure is likely to reveal additional biochemical functions that can be achieved by RNA. We applied an updated computational pipeline17 to discover ncRNAs that rival the known large ribozymes in size and structural complexity or that are among the most abundant RNAs in bacteria that encode them. These RNAs would have been difficult or impossible to detect without examining environmental DNA sequences, suggesting that numerous RNAs with extraordinary size, structural complexity, or other exceptional characteristics remain to be discovered in unexplored sequence space. PMID:19956260

  2. Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds.

    PubMed

    Dean, Rebecca; Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Mank, Judith E

    2015-10-01

    The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”

    PubMed Central

    Tettelin, Hervé; Masignani, Vega; Cieslewicz, Michael J.; Donati, Claudio; Medini, Duccio; Ward, Naomi L.; Angiuoli, Samuel V.; Crabtree, Jonathan; Jones, Amanda L.; Durkin, A. Scott; DeBoy, Robert T.; Davidsen, Tanja M.; Mora, Marirosa; Scarselli, Maria; Margarit y Ros, Immaculada; Peterson, Jeremy D.; Hauser, Christopher R.; Sundaram, Jaideep P.; Nelson, William C.; Madupu, Ramana; Brinkac, Lauren M.; Dodson, Robert J.; Rosovitz, Mary J.; Sullivan, Steven A.; Daugherty, Sean C.; Haft, Daniel H.; Selengut, Jeremy; Gwinn, Michelle L.; Zhou, Liwei; Zafar, Nikhat; Khouri, Hoda; Radune, Diana; Dimitrov, George; Watkins, Kisha; O'Connor, Kevin J. B.; Smith, Shannon; Utterback, Teresa R.; White, Owen; Rubens, Craig E.; Grandi, Guido; Madoff, Lawrence C.; Kasper, Dennis L.; Telford, John L.; Wessels, Michael R.; Rappuoli, Rino; Fraser, Claire M.

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for ≈80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes. PMID:16172379

  4. Molecular Characterization of Watermelon Chlorotic Stunt Virus (WmCSV) from Palestine

    PubMed Central

    Ali-Shtayeh, Mohammed S.; Jamous, Rana M.; Mallah, Omar B.; Abu-Zeitoun, Salam Y.

    2014-01-01

    The incidence of watermelon chlorotic stunt disease and molecular characterization of the Palestinian isolate of Watermelon chlorotic stunt virus (WmCSV-[PAL]) are described in this study. Symptomatic leaf samples obtained from watermelon Citrullus lanatus (Thunb.), and cucumber (Cucumis sativus L.) plants were tested for WmCSV-[PAL] infection by polymerase chain reaction (PCR) and Rolling Circle Amplification (RCA). Disease incidence ranged between 25%–98% in watermelon fields in the studied area, 77% of leaf samples collected from Jenin were found to be mixed infected with WmCSV-[PAL] and SLCV. The full-length DNA-A and DNA-B genomes of WmCSV-[PAL] were amplified and sequenced, and the sequences were deposited in the GenBank. Sequence analysis of virus genomes showed that DNA-A and DNA-B had 97.6%–99.42% and 93.16%–98.26% nucleotide identity with other virus isolates in the region, respectively. Sequence analysis also revealed that the Palestinian isolate of WmCSV shared the highest nucleotide identity with an isolate from Israel suggesting that the virus was introduced to Palestine from Israel. PMID:24956181

  5. Comparison and phylogenetic analysis of the ISS gene in two predominant avian pathogenic E. coli serogroups isolated from avian colibacillosis in Iran.

    PubMed

    Zahraei Salehi, Taghi; Derakhshandeh, Abdollah; Tadjbakhsh, Hasan; Karimi, Vahid

    2013-02-01

    The ISS (increased serum survival) gene and its protein product (ISS) of avian pathogenic Escherichia coli (APEC) are important characteristics of resistance to the complement system. The aims of this study were to clone, sequence and characterize sequence diversity of the ISS gene between two predominant serogroups in Iran and among those previously deposited in Genbank. The ISS gene of 309 bp from the APEC χ1390 strain was amplified by PCR, cloned and sequenced using pTZ57R/T vector. The ISS gene from the χ1390 strain has 100% identity among different serogroups of APEC in different geographical regions throughout the world. Phylogenetic analysis shows two different phylogenic groups among the different strains. Strong association of nucleotide sequences among different E. coli strains suggests that it may be a conserved gene and could be a suitable antigen to control and detect avian pathogenic E. coli, at least in our region. Currently, our group is working on the ISS protein as candidate vaccine in SPF poultry. Copyright © 2012 Elsevier Ltd. All rights reserved.

  6. Community structure of free-floating filamentous cyanobacterial mats from the Wonder Lake geothermal springs in the Philippines.

    PubMed

    Lacap, Donnabella C; Smith, Gavin J D; Warren-Rhodes, Kimberley; Pointing, Stephen B

    2005-07-01

    Cyanobacterial mats were characterized from pools of 45-60 degrees C in near-neutral pH, low-sulphide geothermal springs in the Philippines. Mat structure did not vary with temperature. All mats possessed highly ordered layers of airspaces at both the macroscopic and microscopic level, and these appear to be an adaptation to a free-floating growth habit. Upper mat layers supported biomass with elevated carotenoid:chlorophyll a ratios and an as yet uncharacterized waxy layer on the dorsal surface. Microscopic examination revealed mats comprised a single Fischerella morphotype, with abundant heterocysts throughout mats at all temperatures. Molecular analysis of mat community structure only partly matched morphological identification. All samples supported greater 16S rDNA-defined diversity than morphology suggested, with a progressive loss in the number of genotypes with increasing temperature. Fischerella-like sequences were recovered from mats occurring at all temperatures, but some mats also yielded Oscillatoria-like sequences, although corresponding phenotypes were not observed. Phylogenetic analysis revealed that Fischerella-like sequences were most closely affiliated with Fischerella major and the Oscillatoria-like sequences with Oscillatoria amphigranulata.

  7. Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.

    PubMed

    Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun

    2012-01-01

    The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.

  8. Complex Subtype Diversity of HIV-1 Among Drug Users in Major Kenyan Cities.

    PubMed

    Gounder, Kamini; Oyaro, Micah; Padayachi, Nagavelli; Zulu, Thando Mbali; de Oliveira, Tulio; Wylie, John; Ndung'u, Thumbi

    2017-05-01

    Drug users are increasingly recognized as a key population driving human immunodeficiency virus (HIV) spread in sub-Saharan Africa. To determine HIV-1 subtypes circulating in this population group and explore possible geographic differences, we analyzed HIV-1 sequences among drug users from Nairobi, Mombasa, and Kisumu in Kenya. We sequenced gag and env from 55 drug users. Subtype analysis from 220 gag clonal sequences from 54 of 55 participants (median = 4/participant) showed that 44.4% were A, 16.7% were C, 3.7% were D, and 35.2% were intersubtype recombinants. Of 156 env clonal sequences from 48 of 55 subjects (median = 3/participant), 45.8% were subtype A, 14.6% were C, 6.3% were D, and 33.3% were recombinants. Comparative analysis of both genes showed that 30 (63.8%) participants had concordant subtypes, while 17 (36.2%) were discordant. We identified one genetically linked transmission pair and two cases of dual infection. These data are indicative of extensive HIV-1 intersubtype recombination in Kenya and suggest decline in subtype D prevalence.

  9. Reference genotype and exome data from an Australian Aboriginal population for health-based research

    PubMed Central

    Tang, Dave; Anderson, Denise; Francis, Richard W.; Syn, Genevieve; Jamieson, Sarra E.; Lassmann, Timo; Blackwell, Jenefer M.

    2016-01-01

    Genetic analyses, including genome-wide association studies and whole exome sequencing (WES), provide powerful tools for the analysis of complex and rare genetic diseases. To date there are no reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 72 Aboriginal individuals to a depth of 20X coverage in ∼80% of the sequenced nucleotides. We determined 320,976 single nucleotide variants (SNVs) and 47,313 insertions/deletions using the Genome Analysis Toolkit. We had previously genotyped a subset of the Aboriginal individuals (70/72) using the Illumina Omni2.5 BeadChip platform and found ~99% concordance at overlapping sites, which suggests high quality genotyping. Finally, we compared our SNVs to six publicly available variant databases, such as dbSNP and the Exome Sequencing Project, and 70,115 of our SNVs did not overlap any of the single nucleotide polymorphic sites in all the databases. Our data set provides a useful reference point for genomic studies on Aboriginal Australians. PMID:27070114

  10. Reference genotype and exome data from an Australian Aboriginal population for health-based research.

    PubMed

    Tang, Dave; Anderson, Denise; Francis, Richard W; Syn, Genevieve; Jamieson, Sarra E; Lassmann, Timo; Blackwell, Jenefer M

    2016-04-12

    Genetic analyses, including genome-wide association studies and whole exome sequencing (WES), provide powerful tools for the analysis of complex and rare genetic diseases. To date there are no reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 72 Aboriginal individuals to a depth of 20X coverage in ∼80% of the sequenced nucleotides. We determined 320,976 single nucleotide variants (SNVs) and 47,313 insertions/deletions using the Genome Analysis Toolkit. We had previously genotyped a subset of the Aboriginal individuals (70/72) using the Illumina Omni2.5 BeadChip platform and found ~99% concordance at overlapping sites, which suggests high quality genotyping. Finally, we compared our SNVs to six publicly available variant databases, such as dbSNP and the Exome Sequencing Project, and 70,115 of our SNVs did not overlap any of the single nucleotide polymorphic sites in all the databases. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.

  11. Genotypes and subgenotypes of hepatitis B virus circulating in an endemic area in Peru.

    PubMed

    Ramírez-Soto, Max Carlos; Bracho, Maria Alma; González-Candelas, Fernando; Huichi-Atamari, Milagros

    2018-01-01

    Although hepatitis B virus (HBV) infection is still endemic in Abancay, Peru, two decades after vaccination against hepatitis B started in the area, little is known about the diversity and circulation of genotypes and subgenotypes of the virus. To identify the genotypes and subtypes of HBV circulating in Abancay, complete genome sequences of 11 treatment-naive HBV-infected patients were obtained, and phylogenetic analysis was conducted with these and additional sequences from GenBank. Genotyping revealed the presence of genotype F in all the samples from Abancay. Subgenotype F1b was dominant and only one isolate belonged to subgenotype F4, which represents the first description of this subgenotype in Peru. Phylogenetic analysis revealed that most subgenotype F1b isolates from Peru clustered in a subgroup along with two sequences from Argentina, whereas two clusters with two HBV/F1b sequences each were indicative of recent epidemiological linkage, but only one could be verified by independent data. These results suggest that the HBV subgenotype F1b seems to be the predominant subgenotype in Abancay, Peru.

  12. Complex Subtype Diversity of HIV-1 Among Drug Users in Major Kenyan Cities

    PubMed Central

    Gounder, Kamini; Oyaro, Micah; Padayachi, Nagavelli; Zulu, Thando Mbali; de Oliveira, Tulio; Wylie, John

    2017-01-01

    Abstract Drug users are increasingly recognized as a key population driving human immunodeficiency virus (HIV) spread in sub-Saharan Africa. To determine HIV-1 subtypes circulating in this population group and explore possible geographic differences, we analyzed HIV-1 sequences among drug users from Nairobi, Mombasa, and Kisumu in Kenya. We sequenced gag and env from 55 drug users. Subtype analysis from 220 gag clonal sequences from 54 of 55 participants (median = 4/participant) showed that 44.4% were A, 16.7% were C, 3.7% were D, and 35.2% were intersubtype recombinants. Of 156 env clonal sequences from 48 of 55 subjects (median = 3/participant), 45.8% were subtype A, 14.6% were C, 6.3% were D, and 33.3% were recombinants. Comparative analysis of both genes showed that 30 (63.8%) participants had concordant subtypes, while 17 (36.2%) were discordant. We identified one genetically linked transmission pair and two cases of dual infection. These data are indicative of extensive HIV-1 intersubtype recombination in Kenya and suggest decline in subtype D prevalence. PMID:28068781

  13. Community-led comparative genomic and phenotypic analysis of the aquaculture pathogen Pseudomonas baetica a390T sequenced by Ion semiconductor and Nanopore technologies

    PubMed Central

    Beaton, Ainsley; Lood, Cédric; Cunningham-Oakes, Edward; MacFadyen, Alison; Mullins, Alex J; Bestawy, Walid El; Botelho, João; Chevalier, Sylvie; Dalzell, Chloe; Dolan, Stephen K; Faccenda, Alberto; Ghequire, Maarten G K; Higgins, Steven; Kutschera, Alexander; Murray, Jordan; Redway, Martha; Salih, Talal; Smith, Brian A; Smits, Nathan; Thomson, Ryan; Woodcock, Stuart; Cornelis, Pierre; Lavigne, Rob; van Noort, Vera

    2018-01-01

    Abstract Pseudomonas baetica strain a390T is the type strain of this recently described species and here we present its high-contiguity draft genome. To celebrate the 16th International Conference on Pseudomonas, the genome of P. baetica strain a390T was sequenced using a unique combination of Ion Torrent semiconductor and Oxford Nanopore methods as part of a collaborative community-led project. The use of high-quality Ion Torrent sequences with long Nanopore reads gave rapid, high-contiguity and -quality, 16-contig genome sequence. Whole genome phylogenetic analysis places P. baetica within the P. koreensis clade of the P. fluorescens group. Comparison of the main genomic features of P. baetica with a variety of other Pseudomonas spp. suggests that it is a highly adaptable organism, typical of the genus. This strain was originally isolated from the liver of a diseased wedge sole fish, and genotypic and phenotypic analyses show that it is tolerant to osmotic stress and to oxytetracycline. PMID:29579234

  14. Molecular Cloning and Sequence Analysis of a Phenylalanine Ammonia-Lyase Gene from Dendrobium

    PubMed Central

    Cai, Yongping; Lin, Yi

    2013-01-01

    In this study, a phenylalanine ammonia-lyase (PAL) gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748) has 2,458 bps and contains a complete open reading frame (ORF) of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum. PMID:23638048

  15. Distribution and abundance of archaeal and bacterial ammonia oxidizers in the sediments of the Dongjiang River, a drinking water supply for Hong Kong.

    PubMed

    Sun, Wei; Xia, Chunyu; Xu, Meiying; Guo, Jun; Wang, Aijie; Sun, Guoping

    2013-01-01

    Ammonia-oxidizing archaea (AOA) and bacteria (AOB) play important roles in nitrification. However, limited information about the characteristics of AOA and AOB in the river ecosystem is available. The distribution and abundance of AOA and AOB in the sediments of the Dongjiang River, a drinking water source for Hong Kong, were investigated by clone library analysis and quantitative real-time PCR. Phylogenetic analysis showed that Group 1.1b- and Group 1.1b-associated sequences of AOA predominated in sediments with comparatively high carbon and nitrogen contents (e.g. total carbon (TC) >13 g kg(-1) sediment, NH4(+)-N >144 mg kg(-1) sediment), while Group 1.1a- and Group 1.1a-associated sequences were dominant in sediments with opposite conditions (e.g. TC <4 g kg(-1) sediment, NH4(+)-N <93 mg kg(-1) sediment). Although Nitrosomonas- and Nitrosospira-related sequences of AOB were detected in the sediments, nearly 70% of the sequences fell into the Nitrosomonas-like B cluster, suggesting similar sediment AOB communities along the river. Higher abundance of AOB than AOA was observed in almost all of the sediments in the Dongjiang River, while significant correlations were only detected between the distribution of AOA and the sediment pH and TC, which suggested that AOA responded more sensitively than AOB to variations of environmental factors. These results extend our knowledge about the environmental responses of ammonia oxidizers in the river ecosystem.

  16. New insights into the paleolake sequence of Baumkirchen (Austria): multiple lake phases and a minor ice advance during MIS 4?

    NASA Astrophysics Data System (ADS)

    Barrett, Samuel; Starnberger, Reinhard; Spötl, Christoph; Brauer, Achim; Tjallingii, Rik; Dulski, Peter; Abfalterer, Christof

    2015-04-01

    The sequence of pre-LGM lacustrine sediments at Baumkirchen (Austria) provides a key record in Alpine Quaternary stratigraphy. These sediments from within the boundary of the Alps potentially provide unique insights into the regional paleoclimate. Recent drilling revealed at least ~250m (the base was not reached) of almost entirely mm- to cm-scale lacustrine sediments. The laminated sediments are comprised of alternations between clayey silt and event layers of medium silt to fine sand. The sequence is interrupted only by a short section of gravel supported in an unlaminated clay-rich matrix. Optically stimulated luminescence dating identifies two distinct sequences: the upper sequence spanning mid-late Marine Isotope Stage (MIS) 3 (~33 to ~45 ka BP), agreeing with existing calibrated radiocarbon ages, and the lower section dating to MIS 4 (~59 to ~73 ka BP). Whether the hiatus is an erosional unconformity, or if the sequences represent two separate lake phases is unclear. Although the precise location of the hiatus is hard to identify, the gravel-rich section lies at the very top of the lower sequence. Pebbles in these gravels are largely angular and contain a significant proportion of non-local, regional lithologies. Such gravels are absent in the remainder of the entire 250 m-thick sequence and hence suggest a unique event rather than e.g. an interfingering local delta gravel foresets with the basin sediments. The gravels are therefore likely to be ice-rafted debris from icebergs from nearby glaciers calving into the lake. This therefore represents the first sedimentological evidence of a MIS 4 ice advance in the Eastern Alps. X-ray fluorescence analysis (ITRAX core scanning) of event layers indicates a strong change in the geochemical composition from generally K, Zr and Ti-rich layers in the upper sequence to mainly Ca and/or Si-rich layers in the lower sequence. X-ray diffraction analysis shows the Ca and Si signals to be controlled by carbonate (both calcite and dolomite) and quartz, respectively. This suggests a change in dominant sediment source and may indicate a change in catchment or paleolake configuration, re-raising the long outstanding question of how the lake or lakes were dammed.

  17. Technologically important extremophile 16S rRNA sequence Shannon entropy and fractal property comparison with long term dormant microbes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Gadura, N.; Dehipawala, S.; Cheung, E.; Tuffour, M.; Schneider, P.; Tremberger, G., Jr.; Lieberman, D.; Cheung, T.

    2011-10-01

    Technologically important extremophiles including oil eating microbes, uranium and rocket fuel perchlorate reduction microbes, electron producing microbes and electrode electrons feeding microbes were compared in terms of their 16S rRNA sequences, a standard targeted sequence in comparative phylogeny studies. Microbes that were reported to have survived a prolonged dormant duration were also studied. Examples included the recently discovered microbe that survives after 34,000 years in a salty environment while feeding off organic compounds from other trapped dead microbes. Shannon entropy of the 16S rRNA nucleotide composition and fractal dimension of the nucleotide sequence in terms of its atomic number fluctuation analyses suggest a selected range for these extremophiles as compared to other microbes; consistent with the experience of relatively mild evolutionary pressure. However, most of the microbes that have been reported to survive in prolonged dormant duration carry sequences with fractal dimension between 1.995 and 2.005 (N = 10 out of 13). Similar results are observed for halophiles, red-shifted chlorophyll and radiation resistant microbes. The results suggest that prolonged dormant duration, in analogous to high salty or radiation environment, would select high fractal 16S rRNA sequences. Path analysis in structural equation modeling supports a causal relation between entropy and fractal dimension for the studied 16S rRNA sequences (N = 7). Candidate choices for high fractal 16S rRNA microbes could offer protection for prolonged spaceflights. BioBrick gene network manipulation could include extremophile 16S rRNA sequences in synthetic biology and shed more light on exobiology and future colonization in shielded spaceflights. Whether the high fractal 16S rRNA sequences contain an asteroidlike extra-terrestrial source could be speculative but interesting.

  18. Chromosome arm-specific BAC end sequences permit comparative analysis of homoeologous chromosomes and genomes of polyploid wheat

    PubMed Central

    2012-01-01

    Background Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat. Results The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs. Conclusion This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping. PMID:22559868

  19. Mosaic CREBBP mutation causes overlapping clinical features of Rubinstein–Taybi and Filippi syndromes

    PubMed Central

    de Vries, Tamar I; R Monroe, Glen; van Belzen, Martine J; van der Lans, Christian A; Savelberg, Sanne MC; Newman, William G; van Haaften, Gijs; Nievelstein, Rutger A; van Haelst, Mieke M

    2016-01-01

    Rubinstein–Taybi syndrome (RTS, OMIM 180849) and Filippi syndrome (FLPIS, OMIM 272440) are both rare syndromes, with multiple congenital anomalies and intellectual deficit (MCA/ID). We present a patient with intellectual deficit, short stature, bilateral syndactyly of hands and feet, broad thumbs, ocular abnormalities, and dysmorphic facial features. These clinical features suggest both RTS and FLPIS. Initial DNA analysis of DNA isolated from blood did not identify variants to confirm either of these syndrome diagnoses. Whole-exome sequencing identified a homozygous variant in C9orf173, which was novel at the time of analysis. Further Sanger sequencing analysis of FLPIS cases tested negative for CKAP2L variants did not, however, reveal any further variants. Subsequent analysis using DNA isolated from buccal mucosa revealed a mosaic variant in CREBBP. This report highlights the importance of excluding mosaic variants in patients with a strong but atypical clinical presentation of a MCA/ID syndrome if no disease-causing variants can be detected in DNA isolated from blood samples. As the striking syndactyly observed in the present case is typical for FLPIS, we suggest CREBBP analysis in saliva samples for FLPIS syndrome cases in which no causal CKAP2L variant is detected. PMID:26956253

  20. Fox parasites in Pre-columbian times: Evidence from the past to understand the current helminth assemblages.

    PubMed

    Fugassa, M H; Petrigh, R S; Fernández, P M; Carballido Catalayud, M; Belleli, C

    2018-06-11

    This work aims to increase the information on the entero-parasitism in Holocene carnivores, by examining coprolites found in Patagonia. Molecular analysis was conducted following the Authenticity Criteria to Determine Ancient DNA sequences. The nucleotide sequences showed 99% of identity with the Control Region sequences of Lycalopex culpaeus (culpeo fox). Coprolites were positive for gastrointestinal parasites. The presence of Alaria sp. and Clonorchis sp. represents the first record for pre-Columbian America. The parasitological findings suggest the importance of these carnivores for the dissemination of their own parasites and those to their prey in rockshelters, areas with high re-use of space. Copyright © 2018. Published by Elsevier B.V.

  1. Phytoplasma phylogenetics based on analysis of secA and 23S rRNA gene sequences for improved resolution of candidate species of 'Candidatus Phytoplasma'.

    PubMed

    Hodgetts, Jennifer; Boonham, Neil; Mumford, Rick; Harrison, Nigel; Dickinson, Matthew

    2008-08-01

    Phytoplasma phylogenetics has focused primarily on sequences of the non-coding 16S rRNA gene and the 16S-23S rRNA intergenic spacer region (16-23S ISR), and primers that enable amplification of these regions from all phytoplasmas by PCR are well established. In this study, primers based on the secA gene have been developed into a semi-nested PCR assay that results in a sequence of the expected size (about 480 bp) from all 34 phytoplasmas examined, including strains representative of 12 16Sr groups. Phylogenetic analysis of secA gene sequences showed similar clustering of phytoplasmas when compared with clusters resolved by similar sequence analyses of a 16-23S ISR-23S rRNA gene contig or of the 16S rRNA gene alone. The main differences between trees were in the branch lengths, which were elongated in the 16-23S ISR-23S rRNA gene tree when compared with the 16S rRNA gene tree and elongated still further in the secA gene tree, despite this being a shorter sequence. The improved resolution in the secA gene-derived phylogenetic tree resulted in the 16SrII group splitting into two distinct clusters, while phytoplasmas associated with coconut lethal yellowing-type diseases split into three distinct groups, thereby supporting past proposals that they represent different candidate species within 'Candidatus Phytoplasma'. The ability to differentiate 16Sr groups and subgroups by virtual RFLP analysis of secA gene sequences suggests that this gene may provide an informative alternative molecular marker for pathogen identification and diagnosis of phytoplasma diseases.

  2. The transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) male reproductive organs.

    PubMed

    Azevedo, Renata V D M; Dias, Denise B S; Bretãs, Jorge A C; Mazzoni, Camila J; Souza, Nataly A; Albano, Rodolpho M; Wagner, Glauber; Davila, Alberto M R; Peixoto, Alexandre A

    2012-01-01

    It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. We generated 2678 high quality ESTs ("Expressed Sequence Tags") of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies.

  3. The Transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) Male Reproductive Organs

    PubMed Central

    Bretãs, Jorge A. C.; Mazzoni, Camila J.; Souza, Nataly A.; Albano, Rodolpho M.; Wagner, Glauber; Davila, Alberto M. R.; Peixoto, Alexandre A.

    2012-01-01

    Background It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. Methods/Principal Findings We generated 2678 high quality ESTs (“Expressed Sequence Tags”) of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). Conclusions The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies. PMID:22496818

  4. Molecular variation and horizontal gene transfer of the homocysteine methyltransferase gene mmuM and its distribution in clinical pathogens.

    PubMed

    Ying, Jianchao; Wang, Huifeng; Bao, Bokan; Zhang, Ying; Zhang, Jinfang; Zhang, Cheng; Li, Aifang; Lu, Junwan; Li, Peizhen; Ying, Jun; Liu, Qi; Xu, Teng; Yi, Huiguang; Li, Jinsong; Zhou, Li; Zhou, Tieli; Xu, Zuyuan; Ni, Liyan; Bao, Qiyu

    2015-01-01

    The homocysteine methyltransferase encoded by mmuM is widely distributed among microbial organisms. It is the key enzyme that catalyzes the last step in methionine biosynthesis and plays an important role in the metabolism process. It also enables the microbial organisms to tolerate high concentrations of selenium in the environment. In this research, 533 mmuM gene sequences covering 70 genera of the bacteria were selected from GenBank database. The distribution frequency of mmuM is different in the investigated genera of bacteria. The mapping results of 160 mmuM reference sequences showed that the mmuM genes were found in 7 species of pathogen genomes sequenced in this work. The polymerase chain reaction products of one mmuM genotype (NC_013951 as the reference) were sequenced and the sequencing results confirmed the mapping results. Furthermore, 144 representative sequences were chosen for phylogenetic analysis and some mmuM genes from totally different genera (such as the genes between Escherichia and Klebsiella and between Enterobacter and Kosakonia) shared closer phylogenetic relationship than those from the same genus. Comparative genomic analysis of the mmuM encoding regions on plasmids and bacterial chromosomes showed that pKF3-140 and pIP1206 plasmids shared a 21 kb homology region and a 4.9 kb fragment in this region was in fact originated from the Escherichia coli chromosome. These results further suggested that mmuM gene did go through the gene horizontal transfer among different species or genera of bacteria. High-throughput sequencing combined with comparative genomics analysis would explore distribution and dissemination of the mmuM gene among bacteria and its evolution at a molecular level.

  5. Molecular cloning and expression analysis of annexin A2 gene in sika deer antler tip.

    PubMed

    Xia, Yanling; Qu, Haomiao; Lu, Binshan; Zhang, Qiang; Li, Heping

    2018-04-01

    Molecular cloning and bioinformatics analysis of annexin A2 ( ANXA2 ) gene in sika deer antler tip were conducted. The role of ANXA2 gene in the growth and development of the antler were analyzed initially. The reverse transcriptase polymerase chain reaction (RT-PCR) was used to clone the cDNA sequence of the ANXA2 gene from antler tip of sika deer ( Cervus Nippon hortulorum ) and the bioinformatics methods were applied to analyze the amino acid sequence of Anxa2 protein. The mRNA expression levels of the ANXA2 gene in different growth stages were examined by real time reverse transcriptase polymerase chain reaction (real time RT-PCR). The nucleotide sequence analysis revealed an open reading frame of 1,020 bp encoding 339 amino acids long protein of calculated molecular weight 38.6 kDa and isoelectric point 6.09. Homologous sequence alignment and phylogenetic analysis indicated that the Anxa2 mature protein of sika deer had the closest genetic distance with Cervus elaphus and Bos mutus . Real time RT-PCR results showed that the gene had differential expression levels in different growth stages, and the expression level of the ANXA2 gene was the highest at metaphase (rapid growing period). ANXA2 gene may promote the cell proliferation, and the finding suggested Anxa2 as an important candidate for regulating the growth and development of deer antler.

  6. Quantification of the methylation status of the PWS/AS imprinted region: comparison of two approaches based on bisulfite sequencing and methylation-sensitive MLPA.

    PubMed

    Dikow, Nicola; Nygren, Anders Oh; Schouten, Jan P; Hartmann, Carolin; Krämer, Nikola; Janssen, Bart; Zschocke, Johannes

    2007-06-01

    Standard methods used for genomic methylation analysis allow the detection of complete absence of either methylated or non-methylated alleles but are usually unable to detect changes in the proportion of methylated and unmethylated alleles. We compare two methods for quantitative methylation analysis, using the chromosome 15q11-q13 imprinted region as model. Absence of the non-methylated paternal allele in this region leads to Prader-Willi syndrome (PWS) whilst absence of the methylated maternal allele results in Angelman syndrome (AS). A proportion of AS is caused by mosaic imprinting defects which may be missed with standard methods and require quantitative analysis for their detection. Sequence-based quantitative methylation analysis (SeQMA) involves quantitative comparison of peaks generated through sequencing reactions after bisulfite treatment. It is simple, cost-effective and can be easily established for a large number of genes. However, our results support previous suggestions that methods based on bisulfite treatment may be problematic for exact quantification of methylation status. Methylation-specific multiplex ligation-dependent probe amplification (MS-MLPA) avoids bisulfite treatment. It detects changes in both CpG methylation as well as copy number of up to 40 chromosomal sequences in one simple reaction. Once established in a laboratory setting, the method is more accurate, reliable and less time consuming.

  7. SSU rRNA-based phylogenetic position of the genera Amoeba and Chaos (Lobosea, Gymnamoebia): the origin of gymnamoebae revisited.

    PubMed

    Bolivar, I; Fahrni, J F; Smirnov, A; Pawlowski, J

    2001-12-01

    Naked lobose amoebae (gymnamoebae) are among the most abundant group of protists present in all aquatic and terrestrial biotopes. Yet, because of lack of informative morphological characters, the origin and evolutionary history of gymnamoebae are poorly known. The first molecular studies revealed multiple origins for the amoeboid lineages and an extraordinary diversity of amoebae species. Molecular data, however, exist only for a few species of the numerous taxa belonging to this group. Here, we present the small-subunit (SSU) rDNA sequences of four species of typical large gymnamoebae: Amoeba proteus, Amoeba leningradensis, Chaos nobile, and Chaos carolinense. Sequence analysis suggests that the four species are closely related to the species of genera Saccamoeba, Leptomyxa, Rhizamoeba, Paraflabellula, Hartmannella, and Echinamoeba. All of them form a relatively well-supported clade, which corresponds to the subclass Gymnamoebia, in agreement with morphology-based taxonomy. The other gymnamoebae cluster in small groups or branch separately. Their relationships change depending on the type of analysis and the model of nucleotide substitution. All gymnamoebae branch together in Neighbor-Joining analysis with corrections for among-site rate heterogeneity and proportion of invariable sites. This clade, however, is not statistically supported by SSU rRNA gene sequences and further analysis of protein sequence data will be necessary to test the monophyly of gymnamoebae.

  8. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Riley, Robert; Salamov, Asaf; Otillar, Robert

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes revealsmore » that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.« less

  9. Molecular cloning, mRNA expression and tissue distribution analysis of Slc7a11 gene in alpaca (Lama paco) skins associated with different coat colors.

    PubMed

    Tian, Xue; Meng, Xiaolin; Wang, Liangyan; Song, Yunfei; Zhang, Danli; Ji, Yuankai; Li, Xuejun; Dong, Changsheng

    2015-01-25

    Slc7a11 encoding solute carrier family 7 member 11 (amionic amino acid transporter light chain, xCT), has been identified to be a critical genetic regulator of pheomelanin synthesis in hair and melanocytes. To better understand the molecular characterization of Slc7a11 and the expression patterns in skin of white versus brown alpaca (lama paco), we cloned the full length coding sequence (CDS) of alpaca Slc7a11 gene and analyzed the expression patterns using Real Time PCR, Western blotting and immunohistochemistry. The full length CDS of 1512bp encodes a 503 amino acid polypeptide. Sequence analysis showed that alpaca xCT contains 12 transmembrane regions consistent with the highly conserved amino acid permease (AA_permease_2) domain similar to other vertebrates. Sequence alignment and phylogenetic analysis revealed that alpaca xCT had the highest identity and shared the same branch with Camelus ferus. Real Time PCR and Western blotting suggested that xCT was expressed at significantly high levels in brown alpaca skin, and transcripts and protein possessed the same expression pattern in white and brown alpaca skins. Additionally, immunohistochemical analysis further demonstrated that xCT staining was robustly increased in the matrix and root sheath of brown alpaca skin compared with that of white. These results suggest that Slc7a11 functions in alpaca coat color regulation and offer essential information for further exploration on the role of Slc7a11 in melanogenesis. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Comparative genome analysis of Pediococcus damnosus LMG 28219, a strain well-adapted to the beer environment.

    PubMed

    Snauwaert, Isabel; Stragier, Pieter; De Vuyst, Luc; Vandamme, Peter

    2015-04-03

    Pediococcus damnosus LMG 28219 is a lactic acid bacterium dominating the maturation phase of Flemish acid beer productions. It proved to be capable of growing in beer, thereby resisting this environment, which is unfavorable for microbial growth. The molecular mechanisms underlying its metabolic capabilities and niche adaptations were unknown up to now. In the present study, whole-genome sequencing and comparative genome analysis were used to investigate this strain's mechanisms to reside in the beer niche, with special focus on not only stress and hop resistances but also folate biosynthesis and exopolysaccharide (EPS) production. The draft genome sequence of P. damnosus LMG 28219 harbored 183 contigs, including an intact prophage region and several coding sequences involved in plasmid replication. The annotation of 2178 coding sequences revealed the presence of many transporters and transcriptional regulators and several genes involved in oxidative stress response, hop resistance, de novo folate biosynthesis, and EPS production. Comparative genome analysis of P. damnosus LMG 28219 with Pediococcus claussenii ATCC BAA-344(T) (beer origin) and Pediococcus pentosaceus ATCC 25745 (plant origin) revealed that various hop resistance genes and genes involved in de novo folate biosynthesis were unique to the strains isolated from beer. This contrasted with the genes related to osmotic stress responses, which were shared between the strains compared. Furthermore, transcriptional regulators were enriched in the genomes of bacteria capable of growth in beer, suggesting that those cause rapid up- or down-regulation of gene expression. Genome sequence analysis of P. damnosus LMG 28219 provided insights into the underlying mechanisms of its adaptation to the beer niche. The results presented will enable analysis of the transcriptome and proteome of P. damnosus LMG 28219, which will result in additional knowledge on its metabolic activities.

  11. Phylogenetic analysis of Mycobacterium massiliense strains having recombinant rpoB gene laterally transferred from Mycobacterium abscessus.

    PubMed

    Kim, Byoung-Jun; Kim, Ga-Na; Kim, Bo-Ram; Shim, Tae-Sun; Kook, Yoon-Hoh; Kim, Bum-Joon

    2017-01-01

    Recent multi locus sequence typing (MLST) and genome based studies indicate that lateral gene transfer (LGT) events in the rpoB gene are prevalent between Mycobacterium abscessus complex strains. To check the prevalence of the M. massiliense strains subject to rpoB LGT (Rec-mas), we applied rpoB typing (711 bp) to 106 Korean strains of M. massiliense infection that had already been identified by hsp65 sequence analysis (603 bp). The analysis indicated 6 smooth strains in M. massiliense Type I (10.0%, 6/60) genotypes but no strains in M. massiliense Type II genotypes (0%, 0/46), showing a discrepancy between the 2 typing methods. Further MLST analysis based on the partial sequencing of seven housekeeping genes, argH, cya, glpK, gnd, murC, pta and purH, as well as erm(41) PCR proved that these 6 Rec-mas strains consisted of two distinct genotypes belonging to M. massiliense and not M. abscessus. The complete rpoB sequencing analysis showed that these 6 Rec-mas strains have an identical hybrid rpoB gene, of which a 478 bp partial rpoB fragment may be laterally transferred from M. abscessus. Notably, five of the 6 Rec-mas strains showed complete identical sequences in a total of nine genes, including the seven MLST genes, hsp65, and rpoB, suggesting their clonal propagation in South Korea. In conclusion, we identified 6 M. massiliense smooth strains of 2 phylogenetically distinct genotypes with a specific hybrid rpoB gene laterally transferred from M. abscessus from Korean patients. Their clinical relevance and bacteriological traits remain to be elucidated.

  12. Phylogenetic analysis of Mycobacterium massiliense strains having recombinant rpoB gene laterally transferred from Mycobacterium abscessus

    PubMed Central

    Kim, Byoung-Jun; Kim, Ga-Na; Kim, Bo-Ram; Shim, Tae-Sun; Kook, Yoon-Hoh

    2017-01-01

    Recent multi locus sequence typing (MLST) and genome based studies indicate that lateral gene transfer (LGT) events in the rpoB gene are prevalent between Mycobacterium abscessus complex strains. To check the prevalence of the M. massiliense strains subject to rpoB LGT (Rec-mas), we applied rpoB typing (711 bp) to 106 Korean strains of M. massiliense infection that had already been identified by hsp65 sequence analysis (603 bp). The analysis indicated 6 smooth strains in M. massiliense Type I (10.0%, 6/60) genotypes but no strains in M. massiliense Type II genotypes (0%, 0/46), showing a discrepancy between the 2 typing methods. Further MLST analysis based on the partial sequencing of seven housekeeping genes, argH, cya, glpK, gnd, murC, pta and purH, as well as erm(41) PCR proved that these 6 Rec-mas strains consisted of two distinct genotypes belonging to M. massiliense and not M. abscessus. The complete rpoB sequencing analysis showed that these 6 Rec-mas strains have an identical hybrid rpoB gene, of which a 478 bp partial rpoB fragment may be laterally transferred from M. abscessus. Notably, five of the 6 Rec-mas strains showed complete identical sequences in a total of nine genes, including the seven MLST genes, hsp65, and rpoB, suggesting their clonal propagation in South Korea. In conclusion, we identified 6 M. massiliense smooth strains of 2 phylogenetically distinct genotypes with a specific hybrid rpoB gene laterally transferred from M. abscessus from Korean patients. Their clinical relevance and bacteriological traits remain to be elucidated. PMID:28604829

  13. KM+, a mannose-binding lectin from Artocarpus integrifolia: amino acid sequence, predicted tertiary structure, carbohydrate recognition, and analysis of the beta-prism fold.

    PubMed Central

    Rosa, J. C.; De Oliveira, P. S.; Garratt, R.; Beltramini, L.; Resing, K.; Roque-Barreira, M. C.; Greene, L. J.

    1999-01-01

    The complete amino acid sequence of the lectin KM+ from Artocarpus integrifolia (jackfruit), which contains 149 residues/mol, is reported and compared to those of other members of the Moraceae family, particularly that of jacalin, also from jackfruit, with which it shares 52% sequence identity. KM+ presents an acetyl-blocked N-terminus and is not posttranslationally modified by proteolytic cleavage as is the case for jacalin. Rather, it possesses a short, glycine-rich linker that unites the regions homologous to the alpha- and beta-chains of jacalin. The results of homology modeling implicate the linker sequence in sterically impeding rotation of the side chain of Asp141 within the binding site pocket. As a consequence, the aspartic acid is locked into a conformation adequate only for the recognition of equatorial hydroxyl groups on the C4 epimeric center (alpha-D-mannose, alpha-D-glucose, and their derivatives). In contrast, the internal cleavage of the jacalin chain permits free rotation of the homologous aspartic acid, rendering it capable of accepting hydrogen bonds from both possible hydroxyl configurations on C4. We suggest that, together with direct recognition of epimeric hydroxyls and the steric exclusion of disfavored ligands, conformational restriction of the lectin should be considered to be a new mechanism by which selectivity may be built into carbohydrate binding sites. Jacalin and KM+ adopt the beta-prism fold already observed in two unrelated protein families. Despite presenting little or no sequence similarity, an analysis of the beta-prism reveals a canonical feature repeatedly present in all such structures, which is based on six largely hydrophobic residues within a beta-hairpin containing two classic-type beta-bulges. We suggest the term beta-prism motif to describe this feature. PMID:10210179

  14. Assessment of acute myocarditis by cardiac magnetic resonance imaging: Comparison of qualitative and quantitative analysis methods.

    PubMed

    Imbriaco, Massimo; Nappi, Carmela; Puglia, Marta; De Giorgi, Marco; Dell'Aversana, Serena; Cuocolo, Renato; Ponsiglione, Andrea; De Giorgi, Igino; Polito, Maria Vincenza; Klain, Michele; Piscione, Federico; Pace, Leonardo; Cuocolo, Alberto

    2017-10-26

    To compare cardiac magnetic resonance (CMR) qualitative and quantitative analysis methods for the noninvasive assessment of myocardial inflammation in patients with suspected acute myocarditis (AM). A total of 61 patients with suspected AM underwent coronary angiography and CMR. Qualitative analysis was performed applying Lake-Louise Criteria (LLC), followed by quantitative analysis based on the evaluation of edema ratio (ER) and global relative enhancement (RE). Diagnostic performance was assessed for each method by measuring the area under the curves (AUC) of the receiver operating characteristic analyses. The final diagnosis of AM was based on symptoms and signs suggestive of cardiac disease, evidence of myocardial injury as defined by electrocardiogram changes, elevated troponin I, exclusion of coronary artery disease by coronary angiography, and clinical and echocardiographic follow-up at 3 months after admission to the chest pain unit. In all patients, coronary angiography did not show significant coronary artery stenosis. Troponin I levels and creatine kinase were higher in patients with AM compared to those without (both P < .001). There were no significant differences among LLC, T2-weighted short inversion time inversion recovery (STIR) sequences, early (EGE), and late (LGE) gadolinium-enhancement sequences for diagnosis of AM. The AUC for qualitative (T2-weighted STIR 0.92, EGE 0.87 and LGE 0.88) and quantitative (ER 0.89 and global RE 0.80) analyses were also similar. Qualitative and quantitative CMR analysis methods show similar diagnostic accuracy for the diagnosis of AM. These findings suggest that a simplified approach using a shortened CMR protocol including only T2-weighted STIR sequences might be useful to rule out AM in patients with acute coronary syndrome and normal coronary angiography.

  15. Direct Detection and Identification of Prosthetic Joint Infection Pathogens in Synovial Fluid by Metagenomic Shotgun Sequencing.

    PubMed

    Ivy, Morgan I; Thoendel, Matthew J; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Hanssen, Arlen D; Abdel, Matthew P; Chia, Nicholas; Yao, Janet Z; Tande, Aaron J; Mandrekar, Jayawant N; Patel, Robin

    2018-05-30

    Background: Metagenomic shotgun sequencing has the potential to transform how serious infections are diagnosed by offering universal, culture-free pathogen detection. This may be especially advantageous for microbial diagnosis of prosthetic joint infection (PJI) by synovial fluid analysis, since synovial fluid cultures are not universally positive, and synovial fluid is easily obtained pre-operatively. We applied a metagenomics-based approach to synovial fluid in an attempt to detect microorganisms in 168 failed total knee arthroplasties. Results: Genus- and species-level analysis of metagenomic sequencing yielded the known pathogen in 74 (90%) and 68 (83%) of the 82 culture-positive PJIs analyzed, respectively, with testing of two (2%) and three (4%) samples, respectively, yielding additional pathogens not detected by culture. For the 25 culture-negative PJIs tested, genus- and species-level analysis yielded 19 (76%) and 21 (84%) samples with insignificant findings, respectively, and 6 (24%) and 4 (16%) with potential pathogens detected, respectively. Genus- and species-level analysis of the 60 culture-negative aseptic failure cases yielded 53 (88.3%) and 56 (93.3%) cases with insignificant findings, and 7 (11.7%) and 4 (6.7%) with potential clinically-significant organisms detected, respectively. There was one case of aseptic failure with synovial fluid culture growth; metagenomic analysis showed insignificant findings, suggesting possible synovial fluid culture contamination. Conclusion: Metagenomic shotgun sequencing can detect pathogens involved in PJI when applied to synovial fluid and may be particularly useful for culture-negative cases. Copyright © 2018 American Society for Microbiology.

  16. Characterization of HIV Transmission in South-East Austria

    PubMed Central

    Kessler, Harald H.; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J.; Mehta, Sanjay R.

    2016-01-01

    To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects. PMID:26967154

  17. Characterization of HIV Transmission in South-East Austria.

    PubMed

    Hoenigl, Martin; Chaillon, Antoine; Kessler, Harald H; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J; Mehta, Sanjay R

    2016-01-01

    To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects.

  18. Sequence analysis of dolphin ferritin H and L subunits and possible iron-dependent translational control of dolphin ferritin gene

    PubMed Central

    Takaesu, Azusa; Watanabe, Kiyotaka; Takai, Shinji; Sasaki, Yukako; Orino, Koichi

    2008-01-01

    Background Iron-storage protein, ferritin plays a central role in iron metabolism. Ferritin has dual function to store iron and segregate iron for protection of iron-catalyzed reactive oxygen species. Tissue ferritin is composed of two kinds of subunits (H: heavy chain or heart-type subunit; L: light chain or liver-type subunit). Ferritin gene expression is controlled at translational level in iron-dependent manner or at transcriptional level in iron-independent manner. However, sequencing analysis of marine mammalian ferritin subunits has not yet been performed fully. The purpose of this study is to reveal cDNA-derived amino acid sequences of cetacean ferritin H and L subunits, and demonstrate the possibility of expression of these subunits, especially H subunit, by iron. Methods Sequence analyses of cetacean ferritin H and L subunits were performed by direct sequencing of polymerase chain reaction (PCR) fragments from cDNAs generated via reverse transcription-PCR of leukocyte total RNA prepared from blood samples of six different dolphin species (Pseudorca crassidens, Lagenorhynchus obliquidens, Grampus griseus, Globicephala macrorhynchus, Tursiops truncatus, and Delphinapterus leucas). The putative iron-responsive element sequence in the 5'-untranslated region of the six different dolphin species was revealed by direct sequencing of PCR fragments obtained using leukocyte genomic DNA. Results Dolphin H and L subunits consist of 182 and 174 amino acids, respectively, and amino acid sequence identities of ferritin subunits among these dolphins are highly conserved (H: 99–100%, (99→98) ; L: 98–100%). The conserved 28 bp IRE sequence was located -144 bp upstream from the initiation codon in the six different dolphin species. Conclusion These results indicate that six different dolphin species have conserved ferritin sequences, and suggest that these genes are iron-dependently expressed. PMID:18954429

  19. Lipopolysaccharide-induced innate immune factors in the bottlenose dolphin (Tursiops truncatus) detected in expression sequence tag analysis.

    PubMed

    Ohishi, Kazue; Shishido, Reiko; Iwata, Yasunao; Saitoh, Masafumi; Takenaka, Ryota; Ohtsu, Dai; Okutsu, Kenji; Maruyama, Tadashi

    2011-11-01

    EST analysis based on the megaclone-megasorting method was performed using leukocytes from the bottlenose dolphin (Tursiops truncatus) with or without LPS stimulation. A total of 849 upregulated and 384 downregulated EST clones were sequenced, annotated, and functionally classified. Ferritin heavy peptide I was the most abundant upregulated transcript, suggesting that LPS stimulation induced high production of reactive oxygen species, which were sequestered in ferritin. Among the immune factors, the transcripts coding for an IL-1Ra, homologs to bovine serum amyloid A3, and canine intercellular adhesion molecule-1 were highly expressed. Markedly downregulated transcripts of immune factors were those for homologs of calcium-binding proteins belonging to the S100 family, S100A12, S100A8, and S100A6. Time-course experiments on the expression of some immune factors including IL-1Ra suggested that these factors interact and control cetacean innate immunity. © 2011 The Societies and Blackwell Publishing Asia Pty Ltd.

  20. Accelerated Evolution of the Pituitary Adenylate Cyclase-Activating Polypeptide Precursor Gene During Human Origin

    PubMed Central

    Wang, Yin-qiu; Qian, Ya-ping; Yang, Su; Shi, Hong; Liao, Cheng-hong; Zheng, Hong-Kun; Wang, Jun; Lin, Alice A.; Cavalli-Sforza, L. Luca; Underhill, Peter A.; Chakraborty, Ranajit; Jin, Li; Su, Bing

    2005-01-01

    Pituitary adenylate cyclase-activating polypeptide (PACAP) is a neuropeptide abundantly expressed in the central nervous system and involved in regulating neurogenesis and neuronal signal transduction. The amino acid sequence of PACAP is extremely conserved across vertebrate species, indicating a strong functional constraint during the course of evolution. However, through comparative sequence analysis, we demonstrated that the PACAP precursor gene underwent an accelerated evolution in the human lineage since the divergence from chimpanzees, and the amino acid substitution rate in humans is at least seven times faster than that in other mammal species resulting from strong Darwinian positive selection. Eleven human-specific amino acid changes were identified in the PACAP precursors, which are conserved from murine to African apes. Protein structural analysis suggested that a putative novel neuropeptide might have originated during human evolution and functioned in the human brain. Our data suggested that the PACAP precursor gene underwent adaptive changes during human origin and may have contributed to the formation of human cognition. PMID:15834139

  1. Elucidation of cross-species proteomic effects in human and hominin bone proteome identification through a bioinformatics experiment.

    PubMed

    Welker, F

    2018-02-20

    The study of ancient protein sequences is increasingly focused on the analysis of older samples, including those of ancient hominins. The analysis of such ancient proteomes thereby potentially suffers from "cross-species proteomic effects": the loss of peptide and protein identifications at increased evolutionary distances due to a larger number of protein sequence differences between the database sequence and the analyzed organism. Error-tolerant proteomic search algorithms should theoretically overcome this problem at both the peptide and protein level; however, this has not been demonstrated. If error-tolerant searches do not overcome the cross-species proteomic issue then there might be inherent biases in the identified proteomes. Here, a bioinformatics experiment is performed to test this using a set of modern human bone proteomes and three independent searches against sequence databases at increasing evolutionary distances: the human (0 Ma), chimpanzee (6-8 Ma) and orangutan (16-17 Ma) reference proteomes, respectively. Incorrectly suggested amino acid substitutions are absent when employing adequate filtering criteria for mutable Peptide Spectrum Matches (PSMs), but roughly half of the mutable PSMs were not recovered. As a result, peptide and protein identification rates are higher in error-tolerant mode compared to non-error-tolerant searches but did not recover protein identifications completely. Data indicates that peptide length and the number of mutations between the target and database sequences are the main factors influencing mutable PSM identification. The error-tolerant results suggest that the cross-species proteomics problem is not overcome at increasing evolutionary distances, even at the protein level. Peptide and protein loss has the potential to significantly impact divergence dating and proteome comparisons when using ancient samples as there is a bias towards the identification of conserved sequences and proteins. Effects are minimized between moderately divergent proteomes, as indicated by almost complete recovery of informative positions in the search against the chimpanzee proteome (≈90%, 6-8 Ma). This provides a bioinformatic background to future phylogenetic and proteomic analysis of ancient hominin proteomes, including the future description of novel hominin amino acid sequences, but also has negative implications for the study of fast-evolving proteins in hominins, non-hominin animals, and ancient bacterial proteins in evolutionary contexts.

  2. Multilocus sequence typing of Lactococcus lactis from naturally fermented milk foods in ethnic minority areas of China.

    PubMed

    Xu, Haiyan; Sun, Zhihong; Liu, Wenjun; Yu, Jie; Song, Yuqin; Lv, Qiang; Zhang, Jiachao; Shao, Yuyu; Menghe, Bilige; Zhang, Heping

    2014-05-01

    To determine the genetic diversity and phylogenetic relationships among Lactococcus lactis isolates, 197 strains isolated from naturally homemade yogurt in 9 ethnic minority areas of 6 provinces of China were subjected to multilocus sequence typing (MLST). The MLST analysis was performed using internal fragment sequences of 12 housekeeping genes (carB, clpX, dnaA, groEL, murC, murE, pepN, pepX, pyrG, recA, rpoB, and pheS). Six (dnaA) to 8 (murC) different alleles were detected for these genes, which ranged from 33.62 (clpX) to 41.95% (recA) GC (guanine-cytosine) content. The nucleotide diversity (π) ranged from 0.00362 (murE) to 0.08439 (carB). Despite this limited allelic diversity, the allele combinations of each strain revealed 72 different sequence types, which denoted significant genotypic diversity. The dN/dS ratios (where dS is the number of synonymous substitutions per synonymous site, and dN is the number of nonsynonymous substitutions per nonsynonymous site) were lower than 1, suggesting potential negative selection for these genes. The standardized index of association of the alleles IA(S)=0.3038 supported the clonality of Lc. lactis, but the presence of network structure revealed by the split decomposition analysis of the concatenated sequence was strong evidence for intraspecies recombination. Therefore, this suggests that recombination contributed to the evolution of Lc. lactis. A minimum spanning tree analysis of the 197 isolates identified 14 clonal complexes and 23 singletons. Phylogenetic trees were constructed based on the sequence types, using the minimum evolution algorithm, and on the concatenated sequence (6,192 bp), using the unweighted pair-group method with arithmetic mean, and these trees indicated that the evolution of our Lc. lactis population was correlated with geographic origin. Taken together, our results demonstrated that MLST could provide a better understanding of Lc. lactis genome evolution, as well as useful information for future studies on global Lc. lactis structure and genetic evolution, which will lay the foundation for screening Lc. lactis as starter cultures in fermented dairy products. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  3. A comparative molecular analysis of water-filled limestone sinkholes in north-eastern Mexico.

    PubMed

    Sahl, Jason W; Gary, Marcus O; Harris, J Kirk; Spear, John R

    2011-01-01

    Sistema Zacatón in north-eastern Mexico is host to several deep, water-filled, anoxic, karstic sinkholes (cenotes). These cenotes were explored, mapped, and geochemically and microbiologically sampled by the autonomous underwater vehicle deep phreatic thermal explorer (DEPTHX). The community structure of the filterable fraction of the water column and extensive microbial mats that coat the cenote walls was investigated by comparative analysis of small-subunit (SSU) 16S rRNA gene sequences. Full-length Sanger gene sequence analysis revealed novel microbial diversity that included three putative bacterial candidate phyla and three additional groups that showed high intra-clade distance with poorly characterized bacterial candidate phyla. Limited functional gene sequence analysis in these anoxic environments identified genes associated with methanogenesis, sulfate reduction and anaerobic ammonium oxidation. A directed, barcoded amplicon, multiplex pyrosequencing approach was employed to compare ∼100,000 bacterial SSU gene sequences from water column and wall microbial mat samples from five cenotes in Sistema Zacatón. A new, high-resolution sequence distribution profile (SDP) method identified changes in specific phylogenetic types (phylotypes) in microbial mats at varied depths; Mantel tests showed a correlation of the genetic distances between mat communities in two cenotes and the geographic location of each cenote. Community structure profiles from the water column of three neighbouring cenotes showed distinct variation; statistically significant differences in the concentration of geochemical constituents suggest that the variation observed in microbial communities between neighbouring cenotes are due to geochemical variation. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.

  4. Human T-lymphotropic virus type 1 (HTLV-1) genetic typing in Kakeroma Island, an island at the crossroads of the ryukyuans and Wajin in Japan, providing further insights into the origin of the virus in Japan.

    PubMed

    Eguchi, Katsuyuki; Fujii, Hidefumi; Oshima, Kengo; Otani, Masashi; Matsuo, Toshiaki; Yamamoto, Taro

    2009-08-01

    Peripheral blood samples were collected from 23 human T-lymphotropic virus type-1 (HTLV-1) carriers residing in Kakeroma Island, Japan (Kagoshima Prefecture, Oshima County, Setouchi Town), one of the most highly endemic areas in Japan. The samples were subjected to amplification by PCR and sequencing of the Long Terminal Repeat in order to reconstruct a phylogenetic tree of HTLV-1 isolates. Restriction Fragment Length Polymorphism (RFLP) analysis of env region was also conducted for subgrouping of HTLV-1. Although one sample could not be amplified by PCR, and three more could not be sequenced due to the existence of conspicuous nonspecific bands or repeated sequences, the phylogenetic analysis revealed that the remaining 19 isolates obtained from Kakeroma Island belonged to either the Transcontinental or the Japanese subgroups of the Cosmopolitan subtype, one of the three major subtypes. The RFLP data corresponded closely with the typing data throughout the sequencing. The proportion of the Transcontinental subgroup among the isolates was 26.3% (5 of 19) by sequence analysis and 27.3% (6 of 22) by RFLP. Unlike in Taiwan, China and Okinawa, the Japanese subgroup was dominant in Kakeroma Island. The analysis would also suggest that the Japanese subgroup seems not to have derived from the Transcontinental subgroup, but rather that the Transcontinental subgroup came to Japan first and was followed later by the Japanese one. 2009 Wiley-Liss, Inc.

  5. Evolutionary interpretations of mycobacteriophage biodiversity and host-range through the analysis of codon usage bias.

    PubMed

    Esposito, Lauren A; Gupta, Swati; Streiter, Fraida; Prasad, Ashley; Dennehy, John J

    2016-10-01

    In an genomics course sponsored by the Howard Hughes Medical Institute (HHMI), undergraduate students have isolated and sequenced the genomes of more than 1,150 mycobacteriophages, creating the largest database of sequenced bacteriophages able to infect a single host, Mycobacterium smegmatis , a soil bacterium. Genomic analysis indicates that these mycobacteriophages can be grouped into 26 clusters based on genetic similarity. These clusters span a continuum of genetic diversity, with extensive genomic mosaicism among phages in different clusters. However, little is known regarding the primary hosts of these mycobacteriophages in their natural habitats, nor of their broader host ranges. As such, it is possible that the primary host of many newly isolated mycobacteriophages is not M. smegmatis , but instead a range of closely related bacterial species. However, determining mycobacteriophage host range presents difficulties associated with mycobacterial cultivability, pathogenicity and growth. Another way to gain insight into mycobacteriophage host range and ecology is through bioinformatic analysis of their genomic sequences. To this end, we examined the correlations between the codon usage biases of 199 different mycobacteriophages and those of several fully sequenced mycobacterial species in order to gain insight into the natural host range of these mycobacteriophages. We find that UPGMA clustering tends to match, but not consistently, clustering by shared nucleotide sequence identify. In addition, analysis of GC content, tRNA usage and correlations between mycobacteriophage and mycobacterial codon usage bias suggests that the preferred host of many clustered mycobacteriophages is not M. smegmatis but other, as yet unknown, members of the mycobacteria complex or closely allied bacterial species.

  6. Evolutionary interpretations of mycobacteriophage biodiversity and host-range through the analysis of codon usage bias

    PubMed Central

    Esposito, Lauren A.; Gupta, Swati; Streiter, Fraida; Prasad, Ashley

    2016-01-01

    In an genomics course sponsored by the Howard Hughes Medical Institute (HHMI), undergraduate students have isolated and sequenced the genomes of more than 1,150 mycobacteriophages, creating the largest database of sequenced bacteriophages able to infect a single host, Mycobacterium smegmatis, a soil bacterium. Genomic analysis indicates that these mycobacteriophages can be grouped into 26 clusters based on genetic similarity. These clusters span a continuum of genetic diversity, with extensive genomic mosaicism among phages in different clusters. However, little is known regarding the primary hosts of these mycobacteriophages in their natural habitats, nor of their broader host ranges. As such, it is possible that the primary host of many newly isolated mycobacteriophages is not M. smegmatis, but instead a range of closely related bacterial species. However, determining mycobacteriophage host range presents difficulties associated with mycobacterial cultivability, pathogenicity and growth. Another way to gain insight into mycobacteriophage host range and ecology is through bioinformatic analysis of their genomic sequences. To this end, we examined the correlations between the codon usage biases of 199 different mycobacteriophages and those of several fully sequenced mycobacterial species in order to gain insight into the natural host range of these mycobacteriophages. We find that UPGMA clustering tends to match, but not consistently, clustering by shared nucleotide sequence identify. In addition, analysis of GC content, tRNA usage and correlations between mycobacteriophage and mycobacterial codon usage bias suggests that the preferred host of many clustered mycobacteriophages is not M. smegmatis but other, as yet unknown, members of the mycobacteria complex or closely allied bacterial species. PMID:28348827

  7. A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.

    PubMed

    Mehmood, Tahir; Bohlin, Jon; Snipen, Lars

    2015-01-01

    The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.

  8. Whole genome analysis of porcine astroviruses detected in Japanese pigs reveals genetic diversity and possible intra-genotypic recombination.

    PubMed

    Ito, Mika; Kuroda, Moegi; Masuda, Tsuneyuki; Akagami, Masataka; Haga, Kei; Tsuchiaka, Shinobu; Kishimoto, Mai; Naoi, Yuki; Sano, Kaori; Omatsu, Tsutomu; Katayama, Yukie; Oba, Mami; Aoki, Hiroshi; Ichimaru, Toru; Mukono, Itsuro; Ouchi, Yoshinao; Yamasato, Hiroshi; Shirai, Junsuke; Katayama, Kazuhiko; Mizutani, Tetsuya; Nagai, Makoto

    2017-06-01

    Porcine astroviruses (PoAstVs) are ubiquitous enteric virus of pigs that are distributed in several countries throughout the world. Since PoAstVs are detected in apparent healthy pigs, the clinical significance of infection is unknown. However, AstVs have recently been associated with a severe neurological disorder in animals, including humans, and zoonotic potential has been suggested. To date, little is known about the epidemiology of PoAstVs among the pig population in Japan. In this report, we present an analysis of nearly complete genomes of 36 PoAstVs detected by a metagenomics approach in the feces of Japanese pigs. Based on a phylogenetic analysis and pairwise sequence comparison, 10, 5, 15, and 6 sequences were classified as PoAstV2, PoAstV3, PoAstV4, and PoAstV5, respectively. Co-infection with two or three strains was found in individual fecal samples from eight pigs. The phylogenetic trees of ORF1a, ORF1b, and ORF2 of PoAstV2 and PoAstV4 showed differences in their topologies. The PoAstV3 and PoAstV5 strains shared high sequence identities within each genotype in all ORFs; however, one PoAstV3 strain and one PoAstV5 strain showed considerable sequence divergence from the other PoAstV3 and PoAstV5 strains, respectively, in ORF2. Recombination analysis using whole genomes revealed evidence of multiple possible intra-genotype recombination events in PoAstV2 and PoAstV4, suggesting that recombination might have contributed to the genetic diversity and played an important role in the evolution of Japanese PoAstVs. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

    PubMed

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-07-20

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Non-exomic and synonymous variants in ABCA4 are an important cause of Stargardt disease

    PubMed Central

    Braun, Terry A.; Mullins, Robert F.; Wagner, Alex H.; Andorf, Jeaneen L.; Johnston, Rebecca M.; Bakall, Benjamin B.; Deluca, Adam P.; Fishman, Gerald A.; Lam, Byron L.; Weleber, Richard G.; Cideciyan, Artur V.; Jacobson, Samuel G.; Sheffield, Val C.; Tucker, Budd A.; Stone, Edwin M.

    2013-01-01

    Mutations in ABCA4 cause Stargardt disease and other blinding autosomal recessive retinal disorders. However, sequencing of the complete coding sequence in patients with clinical features of Stargardt disease sometimes fails to detect one or both mutations. For example, among 208 individuals with clear clinical evidence of ABCA4 disease ascertained at a single institution, 28 had only one disease-causing allele identified in the exons and splice junctions of the primary retinal transcript of the gene. Haplotype analysis of these 28 probands revealed 3 haplotypes shared among ten families, suggesting that 18 of the 28 missing alleles were rare enough to be present only once in the cohort. We hypothesized that mutations near rare alternate splice junctions in ABCA4 might cause disease by increasing the probability of mis-splicing at these sites. Next-generation sequencing of RNA extracted from human donor eyes revealed more than a dozen alternate exons that are occasionally incorporated into the ABCA4 transcript in normal human retina. We sequenced the genomic DNA containing 15 of these minor exons in the 28 one-allele subjects and observed five instances of two different variations in the splice signals of exon 36.1 that were not present in normal individuals (P < 10−6). Analysis of RNA obtained from the keratinocytes of patients with these mutations revealed the predicted alternate transcript. This study illustrates the utility of RNA sequence analysis of human donor tissue and patient-derived cell lines to identify mutations that would be undetectable by exome sequencing. PMID:23918662

  11. Recurring sequence-structure motifs in (βα)8-barrel proteins and experimental optimization of a chimeric protein designed based on such motifs.

    PubMed

    Wang, Jichao; Zhang, Tongchuan; Liu, Ruicun; Song, Meilin; Wang, Juncheng; Hong, Jiong; Chen, Quan; Liu, Haiyan

    2017-02-01

    An interesting way of generating novel artificial proteins is to combine sequence motifs from natural proteins, mimicking the evolutionary path suggested by natural proteins comprising recurring motifs. We analyzed the βα and αβ modules of TIM barrel proteins by structure alignment-based sequence clustering. A number of preferred motifs were identified. A chimeric TIM was designed by using recurring elements as mutually compatible interfaces. The foldability of the designed TIM protein was then significantly improved by six rounds of directed evolution. The melting temperature has been improved by more than 20°C. A variety of characteristics suggested that the resulting protein is well-folded. Our analysis provided a library of peptide motifs that is potentially useful for different protein engineering studies. The protein engineering strategy of using recurring motifs as interfaces to connect partial natural proteins may be applied to other protein folds. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Characterization of Chemosynthetic Microbial Mats Associated with Intertidal Hydrothermal Sulfur Vents in White Point, San Pedro, CA, USA

    PubMed Central

    Miranda, Priscilla J.; McLain, Nathan K.; Hatzenpichler, Roland; Orphan, Victoria J.; Dillon, Jesse G.

    2016-01-01

    The shallow-sea hydrothermal vents at White Point (WP) in Palos Verdes on the southern California coast support microbial mats and provide easily accessed settings in which to study chemolithoautotrophic sulfur cycling. Previous studies have cultured sulfur-oxidizing bacteria from the WP mats; however, almost nothing is known about the in situ diversity and activity of the microorganisms in these habitats. We studied the diversity, micron-scale spatial associations and metabolic activity of the mat community via sequence analysis of 16S rRNA and aprA genes, fluorescence in situ hybridization (FISH) microscopy and sulfate reduction rate (SRR) measurements. Sequence analysis revealed a diverse group of bacteria, dominated by sulfur cycling gamma-, epsilon-, and deltaproteobacterial lineages such as Marithrix, Sulfurovum, and Desulfuromusa. FISH microscopy suggests a close physical association between sulfur-oxidizing and sulfur-reducing genotypes, while radiotracer studies showed low, but detectable, SRR. Comparative 16S rRNA gene sequence analyses indicate the WP sulfur vent microbial mat community is similar, but distinct from other hydrothermal vent communities representing a range of biotopes and lithologic settings. These findings suggest a complete biological sulfur cycle is operating in the WP mat ecosystem mediated by diverse bacterial lineages, with some similarity with deep-sea hydrothermal vent communities. PMID:27512390

  13. Sequence analyses reveal that a TPR–DP module, surrounded by recombinable flanking introns, could be at the origin of eukaryotic Hop and Hip TPR–DP domains and prokaryotic GerD proteins

    PubMed Central

    Papandreou, Nikolaos; Chomilier, Jacques

    2008-01-01

    The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR–DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR–DP domains. Electronic supplementary material The online version of this article (doi:10.1007/s12192-008-0083-8) contains supplementary material, which is available to authorized users. PMID:18987995

  14. Identification and Characterization of an Acinetobacter baumannii Biofilm-Associated Protein▿

    PubMed Central

    Loehfelm, Thomas W.; Luke, Nicole R.; Campagnari, Anthony A.

    2008-01-01

    We have identified a homologue to the staphylococcal biofilm-associated protein (Bap) in a bloodstream isolate of Acinetobacter baumannii. The fully sequenced open reading frame is 25,863 bp and encodes a protein with a predicted molecular mass of 854 kDa. Analysis of the nucleotide sequence reveals a repetitive structure consistent with bacterial cell surface adhesins. Bap-specific monoclonal antibody (MAb) 6E3 was generated to an epitope conserved among 41% of A. baumannii strains isolated during a recent outbreak in the U.S. military health care system. Flow cytometry confirms that the MAb 6E3 epitope is surface exposed. Random transposon mutagenesis was used to generate A. baumannii bap1302::EZ-Tn5, a mutant negative for surface reactivity to MAb 6E3 in which the transposon disrupts the coding sequence of bap. Time course confocal laser scanning microscopy and three-dimensional image analysis of actively growing biofilms demonstrates that this mutant is unable to sustain biofilm thickness and volume, suggesting a role for Bap in supporting the development of the mature biofilm structure. This is the first identification of a specific cell surface protein directly involved in biofilm formation by A. baumannii and suggests that Bap is involved in intercellular adhesion within the mature biofilm. PMID:18024522

  15. A Comprehensive Analysis of Transcript-Supported De Novo Genes in Saccharomyces sensu stricto Yeasts

    PubMed Central

    Lu, Tzu-Chiao; Leu, Jun-Yi; Lin, Wen-Chang

    2017-01-01

    Abstract Novel genes arising from random DNA sequences (de novo genes) have been suggested to be widespread in the genomes of different organisms. However, our knowledge about the origin and evolution of de novo genes is still limited. To systematically understand the general features of de novo genes, we established a robust pipeline to analyze >20,000 transcript-supported coding sequences (CDSs) from the budding yeast Saccharomyces cerevisiae. Our analysis pipeline combined phylogeny, synteny, and sequence alignment information to identify possible orthologs across 20 Saccharomycetaceae yeasts and discovered 4,340 S. cerevisiae-specific de novo genes and 8,871 S. sensu stricto-specific de novo genes. We further combine information on CDS positions and transcript structures to show that >65% of de novo genes arose from transcript isoforms of ancient genes, especially in the upstream and internal regions of ancient genes. Fourteen identified de novo genes with high transcript levels were chosen to verify their protein expressions. Ten of them, including eight transcript isoform-associated CDSs, showed translation signals and five proteins exhibited specific cytosolic localizations. Our results suggest that de novo genes frequently arise in the S. sensu stricto complex and have the potential to be quickly integrated into ancient cellular network. PMID:28981695

  16. Genomic insights into the taxonomic status of the Bacillus cereus group

    PubMed Central

    Liu, Yang; Lai, Qiliang; Göker, Markus; Meier-Kolthoff, Jan P.; Wang, Meng; Sun, Yamin; Wang, Lei; Shao, Zongze

    2015-01-01

    The identification and phylogenetic relationships of bacteria within the Bacillus cereus group are controversial. This study aimed at determining the taxonomic affiliations of these strains using the whole-genome sequence-based Genome BLAST Distance Phylogeny (GBDP) approach. The GBDP analysis clearly separated 224 strains into 30 clusters, representing eleven known, partially merged species and accordingly 19–20 putative novel species. Additionally, 16S rRNA gene analysis, a novel variant of multi-locus sequence analysis (nMLSA) and screening of virulence genes were performed. The 16S rRNA gene sequence was not sufficient to differentiate the bacteria within this group due to its high conservation. The nMLSA results were consistent with GBDP. Moreover, a fast typing method was proposed using the pycA gene, and where necessary, the ccpA gene. The pXO plasmids and cry genes were widely distributed, suggesting little correlation with the phylogenetic positions of the host bacteria. This might explain why classifications based on virulence characteristics proved unsatisfactory in the past. In summary, this is the first large-scale and systematic study of the taxonomic status of the bacteria within the B. cereus group using whole-genome sequences, and is likely to contribute to further insights into their pathogenicity, phylogeny and adaptation to diverse environments. PMID:26373441

  17. Genetic diversity and molecular evolution of Naga King Chili inferred from internal transcribed spacer sequence of nuclear ribosomal DNA.

    PubMed

    Kehie, Mechuselie; Kumaria, Suman; Devi, Khumuckcham Sangeeta; Tandon, Pramod

    2016-02-01

    Sequences of the Internal Transcribed Spacer (ITS1-5.8S-ITS2) of nuclear ribosomal DNAs were explored to study the genetic diversity and molecular evolution of Naga King Chili. Our study indicated the occurrence of nucleotide polymorphism and haplotypic diversity in the ITS regions. The present study demonstrated that the variability of ITS1 with respect to nucleotide diversity and sequence polymorphism exceeded that of ITS2. Sequence analysis of 5.8S gene revealed a much conserved region in all the accessions of Naga King Chili. However, strong phylogenetic information of this species is the distinct 13 bp deletion in the 5.8S gene which discriminated Naga King Chili from the rest of the Capsicum sp. Neutrality test results implied a neutral variation, and population seems to be evolving at drift-mutation equilibrium and free from directed selection pressure. Furthermore, mismatch analysis showed multimodal curve indicating a demographic equilibrium. Phylogenetic relationships revealed by Median Joining Network (MJN) analysis denoted a clear discrimination of Naga King Chili from its closest sister species (Capsicum chinense and Capsicum frutescens). The absence of star-like network of haplotypes suggested an ancient population expansion of this chili.

  18. Identification and Characterization of Novel Surface Proteins in Lactobacillus johnsonii and Lactobacillus gasseri

    PubMed Central

    Ventura, Marco; Jankovic, Ivana; Walker, D. Carey; Pridmore, R. David; Zink, Ralf

    2002-01-01

    We have identified and sequenced the genes encoding the aggregation-promoting factor (APF) protein from six different strains of Lactobacillus johnsonii and Lactobacillus gasseri. Both species harbor two apf genes, apf1 and apf2, which are in the same orientation and encode proteins of 257 to 326 amino acids. Multiple alignments of the deduced amino acid sequences of these apf genes demonstrate a very strong sequence conservation of all of the genes with the exception of their central regions. Northern blot analysis showed that both genes are transcribed, reaching their maximum expression during the exponential phase. Primer extension analysis revealed that apf1 and apf2 harbor a putative promoter sequence that is conserved in all of the genes. Western blot analysis of the LiCl cell extracts showed that APF proteins are located on the cell surface. Intact cells of L. johnsonii revealed the typical cell wall architecture of S-layer-carrying gram-positive eubacteria, which could be selectively removed with LiCl treatment. In addition, the amino acid composition, physical properties, and genetic organization were found to be quite similar to those of S-layer proteins. These results suggest that APF is a novel surface protein of the Lactobacillus acidophilus B-homology group which might belong to an S-layer-like family. PMID:12450842

  19. Combined sequence and structure analysis of the fungal laccase family.

    PubMed

    Kumar, S V Suresh; Phale, Prashant S; Durani, S; Wangikar, Pramod P

    2003-08-20

    Plant and fungal laccases belong to the family of multi-copper oxidases and show much broader substrate specificity than other members of the family. Laccases have consequently been of interest for potential industrial applications. We have analyzed the essential sequence features of fungal laccases based on multiple sequence alignments of more than 100 laccases. This has resulted in identification of a set of four ungapped sequence regions, L1-L4, as the overall signature sequences that can be used to identify the laccases, distinguishing them within the broader class of multi-copper oxidases. The 12 amino acid residues in the enzymes serving as the copper ligands are housed within these four identified conserved regions, of which L2 and L4 conform to the earlier reported copper signature sequences of multi-copper oxidases while L1 and L3 are distinctive to the laccases. The mapping of regions L1-L4 on to the three-dimensional structure of the Coprinus cinerius laccase indicates that many of the non-copper-ligating residues of the conserved regions could be critical in maintaining a specific, more or less C-2 symmetric, protein conformational motif characterizing the active site apparatus of the enzymes. The observed intraprotein homologies between L1 and L3 and between L2 and L4 at both the structure and the sequence levels suggest that the quasi C-2 symmetric active site conformational motif may have arisen from a structural duplication event that neither the sequence homology analysis nor the structure homology analysis alone would have unraveled. Although the sequence and structure homology is not detectable in the rest of the protein, the relative orientation of region L1 with L2 is similar to that of L3 with L4. The structure duplication of first-shell and second-shell residues has become cryptic because the intraprotein sequence homology noticeable for a given laccase becomes significant only after comparing the conservation pattern in several fungal laccases. The identified motifs, L1-L4, can be useful in searching the newly sequenced genomes for putative laccase enzymes. Copyright 2003 Wiley Periodicals, Inc. Biotechnol Bioeng 83: 386-394, 2003.

  20. A Phylogenetic and Phenotypic Analysis of Salmonella enterica Serovar Weltevreden, an Emerging Agent of Diarrheal Disease in Tropical Regions

    PubMed Central

    Makendi, Carine; Page, Andrew J.; Wren, Brendan W.; Le Thi Phuong, Tu; Clare, Simon; Hale, Christine; Goulding, David; Klemm, Elizabeth J.; Pickard, Derek; Okoro, Chinyere; Hunt, Martin; Thompson, Corinne N.; Phu Huong Lan, Nguyen; Tran Do Hoang, Nhu; Thwaites, Guy E.; Le Hello, Simon; Brisabois, Anne; Weill, François-Xavier; Baker, Stephen; Dougan, Gordon

    2016-01-01

    Salmonella enterica serovar Weltevreden (S. Weltevreden) is an emerging cause of diarrheal and invasive disease in humans residing in tropical regions. Despite the regional and international emergence of this Salmonella serovar, relatively little is known about its genetic diversity, genomics or virulence potential in model systems. Here we used whole genome sequencing and bioinformatics analyses to define the phylogenetic structure of a diverse global selection of S. Weltevreden. Phylogenetic analysis of more than 100 isolates demonstrated that the population of S. Weltevreden can be segregated into two main phylogenetic clusters, one associated predominantly with continental Southeast Asia and the other more internationally dispersed. Subcluster analysis suggested the local evolution of S. Weltevreden within specific geographical regions. Four of the isolates were sequenced using long read sequencing to produce high quality reference genomes. Phenotypic analysis in Hep-2 cells and in a murine infection model indicated that S. Weltevreden were significantly attenuated in these models compared to the classical S. Typhimurium reference strain SL1344. Our work outlines novel insights into this important emerging pathogen and provides a baseline understanding for future research studies. PMID:26867150

  1. Origin of pitcher plant mosquitoes in Aedes (Stegomyia): a molecular phylogenetic analysis using mitochondrial and nuclear gene sequences.

    PubMed

    Sota, Teiji; Mogi, Motoyoshi

    2006-09-01

    Two mosquito species of the subgenus Stegomyia (genus Aedes) (Diptera: Culicidae) on the islands of Palau and Yap (Aedes dybasi Bohart and Aedes maehleri Bohart) are adapted to aquatic habitats occupied by Nepenthes pitcher plants. To reveal the origin of these pitcher plant mosquitoes, we attempted a molecular phylogenetic analysis with 11 Stegomyia species by using sequence data from mitochondrial cytochrome oxidase subunit I and 16SrRNA genes as well as the nuclear 28SrRNA gene. Ae. dybasi, a pitcher plant specialist, was sister to Aedes palauensis Bohart within the scutellaris group from the same islands. Ae. maehleri, an opportunistic pitcher plant mosquito, was in a distinct lineage related to the scutellaris group. The adaptation to pitcher plants could have occurred independently in these two species, and recent differentiation of the pitcher plant mosquito Ae. dybasi from the nonpitcher plant mosquito Ae. palauensis was suggested by a relatively small sequence divergence between these species. We also discuss the implications of this analysis for the phylogeny of some other Stegomyia species.

  2. Preliminary report for analysis of genome wide mutations from four ciprofloxacin resistant B. anthracis Sterne isolates generated by Illumina, 454 sequencing and microarrays for DHS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jaing, Crystal; Vergez, Lisa; Hinckley, Aubree

    2011-06-21

    The objective of this project is to provide DHS a comprehensive evaluation of the current genomic technologies including genotyping, Taqman PCR, multiple locus variable tandem repeat analysis (MLVA), microarray and high-throughput DNA sequencing in the analysis of biothreat agents from complex environmental samples. As the result of a different DHS project, we have selected for and isolated a large number of ciprofloxacin resistant B. anthracis Sterne isolates. These isolates vary in the concentrations of ciprofloxacin that they can tolerate, suggesting multiple mutations in the samples. In collaboration with University of Houston, Eureka Genomics and Oak Ridge National Laboratory, we analyzedmore » the ciprofloxacin resistant B. anthracis Sterne isolates by microarray hybridization, Illumina and Roche 454 sequencing to understand the error rates and sensitivity of the different methods. The report provides an assessment of the results and a complete set of all protocols used and all data generated along with information to interpret the protocols and data sets.« less

  3. Illumina sequencing-based analysis of a microbial community enriched under anaerobic methane oxidation condition coupled to denitrification revealed coexistence of aerobic and anaerobic methanotrophs.

    PubMed

    Siniscalchi, Luciene Alves Batista; Leite, Laura Rabelo; Oliveira, Guilherme; Chernicharo, Carlos Augusto Lemos; de Araújo, Juliana Calabria

    2017-07-01

    Methane is produced in anaerobic environments, such as reactors used to treat wastewaters, and can be consumed by methanotrophs. The composition and structure of a microbial community enriched from anaerobic sewage sludge under methane-oxidation condition coupled to denitrification were investigated. Denaturing gradient gel electrophoresis (DGGE) analysis retrieved sequences of Methylocaldum and Chloroflexi. Deep sequencing analysis revealed a complex community that changed over time and was affected by methane concentration. Methylocaldum (8.2%), Methylosinus (2.3%), Methylomonas (0.02%), Methylacidiphilales (0.45%), Nitrospirales (0.18%), and Methanosarcinales (0.3%) were detected. Despite denitrifying conditions provided, Nitrospirales and Methanosarcinales, known to perform anaerobic methane oxidation coupled to denitrification (DAMO) process, were in very low abundance. Results demonstrated that aerobic and anaerobic methanotrophs coexisted in the reactor together with heterotrophic microorganisms, suggesting that a diverse microbial community was important to sustain methanotrophic activity. The methanogenic sludge was a good inoculum to enrich methanotrophs, and cultivation conditions play a selective role in determining community composition.

  4. Qualitative and quantitative assessment of Illumina's forensic STR and SNP kits on MiSeq FGx™.

    PubMed

    Sharma, Vishakha; Chow, Hoi Yan; Siegel, Donald; Wurmbach, Elisa

    2017-01-01

    Massively parallel sequencing (MPS) is a powerful tool transforming DNA analysis in multiple fields ranging from medicine, to environmental science, to evolutionary biology. In forensic applications, MPS offers the ability to significantly increase the discriminatory power of human identification as well as aid in mixture deconvolution. However, before the benefits of any new technology can be employed, a thorough evaluation of its quality, consistency, sensitivity, and specificity must be rigorously evaluated in order to gain a detailed understanding of the technique including sources of error, error rates, and other restrictions/limitations. This extensive study assessed the performance of Illumina's MiSeq FGx MPS system and ForenSeq™ kit in nine experimental runs including 314 reaction samples. In-depth data analysis evaluated the consequences of different assay conditions on test results. Variables included: sample numbers per run, targets per run, DNA input per sample, and replications. Results are presented as heat maps revealing patterns for each locus. Data analysis focused on read numbers (allele coverage), drop-outs, drop-ins, and sequence analysis. The study revealed that loci with high read numbers performed better and resulted in fewer drop-outs and well balanced heterozygous alleles. Several loci were prone to drop-outs which led to falsely typed homozygotes and therefore to genotype errors. Sequence analysis of allele drop-in typically revealed a single nucleotide change (deletion, insertion, or substitution). Analyses of sequences, no template controls, and spurious alleles suggest no contamination during library preparation, pooling, and sequencing, but indicate that sequencing or PCR errors may have occurred due to DNA polymerase infidelities. Finally, we found utilizing Illumina's FGx System at recommended conditions does not guarantee 100% outcomes for all samples tested, including the positive control, and required manual editing due to low read numbers and/or allele drop-in. These findings are important for progressing towards implementation of MPS in forensic DNA testing.

  5. Can We Improve Structured Sequence Processing? Exploring the Direct and Indirect Effects of Computerized Training Using a Mediational Model

    PubMed Central

    Smith, Gretchen N. L.; Conway, Christopher M.; Bauernschmidt, Althea; Pisoni, David B.

    2015-01-01

    Recent research suggests that language acquisition may rely on domain-general learning abilities, such as structured sequence processing, which is the ability to extract, encode, and represent structured patterns in a temporal sequence. If structured sequence processing supports language, then it may be possible to improve language function by enhancing this foundational learning ability. The goal of the present study was to use a novel computerized training task as a means to better understand the relationship between structured sequence processing and language function. Participants first were assessed on pre-training tasks to provide baseline behavioral measures of structured sequence processing and language abilities. Participants were then quasi-randomly assigned to either a treatment group involving adaptive structured visuospatial sequence training, a treatment group involving adaptive non-structured visuospatial sequence training, or a control group. Following four days of sequence training, all participants were assessed with the same pre-training measures. Overall comparison of the post-training means revealed no group differences. However, in order to examine the potential relations between sequence training, structured sequence processing, and language ability, we used a mediation analysis that showed two competing effects. In the indirect effect, adaptive sequence training with structural regularities had a positive impact on structured sequence processing performance, which in turn had a positive impact on language processing. This finding not only identifies a potential novel intervention to treat language impairments but also may be the first demonstration that structured sequence processing can be improved and that this, in turn, has an impact on language processing. However, in the direct effect, adaptive sequence training with structural regularities had a direct negative impact on language processing. This unexpected finding suggests that adaptive training with structural regularities might potentially interfere with language processing. Taken together, these findings underscore the importance of pursuing designs that promote a better understanding of the mechanisms underlying training-related changes, so that regimens can be developed that help reduce these types of negative effects while simultaneously maximizing the benefits to outcome measures of interest. PMID:25946222

  6. Can we improve structured sequence processing? Exploring the direct and indirect effects of computerized training using a mediational model.

    PubMed

    Smith, Gretchen N L; Conway, Christopher M; Bauernschmidt, Althea; Pisoni, David B

    2015-01-01

    Recent research suggests that language acquisition may rely on domain-general learning abilities, such as structured sequence processing, which is the ability to extract, encode, and represent structured patterns in a temporal sequence. If structured sequence processing supports language, then it may be possible to improve language function by enhancing this foundational learning ability. The goal of the present study was to use a novel computerized training task as a means to better understand the relationship between structured sequence processing and language function. Participants first were assessed on pre-training tasks to provide baseline behavioral measures of structured sequence processing and language abilities. Participants were then quasi-randomly assigned to either a treatment group involving adaptive structured visuospatial sequence training, a treatment group involving adaptive non-structured visuospatial sequence training, or a control group. Following four days of sequence training, all participants were assessed with the same pre-training measures. Overall comparison of the post-training means revealed no group differences. However, in order to examine the potential relations between sequence training, structured sequence processing, and language ability, we used a mediation analysis that showed two competing effects. In the indirect effect, adaptive sequence training with structural regularities had a positive impact on structured sequence processing performance, which in turn had a positive impact on language processing. This finding not only identifies a potential novel intervention to treat language impairments but also may be the first demonstration that structured sequence processing can be improved and that this, in turn, has an impact on language processing. However, in the direct effect, adaptive sequence training with structural regularities had a direct negative impact on language processing. This unexpected finding suggests that adaptive training with structural regularities might potentially interfere with language processing. Taken together, these findings underscore the importance of pursuing designs that promote a better understanding of the mechanisms underlying training-related changes, so that regimens can be developed that help reduce these types of negative effects while simultaneously maximizing the benefits to outcome measures of interest.

  7. Large-scale collection of full-length cDNA and transcriptome analysis in Hevea brasiliensis

    PubMed Central

    Makita, Yuko; Ng, Kiaw Kiaw; Veera Singham, G.; Kawashima, Mika; Hirakawa, Hideki; Sato, Shusei

    2017-01-01

    Abstract Natural rubber has unique physical properties that cannot be replaced by products from other latex-producing plants or petrochemically produced synthetic rubbers. Rubber from Hevea brasiliensis is the main commercial source for this natural rubber that has a cis-polyisoprene configuration. For sustainable production of enough rubber to meet demand elucidation of the molecular mechanisms involved in the production of latex is vital. To this end, we firstly constructed rubber full-length cDNA libraries of RRIM 600 cultivar and sequenced around 20,000 clones by the Sanger method and over 15,000 contigs by Illumina sequencer. With these data, we updated around 5,500 gene structures and newly annotated around 9,500 transcription start sites. Second, to elucidate the rubber biosynthetic pathways and their transcriptional regulation, we carried out tissue- and cultivar-specific RNA-Seq analysis. By using our recently published genome sequence, we confirmed the expression patterns of the rubber biosynthetic genes. Our data suggest that the cytoplasmic mevalonate (MVA) pathway is the main route for isoprenoid biosynthesis in latex production. In addition to the well-studied polymerization factors, we suggest that rubber elongation factor 8 (REF8) is a candidate factor in cis-polyisoprene biosynthesis. We have also identified 39 transcription factors that may be key regulators in latex production. Expression profile analysis using two additional cultivars, RRIM 901 and PB 350, via an RNA-Seq approach revealed possible expression differences between a high latex-yielding cultivar and a disease-resistant cultivar. PMID:28431015

  8. Defining objective clusters for rabies virus sequences using affinity propagation clustering

    PubMed Central

    Fischer, Susanne; Freuling, Conrad M.; Pfaff, Florian; Bodenhofer, Ulrich; Höper, Dirk; Fischer, Mareike; Marston, Denise A.; Fooks, Anthony R.; Mettenleiter, Thomas C.; Conraths, Franz J.; Homeier-Bachmann, Timo

    2018-01-01

    Rabies is caused by lyssaviruses, and is one of the oldest known zoonoses. In recent years, more than 21,000 nucleotide sequences of rabies viruses (RABV), from the prototype species rabies lyssavirus, have been deposited in public databases. Subsequent phylogenetic analyses in combination with metadata suggest geographic distributions of RABV. However, these analyses somewhat experience technical difficulties in defining verifiable criteria for cluster allocations in phylogenetic trees inviting for a more rational approach. Therefore, we applied a relatively new mathematical clustering algorythm named ‘affinity propagation clustering’ (AP) to propose a standardized sub-species classification utilizing full-genome RABV sequences. Because AP has the advantage that it is computationally fast and works for any meaningful measure of similarity between data samples, it has previously been applied successfully in bioinformatics, for analysis of microarray and gene expression data, however, cluster analysis of sequences is still in its infancy. Existing (516) and original (46) full genome RABV sequences were used to demonstrate the application of AP for RABV clustering. On a global scale, AP proposed four clusters, i.e. New World cluster, Arctic/Arctic-like, Cosmopolitan, and Asian as previously assigned by phylogenetic studies. By combining AP with established phylogenetic analyses, it is possible to resolve phylogenetic relationships between verifiably determined clusters and sequences. This workflow will be useful in confirming cluster distributions in a uniform transparent manner, not only for RABV, but also for other comparative sequence analyses. PMID:29357361

  9. Intestinal flora of FAP patients containing APC-like sequences.

    PubMed

    Hainova, K; Adamcikova, Z; Ciernikova, S; Stevurkova, V; Tyciakova, S; Zajac, V

    2014-01-01

    Colorectal cancer mortality is one of the most common cause of cancer-related mortality. A multiple risk factors are associated with colorectal cancer, including hereditary, enviromental and inflammatory syndromes affecting the gastrointestinal tract. Familial adenomatous polyposis (FAP) is characterized by the emergence of hundreds to thousands of colorectal adenomatous polyps and FAP syndrome is caused by mutations within the adenomatous polyposis coli (APC) tumor suppressor gene. We analyzed 21 rectal bacterial subclones isolated from FAP patient 41-1 with confirmed 5bp ACAAA deletion within codons 1060-1063 for the presence of APC-like sequences in longest exon 15. The studied section was defined by primers 15Efor-15Erev, what correlates with mutation cluster region (MCR) in which the 75% of all APC germline mutations were detected. More than 90% homology was showed by sequencing and subsequent software comparison. The expression of APC-like sequences was demostrated by Western blot analysis using monoclonal and polyclonal antibodies against APC protein. To study missing link between the DNA analysis (PCR, DNA sequencing) and protein expresion experiments (Western blotting) we analyzed bacterial transcripts containing the 15Efor-15Erev sequence of APC gene by reverse transcription-PCR, what indicated that an APC gene derived fragment may be produced. We observed 97-100 % homology after computer comparison of cDNA PCR products. Our results suggest that presence of APC-like sequences in intestinal/rectal bacteria is enrichment of bacterial genetic information in which horizontal gene transfer between humans and microflora play an important role.

  10. Assessing the diversity of AM fungi in arid gypsophilous plant communities.

    PubMed

    Alguacil, M M; Roldán, A; Torres, M P

    2009-10-01

    In the present study, we used PCR-Single-Stranded Conformation Polymorphism (SSCP) techniques to analyse arbuscular mycorrhizal fungi (AMF) communities in four sites within a 10 km(2) gypsum area in Southern Spain. Four common plant species from these ecosystems were selected. The AM fungal small-subunit (SSU) rRNA genes were subjected to PCR, cloning, SSCP analysis, sequencing and phylogenetic analyses. A total of 1443 SSU rRNA sequences were analysed, for 21 AM fungal types: 19 belonged to the genus Glomus, 1 to the genus Diversispora and 1 to the Scutellospora. Four sequence groups were identified, which showed high similarity to sequences of known glomalean species or isolates: Glo G18 to Glomus constrictum, Glo G1 to Glomus intraradices, Glo G16 to Glomus clarum, Scut to Scutellospora dipurpurescens and Div to one new genus in the family Diversisporaceae identified recently as Otospora bareai. There were three sequence groups that received strong support in the phylogenetic analysis, and did not seem to be related to any sequences of AM fungi in culture or previously found in the database; thus, they could be novel taxa within the genus Glomus: Glo G4, Glo G2 and Glo G14. We have detected the presence of both generalist and potential specialist AMF in gypsum ecosystems. The AMF communities were different in the plant studied suggesting some degree of preference in the interactions between these symbionts.

  11. Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

    PubMed

    Simmons, Sheri L; Dibartolo, Genevieve; Denef, Vincent J; Goltsman, Daniela S Aliaga; Thelen, Michael P; Banfield, Jillian F

    2008-07-22

    Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination.

  12. Population Genomic Analysis of Strain Variation in Leptospirillum Group II Bacteria Involved in Acid Mine Drainage Formation

    PubMed Central

    Denef, Vincent J; Goltsman, Daniela S. Aliaga; Thelen, Michael P; Banfield, Jillian F

    2008-01-01

    Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth ∼20×). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types (∼94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination. PMID:18651792

  13. Deep Sequencing of Random Mutant Libraries Reveals the Active Site of the Narrow Specificity CphA Metallo-β-Lactamase is Fragile to Mutations.

    PubMed

    Sun, Zhizeng; Mehta, Shrenik C; Adamski, Carolyn J; Gibbs, Richard A; Palzkill, Timothy

    2016-09-12

    CphA is a Zn(2+)-dependent metallo-β-lactamase that efficiently hydrolyzes only carbapenem antibiotics. To understand the sequence requirements for CphA function, single codon random mutant libraries were constructed for residues in and near the active site and mutants were selected for E. coli growth on increasing concentrations of imipenem, a carbapenem antibiotic. At high concentrations of imipenem that select for phenotypically wild-type mutants, the active-site residues exhibit stringent sequence requirements in that nearly all residues in positions that contact zinc, the substrate, or the catalytic water do not tolerate amino acid substitutions. In addition, at high imipenem concentrations a number of residues that do not directly contact zinc or substrate are also essential and do not tolerate substitutions. Biochemical analysis confirmed that amino acid substitutions at essential positions decreased the stability or catalytic activity of the CphA enzyme. Therefore, the CphA active - site is fragile to substitutions, suggesting active-site residues are optimized for imipenem hydrolysis. These results also suggest that resistance to inhibitors targeted to the CphA active site would be slow to develop because of the strong sequence constraints on function.

  14. Genomics of high molecular weight plasmids isolated from an on-farm biopurification system.

    PubMed

    Martini, María C; Wibberg, Daniel; Lozano, Mauricio; Torres Tejerizo, Gonzalo; Albicoro, Francisco J; Jaenicke, Sebastian; van Elsas, Jan Dirk; Petroni, Alejandro; Garcillán-Barcia, M Pilar; de la Cruz, Fernando; Schlüter, Andreas; Pühler, Alfred; Pistorio, Mariano; Lagares, Antonio; Del Papa, María F

    2016-06-20

    The use of biopurification systems (BPS) constitutes an efficient strategy to eliminate pesticides from polluted wastewaters from farm activities. BPS environments contain a high microbial density and diversity facilitating the exchange of information among bacteria, mediated by mobile genetic elements (MGEs), which play a key role in bacterial adaptation and evolution in such environments. Here we sequenced and characterized high-molecular-weight plasmids from a bacterial collection of an on-farm BPS. The high-throughput-sequencing of the plasmid pool yielded a total of several Mb sequence information. Assembly of the sequence data resulted in six complete replicons. Using in silico analyses we identified plasmid replication genes whose encoding proteins represent 13 different Pfam families, as well as proteins involved in plasmid conjugation, indicating a large diversity of plasmid replicons and suggesting the occurrence of horizontal gene transfer (HGT) events within the habitat analyzed. In addition, genes conferring resistance to 10 classes of antimicrobial compounds and those encoding enzymes potentially involved in pesticide and aromatic hydrocarbon degradation were found. Global analysis of the plasmid pool suggest that the analyzed BPS represents a key environment for further studies addressing the dissemination of MGEs carrying catabolic genes and pathway assembly regarding degradation capabilities.

  15. Sequence of the bchG gene from Chloroflexus aurantiacus: relationship between chlorophyll synthase and other polyprenyltransferases

    NASA Technical Reports Server (NTRS)

    Lopez, J. C.; Ryan, S.; Blankenship, R. E.

    1996-01-01

    The sequence of the Chloroflexus aurantiacus open reading frame thought to be the C. aurantiacus homolog of the Rhodobacter capsulatus bchG gene is reported. The BchG gene product catalyzes esterification of bacteriochlorophyllide a by geranylgeraniol-PPi during bacteriochlorophyll a biosynthesis. Homologs from Arabidopsis thaliana, Synechocystis sp. strain PCC6803, and C. aurantiacus were identified in database searches. Profile analysis identified three related polyprenyltransferase enzymes which attach an aliphatic alcohol PPi to an aromatic substrate. This suggests a broader relationship between chlorophyll synthases and other polyprenyltransferases.

  16. Metagenomic analysis of bacterial and archaeal assemblages in the soil-mousse surrounding a geothermal spring.

    PubMed

    Bhatia, Sonu; Batra, Navneet; Pathak, Ashish; Joshi, Amit; Souza, Leila; Almeida, Paulo; Chauhan, Ashvini

    2015-09-01

    The soil-mousse surrounding a geothermal spring was analyzed for bacterial and archaeal diversity using 16S rRNA gene amplicon metagenomic sequencing which revealed the presence of 18 bacterial phyla distributed across 109 families and 219 genera. Firmicutes, Actinobacteria, and the Deinococcus-Thermus group were the predominant bacterial assemblages with Crenarchaeota and Thaumarchaeota as the main archaeal assemblages in this largely understudied geothermal habitat. Several metagenome sequences remained taxonomically unassigned suggesting the presence of a repertoire of hitherto undescribed microbes in this geothermal soil-mousse econiche.

  17. Identification of BRCA1 missense substitutions that confer partial functional activity: potential moderate risk variants?

    PubMed

    Lovelock, Paul K; Spurdle, Amanda B; Mok, Myth T S; Farrugia, Daniel J; Lakhani, Sunil R; Healey, Sue; Arnold, Stephen; Buchanan, Daniel; Couch, Fergus J; Henderson, Beric R; Goldgar, David E; Tavtigian, Sean V; Chenevix-Trench, Georgia; Brown, Melissa A

    2007-01-01

    Many of the DNA sequence variants identified in the breast cancer susceptibility gene BRCA1 remain unclassified in terms of their potential pathogenicity. Both multifactorial likelihood analysis and functional approaches have been proposed as a means to elucidate likely clinical significance of such variants, but analysis of the comparative value of these methods for classifying all sequence variants has been limited. We have compared the results from multifactorial likelihood analysis with those from several functional analyses for the four BRCA1 sequence variants A1708E, G1738R, R1699Q, and A1708V. Our results show that multifactorial likelihood analysis, which incorporates sequence conservation, co-inheritance, segregation, and tumour immunohistochemical analysis, may improve classification of variants. For A1708E, previously shown to be functionally compromised, analysis of oestrogen receptor, cytokeratin 5/6, and cytokeratin 14 tumour expression data significantly strengthened the prediction of pathogenicity, giving a posterior probability of pathogenicity of 99%. For G1738R, shown to be functionally defective in this study, immunohistochemistry analysis confirmed previous findings of inconsistent 'BRCA1-like' phenotypes for the two tumours studied, and the posterior probability for this variant was 96%. The posterior probabilities of R1699Q and A1708V were 54% and 69%, respectively, only moderately suggestive of increased risk. Interestingly, results from functional analyses suggest that both of these variants have only partial functional activity. R1699Q was defective in foci formation in response to DNA damage and displayed intermediate transcriptional transactivation activity but showed no evidence for centrosome amplification. In contrast, A1708V displayed an intermediate transcriptional transactivation activity and a normal foci formation response in response to DNA damage but induced centrosome amplification. These data highlight the need for a range of functional studies to be performed in order to identify variants with partially compromised function. The results also raise the possibility that A1708V and R1699Q may be associated with a low or moderate risk of cancer. While data pooling strategies may provide more information for multifactorial analysis to improve the interpretation of the clinical significance of these variants, it is likely that the development of current multifactorial likelihood approaches and the consideration of alternative statistical approaches will be needed to determine whether these individually rare variants do confer a low or moderate risk of breast cancer.

  18. Sequence Analysis of Raspberry latent virus Suggests a New Genus of Dicot Infecting Reoviruses

    USDA-ARS?s Scientific Manuscript database

    Currently, there are three assigned genera of plant reoviruses: Phytoreovirus, Fijivirus and Oryzavirus. With only two exceptions, all plant reoviruses infect monocotyledonous plants. The recent characterization of Raspberry latent virus (RpLV) isolated from red raspberry plants in northern Washingt...

  19. Four distinct types of E.C. 1.2.1.30 enzymes can catalyze the reduction of carboxylic acids to aldehydes.

    PubMed

    Stolterfoht, Holly; Schwendenwein, Daniel; Sensen, Christoph W; Rudroff, Florian; Winkler, Margit

    2017-09-10

    Increasing demand for chemicals from renewable resources calls for the development of new biotechnological methods for the reduction of oxidized bio-based compounds. Enzymatic carboxylate reduction is highly selective, both in terms of chemo- and product selectivity, but not many carboxylate reductase enzymes (CARs) have been identified on the sequence level to date. Thus far, their phylogeny is unexplored and very little is known about their structure-function-relationship. CARs minimally contain an adenylation domain, a phosphopantetheinylation domain and a reductase domain. We have recently identified new enzymes of fungal origin, using similarity searches against genomic sequences from organisms in which aldehydes were detected upon incubation with carboxylic acids. Analysis of sequences with known CAR functionality and CAR enzymes recently identified in our laboratory suggests that the three-domain architecture mentioned above is modular. The construction of a distance tree with a subsequent 1000-replicate bootstrap analysis showed that the CAR sequences included in our study fall into four distinct subgroups (one of bacterial origin and three of fungal origin, respectively), each with a bootstrap value of 100%. The multiple sequence alignment of all experimentally confirmed CAR protein sequences revealed fingerprint sequences of residues which are likely to be involved in substrate and co-substrate binding and one of the three catalytic substeps, respectively. The fingerprint sequences broaden our understanding of the amino acids that might be essential for the reduction of organic acids to the corresponding aldehydes in CAR proteins. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Isolation and characterization of the pea cytochrome c oxidase Vb gene.

    PubMed

    Kubo, Nakao; Arimura, Shin-Ichi; Tsutsumi, Nobuhiro; Kadowaki, Koh-Ichi; Hirai, Masashi

    2006-11-01

    Three copies of the gene that encodes cytochrome c oxidase subunit Vb were isolated from the pea (PscoxVb-1, PscoxVb-2, and PscoxVb-3). Northern Blot and reverse transcriptase-PCR analyses suggest that all 3 genes are transcribed in the pea. Each pea coxVb gene has an N-terminal extended sequence that can encode a mitochondrial targeting signal, called a presequence. The localization of green fluorescent proteins fused with the presequence strongly suggests the targeting of pea COXVb proteins to mitochondria. Each pea coxVb gene has 5 intron sites within the coding region. These are similar to Arabidopsis and rice, although the intron lengths vary greatly. A phylogenetic analysis of coxVb suggests the occurrence of gene duplication events during angiosperm evolution. In particular, 2 duplication events might have occurred in legumes, grasses, and Solanaceae. A comparison of amino acid sequences in COXVb or its counterpart shows the conservation of several amino acids within a zinc finger motif. Interestingly, a homology search analysis showed that bacterial protein COG4391 and a mitochondrial complex I 13 kDa subunit also have similar amino acid compositions around this motif. Such similarity might reflect evolutionary relationships among the 3 proteins.

  1. Clustering of Genetically Defined Allele Classes in the Caenorhabditis elegans DAF-2 Insulin/IGF-1 Receptor

    PubMed Central

    Patel, Dhaval S.; Garza-Garcia, Acely; Nanji, Manoj; McElwee, Joshua J.; Ackerman, Daniel; Driscoll, Paul C.; Gems, David

    2008-01-01

    The DAF-2 insulin/IGF-1 receptor regulates development, metabolism, and aging in the nematode Caenorhabditis elegans. However, complex differences among daf-2 alleles complicate analysis of this gene. We have employed epistasis analysis, transcript profile analysis, mutant sequence analysis, and homology modeling of mutant receptors to understand this complexity. We define an allelic series of nonconditional daf-2 mutants, including nonsense and deletion alleles, and a putative null allele, m65. The most severe daf-2 alleles show incomplete suppression by daf-18(0) and daf-16(0) and have a range of effects on early development. Among weaker daf-2 alleles there exist distinct mutant classes that differ in epistatic interactions with mutations in other genes. Mutant sequence analysis (including 11 newly sequenced alleles) reveals that class 1 mutant lesions lie only in certain extracellular regions of the receptor, while class 2 (pleiotropic) and nonconditional missense mutants have lesions only in the ligand-binding pocket of the receptor ectodomain or the tyrosine kinase domain. Effects of equivalent mutations on the human insulin receptor suggest an altered balance of intracellular signaling in class 2 alleles. These studies consolidate and extend our understanding of the complex genetics of daf-2 and its underlying molecular biology. PMID:18245374

  2. Identification and expression analysis of cDNA encoding insulin-like growth factor 2 in horses

    PubMed Central

    KIKUCHI, Kohta; SASAKI, Keisuke; AKIZAWA, Hiroki; TSUKAHARA, Hayato; BAI, Hanako; TAKAHASHI, Masashi; NAMBO, Yasuo; HATA, Hiroshi; KAWAHARA, Manabu

    2017-01-01

    Insulin-like growth factor 2 (IGF2) is responsible for a broad range of physiological processes during fetal development and adulthood, but genomic analyses of IGF2 containing the 5ʹ- and 3ʹ-untranslated regions (UTRs) in equines have been limited. In this study, we characterized the IGF2 mRNA containing the UTRs, and determined its expression pattern in the fetal tissues of horses. The complete equine IGF2 mRNA sequence harboring another exon approximately 2.8 kb upstream from the canonical transcription start site was identified as a new transcript variant. As this upstream exon did not contain the start codon, the amino acid sequence was identical to the canonical variant. Analysis of the deduced amino acid sequence revealed that the protein possessed two major domains, IlGF and IGF2_C, and analysis of IGF2 sequence polymorphism in fetal tissues of Hokkaido native horse and Thoroughbreds revealed a single nucleotide polymorphism (T to C transition) at position 398 in Thoroughbreds, which caused an amino acid substitution at position 133 in the IGF2 sequence. Furthermore, the expression pattern of the IGF2 mRNA in the fetal tissues of horses was determined for the first time, and was found to be consistent with those of other species. Taken together, these results suggested that the transcriptional and translational products of the IGF2 gene have conserved functions in the fetal development of mammals, including horses. PMID:29151450

  3. [Analysis of genotype and phenotype correlation of MYH7-V878A mutation among ethnic Han Chinese pedigrees affected with hypertrophic cardiomyopathy].

    PubMed

    Wang, Bo; Guo, Ruiqi; Zuo, Lei; Shao, Hong; Liu, Ying; Wang, Yu; Ju, Yan; Sun, Chao; Wang, Lifeng; Zhang, Yanmin; Liu, Liwen

    2017-08-10

    To analyze the phenotype-genotype correlation of MYH7-V878A mutation. Exonic amplification and high-throughput sequencing of 96-cardiovascular disease-related genes were carried out on probands from 210 pedigrees affected with hypertrophic cardiomyopathy (HCM). For the probands, their family members, and 300 healthy volunteers, the identified MYH7-V878A mutation was verified by Sanger sequencing. Information of the HCM patients and their family members, including clinical data, physical examination, echocardiography (UCG), electrocardiography (ECG), and conserved sequence of the mutation among various species were analyzed. A MYH7-V878A mutation was detected in five HCM pedigrees containing 31 family members. Fourteen members have carried the mutation, among whom 11 were diagnosed with HCM, while 3 did not meet the diagnostic criteria. Some of the fourteen members also carried other mutations. Family members not carrying the mutation had normal UCG and ECG. No MYH7-V878A mutation was found among the 300 healthy volunteers. Analysis of sequence conservation showed that the amino acid is located in highly conserved regions among various species. MYH7-V878A is a hot spot among ethnic Han Chinese with a high penetrance. Functional analysis of the conserved sequences suggested that the mutation may cause significant alteration of the function. MYH7-V878A has a significant value for the early diagnosis of HCM.

  4. Archaeal and bacterial diversity in two hot springs from geothermal regions in Bulgaria as demostrated by 16S rRNA and GH-57 genes.

    PubMed

    Stefanova, Katerina; Tomova, Iva; Tomova, Anna; Radchenkova, Nadja; Atanassov, Ivan; Kambourova, Margarita

    2015-12-01

    Archaeal and bacterial diversity in two Bulgarian hot springs, geographically separated with different tectonic origin and different temperature of water was investigated exploring two genes, 16S rRNA and GH-57. Archaeal diversity was significantly higher in the hotter spring Levunovo (LV) (82°C); on the contrary, bacterial diversity was higher in the spring Vetren Dol (VD) (68°C). The analyzed clones from LV library were referred to twenty eight different sequence types belonging to five archaeal groups from Crenarchaeota and Euryarchaeota. A domination of two groups was observed, Candidate Thaumarchaeota and Methanosarcinales. The majority of the clones from VD were referred to HWCG (Hot Water Crenarchaeotic Group). The formation of a group of thermophiles in the order Methanosarcinales was suggested. Phylogenetic analysis revealed high numbers of novel sequences, more than one third of archaeal and half of the bacterial phylotypes displayed similarity lower than 97% with known ones. The retrieved GH-57 gene sequences showed a complex phylogenic distribution. The main part of the retrieved homologous GH-57 sequences affiliated with bacterial phyla Bacteroidetes, Deltaproteobacteria, Candidate Saccharibacteria and affiliation of almost half of the analyzed sequences is not fully resolved. GH-57 gene analysis allows an increased resolution of the biodiversity assessment and in depth analysis of specific taxonomic groups. [Int Microbiol 18(4):217-223 (2015)]. Copyright© by the Spanish Society for Microbiology and Institute for Catalan Studies.

  5. Customisation of the exome data analysis pipeline using a combinatorial approach.

    PubMed

    Pattnaik, Swetansu; Vaidyanathan, Srividya; Pooja, Durgad G; Deepak, Sa; Panda, Binay

    2012-01-01

    The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.

  6. Viral morphogenesis is the dominant source of sequence censorship in M13 combinatorial peptide phage display.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rodi, D. J.; Soares, A. S.; Makowski, L.

    Novel statistical methods have been developed and used to quantitate and annotate the sequence diversity within combinatorial peptide libraries on the basis of small numbers (1-200) of sequences selected at random from commercially available M13 p3-based phage display libraries. These libraries behave statistically as though they correspond to populations containing roughly 4.0{+-}1.6% of the random dodecapeptides and 7.9{+-}2.6% of the random constrained heptapeptides that are theoretically possible within the phage populations. Analysis of amino acid residue occurrence patterns shows no demonstrable influence on sequence censorship by Escherichia coli tRNA isoacceptor profiles or either overall codon or Class II codon usagemore » patterns, suggesting no metabolic constraints on recombinant p3 synthesis. There is an overall depression in the occurrence of cysteine, arginine and glycine residues and an overabundance of proline, threonine and histidine residues. The majority of position-dependent amino acid sequence bias is clustered at three positions within the inserted peptides of the dodecapeptide library, +1, +3 and +12 downstream from the signal peptidase cleavage site. Conformational tendency measures of the peptides indicate a significant preference for inserts favoring a {beta}-turn conformation. The observed protein sequence limitations can primarily be attributed to genetic codon degeneracy and signal peptidase cleavage preferences. These data suggest that for applications in which maximal sequence diversity is essential, such as epitope mapping or novel receptor identification, combinatorial peptide libraries should be constructed using codon-corrected trinucleotide cassettes within vector-host systems designed to minimize morphogenesis-related censorship.« less

  7. Enhancing the detection of barcoded reads in high throughput DNA sequencing data by controlling the false discovery rate.

    PubMed

    Buschmann, Tilo; Zhang, Rong; Brash, Douglas E; Bystrykh, Leonid V

    2014-08-07

    DNA barcodes are short unique sequences used to label DNA or RNA-derived samples in multiplexed deep sequencing experiments. During the demultiplexing step, barcodes must be detected and their position identified. In some cases (e.g., with PacBio SMRT), the position of the barcode and DNA context is not well defined. Many reads start inside the genomic insert so that adjacent primers might be missed. The matter is further complicated by coincidental similarities between barcode sequences and reference DNA. Therefore, a robust strategy is required in order to detect barcoded reads and avoid a large number of false positives or negatives.For mass inference problems such as this one, false discovery rate (FDR) methods are powerful and balanced solutions. Since existing FDR methods cannot be applied to this particular problem, we present an adapted FDR method that is suitable for the detection of barcoded reads as well as suggest possible improvements. In our analysis, barcode sequences showed high rates of coincidental similarities with the Mus musculus reference DNA. This problem became more acute when the length of the barcode sequence decreased and the number of barcodes in the set increased. The method presented in this paper controls the tail area-based false discovery rate to distinguish between barcoded and unbarcoded reads. This method helps to establish the highest acceptable minimal distance between reads and barcode sequences. In a proof of concept experiment we correctly detected barcodes in 83% of the reads with a precision of 89%. Sensitivity improved to 99% at 99% precision when the adjacent primer sequence was incorporated in the analysis. The analysis was further improved using a paired end strategy. Following an analysis of the data for sequence variants induced in the Atp1a1 gene of C57BL/6 murine melanocytes by ultraviolet light and conferring resistance to ouabain, we found no evidence of cross-contamination of DNA material between samples. Our method offers a proper quantitative treatment of the problem of detecting barcoded reads in a noisy sequencing environment. It is based on the false discovery rate statistics that allows a proper trade-off between sensitivity and precision to be chosen.

  8. Comparative genome analysis of Basidiomycete fungi

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism.more » Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.« less

  9. Metagenomics and the protein universe

    PubMed Central

    Godzik, Adam

    2011-01-01

    Metagenomics sequencing projects have dramatically increased our knowledge of the protein universe and provided over one-half of currently known protein sequences; they have also introduced a much broader phylogenetic diversity into the protein databases. The full analysis of metagenomic datasets is only beginning, but it has already led to the discovery of thousands of new protein families, likely representing novel functions specific to given environments. At the same time, a deeper analysis of such novel families, including experimental structure determination of some representatives, suggests that most of them represent distant homologs of already characterized protein families, and thus most of the protein diversity present in the new environments are due to functional divergence of the known protein families rather than the emergence of new ones. PMID:21497084

  10. Pathogenicity, sequence and phylogenetic analysis of Malaysian Chicken anaemia virus obtained after low and high passages in MSB-1 cells.

    PubMed

    Chowdhury, S M Z H; Omar, A R; Aini, I; Hair-Bejo, M; Jamaluddin, A A; Md-Zain, B M; Kono, Y

    2003-12-01

    Specific-pathogen-free (SPF) chickens inoculated with low passage Chicken anaemia virus (CAV), SMSC-1 and 3-1 isolates produced lesions suggestive of CAV infection. Repeated passages of the isolates in cell culture until passage 60 (P60) and passage 123 produced viruses that showed a significantly reduced level of pathogenicity in SPF chickens compared to the low passage isolates. Sequence comparison indicated that nucleotide changes in only the coding region of the P60 passage isolates were thought to contribute to virus attenuation. Phylogenetic analysis indicated that SMSC-1 and 3-1 were highly divergent, but their P60 passage derivatives shared significant homology to a Japanese isolate A2.

  11. Identification and expression analysis of a novel R-type lectin from the coleopteran beetle, Tenebrio molitor.

    PubMed

    Kim, Dong Hyun; Patnaik, Bharat Bhusan; Seo, Gi Won; Kang, Seong Min; Lee, Yong Seok; Lee, Bok Luel; Han, Yeon Soo

    2013-11-01

    We have identified novel ricin-type (R-type) lectin by sequencing of random clones from cDNA library of the coleopteran beetle, Tenebrio molitor. The cDNA sequence is comprised of 495 bp encoding a protein of 164 amino acid residues and shows 49% identity with galectin of Tribolium castaneum. Bioinformatics analysis shows that the amino acid residues from 35 to 162 belong to ricin-type beta-trefoil structure. The transcript was significantly upregulated after early hours of injection with peptidoglycans derived from Gram (+) and Gram (-) bacteria, beta-1, 3 glucan from fungi and an intracellular pathogen, Listeria monocytogenes suggesting putative function in innate immunity. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. Bioremediation potential of a highly mercury resistant bacterial strain Sphingobium SA2 isolated from contaminated soil.

    PubMed

    Mahbub, Khandaker Rayhan; Krishnan, Kannan; Megharaj, Mallavarapu; Naidu, Ravi

    2016-02-01

    A mercury resistant bacterial strain, SA2, was isolated from soil contaminated with mercury. The 16S rRNA gene sequence of this isolate showed 99% sequence similarity to the genera Sphingobium and Sphingomonas of α-proteobacteria group. However, the isolate formed a distinct phyletic line with the genus Sphingobium suggesting the strain belongs to Sphingobium sp. Toxicity studies indicated resistance to high levels of mercury with estimated EC50 values 4.5 mg L(-1) and 44.15 mg L(-1) and MIC values 5.1 mg L(-1) and 48.48 mg L(-1) in minimal and rich media, respectively. The strain SA2 was able to volatilize mercury by producing mercuric reductase enzyme which makes it potential candidate for remediating mercury. ICP-QQQ-MS analysis of Hg supplemented culture solutions confirmed that almost 79% mercury in the culture suspension was volatilized in 6 h. A very small amount of mercury was observed to accumulate in cell pellets which was also evident according to ESEM-EDX analysis. The mercuric reductase gene merA was amplified and sequenced. The deduced amino acid sequence demonstrated sequence homology with α-proteobacteria and Ascomycota group. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Sequencing of the amylopullulanase (apu) gene of Thermoanaerobacter ethanolicus 39E, and identification of the active site by site-directed mutagenesis.

    PubMed

    Mathupala, S P; Lowe, S E; Podkovyrov, S M; Zeikus, J G

    1993-08-05

    The complete nucleotide sequence of the gene encoding the dual active amylopullulanase of Thermoanaerobacter ethanolicus 39E (formerly Clostridium thermohydrosulfuricum) was determined. The structural gene (apu) contained a single open reading frame 4443 base pairs in length, corresponding to 1481 amino acids, with an estimated molecular weight of 162,780. Analysis of the deduced sequence of apu with sequences of alpha-amylases and alpha-1,6 debranching enzymes enabled the identification of four conserved regions putatively involved in substrate binding and in catalysis. The conserved regions were localized within a 2.9-kilobase pair gene fragment, which encoded a M(r) 100,000 protein that maintained the dual activities and thermostability of the native enzyme. The catalytic residues of amylopullulanase were tentatively identified by using hydrophobic cluster analysis for comparison of amino acid sequences of amylopullulanase and other amylolytic enzymes. Asp597, Glu626, and Asp703 were individually modified to their respective amide form, or the alternate acid form, and in all cases both alpha-amylase and pullulanase activities were lost, suggesting the possible involvement of 3 residues in a catalytic triad, and the presence of a putative single catalytic site within the enzyme. These findings substantiate amylopullulanase as a new type of amylosaccharidase.

  14. Novel proteases from the genome of the carnivorous plant Drosera capensis: structural prediction and comparative analysis

    PubMed Central

    Butts, Carter T.; Bierma, Jan C.; Martin, Rachel W.

    2016-01-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a “ferment” similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. PMID:27353064

  15. Novel gastric helicobacters and oral campylobacters are present in captive and wild cetaceans

    PubMed Central

    Goldman, Cinthia G.; Matteo, Mario J.; Loureiro, Julio D.; Almuzara, Marisa; Barberis, Claudia; Vay, Carlos; Catalano, Mariana; Heredia, Sergio Rodríguez; Mantero, Paula; Boccio, Jose R.; Zubillaga, Marcela B.; Cremaschi, Graciela A.; Solnick, Jay V.; Perez-Perez, Guillermo I.; Blaser, Martin J.

    2011-01-01

    The mammalian gastric and oral mucosa may be colonized by mixed Helicobacter and Campylobacter species, respectively, in individual animals. To better characterize the presence and distribution of Helicobacter and Campylobacter among marine mammals, we used PCR and 16S rDNA sequence analysis to examine gastric and oral samples from ten dolphins (Tursiops gephyreus), one killer whale (Orcinus orca), one false killer whale (Pseudorca crassidens), and three wild La Plata river dolphins (Pontoporia blainvillei). Helicobacter spp. DNA was widely distributed in gastric and oral samples from both captive and wild cetaceans. Phylogenetic analysis demonstrated two Helicobacter sequence clusters, one closely related to H. cetorum, a species isolated from dolphins and whales in North America. The second related cluster was to sequences obtained from dolphins in Australia and to gastric non-Helicobacter pylori helicobacters, and may represent a novel taxonomic group. Dental plaque sequences from four dolphins formed a third cluster within the Campylobacter genus that likely represents a novel species isolated from marine mammals. Identification of identical Helicobacter spp. DNA sequences from dental plaque, saliva and gastric fluids from the same hosts, suggests that the oral cavity may be involved in transmission. These results demonstrate that Helicobacter and Campylobacter species are commonly distributed in marine mammals, and identify taxonomic clusters that may represent novel species. PMID:21592686

  16. Evolution and Diversity in Human Herpes Simplex Virus Genomes

    PubMed Central

    Gatherer, Derek; Ochoa, Alejandro; Greenbaum, Benjamin; Dolan, Aidan; Bowden, Rory J.; Enquist, Lynn W.; Legendre, Matthieu; Davison, Andrew J.

    2014-01-01

    Herpes simplex virus 1 (HSV-1) causes a chronic, lifelong infection in >60% of adults. Multiple recent vaccine trials have failed, with viral diversity likely contributing to these failures. To understand HSV-1 diversity better, we comprehensively compared 20 newly sequenced viral genomes from China, Japan, Kenya, and South Korea with six previously sequenced genomes from the United States, Europe, and Japan. In this diverse collection of passaged strains, we found that one-fifth of the newly sequenced members share a gene deletion and one-third exhibit homopolymeric frameshift mutations (HFMs). Individual strains exhibit genotypic and potential phenotypic variation via HFMs, deletions, short sequence repeats, and single-nucleotide polymorphisms, although the protein sequence identity between strains exceeds 90% on average. In the first genome-scale analysis of positive selection in HSV-1, we found signs of selection in specific proteins and residues, including the fusion protein glycoprotein H. We also confirmed previous results suggesting that recombination has occurred with high frequency throughout the HSV-1 genome. Despite this, the HSV-1 strains analyzed clustered by geographic origin during whole-genome distance analysis. These data shed light on likely routes of HSV-1 adaptation to changing environments and will aid in the selection of vaccine antigens that are invariant worldwide. PMID:24227835

  17. In silico study of breast cancer associated gene 3 using LION Target Engine and other tools.

    PubMed

    León, Darryl A; Cànaves, Jaume M

    2003-12-01

    Sequence analysis of individual targets is an important step in annotation and validation. As a test case, we investigated human breast cancer associated gene 3 (BCA3) with LION Target Engine and with other bioinformatics tools. LION Target Engine confirmed that the BCA3 gene is located on 11p15.4 and that the two most likely splice variants (lacking exon 3 and exons 3 and 5, respectively) exist. Based on our manual curation of sequence data, it is proposed that an additional variant (missing only exon 5) published in a public sequence repository, is a prediction artifact. A significant number of new orthologs were also identified, and these were the basis for a high-quality protein secondary structure prediction. Moreover, our research confirmed several distinct functional domains as described in earlier reports. Sequence conservation from multiple sequence alignments, splice variant identification, secondary structure predictions, and predicted phosphorylation sites suggest that the removal of interaction sites through alternative splicing might play a modulatory role in BCA3. This in silico approach shows the depth and relevance of an analysis that can be accomplished by including a variety of publicly available tools with an integrated and customizable life science informatics platform.

  18. Molecular Characterization of Bombyx mori Cytoplasmic Polyhedrosis Virus Genome Segment 4

    PubMed Central

    Ikeda, Keiko; Nagaoka, Sumiharu; Winkler, Stefan; Kotani, Kumiko; Yagi, Hiroaki; Nakanishi, Kae; Miyajima, Shigetoshi; Kobayashi, Jun; Mori, Hajime

    2001-01-01

    The complete nucleotide sequence of the genome segment 4 (S4) of Bombyx mori cytoplasmic polyhedrosis virus (BmCPV) was determined. The 3,259-nucleotide sequence contains a single long open reading frame which spans nucleotides 14 to 3187 and which is predicted to encode a protein with a molecular mass of about 130 kDa. Western blot analysis showed that S4 encodes BmCPV protein VP3, which is one of the outer components of the BmCPV virion. Sequence analysis of the deduced amino acid sequence of BmCPV VP3 revealed possible sequence homology with proteins from rice ragged stunt virus (RRSV) S2, Nilaparvata lugens reovirus S4, and Fiji disease fijivirus S4. This may suggest that plant reoviruses originated from insect viruses and that RRSV emerged more recently than other plant reoviruses. A chimeric protein consisting of BmCPV VP3 and green fluorescent protein (GFP) was constructed and expressed with BmCPV polyhedrin using a baculovirus expression vector. The VP3-GFP chimera was incorporated into BmCPV polyhedra and released under alkaline conditions. The results indicate that specific interactions occur between BmCPV polyhedrin and VP3 which might facilitate BmCPV virion occlusion into the polyhedra. PMID:11134312

  19. The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

    PubMed Central

    Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

    2013-01-01

    Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520

  20. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    PubMed

    Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

    2013-01-01

    Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

Top