Sample records for multiple sequence types

  1. Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach

    NASA Astrophysics Data System (ADS)

    Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.

    2012-10-01

    In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.

  2. How Should Intelligent Tutoring Systems Sequence Multiple Graphical Representations of Fractions? A Multi-Methods Study

    ERIC Educational Resources Information Center

    Rau, M. A.; Aleven, V.; Rummel, N.; Pardos, Z.

    2014-01-01

    Providing learners with multiple representations of learning content has been shown to enhance learning outcomes. When multiple representations are presented across consecutive problems, we have to decide in what sequence to present them. Prior research has demonstrated that interleaving "tasks types" (as opposed to blocking them) can…

  3. Interleaved Practice in Multi-Dimensional Learning Tasks: Which Dimension Should We Interleave?

    ERIC Educational Resources Information Center

    Rau, Martina A.; Aleven, Vincent; Rummel, Nikol

    2013-01-01

    Research shows that multiple representations can enhance student learning. Many curricula use multiple representations across multiple task types. The temporal sequence of representations and task types is likely to impact student learning. Research on contextual interference shows that interleaving learning tasks leads to better learning results…

  4. Clonal Relatedness of Enterotoxigenic Escherichia coli (ETEC) Strains Expressing LT and CS17 Isolated from Children with Diarrhoea in La Paz, Bolivia

    PubMed Central

    Rodas, Claudia; Klena, John D.; Nicklasson, Matilda; Iniguez, Volga; Sjöling, Åsa

    2011-01-01

    Background Enterotoxigenic Escherichia coli (ETEC) is a major cause of traveller's and infantile diarrhoea in the developing world. ETEC produces two toxins, a heat-stable toxin (known as ST) and a heat-labile toxin (LT) and colonization factors that help the bacteria to attach to epithelial cells. Methodology/Principal Findings In this study, we characterized a subset of ETEC clinical isolates recovered from Bolivian children under 5 years of age using a combination of multilocus sequence typing (MLST) analysis, virulence typing, serotyping and antimicrobial resistance test patterns in order to determine the genetic background of ETEC strains circulating in Bolivia. We found that strains expressing the heat-labile (LT) enterotoxin and colonization factor CS17 were common and belonged to several MLST sequence types but mainly to sequence type-423 and sequence type-443 (Achtman scheme). To further study the LT/CS17 strains we analysed the nucleotide sequence of the CS17 operon and compared the structure to LT/CS17 ETEC isolates from Bangladesh. Sequence analysis confirmed that all sequence type-423 strains from Bolivia had a single nucleotide polymorphism; SNPbol in the CS17 operon that was also found in some other MLST sequence types from Bolivia but not in strains recovered from Bangladeshi children. The dominant ETEC clone in Bolivia (sequence type-423/SNPbol) was found to persist over multiple years and was associated with severe diarrhoea but these strains were variable with respect to antimicrobial resistance patterns. Conclusion/Significance The results showed that although the LT/CS17 phenotype is common among ETEC strains in Bolivia, multiple clones, as determined by unique MLST sequence types, populate this phenotype. Our data also appear to suggest that acquisition and loss of antimicrobial resistance in LT-expressing CS17 ETEC clones is more dynamic than acquisition or loss of virulence factors. PMID:22140423

  5. Clonal relatedness of enterotoxigenic Escherichia coli (ETEC) strains expressing LT and CS17 isolated from children with diarrhoea in La Paz, Bolivia.

    PubMed

    Rodas, Claudia; Klena, John D; Nicklasson, Matilda; Iniguez, Volga; Sjöling, Asa

    2011-01-01

    Enterotoxigenic Escherichia coli (ETEC) is a major cause of traveller's and infantile diarrhoea in the developing world. ETEC produces two toxins, a heat-stable toxin (known as ST) and a heat-labile toxin (LT) and colonization factors that help the bacteria to attach to epithelial cells. In this study, we characterized a subset of ETEC clinical isolates recovered from Bolivian children under 5 years of age using a combination of multilocus sequence typing (MLST) analysis, virulence typing, serotyping and antimicrobial resistance test patterns in order to determine the genetic background of ETEC strains circulating in Bolivia. We found that strains expressing the heat-labile (LT) enterotoxin and colonization factor CS17 were common and belonged to several MLST sequence types but mainly to sequence type-423 and sequence type-443 (Achtman scheme). To further study the LT/CS17 strains we analysed the nucleotide sequence of the CS17 operon and compared the structure to LT/CS17 ETEC isolates from Bangladesh. Sequence analysis confirmed that all sequence type-423 strains from Bolivia had a single nucleotide polymorphism; SNP(bol) in the CS17 operon that was also found in some other MLST sequence types from Bolivia but not in strains recovered from Bangladeshi children. The dominant ETEC clone in Bolivia (sequence type-423/SNP(bol)) was found to persist over multiple years and was associated with severe diarrhoea but these strains were variable with respect to antimicrobial resistance patterns. The results showed that although the LT/CS17 phenotype is common among ETEC strains in Bolivia, multiple clones, as determined by unique MLST sequence types, populate this phenotype. Our data also appear to suggest that acquisition and loss of antimicrobial resistance in LT-expressing CS17 ETEC clones is more dynamic than acquisition or loss of virulence factors.

  6. Coupling detrended fluctuation analysis for multiple warehouse-out behavioral sequences

    NASA Astrophysics Data System (ADS)

    Yao, Can-Zhong; Lin, Ji-Nan; Zheng, Xu-Zhou

    2017-01-01

    Interaction patterns among different warehouses could make the warehouse-out behavioral sequences less predictable. We firstly take a coupling detrended fluctuation analysis on the warehouse-out quantity, and find that the multivariate sequences exhibit significant coupling multifractal characteristics regardless of the types of steel products. Secondly, we track the sources of multifractal warehouse-out sequences by shuffling and surrogating original ones, and we find that fat-tail distribution contributes more to multifractal features than the long-term memory, regardless of types of steel products. From perspective of warehouse contribution, some warehouses steadily contribute more to multifractal than other warehouses. Finally, based on multiscale multifractal analysis, we propose Hurst surface structure to investigate coupling multifractal, and show that multiple behavioral sequences exhibit significant coupling multifractal features that emerge and usually be restricted within relatively greater time scale interval.

  7. The Genome Sequence of a Type ST239 Methicillin-Resistant Staphylococcus aureus Isolate from a Malaysian Hospital

    PubMed Central

    Lee, LS; Teh, LK; Zainuddin, ZF; Salleh, MZ

    2014-01-01

    We report the genome sequence of a healthcare-associated MRSA type ST239 clone isolated from a patient with septicemia in Malaysia. This clone typifies the characteristics of ST239 lineage, including resistance to multiple antibiotics and antiseptics. PMID:25197474

  8. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    PubMed

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies.

  9. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks

    PubMed Central

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies. PMID:27446025

  10. The 5S rDNA in two Abracris grasshoppers (Ommatolampidinae: Acrididae): molecular and chromosomal organization.

    PubMed

    Bueno, Danilo; Palacios-Gimenez, Octavio Manuel; Martí, Dardo Andrea; Mariguela, Tatiane Casagrande; Cabral-de-Mello, Diogo Cavalcanti

    2016-08-01

    The 5S ribosomal DNA (rDNA) sequences are subject of dynamic evolution at chromosomal and molecular levels, evolving through concerted and/or birth-and-death fashion. Among grasshoppers, the chromosomal location for this sequence was established for some species, but little molecular information was obtained to infer evolutionary patterns. Here, we integrated data from chromosomal and nucleotide sequence analysis for 5S rDNA in two Abracris species aiming to identify evolutionary dynamics. For both species, two arrays were identified, a larger sequence (named type-I) that consisted of the entire 5S rDNA gene plus NTS (non-transcribed spacer) and a smaller (named type-II) with truncated 5S rDNA gene plus short NTS that was considered a pseudogene. For type-I sequences, the gene corresponding region contained the internal control region and poly-T motif and the NTS presented partial transposable elements. Between the species, nucleotide differences for type-I were noticed, while type-II was identical, suggesting pseudogenization in a common ancestor. At chromosomal point to view, the type-II was placed in one bivalent, while type-I occurred in multiple copies in distinct chromosomes. In Abracris, the evolution of 5S rDNA was apparently influenced by the chromosomal distribution of clusters (single or multiple location), resulting in a mixed mechanism integrating concerted and birth-and-death evolution depending on the unit.

  11. Simultaneous phylogeny reconstruction and multiple sequence alignment

    PubMed Central

    Yue, Feng; Shi, Jian; Tang, Jijun

    2009-01-01

    Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110

  12. An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

    PubMed

    Horn, T; Chang, C A; Urdea, M S

    1997-12-01

    The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.

  13. An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

    PubMed Central

    Horn, T; Chang, C A; Urdea, M S

    1997-01-01

    The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265

  14. Typing of canine parvovirus isolates using mini-sequencing based single nucleotide polymorphism analysis.

    PubMed

    Naidu, Hariprasad; Subramanian, B Mohana; Chinchkar, Shankar Ramchandra; Sriraman, Rajan; Rana, Samir Kumar; Srinivasan, V A

    2012-05-01

    The antigenic types of canine parvovirus (CPV) are defined based on differences in the amino acids of the major capsid protein VP2. Type specificity is conferred by a limited number of amino acid changes and in particular by few nucleotide substitutions. PCR based methods are not particularly suitable for typing circulating variants which differ in a few specific nucleotide substitutions. Assays for determining SNPs can detect efficiently nucleotide substitutions and can thus be adapted to identify CPV types. In the present study, CPV typing was performed by single nucleotide extension using the mini-sequencing technique. A mini-sequencing signature was established for all the four CPV types (CPV2, 2a, 2b and 2c) and feline panleukopenia virus. The CPV typing using the mini-sequencing reaction was performed for 13 CPV field isolates and the two vaccine strains available in our repository. All the isolates had been typed earlier by full-length sequencing of the VP2 gene. The typing results obtained from mini-sequencing matched completely with that of sequencing. Typing could be achieved with less than 100 copies of standard plasmid DNA constructs or ≤10¹ FAID₅₀ of virus by mini-sequencing technique. The technique was also efficient for detecting multiple types in mixed infections. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.

    PubMed

    Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S

    2007-10-11

    By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.

  16. Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.

    PubMed

    Kurian, Allison W; Ward, Kevin C; Hamilton, Ann S; Deapen, Dennis M; Abrahamse, Paul; Bondarenko, Irina; Li, Yun; Hawley, Sarah T; Morrow, Monica; Jagsi, Reshma; Katz, Steven J

    2018-05-10

    Low-cost sequencing of multiple genes is increasingly available for cancer risk assessment. Little is known about uptake or outcomes of multiple-gene sequencing after breast cancer diagnosis in community practice. To examine the effect of multiple-gene sequencing on the experience and treatment outcomes for patients with breast cancer. For this population-based retrospective cohort study, patients with breast cancer diagnosed from January 2013 to December 2015 and accrued from SEER registries across Georgia and in Los Angeles, California, were surveyed (n = 5080, response rate = 70%). Responses were merged with SEER data and results of clinical genetic tests, either BRCA1 and BRCA2 (BRCA1/2) sequencing only or including additional other genes (multiple-gene sequencing), provided by 4 laboratories. Type of testing (multiple-gene sequencing vs BRCA1/2-only sequencing), test results (negative, variant of unknown significance, or pathogenic variant), patient experiences with testing (timing of testing, who discussed results), and treatment (strength of patient consideration of, and surgeon recommendation for, prophylactic mastectomy), and prophylactic mastectomy receipt. We defined a patient subgroup with higher pretest risk of carrying a pathogenic variant according to practice guidelines. Among 5026 patients (mean [SD] age, 59.9 [10.7]), 1316 (26.2%) were linked to genetic results from any laboratory. Multiple-gene sequencing increasingly replaced BRCA1/2-only testing over time: in 2013, the rate of multiple-gene sequencing was 25.6% and BRCA1/2-only testing, 74.4%;in 2015 the rate of multiple-gene sequencing was 66.5% and BRCA1/2-only testing, 33.5%. Multiple-gene sequencing was more often ordered by genetic counselors (multiple-gene sequencing, 25.5% and BRCA1/2-only testing, 15.3%) and delayed until after surgery (multiple-gene sequencing, 32.5% and BRCA1/2-only testing, 19.9%). Multiple-gene sequencing substantially increased rate of detection of any pathogenic variant (multiple-gene sequencing: higher-risk patients, 12%; average-risk patients, 4.2% and BRCA1/2-only testing: higher-risk patients, 7.8%; average-risk patients, 2.2%) and variants of uncertain significance, especially in minorities (multiple-gene sequencing: white patients, 23.7%; black patients, 44.5%; and Asian patients, 50.9% and BRCA1/2-only testing: white patients, 2.2%; black patients, 5.6%; and Asian patients, 0%). Multiple-gene sequencing was not associated with an increase in the rate of prophylactic mastectomy use, which was highest with pathogenic variants in BRCA1/2 (BRCA1/2, 79.0%; other pathogenic variant, 37.6%; variant of uncertain significance, 30.2%; negative, 35.3%). Multiple-gene sequencing rapidly replaced BRCA1/2-only testing for patients with breast cancer in the community and enabled 2-fold higher detection of clinically relevant pathogenic variants without an associated increase in prophylactic mastectomy. However, important targets for improvement in the clinical utility of multiple-gene sequencing include postsurgical delay and racial/ethnic disparity in variants of uncertain significance.

  17. AlignMe—a membrane protein sequence alignment web server

    PubMed Central

    Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

    2014-01-01

    We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425

  18. Real-Time PCR Typing of Escherichia coli Based on Multiple Single Nucleotide Polymorphisms--a Convenient and Rapid Method.

    PubMed

    Lager, Malin; Mernelius, Sara; Löfgren, Sture; Söderman, Jan

    2016-01-01

    Healthcare-associated infections caused by Escherichia coli and antibiotic resistance due to extended-spectrum beta-lactamase (ESBL) production constitute a threat against patient safety. To identify, track, and control outbreaks and to detect emerging virulent clones, typing tools of sufficient discriminatory power that generate reproducible and unambiguous data are needed. A probe based real-time PCR method targeting multiple single nucleotide polymorphisms (SNP) was developed. The method was based on the multi locus sequence typing scheme of Institute Pasteur and by adaptation of previously described typing assays. An 8 SNP-panel that reached a Simpson's diversity index of 0.95 was established, based on analysis of sporadic E. coli cases (ESBL n = 27 and non-ESBL n = 53). This multi-SNP assay was used to identify the sequence type 131 (ST131) complex according to the Achtman's multi locus sequence typing scheme. However, it did not fully discriminate within the complex but provided a diagnostic signature that outperformed a previously described detection assay. Pulsed-field gel electrophoresis typing of isolates from a presumed outbreak (n = 22) identified two outbreaks (ST127 and ST131) and three different non-outbreak-related isolates. Multi-SNP typing generated congruent data except for one non-outbreak-related ST131 isolate. We consider multi-SNP real-time PCR typing an accessible primary generic E. coli typing tool for rapid and uniform type identification.

  19. Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5′-phosphate-dependent enzymes

    PubMed Central

    Paiardini, Alessandro; Bossa, Francesco; Pascarella, Stefano

    2004-01-01

    The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction. PMID:15498941

  20. HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

    PubMed

    Wan, Shixiang; Zou, Quan

    2017-01-01

    Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.

  1. Evolution of nuclear rDNA ITS sequences in the Cladophora albida/sericea clade (Chlorophyta).

    PubMed

    Bakker, F T; Olsen, J L; Stam, W T

    1995-06-01

    Ribosomal DNA ITS sequences were compared among 13 different species and biogeographic isolates from the monophyletic "albida/sericea clade" in the green algal genus Cladophora. Six distinct ITS sequence types were found, characterized by multiple insertions and deletions and high levels of nucleotide substitution. Conserved domains within the ITS regions indicate the presence of ITS secondary structure. Low transition/transversion ratios among the six types and nearly symmetrical tree-length frequency distributions indicate some saturation, and low phylogenetic signal. Although branching order among five of the six ITS sequence types could not be resolved, estimates of ITS sequence divergence as compared with 18S divergence in a subset of the taxa suggests that the origin of the different ITS types is probably in the mid-Miocene (12 Ma ago) but that biogeographic isolates within a single ITS type (including both Pacific and Atlantic representatives) have probably dispersed on a time scale of thousands rather than millions of years.

  2. Advances in DNA sequencing technologies for high resolution HLA typing.

    PubMed

    Cereb, Nezih; Kim, Hwa Ran; Ryu, Jaejun; Yang, Soo Young

    2015-12-01

    This communication describes our experience in large-scale G group-level high resolution HLA typing using three different DNA sequencing platforms - ABI 3730 xl, Illumina MiSeq and PacBio RS II. Recent advances in DNA sequencing technologies, so-called next generation sequencing (NGS), have brought breakthroughs in deciphering the genetic information in all living species at a large scale and at an affordable level. The NGS DNA indexing system allows sequencing multiple genes for large number of individuals in a single run. Our laboratory has adopted and used these technologies for HLA molecular testing services. We found that each sequencing technology has its own strengths and weaknesses, and their sequencing performances complement each other. HLA genes are highly complex and genotyping them is quite challenging. Using these three sequencing platforms, we were able to meet all requirements for G group-level high resolution and high volume HLA typing. Copyright © 2015 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.

  3. Sequence Segmentation with changeptGUI.

    PubMed

    Tasker, Edward; Keith, Jonathan M

    2017-01-01

    Many biological sequences have a segmental structure that can provide valuable clues to their content, structure, and function. The program changept is a tool for investigating the segmental structure of a sequence, and can also be applied to multiple sequences in parallel to identify a common segmental structure, thus providing a method for integrating multiple data types to identify functional elements in genomes. In the previous edition of this book, a command line interface for changept is described. Here we present a graphical user interface for this package, called changeptGUI. This interface also includes tools for pre- and post-processing of data and results to facilitate investigation of the number and characteristics of segment classes.

  4. Genetic analysis of a Chinese family with members affected with Usher syndrome type II and Waardenburg syndrome type IV.

    PubMed

    Wang, Xueling; Lin, Xiao-Jiang; Tang, Xiangrong; Chai, Yong-Chuan; Yu, De-Hong; Chen, Dong-Ye; Wu, Hao

    2017-11-01

    The purpose of this study was to identify the genetic causes of a family presenting with multiple symptoms overlapping Usher syndrome type II (USH2) and Waardenburg syndrome type IV (WS4). Targeted next-generation sequencing including the exon and flanking intron sequences of 79 deafness genes was performed on the proband. Co-segregation of the disease phenotype and the detected variants were confirmed in all family members by PCR amplification and Sanger sequencing. The affected members of this family had two different recessive disorders, USH2 and WS4. By targeted next-generation sequencing, we identified that USH2 was caused by a novel missense mutation, p.V4907D in GPR98; whereas WS4 due to p.V185M in EDNRB. This is the first report of homozygous p.V185M mutation in EDNRB in patient with WS4. This study reported a Chinese family with multiple independent and overlapping phenotypes. In condition, molecular level analysis was efficient to identify the causative variant p.V4907D in GPR98 and p.V185M in EDNRB, also was helpful to confirm the clinical diagnosis of USH2 and WS4. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. The Genome sequences of four non-human/non-clinical Salmonella enterica serovar Kentucky ST198 isolates recovered between 1972 and 1973

    USDA-ARS?s Scientific Manuscript database

    Salmonella Kentucky is a polyphyletic member of S. enterica subclade A1 with multiple sequence types that often colonize the same hosts but in different frequencies on different continents. To evaluate the genomic features involved in S. Kentucky host specificity we sequenced the genomes of four iso...

  6. Multilocus sequence typing (MLST) analysis of Propionibacterium acnes isolates from radical prostatectomy specimens.

    PubMed

    Mak, Tim N; Yu, Shu-Han; De Marzo, Angelo M; Brüggemann, Holger; Sfanos, Karen S

    2013-05-01

    Inflammation is commonly observed in radical prostatectomy specimens, and evidence suggests that inflammation may contribute to prostate carcinogenesis. Multiple microorganisms have been implicated in serving as a stimulus for prostatic inflammation. The pro-inflammatory anaerobe, Propionibacterium acnes, is ubiquitously found on human skin and is associated with the skin disease acne vulgaris. Recent studies have shown that P. acnes can be detected in prostatectomy specimens by bacterial culture or by culture-independent molecular techniques. Radical prostatectomy tissue samples were obtained from 30 prostate cancer patients and subject to both aerobic and anaerobic culture. Cultured species were identified by 16S rDNA gene sequencing. Propionibacterium acnes isolates were typed using multilocus sequence typing (MLST). Our study confirmed that P. acnes can be readily cultured from prostatectomy tissues (7 of 30 cases, 23%). In some cases, multiple isolates of P. acnes were cultured as well as other Propionibacterium species, such as P. granulosum and P. avidum. Overall, 9 of 30 cases (30%) were positive for Propionibacterium spp. MLST analyses identified eight different sequence types (STs) among prostate-derived P. acnes isolates. These STs belong to two clonal complexes, namely CC36 (type I-2) and CC53/60 (type II), or are CC53/60-related singletons. MLST typing results indicated that prostate-derived P. acnes isolates do not fall within the typical skin/acne STs, but rather are characteristic of STs associated with opportunistic infections and/or urethral flora. The MLST typing results argue against the likelihood that prostatectomy-derived P. acnes isolates represent contamination from skin flora. Copyright © 2012 Wiley Periodicals, Inc.

  7. Sleep-stage sequencing of sleep-onset REM periods in MSLT predicts treatment response in patients with narcolepsy.

    PubMed

    Drakatos, Panagis; Patel, Kishankumar; Thakrar, Chiraag; Williams, Adrian J; Kent, Brian D; Leschziner, Guy D

    2016-04-01

    Current treatment recommendations for narcolepsy suggest that modafinil should be used as a first-line treatment ahead of conventional stimulants or sodium oxybate. In this study, performed in a tertiary sleep disorders centre, treatment responses were examined following these recommendations, and the ability of sleep-stage sequencing of sleep-onset rapid eye movement periods in the multiple sleep latency test to predict treatment response. Over a 3.5-year period, 255 patients were retrospectively identified in the authors' database as patients diagnosed with narcolepsy, type 1 (with cataplexy) or type 2 (without) using clinical and polysomnographic criteria. Eligible patients were examined in detail, sleep study data were abstracted and sleep-stage sequencing of sleep-onset rapid eye movement periods were analysed. Response to treatment was graded utilizing an internally developed scale. Seventy-five patients were included (39% males). Forty (53%) were diagnosed with type 1 narcolepsy with a mean follow-up of 2.37 ± 1.35 years. Ninety-seven percent of the patients were initially started on modafinil, and overall 59% reported complete response on the last follow-up. Twenty-nine patients (39%) had the sequence of sleep stage 1 or wake to rapid eye movement in all of their sleep-onset rapid eye movement periods, with most of these diagnosed as narcolepsy type 1 (72%). The presence of this specific sleep-stage sequence in all sleep-onset rapid eye movement periods was associated with worse treatment response (P = 0.0023). Sleep-stage sequence analysis of sleep-onset rapid eye movement periods in the multiple sleep latency test may aid the prediction of treatment response in narcoleptics and provide a useful prognostic tool in clinical practice, above and beyond their classification as narcolepsy type 1 or 2. © 2015 European Sleep Research Society.

  8. Prediction of beta-turns and beta-turn types by a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN).

    PubMed

    Kirschner, Andreas; Frishman, Dmitrij

    2008-10-01

    Prediction of beta-turns from amino acid sequences has long been recognized as an important problem in structural bioinformatics due to their frequent occurrence as well as their structural and functional significance. Because various structural features of proteins are intercorrelated, secondary structure information has been often employed as an additional input for machine learning algorithms while predicting beta-turns. Here we present a novel bidirectional Elman-type recurrent neural network with multiple output layers (MOLEBRNN) capable of predicting multiple mutually dependent structural motifs and demonstrate its efficiency in recognizing three aspects of protein structure: beta-turns, beta-turn types, and secondary structure. The advantage of our method compared to other predictors is that it does not require any external input except for sequence profiles because interdependencies between different structural features are taken into account implicitly during the learning process. In a sevenfold cross-validation experiment on a standard test dataset our method exhibits the total prediction accuracy of 77.9% and the Mathew's Correlation Coefficient of 0.45, the highest performance reported so far. It also outperforms other known methods in delineating individual turn types. We demonstrate how simultaneous prediction of multiple targets influences prediction performance on single targets. The MOLEBRNN presented here is a generic method applicable in a variety of research fields where multiple mutually depending target classes need to be predicted. http://webclu.bio.wzw.tum.de/predator-web/.

  9. NGSCheckMate: software for validating sample identity in next-generation sequencing studies within and across data types.

    PubMed

    Lee, Sejoon; Lee, Soohyun; Ouellette, Scott; Park, Woong-Yang; Lee, Eunjung A; Park, Peter J

    2017-06-20

    In many next-generation sequencing (NGS) studies, multiple samples or data types are profiled for each individual. An important quality control (QC) step in these studies is to ensure that datasets from the same subject are properly paired. Given the heterogeneity of data types, file types and sequencing depths in a multi-dimensional study, a robust program that provides a standardized metric for genotype comparisons would be useful. Here, we describe NGSCheckMate, a user-friendly software package for verifying sample identities from FASTQ, BAM or VCF files. This tool uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms, considering depth-dependent behavior of similarity metrics for identical and unrelated samples. Our evaluation shows that NGSCheckMate is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNA-seq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth (>0.5X). An alignment-free module can be run directly on FASTQ files for a quick initial check. We recommend using this software as a QC step in NGS studies. https://github.com/parklab/NGSCheckMate. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Rather than by direct acquisition via lateral gene transfer, GHF5 cellulases were passed on from early Pratylenchidae to root-knot and cyst nematodes.

    PubMed

    Rybarczyk-Mydłowska, Katarzyna; Maboreke, Hazel Ruvimbo; van Megen, Hanny; van den Elsen, Sven; Mooyman, Paul; Smant, Geert; Bakker, Jaap; Helder, Johannes

    2012-11-21

    Plant parasitic nematodes are unusual Metazoans as they are equipped with genes that allow for symbiont-independent degradation of plant cell walls. Among the cell wall-degrading enzymes, glycoside hydrolase family 5 (GHF5) cellulases are relatively well characterized, especially for high impact parasites such as root-knot and cyst nematodes. Interestingly, ancestors of extant nematodes most likely acquired these GHF5 cellulases from a prokaryote donor by one or multiple lateral gene transfer events. To obtain insight into the origin of GHF5 cellulases among evolutionary advanced members of the order Tylenchida, cellulase biodiversity data from less distal family members were collected and analyzed. Single nematodes were used to obtain (partial) genomic sequences of cellulases from representatives of the genera Meloidogyne, Pratylenchus, Hirschmanniella and Globodera. Combined Bayesian analysis of ≈ 100 cellulase sequences revealed three types of catalytic domains (A, B, and C). Represented by 84 sequences, type B is numerically dominant, and the overall topology of the catalytic domain type shows remarkable resemblance with trees based on neutral (= pathogenicity-unrelated) small subunit ribosomal DNA sequences. Bayesian analysis further suggested a sister relationship between the lesion nematode Pratylenchus thornei and all type B cellulases from root-knot nematodes. Yet, the relationship between the three catalytic domain types remained unclear. Superposition of intron data onto the cellulase tree suggests that types B and C are related, and together distinct from type A that is characterized by two unique introns. All Tylenchida members investigated here harbored one or multiple GHF5 cellulases. Three types of catalytic domains are distinguished, and the presence of at least two types is relatively common among plant parasitic Tylenchida. Analysis of coding sequences of cellulases suggests that root-knot and cyst nematodes did not acquire this gene directly by lateral genes transfer. More likely, these genes were passed on by ancestors of a family nowadays known as the Pratylenchidae.

  11. BlockLogo: visualization of peptide and sequence motif conservation

    PubMed Central

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L.; Zhang, Guang Lan; Brusic, Vladimir

    2013-01-01

    BlockLogo is a web-server application for visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://methilab.bu.edu/blocklogo/ PMID:24001880

  12. Divergent nuclear 18S rDNA paralogs in a turkey coccidium, Eimeria meleagrimitis, complicate molecular systematics and identification.

    PubMed

    El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R

    2013-07-01

    Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.

  13. Development of Mycoplasma synoviae (MS) core genome multilocus sequence typing (cgMLST) scheme.

    PubMed

    Ghanem, Mostafa; El-Gazzar, Mohamed

    2018-05-01

    Mycoplasma synoviae (MS) is a poultry pathogen with reported increased prevalence and virulence in recent years. MS strain identification is essential for prevention, control efforts and epidemiological outbreak investigations. Multiple multilocus based sequence typing schemes have been developed for MS, yet the resolution of these schemes could be limited for outbreak investigation. The cost of whole genome sequencing became close to that of sequencing the seven MLST targets; however, there is no standardized method for typing MS strains based on whole genome sequences. In this paper, we propose a core genome multilocus sequence typing (cgMLST) scheme as a standardized and reproducible method for typing MS based whole genome sequences. A diverse set of 25 MS whole genome sequences were used to identify 302 core genome genes as cgMLST targets (35.5% of MS genome) and 44 whole genome sequences of MS isolates from six countries in four continents were used for typing applying this scheme. cgMLST based phylogenetic trees displayed a high degree of agreement with core genome SNP based analysis and available epidemiological information. cgMLST allowed evaluation of two conventional MLST schemes of MS. The high discriminatory power of cgMLST allowed differentiation between samples of the same conventional MLST type. cgMLST represents a standardized, accurate, highly discriminatory, and reproducible method for differentiation between MS isolates. Like conventional MLST, it provides stable and expandable nomenclature, allowing for comparing and sharing the typing results between different laboratories worldwide. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  14. HPV-QUEST: A highly customized system for automated HPV sequence analysis capable of processing Next Generation sequencing data set.

    PubMed

    Yin, Li; Yao, Jiqiang; Gardner, Brent P; Chang, Kaifen; Yu, Fahong; Goodenow, Maureen M

    2012-01-01

    Next Generation sequencing (NGS) applied to human papilloma viruses (HPV) can provide sensitive methods to investigate the molecular epidemiology of multiple type HPV infection. Currently a genotyping system with a comprehensive collection of updated HPV reference sequences and a capacity to handle NGS data sets is lacking. HPV-QUEST was developed as an automated and rapid HPV genotyping system. The web-based HPV-QUEST subtyping algorithm was developed using HTML, PHP, Perl scripting language, and MYSQL as the database backend. HPV-QUEST includes a database of annotated HPV reference sequences with updated nomenclature covering 5 genuses, 14 species and 150 mucosal and cutaneous types to genotype blasted query sequences. HPV-QUEST processes up to 10 megabases of sequences within 1 to 2 minutes. Results are reported in html, text and excel formats and display e-value, blast score, and local and coverage identities; provide genus, species, type, infection site and risk for the best matched reference HPV sequence; and produce results ready for additional analyses.

  15. A VLT/NACO survey for triple and quadruple systems among visual pre-main sequence binaries

    NASA Astrophysics Data System (ADS)

    Correia, S.; Zinnecker, H.; Ratzka, Th.; Sterzik, M. F.

    2006-12-01

    Aims.This paper describes a systematic search for high-order multiplicity among wide visual Pre-Main Sequence (PMS) binaries. Methods: .We conducted an Adaptive Optics survey of a sample of 58 PMS wide binaries from various star-forming regions, which include 52 T Tauri systems with mostly K- and M-type primaries, with the NIR instrument NACO at the VLT. Results: .Of these 52 systems, 7 are found to be triple (2 new) and 7 quadruple (1 new). The new close companions are most likely physically bound based on their probability of chance projection and, for some of them, on their position on a color-color diagram. The corresponding degree of multiplicity among wide binaries (number of triples and quadruples divided by the number of systems) is 26.9 ± 7.2% in the projected separation range ~0.07 arcsec -12'', with the largest contribution from the Taurus-Auriga cloud. We also found that this degree of multiplicity is twice in Taurus compared to Ophiuchus and Chamaeleon for which the same number of sources are present in our sample. Considering a restricted sample composed of systems at distance 140-190 pc, the degree of multiplicity is 26.8 ± 8.1%, in the separation range 10/14 AU-1700/2300 AU (30 binaries, 5 triples, 6 quadruples). The observed frequency agrees with results from previous multiplicity surveys within the uncertainties, although a significant overabundance of quadruple systems compared to triple systems is apparent. Tentatively including the spectroscopic pairs in our restricted sample and comparing the multiplicity fractions to those measured for solar-type main-sequence stars in the solar neighborhood leads to the conclusion that both the ratio of triples to binaries and the ratio of quadruples to triples seems to be in excess among young stars. Most of the current numerical simulations of multiple star formation, and especially smoothed particles hydrodynamics simulations, over-predict the fraction of high-order multiplicity when compared to our results. The circumstellar properties around the individual components of our high-order multiple systems tend to favor mixed systems (i.e. systems including components of wTTS and cTTS type), which is in general agreement with previous studies of disks in binaries, with the exception of Taurus, where we find a preponderance of similar type of components among the multiples studied.

  16. Distribution of Bartonella henselae Variants in Patients, Reservoir Hosts and Vectors in Spain

    PubMed Central

    Gil, Horacio; Escudero, Raquel; Pons, Inmaculada; Rodríguez-Vargas, Manuela; García-Esteban, Coral; Rodríguez-Moreno, Isabel; García-Amil, Cristina; Lobo, Bruno; Valcárcel, Félix; Pérez, Azucena; Jiménez, Santos; Jado, Isabel; Juste, Ramón; Segura, Ferrán; Anda, Pedro

    2013-01-01

    We have studied the diversity of B. henselae circulating in patients, reservoir hosts and vectors in Spain. In total, we have fully characterized 53 clinical samples from 46 patients, as well as 78 B. henselae isolates obtained from 35 cats from La Rioja and Catalonia (northeastern Spain), four positive cat blood samples from which no isolates were obtained, and three positive fleas by Multiple Locus Sequence Typing and Multiple Locus Variable Number Tandem Repeats Analysis. This study represents the largest series of human cases characterized with these methods, with 10 different sequence types and 41 MLVA profiles. Two of the sequence types and 35 of the profiles were not described previously. Most of the B. henselae variants belonged to ST5. Also, we have identified a common profile (72) which is well distributed in Spain and was found to persist over time. Indeed, this profile seems to be the origin from which most of the variants identified in this study have been generated. In addition, ST5, ST6 and ST9 were found associated with felines, whereas ST1, ST5 and ST8 were the most frequent sequence types found infecting humans. Interestingly, some of the feline associated variants never found on patients were located in a separate clade, which could represent a group of strains less pathogenic for humans. PMID:23874563

  17. Duplication and concerted evolution of MiSp-encoding genes underlie the material properties of minor ampullate silks of cobweb weaving spiders.

    PubMed

    Vienneau-Hathaway, Jannelle M; Brassfield, Elizabeth R; Lane, Amanda Kelly; Collin, Matthew A; Correa-Garhwal, Sandra M; Clarke, Thomas H; Schwager, Evelyn E; Garb, Jessica E; Hayashi, Cheryl Y; Ayoub, Nadia A

    2017-03-14

    Orb-web weaving spiders and their relatives use multiple types of task-specific silks. The majority of spider silk studies have focused on the ultra-tough dragline silk synthesized in major ampullate glands, but other silk types have impressive material properties. For instance, minor ampullate silks of orb-web weaving spiders are as tough as draglines, due to their higher extensibility despite lower strength. Differences in material properties between silk types result from differences in their component proteins, particularly members of the spidroin (spider fibroin) gene family. However, the extent to which variation in material properties within a single silk type can be explained by variation in spidroin sequences is unknown. Here, we compare the minor ampullate spidroins (MiSp) of orb-weavers and cobweb weavers. Orb-web weavers use minor ampullate silk to form the auxiliary spiral of the orb-web while cobweb weavers use it to wrap prey, suggesting that selection pressures on minor ampullate spidroins (MiSp) may differ between the two groups. We report complete or nearly complete MiSp sequences from five cobweb weaving spider species and measure material properties of minor ampullate silks in a subset of these species. We also compare MiSp sequences and silk properties of our cobweb weavers to published data for orb-web weavers. We demonstrate that all our cobweb weavers possess multiple MiSp loci and that one locus is more highly expressed in at least two species. We also find that the proportion of β-spiral-forming amino acid motifs in MiSp positively correlates with minor ampullate silk extensibility across orb-web and cobweb weavers. MiSp sequences vary dramatically within and among spider species, and have likely been subject to multiple rounds of gene duplication and concerted evolution, which have contributed to the diverse material properties of minor ampullate silks. Our sequences also provide templates for recombinant silk proteins with tailored properties.

  18. Genome Sequences of Mycobacteriophages Amgine, Amohnition, Bella96, Cain, DarthP, Hammy, Krueger, LastHope, Peanam, PhelpsODU, Phrank, SirPhilip, Slimphazie, and Unicorn

    PubMed Central

    Anders, Kirk R.; Mavrodi, Dmitri V.; Vazquez, Edwin; Amoh, Nana Yaa A.; Baliraine, Frederick N.; Buchser, William J.; Cast, Thomas P.; Chamberlain, Carmen E.; Chung, Hui-Min; D’Angelo, William A.; Farris, Christian T.; Fernandez-Martinez, Mariceli; Fischman, Haley D.; Forsyth, Mark H.; Fortier, Anna G.; Gallo, Kara F.; Held, Greta J.; Lomas, Miguel A.; Maldonado-Vazquez, Natalia Y.; Moonsammy, Claudia H.; Namboote, Peace; Paudel, Sudip; Reyes, Gabriella M.; Rubin, Michael R.; Saha, Margaret S.; Stukey, Joseph; Tobias, Tristan D.; Garlena, Rebecca A.; Stoner, Ty H.; Russell, Daniel A.

    2017-01-01

    ABSTRACT We report the genome sequences of 14 cluster K mycobacteriophages isolated using Mycobacterium smegmatis mc²155 as host. Four are closely related to subcluster K1 phages, and 10 are members of subcluster K6. The phage genomes span considerable sequence diversity, including multiple types of integrases and integration sites. PMID:29217790

  19. Efficient Processing of the Immunodominant, HLA-A*0201-Restricted Human Immunodeficiency Virus Type 1 Cytotoxic T-Lymphocyte Epitope despite Multiple Variations in the Epitope Flanking Sequences

    PubMed Central

    Brander, Christian; Yang, Otto O.; Jones, Norman G.; Lee, Yun; Goulder, Philip; Johnson, R. Paul; Trocha, Alicja; Colbert, David; Hay, Christine; Buchbinder, Susan; Bergmann, Cornelia C.; Zweerink, Hans J.; Wolinsky, Steven; Blattner, William A.; Kalams, Spyros A.; Walker, Bruce D.

    1999-01-01

    Immune escape from cytotoxic T-lymphocyte (CTL) responses has been shown to occur not only by changes within the targeted epitope but also by changes in the flanking sequences which interfere with the processing of the immunogenic peptide. However, the frequency of such an escape mechanism has not been determined. To investigate whether naturally occurring variations in the flanking sequences of an immunodominant human immunodeficiency virus type 1 (HIV-1) Gag CTL epitope prevent antigen processing, cells infected with HIV-1 or vaccinia virus constructs encoding different patient-derived Gag sequences were tested for recognition by HLA-A*0201-restricted, p17-specific CTL. We found that the immunodominant p17 epitope (SL9) and its variants were efficiently processed from minigene expressing vectors and from six HIV-1 Gag variants expressed by recombinant vaccinia virus constructs. Furthermore, SL9-specific CTL clones derived from multiple donors efficiently inhibited virus replication when added to HLA-A*0201-bearing cells infected with primary or laboratory-adapted strains of virus, despite the variability in the SL9 flanking sequences. These data suggest that escape from this immunodominant CTL response is not frequently accomplished by changes in the epitope flanking sequences. PMID:10559335

  20. A functional U-statistic method for association analysis of sequencing data.

    PubMed

    Jadhav, Sneha; Tong, Xiaoran; Lu, Qing

    2017-11-01

    Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.

  1. The genetic structure of the A mating-type locus of Lentinula edodes.

    PubMed

    Au, Chun Hang; Wong, Man Chun; Bao, Dapeng; Zhang, Meiyan; Song, Chunyan; Song, Wenhua; Law, Patrick Tik Wan; Kües, Ursula; Kwan, Hoi Shan

    2014-02-10

    The Shiitake mushroom, Lentinula edodes (Berk.) Pegler is a tetrapolar basidiomycete with two unlinked mating-type loci, commonly called the A and B loci. Identifying the mating-types in shiitake is important for enhancing the breeding and cultivation of this economically-important edible mushroom. Here, we identified the A mating-type locus from the first draft genome sequence of L. edodes and characterized multiple alleles from different monokaryotic strains. Two intron-length polymorphism markers were developed to facilitate rapid molecular determination of A mating-type. L. edodes sequences were compared with those of known tetrapolar and bipolar basidiomycete species. The A mating-type genes are conserved at the homeodomain region across the order Agaricales. However, we observed unique genomic organization of the locus in L. edodes which exhibits atypical gene order and multiple repetitive elements around its A locus. To our knowledge, this is the first known exception among Homobasidiomycetes, in which the mitochondrial intermediate peptidase (mip) gene is not closely linked to A locus. Copyright © 2013 Elsevier B.V. All rights reserved.

  2. Sequences of multiple bacterial genomes and a Chlamydia trachomatis genotype from direct sequencing of DNA derived from a vaginal swab diagnostic specimen.

    PubMed

    Andersson, P; Klein, M; Lilliebridge, R A; Giffard, P M

    2013-09-01

    Ultra-deep Illumina sequencing was performed on whole genome amplified DNA derived from a Chlamydia trachomatis-positive vaginal swab. Alignment of reads with reference genomes allowed robust SNP identification from the C. trachomatis chromosome and plasmid. This revealed that the C. trachomatis in the specimen was very closely related to the sequenced urogenital, serovar F, clade T1 isolate F-SW4. In addition, high genome-wide coverage was obtained for Prevotella melaninogenica, Gardnerella vaginalis, Clostridiales genomosp. BVAB3 and Mycoplasma hominis. This illustrates the potential of metagenome data to provide high resolution bacterial typing data from multiple taxa in a diagnostic specimen. ©2013 The Authors Clinical Microbiology and Infection ©2013 European Society of Clinical Microbiology and Infectious Diseases.

  3. Multiple alignment-free sequence comparison

    PubMed Central

    Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine

    2013-01-01

    Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418

  4. Wide distribution of O157-antigen biosynthesis gene clusters in Escherichia coli.

    PubMed

    Iguchi, Atsushi; Shirai, Hiroki; Seto, Kazuko; Ooka, Tadasuke; Ogura, Yoshitoshi; Hayashi, Tetsuya; Osawa, Kayo; Osawa, Ro

    2011-01-01

    Most Escherichia coli O157-serogroup strains are classified as enterohemorrhagic E. coli (EHEC), which is known as an important food-borne pathogen for humans. They usually produce Shiga toxin (Stx) 1 and/or Stx2, and express H7-flagella antigen (or nonmotile). However, O157 strains that do not produce Stxs and express H antigens different from H7 are sometimes isolated from clinical and other sources. Multilocus sequence analysis revealed that these 21 O157:non-H7 strains tested in this study belong to multiple evolutionary lineages different from that of EHEC O157:H7 strains, suggesting a wide distribution of the gene set encoding the O157-antigen biosynthesis in multiple lineages. To gain insight into the gene organization and the sequence similarity of the O157-antigen biosynthesis gene clusters, we conducted genomic comparisons of the chromosomal regions (about 59 kb in each strain) covering the O-antigen gene cluster and its flanking regions between six O157:H7/non-H7 strains. Gene organization of the O157-antigen gene cluster was identical among O157:H7/non-H7 strains, but was divided into two distinct types at the nucleotide sequence level. Interestingly, distribution of the two types did not clearly follow the evolutionary lineages of the strains, suggesting that horizontal gene transfer of both types of O157-antigen gene clusters has occurred independently among E. coli strains. Additionally, detailed sequence comparison revealed that some positions of the repetitive extragenic palindromic (REP) sequences in the regions flanking the O-antigen gene clusters were coincident with possible recombination points. From these results, we conclude that the horizontal transfer of the O157-antigen gene clusters induced the emergence of multiple O157 lineages within E. coli and speculate that REP sequences may involve one of the driving forces for exchange and evolution of O-antigen loci.

  5. FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.

    PubMed

    Bolleman, Jerven T; Mungall, Christopher J; Strozzi, Francesco; Baran, Joachim; Dumontier, Michel; Bonnal, Raoul J P; Buels, Robert; Hoehndorf, Robert; Fujisawa, Takatomo; Katayama, Toshiaki; Cock, Peter J A

    2016-06-13

    Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned "omics" areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe - and potentially merge - sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.

  6. FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation

    DOE PAGES

    Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco; ...

    2016-06-13

    Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less

  7. FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco

    Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less

  8. Type-Specific Detection of 30 Oncogenic Human Papillomaviruses by Genotyping both E6 and L1 Genes

    PubMed Central

    Peng, Junping; Gao, Lei; Guo, Junhua; Wang, Ting; Wang, Ling; Yao, Qing; Zhu, Haijun

    2013-01-01

    Human papillomavirus (HPV) is the principal cause of invasive cervical cancer and benign genital lesions. There are currently 30 HPV types linked to cervical cancer. HPV infection also leads to other types of cancer. We developed a 61-plex analysis of these 30 HPV types by examining two genes, E6 and L1, using MassARRAY matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) (PCR-MS). Two hundred samples from homosexual males (HM) were screened by PCR-MS and MY09/MY11 primer set-mediated PCR (MY-PCR) followed by sequencing. One hundred thirty-five formalin-fixed, paraffin-embedded (FFPE) cervical cancer samples were also analyzed by PCR-MS, and results were compared to those of the commercially available GenoArray (GA) assay. One or more HPV types were identified in 64.5% (129/200) of the samples from HM. Comprising all 30 HPV types, PCR-MS detected 51.9% (67/129) of samples with multiple HPV types, whereas MY-PCR detected only one single HPV type in these samples. All PCR-MS results were confirmed by MY-PCR. In the cervical cancer samples, PCR-MS and GA detected 97% (131/135) and 90.4% (122/135) of HPV-positive samples, respectively. PCR-MS and GA results were fully concordant for 122 positive and 4 negative samples. The sequencing results for the 9 samples that tested negative by GA were completely concordant with the positive PCR-MS results. Multiple HPV types were identified in 25.2% (34/135) and 55.6% (75/135) of the cervical cancer samples by GA and PCR-MS, respectively, and results were confirmed by sequencing. The new assay allows the genotyping of >1,000 samples per day. It provides a good alternative to current methods, especially for large-scale investigations of multiple HPV infections and degraded FFPE samples. PMID:23152557

  9. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  10. Genome Sequences of Mycobacteriophages Amgine, Amohnition, Bella96, Cain, DarthP, Hammy, Krueger, LastHope, Peanam, PhelpsODU, Phrank, SirPhilip, Slimphazie, and Unicorn.

    PubMed

    Anders, Kirk R; Barekzi, Nazir; Best, Aaron A; Frederick, Gregory D; Mavrodi, Dmitri V; Vazquez, Edwin; Amoh, Nana Yaa A; Baliraine, Frederick N; Buchser, William J; Cast, Thomas P; Chamberlain, Carmen E; Chung, Hui-Min; D'Angelo, William A; Farris, Christian T; Fernandez-Martinez, Mariceli; Fischman, Haley D; Forsyth, Mark H; Fortier, Anna G; Gallo, Kara F; Held, Greta J; Lomas, Miguel A; Maldonado-Vazquez, Natalia Y; Moonsammy, Claudia H; Namboote, Peace; Paudel, Sudip; Polley, Sarah-Elizabeth M; Reyes, Gabriella M; Rubin, Michael R; Saha, Margaret S; Stukey, Joseph; Tobias, Tristan D; Garlena, Rebecca A; Stoner, Ty H; Cresawn, Steven G; Jacobs-Sera, Deborah; Pope, Welkin H; Russell, Daniel A; Hatfull, Graham F

    2017-12-07

    We report the genome sequences of 14 cluster K mycobacteriophages isolated using Mycobacterium smegmatis mc²155 as host. Four are closely related to subcluster K1 phages, and 10 are members of subcluster K6. The phage genomes span considerable sequence diversity, including multiple types of integrases and integration sites. Copyright © 2017 Anders et al.

  11. Mitochondrial sequence divergence among Antarctic killer whale ecotypes is consistent with multiple species.

    PubMed

    LeDuc, Richard G; Robertson, Kelly M; Pitman, Robert L

    2008-08-23

    Recently, three visually distinct forms of killer whales (Orcinus orca) were described from Antarctic waters and designated as types A, B and C. Based on consistent differences in prey selection and habitat preferences, morphological divergence and apparent lack of interbreeding among these broadly sympatric forms, it was suggested that they may represent separate species. To evaluate this hypothesis, we compared complete sequences of the mitochondrial control region from 81 Antarctic killer whale samples, including 9 type A, 18 type B, 47 type C and 7 type-undetermined individuals. We found three fixed differences that separated type A from B and C, and a single fixed difference that separated type C from A and B. These results are consistent with reproductive isolation among the different forms, although caution is needed in drawing further conclusions. Despite dramatic differences in morphology and ecology, the relatively low levels of sequence divergence in Antarctic killer whales indicate that these evolutionary changes occurred relatively rapidly and recently.

  12. Several Families of Sequences with Low Correlation and Large Linear Span

    NASA Astrophysics Data System (ADS)

    Zeng, Fanxin; Zhang, Zhenyu

    In DS-CDMA systems and DS-UWB radios, low correlation of spreading sequences can greatly help to minimize multiple access interference (MAI) and large linear span of spreading sequences can reduce their predictability. In this letter, new sequence sets with low correlation and large linear span are proposed. Based on the construction Trm1[Trnm(αbt+γiαdt)]r for generating p-ary sequences of period pn-1, where n=2m, d=upm±v, b=u±v, γi∈GF(pn), and p is an arbitrary prime number, several methods to choose the parameter d are provided. The obtained sequences with family size pn are of four-valued, five-valued, six-valued or seven-valued correlation and the maximum nontrivial correlation value is (u+v-1)pm-1. The simulation by a computer shows that the linear span of the new sequences is larger than that of the sequences with Niho-type and Welch-type decimations, and similar to that of [10].

  13. Multiple Locus Variable-Number Tandem-Repeat and Single-Nucleotide Polymorphism-Based Brucella Typing Reveals Multiple Lineages in Brucella melitensis Currently Endemic in China.

    PubMed

    Sun, Mingjun; Jing, Zhigang; Di, Dongdong; Yan, Hao; Zhang, Zhicheng; Xu, Quangang; Zhang, Xiyue; Wang, Xun; Ni, Bo; Sun, Xiangxiang; Yan, Chengxu; Yang, Zhen; Tian, Lili; Li, Jinping; Fan, Weixing

    2017-01-01

    Brucellosis is a worldwide zoonotic disease caused by Brucella spp. In China, brucellosis is recognized as a reemerging disease mainly caused by Brucella melitensis specie. To better understand the currently endemic B. melitensis strains in China, three Brucella genotyping methods were applied to 110 B. melitensis strains obtained in past several years. By MLVA genotyping, five MLVA-8 genotypes were identified, among which genotypes 42 (1-5-3-13-2-2-3-2) was recognized as the predominant genotype, while genotype 63 (1-5-3-13-2-3-3-2) and a novel genotype of 1-5-3-13-2-4-3-2 were second frequently observed. MLVA-16 discerned a total of 57 MLVA-16 genotypes among these Brucella strains, with 41 genotypes being firstly detected and the other 16 genotypes being previously reported. By BruMLSA21 typing, six sequence types (STs) were identified, among them ST8 is the most frequently seen in China while the other five STs were firstly detected and designated as ST137, ST138, ST139, ST140, and ST141 by international multilocus sequence typing database. Whole-genome sequence (WGS)-single-nucleotide polymorphism (SNP)-based typing and phylogenetic analysis resolved Chinese B. melitensis strains into five clusters, reflecting the existence of multiple lineages among these Chinese B. melitensis strains. In phylogeny, Chinese lineages are more closely related to strains collected from East Mediterranean and Middle East countries, such as Turkey, Kuwait, and Iraq. In the next few years, MLVA typing will certainly remain an important epidemiological tool for Brucella infection analysis, as it displays a high discriminatory ability and achieves result largely in agreement with WGS-SNP-based typing. However, WGS-SNP-based typing is found to be the most powerful and reliable method in discerning Brucella strains and will be popular used in the future.

  14. Molecular Strain Typing of Mycobacterium tuberculosis: a Review of Frequently Used Methods

    PubMed Central

    2016-01-01

    Tuberculosis, caused by the bacterium Mycobacterium tuberculosis, remains one of the most serious global health problems. Molecular typing of M. tuberculosis has been used for various epidemiologic purposes as well as for clinical management. Currently, many techniques are available to type M. tuberculosis. Choosing the most appropriate technique in accordance with the existing laboratory conditions and the specific features of the geographic region is important. Insertion sequence IS6110-based restriction fragment length polymorphism (RFLP) analysis is considered the gold standard for the molecular epidemiologic investigations of tuberculosis. However, other polymerase chain reaction-based methods such as spacer oligonucleotide typing (spoligotyping), which detects 43 spacer sequence-interspersing direct repeats (DRs) in the genomic DR region; mycobacterial interspersed repetitive units–variable number tandem repeats, (MIRU-VNTR), which determines the number and size of tandem repetitive DNA sequences; repetitive-sequence-based PCR (rep-PCR), which provides high-throughput genotypic fingerprinting of multiple Mycobacterium species; and the recently developed genome-based whole genome sequencing methods demonstrate similar discriminatory power and greater convenience. This review focuses on techniques frequently used for the molecular typing of M. tuberculosis and discusses their general aspects and applications. PMID:27709842

  15. Molecular Strain Typing of Mycobacterium tuberculosis: a Review of Frequently Used Methods.

    PubMed

    Ei, Phyu Win; Aung, Wah Wah; Lee, Jong Seok; Choi, Go Eun; Chang, Chulhun L

    2016-11-01

    Tuberculosis, caused by the bacterium Mycobacterium tuberculosis, remains one of the most serious global health problems. Molecular typing of M. tuberculosis has been used for various epidemiologic purposes as well as for clinical management. Currently, many techniques are available to type M. tuberculosis. Choosing the most appropriate technique in accordance with the existing laboratory conditions and the specific features of the geographic region is important. Insertion sequence IS6110-based restriction fragment length polymorphism (RFLP) analysis is considered the gold standard for the molecular epidemiologic investigations of tuberculosis. However, other polymerase chain reaction-based methods such as spacer oligonucleotide typing (spoligotyping), which detects 43 spacer sequence-interspersing direct repeats (DRs) in the genomic DR region; mycobacterial interspersed repetitive units-variable number tandem repeats, (MIRU-VNTR), which determines the number and size of tandem repetitive DNA sequences; repetitive-sequence-based PCR (rep-PCR), which provides high-throughput genotypic fingerprinting of multiple Mycobacterium species; and the recently developed genome-based whole genome sequencing methods demonstrate similar discriminatory power and greater convenience. This review focuses on techniques frequently used for the molecular typing of M. tuberculosis and discusses their general aspects and applications.

  16. Identification of Variable-Number Tandem-Repeat (VNTR) Sequences in Acinetobacter baumannii and Interlaboratory Validation of an Optimized Multiple-Locus VNTR Analysis Typing Scheme▿†

    PubMed Central

    Pourcel, Christine; Minandri, Fabrizia; Hauck, Yolande; D'Arezzo, Silvia; Imperi, Francesco; Vergnaud, Gilles; Visca, Paolo

    2011-01-01

    Acinetobacter baumannii is an important opportunistic pathogen responsible for nosocomial outbreaks, mostly occurring in intensive care units. Due to the multiplicity of infection sources, reliable molecular fingerprinting techniques are needed to establish epidemiological correlations among A. baumannii isolates. Multiple-locus variable-number tandem-repeat analysis (MLVA) has proven to be a fast, reliable, and cost-effective typing method for several bacterial species. In this study, an MLVA assay compatible with simple PCR- and agarose gel-based electrophoresis steps as well as with high-throughput automated methods was developed for A. baumannii typing. Preliminarily, 10 potential polymorphic variable-number tandem repeats (VNTRs) were identified upon bioinformatic screening of six annotated genome sequences of A. baumannii. A collection of 7 reference strains plus 18 well-characterized isolates, including unique types and representatives of the three international A. baumannii lineages, was then evaluated in a two-center study aimed at validating the MLVA assay and comparing it with other genotyping assays, namely, macrorestriction analysis with pulsed-field gel electrophoresis (PFGE) and PCR-based sequence group (SG) profiling. The results showed that MLVA can discriminate between isolates with identical PFGE types and SG profiles. A panel of eight VNTR markers was selected, all showing the ability to be amplified and good amounts of polymorphism in the majority of strains. Independently generated MLVA profiles, composed of an ordered string of allele numbers corresponding to the number of repeats at each VNTR locus, were concordant between centers. Typeability, reproducibility, stability, discriminatory power, and epidemiological concordance were excellent. A database containing information and MLVA profiles for several A. baumannii strains is available from http://mlva.u-psud.fr/. PMID:21147956

  17. Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment

    PubMed Central

    Yamashita, Yuichi; Tani, Jun

    2008-01-01

    It is generally thought that skilled behavior in human beings results from a functional hierarchy of the motor control system, within which reusable motor primitives are flexibly integrated into various sensori-motor sequence patterns. The underlying neural mechanisms governing the way in which continuous sensori-motor flows are segmented into primitives and the way in which series of primitives are integrated into various behavior sequences have, however, not yet been clarified. In earlier studies, this functional hierarchy has been realized through the use of explicit hierarchical structure, with local modules representing motor primitives in the lower level and a higher module representing sequences of primitives switched via additional mechanisms such as gate-selecting. When sequences contain similarities and overlap, however, a conflict arises in such earlier models between generalization and segmentation, induced by this separated modular structure. To address this issue, we propose a different type of neural network model. The current model neither makes use of separate local modules to represent primitives nor introduces explicit hierarchical structure. Rather than forcing architectural hierarchy onto the system, functional hierarchy emerges through a form of self-organization that is based on two distinct types of neurons, each with different time properties (“multiple timescales”). Through the introduction of multiple timescales, continuous sequences of behavior are segmented into reusable primitives, and the primitives, in turn, are flexibly integrated into novel sequences. In experiments, the proposed network model, coordinating the physical body of a humanoid robot through high-dimensional sensori-motor control, also successfully situated itself within a physical environment. Our results suggest that it is not only the spatial connections between neurons but also the timescales of neural activity that act as important mechanisms leading to functional hierarchy in neural systems. PMID:18989398

  18. AgdbNet – antigen sequence database software for bacterial typing

    PubMed Central

    Jolley, Keith A; Maiden, Martin CJ

    2006-01-01

    Background Bacterial typing schemes based on the sequences of genes encoding surface antigens require databases that provide a uniform, curated, and widely accepted nomenclature of the variants identified. Due to the differences in typing schemes, imposed by the diversity of genes targeted, creating these databases has typically required the writing of one-off code to link the database to a web interface. Here we describe agdbNet, widely applicable web database software that facilitates simultaneous BLAST querying of multiple loci using either nucleotide or peptide sequences. Results Databases are described by XML files that are parsed by a Perl CGI script. Each database can have any number of loci, which may be defined by nucleotide and/or peptide sequences. The software is currently in use on at least five public databases for the typing of Neisseria meningitidis, Campylobacter jejuni and Streptococcus equi and can be set up to query internal isolate tables or suitably-configured external isolate databases, such as those used for multilocus sequence typing. The style of the resulting website can be fully configured by modifying stylesheets and through the use of customised header and footer files that surround the output of the script. Conclusion The software provides a rapid means of setting up customised Internet antigen sequence databases. The flexible configuration options enable typing schemes with differing requirements to be accommodated. PMID:16790057

  19. The recent emergence in hospitals of multidrug-resistant community-associated sequence type 1 and spa type t127 methicillin-resistant Staphylococcus aureus investigated by whole-genome sequencing: Implications for screening

    PubMed Central

    Earls, Megan R.; Kinnevey, Peter M.; Brennan, Gráinne I.; Lazaris, Alexandros; Skally, Mairead; O’Connell, Brian; Humphreys, Hilary; Shore, Anna C.

    2017-01-01

    Community-associated spa type t127/t922 methicillin-resistant Staphylococcus aureus (MRSA) prevalence increased from 1%-7% in Ireland between 2010–2015. This study tracked the spread of 89 such isolates from June 2013-June 2016. These included 78 healthcare-associated and 11 community associated-MRSA isolates from a prolonged hospital outbreak (H1) (n = 46), 16 other hospitals (n = 28), four other healthcare facilities (n = 4) and community-associated sources (n = 11). Isolates underwent antimicrobial susceptibility testing, DNA microarray profiling and whole-genome sequencing. Minimum spanning trees were generated following core-genome multilocus sequence typing and pairwise single nucleotide variation (SNV) analysis was performed. All isolates were sequence type 1 MRSA staphylococcal cassette chromosome mec type IV (ST1-MRSA-IV) and 76/89 were multidrug-resistant. Fifty isolates, including 40/46 from H1, were high-level mupirocin-resistant, carrying a conjugative 39 kb iles2-encoding plasmid. Two closely related ST1-MRSA-IV strains (I and II) and multiple sporadic strains were identified. Strain I isolates (57/89), including 43/46 H1 and all high-level mupirocin-resistant isolates, exhibited ≤80 SNVs. Two strain I isolates from separate H1 healthcare workers differed from other H1/strain I isolates by 7–47 and 12–53 SNVs, respectively, indicating healthcare worker involvement in this outbreak. Strain II isolates (19/89), including the remaining H1 isolates, exhibited ≤127 SNVs. For each strain, the pairwise SNVs exhibited by healthcare-associated and community-associated isolates indicated recent transmission of ST1-MRSA-IV within and between multiple hospitals, healthcare facilities and communities in Ireland. Given the interchange between healthcare-associated and community-associated isolates in hospitals, the risk factors that inform screening for MRSA require revision. PMID:28399151

  20. A multilevel ant colony optimization algorithm for classical and isothermic DNA sequencing by hybridization with multiplicity information available.

    PubMed

    Kwarciak, Kamil; Radom, Marcin; Formanowicz, Piotr

    2016-04-01

    The classical sequencing by hybridization takes into account a binary information about sequence composition. A given element from an oligonucleotide library is or is not a part of the target sequence. However, the DNA chip technology has been developed and it enables to receive a partial information about multiplicity of each oligonucleotide the analyzed sequence consist of. Currently, it is not possible to assess the exact data of such type but even partial information should be very useful. Two realistic multiplicity information models are taken into consideration in this paper. The first one, called "one and many" assumes that it is possible to obtain information if a given oligonucleotide occurs in a reconstructed sequence once or more than once. According to the second model, called "one, two and many", one is able to receive from biochemical experiment information if a given oligonucleotide is present in an analyzed sequence once, twice or at least three times. An ant colony optimization algorithm has been implemented to verify the above models and to compare with existing algorithms for sequencing by hybridization which utilize the additional information. The proposed algorithm solves the problem with any kind of hybridization errors. Computational experiment results confirm that using even the partial information about multiplicity leads to increased quality of reconstructed sequences. Moreover, they also show that the more precise model enables to obtain better solutions and the ant colony optimization algorithm outperforms the existing ones. Test data sets and the proposed ant colony optimization algorithm are available on: http://bioserver.cs.put.poznan.pl/download/ACO4mSBH.zip. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.

    PubMed

    Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A

    2014-10-01

    Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.

  2. The (in)famous GWAS P-value threshold revisited and updated for low-frequency variants.

    PubMed

    Fadista, João; Manning, Alisa K; Florez, Jose C; Groop, Leif

    2016-08-01

    Genome-wide association studies (GWAS) have long relied on proposed statistical significance thresholds to be able to differentiate true positives from false positives. Although the genome-wide significance P-value threshold of 5 × 10(-8) has become a standard for common-variant GWAS, it has not been updated to cope with the lower allele frequency spectrum used in many recent array-based GWAS studies and sequencing studies. Using a whole-genome- and -exome-sequencing data set of 2875 individuals of European ancestry from the Genetics of Type 2 Diabetes (GoT2D) project and a whole-exome-sequencing data set of 13 000 individuals from five ancestries from the GoT2D and T2D-GENES (Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples) projects, we describe guidelines for genome- and exome-wide association P-value thresholds needed to correct for multiple testing, explaining the impact of linkage disequilibrium thresholds for distinguishing independent variants, minor allele frequency and ancestry characteristics. We emphasize the advantage of studying recent genetic isolate populations when performing rare and low-frequency genetic association analyses, as the multiple testing burden is diminished due to higher genetic homogeneity.

  3. Review and International Recommendation of Methods for Typing Neisseria gonorrhoeae Isolates and Their Implications for Improved Knowledge of Gonococcal Epidemiology, Treatment, and Biology

    PubMed Central

    Unemo, Magnus; Dillon, Jo-Anne R.

    2011-01-01

    Summary: Gonorrhea, which may become untreatable due to multiple resistance to available antibiotics, remains a public health problem worldwide. Precise methods for typing Neisseria gonorrhoeae, together with epidemiological information, are crucial for an enhanced understanding regarding issues involving epidemiology, test of cure and contact tracing, identifying core groups and risk behaviors, and recommending effective antimicrobial treatment, control, and preventive measures. This review evaluates methods for typing N. gonorrhoeae isolates and recommends various methods for different situations. Phenotypic typing methods, as well as some now-outdated DNA-based methods, have limited usefulness in differentiating between strains of N. gonorrhoeae. Genotypic methods based on DNA sequencing are preferred, and the selection of the appropriate genotypic method should be guided by its performance characteristics and whether short-term epidemiology (microepidemiology) or long-term and/or global epidemiology (macroepidemiology) matters are being investigated. Currently, for microepidemiological questions, the best methods for fast, objective, portable, highly discriminatory, reproducible, typeable, and high-throughput characterization are N. gonorrhoeae multiantigen sequence typing (NG-MAST) or full- or extended-length porB gene sequencing. However, pulsed-field gel electrophoresis (PFGE) and Opa typing can be valuable in specific situations, i.e., extreme microepidemiology, despite their limitations. For macroepidemiological studies and phylogenetic studies, DNA sequencing of chromosomal housekeeping genes, such as multilocus sequence typing (MLST), provides a more nuanced understanding. PMID:21734242

  4. DNA capture and next-generation sequencing can recover whole mitochondrial genomes from highly degraded samples for human identification

    PubMed Central

    2013-01-01

    Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217

  5. Splicing predictions reliably classify different types of alternative splicing

    PubMed Central

    Busch, Anke; Hertel, Klemens J.

    2015-01-01

    Alternative splicing is a key player in the creation of complex mammalian transcriptomes and its misregulation is associated with many human diseases. Multiple mRNA isoforms are generated from most human genes, a process mediated by the interplay of various RNA signature elements and trans-acting factors that guide spliceosomal assembly and intron removal. Here, we introduce a splicing predictor that evaluates hundreds of RNA features simultaneously to successfully differentiate between exons that are constitutively spliced, exons that undergo alternative 5′ or 3′ splice-site selection, and alternative cassette-type exons. Surprisingly, the splicing predictor did not feature strong discriminatory contributions from binding sites for known splicing regulators. Rather, the ability of an exon to be involved in one or multiple types of alternative splicing is dictated by its immediate sequence context, mainly driven by the identity of the exon's splice sites, the conservation around them, and its exon/intron architecture. Thus, the splicing behavior of human exons can be reliably predicted based on basic RNA sequence elements. PMID:25805853

  6. Recurrent TERT promoter mutations identified in a large-scale study of multiple tumour types are associated with increased TERT expression and telomerase activation.

    PubMed

    Huang, Dong-Sheng; Wang, Zhaohui; He, Xu-Jun; Diplas, Bill H; Yang, Rui; Killela, Patrick J; Meng, Qun; Ye, Zai-Yuan; Wang, Wei; Jiang, Xiao-Ting; Xu, Li; He, Xiang-Lei; Zhao, Zhong-Sheng; Xu, Wen-Juan; Wang, Hui-Ju; Ma, Ying-Yu; Xia, Ying-Jie; Li, Li; Zhang, Ru-Xuan; Jin, Tao; Zhao, Zhong-Kuo; Xu, Ji; Yu, Sheng; Wu, Fang; Liang, Junbo; Wang, Sizhen; Jiao, Yuchen; Yan, Hai; Tao, Hou-Quan

    2015-05-01

    Several somatic mutation hotspots were recently identified in the telomerase reverse transcriptase (TERT) promoter region in human cancers. Large scale studies of these mutations in multiple tumour types are limited, in particular in Asian populations. This study aimed to: analyse TERT promoter mutations in multiple tumour types in a large Chinese patient cohort, investigate novel tumour types and assess the functional significance of the mutations. TERT promoter mutation status was assessed by Sanger sequencing for 13 different tumour types and 799 tumour tissues from Chinese cancer patients. Thymic epithelial tumours, gastrointestinal leiomyoma, and gastric schwannoma were included, for which the TERT promoter has not been previously sequenced. Functional studies included TERT expression by reverse-transcriptase quantitative polymerase chain reaction (RT-qPCR), telomerase activity by the telomeric repeat amplification protocol (TRAP) assay and promoter activity by the luciferase reporter assay. TERT promoter mutations were highly frequent in glioblastoma (83.9%), urothelial carcinoma (64.5%), oligodendroglioma (70.0%), medulloblastoma (33.3%) and hepatocellular carcinoma (31.4%). C228T and C250T were the most common mutations. In urothelial carcinoma, several novel rare mutations were identified. TERT promoter mutations were absent in gastrointestinal stromal tumour (GIST), thymic epithelial tumours, gastrointestinal leiomyoma, gastric schwannoma, cholangiocarcinoma, gastric and pancreatic cancer. TERT promoter mutations highly correlated with upregulated TERT mRNA expression and telomerase activity in adult gliomas. These mutations differentially enhanced the transcriptional activity of the TERT core promoter. TERT promoter mutations are frequent in multiple tumour types and have similar distributions in Chinese cancer patients. The functional significance of these mutations reflect the importance to telomere maintenance and hence tumourigenesis, making them potential therapeutic targets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Recurrent TERT promoter mutations identified in a large-scale study of multiple tumor types are associated with increased TERT expression and telomerase activation

    PubMed Central

    Huang, Dong-Sheng; Wang, Zhaohui; He, Xu-Jun; Diplas, Bill H.; Yang, Rui; Killela, Patrick J.; Liang, Junbo; Meng, Qun; Ye, Zai-Yuan; Wang, Wei; Jiang, Xiao-Ting; Xu, Li; He, Xiang-Lei; Zhao, Zhong-Sheng; Xu, Wen-Juan; Wang, Hui-Ju; Ma, Ying-Yu; Xia, Ying-Jie; Li, Li; Zhang, Ru-Xuan; Jin, Tao; Zhao, Zhong-Kuo; Xu, Ji; Yu, Sheng; Wu, Fang; Wang, Si-Zhen; Jiao, Yu-Chen; Yan, Hai; Tao, Hou-Quan

    2015-01-01

    Background Several somatic mutation hotspots were recently identified in the TERT promoter region in human cancers. Large scale studies of these mutations in multiple tumor types are limited, in particular in Asian populations. This study aimed to: analyze TERT promoter mutations in multiple tumor types in a large Chinese patient cohort, investigate novel tumor types and assess the functional significance of the mutations. Methods TERT promoter mutation status was assessed by Sanger sequencing for 13 different tumor types and 799 tumor tissues from Chinese cancer patients. Thymic epithelial tumors, gastrointestinal leiomyoma, and gastric schwannoma were included, for which the TERT promoter has not been previously sequenced. Functional studies included TERT expression by RT-qPCR, telomerase activity by the TRAP assay, and promoter activity by the luciferase reporter assay. Results TERT promoter mutations were highly frequent in glioblastoma (83.9%), urothelial carcinoma (64.5%), oligodendroglioma (70.0%), medulloblastoma (33.3%), and hepatocellular carcinoma (31.4%). C228T and C250T were the most common mutations. In urothelial carcinoma, several novel rare mutations were identified. TERT promoter mutations were absent in GIST, thymic epithelial tumors, gastrointestinal leiomyoma, gastric schwannoma, cholangiocarcinoma, gastric and pancreatic cancer. TERT promoter mutations highly correlated with upregulated TERT mRNA expression and telomerase activity in adult gliomas. These mutations differentially enhanced the transcriptional activity of the TERT core promoter. Conclusions TERT promoter mutations are frequent in multiple tumor types and have similar distributions in Chinese cancer patients. The functional significance of these mutations reflect the importance to telomere maintenance and hence tumorigenesis, making them potential therapeutic targets. PMID:25843513

  8. Formulaic Sequence(FS) Cannot Be an Umbrella Term in SLA: Focusing on Psycholinguistic FSs and Their Identifcation

    ERIC Educational Resources Information Center

    Myles, Florence; Cordier, Caroline

    2017-01-01

    The term "formulaic sequence" (FS) is used with a multiplicity of meanings in the SLA literature, some overlapping but others not, and researchers are not always clear in defining precisely what they are investigating, or in limiting the implicational domain of their findings to the type of formulaicity they focus on. The first part of…

  9. The complete genome sequence of human adenovirus 84, a highly recombinant new Human mastadenovirus D type with a unique fiber gene.

    PubMed

    Kaján, Győző L; Kajon, Adriana E; Pinto, Alexis Castillo; Bartha, Dániel; Arnberg, Niklas

    2017-10-15

    A novel human adenovirus was isolated from a pediatric case of acute respiratory disease in Panama City, Panama in 2011. The clinical isolate was initially identified as an intertypic recombinant based on hexon and fiber gene sequencing. Based on the analysis of its complete genome sequence, the novel complex recombinant Human mastadenovirus D (HAdV-D) strain was classified into a new HAdV type: HAdV-84, and it was designated Adenovirus D human/PAN/P309886/2011/84[P43H17F84]. HAdV-D types possess usually an ocular or gastrointestinal tropism, and respiratory association is scarcely reported. The virus has a novel fiber type, most closely related to, but still clearly distant from that of HAdV-36. The predicted fiber is hypothesised to bind sialic acid with lower affinity compared to HAdV-37. Bioinformatic analysis of the complete genomic sequence of HAdV-84 revealed multiple homologous recombination events and provided deeper insight into HAdV evolution. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Genotyping of Indian antigenic, vaccine, and field Brucella spp. using multilocus sequence typing.

    PubMed

    Shome, Rajeswari; Krithiga, Natesan; Shankaranarayana, Padmashree B; Jegadesan, Sankarasubramanian; Udayakumar S, Vishnu; Shome, Bibek Ranjan; Saikia, Girin Kumar; Sharma, Narendra Kumar; Chauhan, Harshad; Chandel, Bharat Singh; Jeyaprakash, Rajendhran; Rahman, Habibur

    2016-03-31

    Brucellosis is one of the most important zoonotic diseases that affects multiple livestock species and causes great economic losses. The highly conserved genomes of Brucella, with > 90% homology among species, makes it important to study the genetic diversity circulating in the country. A total of 26 Brucella spp. (4 reference strains and 22 field isolates) and 1 B. melitensis draft genome sequence from India (B. melitensis Bm IND1) were included for sequence typing. The field isolates were identified by biochemical tests and confirmed by both conventional and quantitative polymerase chain reaction (qPCR) targeting bcsp 31Brucella genus-specific marker. Brucella speciation and biotyping was done by Bruce ladder, probe qPCR, and AMOS PCRs, respectively, and genotyping was done by multilocus sequence typing (MLST). The MLST typing of 27 Brucella spp. revealed five distinct sequence types (STs); the B. abortus S99 reference strain and 21 B. abortus field isolates belonged to ST1. On the other hand, the vaccine strain B. abortus S19 was genotyped as ST5. Similarly, B. melitensis 16M reference strain and one B. melitensis field isolate were grouped into ST7. Another B. melitensis field isolate belonged to ST8 (draft genome sequence from India), and only B. suis 1330 reference strain was found to be ST14. The sequences revealed genetic similarity of the Indian strains to the global reference and field strains. The study highlights the usefulness of MLST for typing of field isolates and validation of reference strains used for diagnosis and vaccination against brucellosis.

  11. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies.

    PubMed

    Chen, Hui; Luthra, Rajyalakshmi; Goswami, Rashmi S; Singh, Rajesh R; Roy-Chowdhuri, Sinchita

    2015-08-28

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects.

  12. PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods

    PubMed Central

    2012-01-01

    Background With the decrease of DNA sequencing costs, sequence-based typing methods are rapidly becoming the gold standard for epidemiological surveillance. These methods provide reproducible and comparable results needed for a global scale bacterial population analysis, while retaining their usefulness for local epidemiological surveys. Online databases that collect the generated allelic profiles and associated epidemiological data are available but this wealth of data remains underused and are frequently poorly annotated since no user-friendly tool exists to analyze and explore it. Results PHYLOViZ is platform independent Java software that allows the integrated analysis of sequence-based typing methods, including SNP data generated from whole genome sequence approaches, and associated epidemiological data. goeBURST and its Minimum Spanning Tree expansion are used for visualizing the possible evolutionary relationships between isolates. The results can be displayed as an annotated graph overlaying the query results of any other epidemiological data available. Conclusions PHYLOViZ is a user-friendly software that allows the combined analysis of multiple data sources for microbial epidemiological and population studies. It is freely available at http://www.phyloviz.net. PMID:22568821

  13. Introduction of the hybcell-based compact sequencing technology and comparison to state-of-the-art methodologies for KRAS mutation detection.

    PubMed

    Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus

    2015-03-01

    The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.

  14. An efficient method for the prediction of deleterious multiple-point mutations in the secondary structure of RNAs using suboptimal folding solutions

    PubMed Central

    Churkin, Alexander; Barash, Danny

    2008-01-01

    Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3), for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary structure. A complete explanation of the application, called MultiRNAmute, is available at [1]. PMID:18445289

  15. Integrating genome assemblies with MAIA

    PubMed Central

    Nijkamp, Jurgen; Winterbach, Wynand; van den Broek, Marcel; Daran, Jean-Marc; Reinders, Marcel; de Ridder, Dick

    2010-01-01

    Motivation: De novo assembly of a eukaryotic genome with next-generation sequencing data is still a challenging task. Over the past few years several assemblers have been developed, often suitable for one specific type of sequencing data. The number of known genomes is expanding rapidly, therefore it becomes possible to use multiple reference genomes for assembly projects. We introduce an assembly integrator that makes use of all available data, i.e. multiple de novo assemblies and mappings against multiple related genomes, by optimizing a weighted combination of criteria. Results: The developed algorithm was applied on the de novo sequencing of the Saccharomyces cerevisiae CEN.PK 113-7D strain. Using Solexa and 454 read data, two de novo and three comparative assemblies were constructed and subsequently integrated, yielding 29 contigs, covering more than 12 Mbp; a drastic improvement compared with the single assemblies. Availability: MAIA is available as a Matlab package and can be downloaded from http://bioinformatics.tudelft.nl Contact: j.f.nijkamp@tudelft.nl PMID:20823304

  16. Viruses in case series of tumors: Consistent presence in different cancers in the same subject

    PubMed Central

    Arroyo Mühr, Laila Sara; Hortlund, Maria; Bzhalava, Zurab; Nordqvist Kleppe, Sara; Bzhalava, Davit; Dillner, Joakim

    2017-01-01

    Studies investigating presence of viruses in cancer often analyze case series of cancers, resulting in detection of many viruses that are not etiologically linked to the tumors where they are found. The incidence of virus-associated cancers is greatly increased in immunocompromised individuals. Non-melanoma skin cancer (NMSC) is also greatly increased and a variety of viruses have been detected in NMSC. As immunosuppressed patients often develop multiple independent NMSCs, we reasoned that viruses consistently present in independent tumors might be more likely to be involved in tumorigenesis. We sequenced 8 different NMSCs from 1 patient in comparison to 8 different NMSCs from 8 different patients. Among the latter, 12 different virus sequences were detected, but none in more than 1 tumor each. In contrast, the patient with multiple NMSCs had human papillomavirus type 15 and type 38 present in 6 out of 8 NMSCs. PMID:28257474

  17. Jointly characterizing epigenetic dynamics across multiple human cell types

    PubMed Central

    An, Lin; Yue, Feng; Hardison, Ross C

    2016-01-01

    Advanced sequencing technologies have generated a plethora of data for many chromatin marks in multiple tissues and cell types, yet there is lack of a generalized tool for optimal utility of those data. A major challenge is to quantitatively model the epigenetic dynamics across both the genome and many cell types for understanding their impacts on differential gene regulation and disease. We introduce IDEAS, an integrative and discriminative epigenome annotation system, for jointly characterizing epigenetic landscapes in many cell types and detecting differential regulatory regions. A key distinction between our method and existing state-of-the-art algorithms is that IDEAS integrates epigenomes of many cell types simultaneously in a way that preserves the position-dependent and cell type-specific information at fine scales, thereby greatly improving segmentation accuracy and producing comparable annotations across cell types. PMID:27095202

  18. Whole Genome Sequence Typing to Investigate the Apophysomyces Outbreak following a Tornado in Joplin, Missouri, 2011

    PubMed Central

    Etienne, Kizee A.; Gillece, John; Hilsabeck, Remy; Schupp, Jim M.; Colman, Rebecca; Lockhart, Shawn R.; Gade, Lalitha; Thompson, Elizabeth H.; Sutton, Deanna A.; Neblett-Fanfair, Robyn; Park, Benjamin J.; Turabelidze, George; Keim, Paul; Brandt, Mary E.; Deak, Eszter; Engelthaler, David M.

    2012-01-01

    Case reports of Apophysomyces spp. in immunocompetent hosts have been a result of traumatic deep implantation of Apophysomyces spp. spore-contaminated soil or debris. On May 22, 2011 a tornado occurred in Joplin, MO, leaving 13 tornado victims with Apophysomyces trapeziformis infections as a result of lacerations from airborne material. We used whole genome sequence typing (WGST) for high-resolution phylogenetic SNP analysis of 17 outbreak Apophysomyces isolates and five additional temporally and spatially diverse Apophysomyces control isolates (three A. trapeziformis and two A. variabilis isolates). Whole genome SNP phylogenetic analysis revealed three clusters of genotypically related or identical A. trapeziformis isolates and multiple distinct isolates among the Joplin group; this indicated multiple genotypes from a single or multiple sources. Though no linkage between genotype and location of exposure was observed, WGST analysis determined that the Joplin isolates were more closely related to each other than to the control isolates, suggesting local population structure. Additionally, species delineation based on WGST demonstrated the need to reassess currently accepted taxonomic classifications of phylogenetic species within the genus Apophysomyces. PMID:23209631

  19. Whole genome sequence typing to investigate the Apophysomyces outbreak following a tornado in Joplin, Missouri, 2011.

    PubMed

    Etienne, Kizee A; Gillece, John; Hilsabeck, Remy; Schupp, Jim M; Colman, Rebecca; Lockhart, Shawn R; Gade, Lalitha; Thompson, Elizabeth H; Sutton, Deanna A; Neblett-Fanfair, Robyn; Park, Benjamin J; Turabelidze, George; Keim, Paul; Brandt, Mary E; Deak, Eszter; Engelthaler, David M

    2012-01-01

    Case reports of Apophysomyces spp. in immunocompetent hosts have been a result of traumatic deep implantation of Apophysomyces spp. spore-contaminated soil or debris. On May 22, 2011 a tornado occurred in Joplin, MO, leaving 13 tornado victims with Apophysomyces trapeziformis infections as a result of lacerations from airborne material. We used whole genome sequence typing (WGST) for high-resolution phylogenetic SNP analysis of 17 outbreak Apophysomyces isolates and five additional temporally and spatially diverse Apophysomyces control isolates (three A. trapeziformis and two A. variabilis isolates). Whole genome SNP phylogenetic analysis revealed three clusters of genotypically related or identical A. trapeziformis isolates and multiple distinct isolates among the Joplin group; this indicated multiple genotypes from a single or multiple sources. Though no linkage between genotype and location of exposure was observed, WGST analysis determined that the Joplin isolates were more closely related to each other than to the control isolates, suggesting local population structure. Additionally, species delineation based on WGST demonstrated the need to reassess currently accepted taxonomic classifications of phylogenetic species within the genus Apophysomyces.

  20. Detecting atypical examples of known domain types by sequence similarity searching: the SBASE domain library approach.

    PubMed

    Dhir, Somdutta; Pacurar, Mircea; Franklin, Dino; Gáspári, Zoltán; Kertész-Farkas, Attila; Kocsor, András; Eisenhaber, Frank; Pongor, Sándor

    2010-11-01

    SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.

  1. The Applied Development of a Tiered Multilocus Sequence Typing (MLST) Scheme for Dichelobacter nodosus.

    PubMed

    Blanchard, Adam M; Jolley, Keith A; Maiden, Martin C J; Coffey, Tracey J; Maboni, Grazieli; Staley, Ceri E; Bollard, Nicola J; Warry, Andrew; Emes, Richard D; Davies, Peers L; Tötemeyer, Sabine

    2018-01-01

    Dichelobacter nodosus ( D. nodosus ) is the causative pathogen of ovine footrot, a disease that has a significant welfare and financial impact on the global sheep industry. Previous studies into the phylogenetics of D. nodosus have focused on Australia and Scandinavia, meaning the current diversity in the United Kingdom (U.K.) population and its relationship globally, is poorly understood. Numerous epidemiological methods are available for bacterial typing; however, few account for whole genome diversity or provide the opportunity for future application of new computational techniques. Multilocus sequence typing (MLST) measures nucleotide variations within several loci with slow accumulation of variation to enable the designation of allele numbers to determine a sequence type. The usage of whole genome sequence data enables the application of MLST, but also core and whole genome MLST for higher levels of strain discrimination with a negligible increase in experimental cost. An MLST database was developed alongside a seven loci scheme using publically available whole genome data from the sequence read archive. Sequence type designation and strain discrimination was compared to previously published data to ensure reproducibility. Multiple D. nodosus isolates from U.K. farms were directly compared to populations from other countries. The U.K. isolates define new clades within the global population of D. nodosus and predominantly consist of serogroups A, B and H, however serogroups C, D, E, and I were also found. The scheme is publically available at https://pubmlst.org/dnodosus/.

  2. Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.

    PubMed

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Alejandro Q; Musolf, Anthony; Matise, Tara C; Finch, Stephen J; Gordon, Derek

    2012-01-01

    As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single-variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p value, no matter how many loci. Copyright © 2013 S. Karger AG, Basel.

  3. Single variant and multi-variant trend tests for genetic association with next generation sequencing that are robust to sequencing error

    PubMed Central

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Andrew; Musolf, Anthony; Matise, Tara C.; Finch, Stephen J.; Gordon, Derek

    2013-01-01

    As with any new technology, next generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model, based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to that data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p-value, no matter how many loci. PMID:23594495

  4. Quantifying humpback whale song sequences to understand the dynamics of song exchange at the ocean basin scale.

    PubMed

    Garland, Ellen C; Noad, Michael J; Goldizen, Anne W; Lilley, Matthew S; Rekdahl, Melinda L; Garrigue, Claire; Constantine, Rochelle; Daeschler Hauser, Nan; Poole, M Michael; Robbins, Jooke

    2013-01-01

    Humpback whales have a continually evolving vocal sexual display, or "song," that appears to undergo both evolutionary and "revolutionary" change. All males within a population adhere to the current content and arrangement of the song. Populations within an ocean basin share similarities in their songs; this sharing is complex as multiple variations of the song (song types) may be present within a region at any one time. To quantitatively investigate the similarity of song types, songs were compared at both the individual singer and population level using the Levenshtein distance technique and cluster analysis. The highly stereotyped sequences of themes from the songs of 211 individuals from populations within the western and central South Pacific region from 1998 through 2008 were grouped together based on the percentage of song similarity, and compared to qualitatively assigned song types. The analysis produced clusters of highly similar songs that agreed with previous qualitative assignments. Each cluster contained songs from multiple populations and years, confirming the eastward spread of song types and their progressive evolution through the study region. Quantifying song similarity and exchange will assist in understanding broader song dynamics and contribute to the use of vocal displays as population identifiers.

  5. PHYLOSCANNER: Inferring Transmission from Within- and Between-Host Pathogen Genetic Diversity

    PubMed Central

    Hall, Matthew; Ratmann, Oliver; Bonsall, David; Golubchik, Tanya; de Cesare, Mariateresa; Gall, Astrid; Cornelissen, Marion; Fraser, Christophe

    2018-01-01

    Abstract A central feature of pathogen genomics is that different infectious particles (virions and bacterial cells) within an infected individual may be genetically distinct, with patterns of relatedness among infectious particles being the result of both within-host evolution and transmission from one host to the next. Here, we present a new software tool, phyloscanner, which analyses pathogen diversity from multiple infected hosts. phyloscanner provides unprecedented resolution into the transmission process, allowing inference of the direction of transmission from sequence data alone. Multiply infected individuals are also identified, as they harbor subpopulations of infectious particles that are not connected by within-host evolution, except where recombinant types emerge. Low-level contamination is flagged and removed. We illustrate phyloscanner on both viral and bacterial pathogens, namely HIV-1 sequenced on Illumina and Roche 454 platforms, HCV sequenced with the Oxford Nanopore MinION platform, and Streptococcus pneumoniae with sequences from multiple colonies per individual. phyloscanner is available from https://github.com/BDI-pathogens/phyloscanner. PMID:29186559

  6. An efficient, versatile and scalable pattern growth approach to mine frequent patterns in unaligned protein sequences.

    PubMed

    Ye, Kai; Kosters, Walter A; Ijzerman, Adriaan P

    2007-03-15

    Pattern discovery in protein sequences is often based on multiple sequence alignments (MSA). The procedure can be computationally intensive and often requires manual adjustment, which may be particularly difficult for a set of deviating sequences. In contrast, two algorithms, PRATT2 (http//www.ebi.ac.uk/pratt/) and TEIRESIAS (http://cbcsrv.watson.ibm.com/) are used to directly identify frequent patterns from unaligned biological sequences without an attempt to align them. Here we propose a new algorithm with more efficiency and more functionality than both PRATT2 and TEIRESIAS, and discuss some of its applications to G protein-coupled receptors, a protein family of important drug targets. In this study, we designed and implemented six algorithms to mine three different pattern types from either one or two datasets using a pattern growth approach. We compared our approach to PRATT2 and TEIRESIAS in efficiency, completeness and the diversity of pattern types. Compared to PRATT2, our approach is faster, capable of processing large datasets and able to identify the so-called type III patterns. Our approach is comparable to TEIRESIAS in the discovery of the so-called type I patterns but has additional functionality such as mining the so-called type II and type III patterns and finding discriminating patterns between two datasets. The source code for pattern growth algorithms and their pseudo-code are available at http://www.liacs.nl/home/kosters/pg/.

  7. Multiple Epstein-Barr virus strains in patients with infectious mononucleosis: comparison of ex vivo samples with in vitro isolates by use of heteroduplex tracking assays.

    PubMed

    Tierney, Rosemary J; Edwards, Rachel Hood; Sitki-Green, Diane; Croom-Carter, Deborah; Roy, Sushmita; Yao, Qing-Yun; Raab-Traub, Nancy; Rickinson, Alan B

    2006-01-15

    Recent work using a heteroduplex tracking assay (HTA) to identify resident viral sequences has suggested that patients with infectious mononucleosis (IM) who are undergoing primary Epstein-Barr virus (EBV) infection frequently harbor different EBV strains. Here, we examine samples from patients with IM by use of a new Epstein-Barr nuclear antigen 2 HTA alongside the established latent membrane protein 1 HTA. Coresident allelic sequences were detected in ex vivo blood and throat wash samples from 13 of 14 patients with IM; most patients carried 2 or more type 1 strains, 1 patient carried 2 type 2 strains, and 1 patient carried both virus types. In contrast, coresident strains were detected in only 2 of 14 patients by in vitro B cell transformation, despite screening >20 isolates/patient. We infer that coacquisition of multiple strains is common in patients with IM, although only 1 strain tends to be rescued in vitro; whether nonrescued strains are present in low abundance or are transformation defective remains to be determined.

  8. Genetic characterization of Measles Viruses in China, 2004

    PubMed Central

    Zhang, Yan; Ji, Yixin; Jiang, Xiaohong; Xu, Songtao; Zhu, Zhen; Zheng, Lei; He, Jilan; Ling, Hua; Wang, Yan; Liu, Yang; Du, Wen; Yang, Xuelei; Mao, Naiying; Xu, Wenbo

    2008-01-01

    Genetic characterization of wild-type measles virus was studied using nucleotide sequencing of the C-terminal region of the N protein gene and phylogenetic analysis on 59 isolates from 16 provinces of China in 2004. The results showed that all of the isolates belonged to genotype H1. 51 isolates were belonged to cluster 1 and 8 isolates were cluster 2 and Viruses from both clusters were distributed throughout China without distinct geographic pattern. The nucleotide sequence and predicted amino acid homologies of the 59 H1 strains were 96.5%–100% and 95.7%–100%, respectively. The report showed that the transmission pattern of genotype H1 viruses in China in 2004 was consistent with ongoing endemic transmission of multiple lineages of a single, endemic genotype. Multiple transmission pathways leaded to multiple lineages within endemic genotype. PMID:18928575

  9. A Persistent Feature of Multiple Scattering of Waves in the Time-Domain: A Tutorial

    NASA Technical Reports Server (NTRS)

    Lock, James A.; Mishchenko, Michael I.

    2015-01-01

    The equations for frequency-domain multiple scattering are derived for a scalar or electromagnetic plane wave incident on a collection of particles at known positions, and in the time-domain for a plane wave pulse incident on the same collection of particles. The calculation is carried out for five different combinations of wave types and particle types of increasing geometrical complexity. The results are used to illustrate and discuss a number of physical and mathematical characteristics of multiple scattering in the frequency- and time-domains. We argue that frequency-domain multiple scattering is a purely mathematical construct since there is no temporal sequencing information in the frequency-domain equations and since the multi-particle path information can be dispelled by writing the equations in another mathematical form. However, multiple scattering becomes a definite physical phenomenon in the time-domain when the collection of particles is illuminated by an appropriately short localized pulse.

  10. Software for optimization of SNP and PCR-RFLP genotyping to discriminate many genomes with the fewest assays

    PubMed Central

    Gardner, Shea N; Wagner, Mark C

    2005-01-01

    Background Microbial forensics is important in tracking the source of a pathogen, whether the disease is a naturally occurring outbreak or part of a criminal investigation. Results A method and SPR Opt (SNP and PCR-RFLP Optimization) software to perform a comprehensive, whole-genome analysis to forensically discriminate multiple sequences is presented. Tools for the optimization of forensic typing using Single Nucleotide Polymorphism (SNP) and PCR-Restriction Fragment Length Polymorphism (PCR-RFLP) analyses across multiple isolate sequences of a species are described. The PCR-RFLP analysis includes prediction and selection of optimal primers and restriction enzymes to enable maximum isolate discrimination based on sequence information. SPR Opt calculates all SNP or PCR-RFLP variations present in the sequences, groups them into haplotypes according to their co-segregation across those sequences, and performs combinatoric analyses to determine which sets of haplotypes provide maximal discrimination among all the input sequences. Those set combinations requiring that membership in the fewest haplotypes be queried (i.e. the fewest assays be performed) are found. These analyses highlight variable regions based on existing sequence data. These markers may be heterogeneous among unsequenced isolates as well, and thus may be useful for characterizing the relationships among unsequenced as well as sequenced isolates. The predictions are multi-locus. Analyses of mumps and SARS viruses are summarized. Phylogenetic trees created based on SNPs, PCR-RFLPs, and full genomes are compared for SARS virus, illustrating that purported phylogenies based only on SNP or PCR-RFLP variations do not match those based on multiple sequence alignment of the full genomes. Conclusion This is the first software to optimize the selection of forensic markers to maximize information gained from the fewest assays, accepting whole or partial genome sequence data as input. As more sequence data becomes available for multiple strains and isolates of a species, automated, computational approaches such as those described here will be essential to make sense of large amounts of information, and to guide and optimize efforts in the laboratory. The software and source code for SPR Opt is publicly available and free for non-profit use at . PMID:15904493

  11. Mammalian transcriptional hotspots are enriched for tissue specific enhancers near cell type specific highly expressed genes and are predicted to act as transcriptional activator hubs.

    PubMed

    Joshi, Anagha

    2014-12-30

    Transcriptional hotspots are defined as genomic regions bound by multiple factors. They have been identified recently as cell type specific enhancers regulating developmentally essential genes in many species such as worm, fly and humans. The in-depth analysis of hotspots across multiple cell types in same species still remains to be explored and can bring new biological insights. We therefore collected 108 transcription-related factor (TF) ChIP sequencing data sets in ten murine cell types and classified the peaks in each cell type in three groups according to binding occupancy as singletons (low-occupancy), combinatorials (mid-occupancy) and hotspots (high-occupancy). The peaks in the three groups clustered largely according to the occupancy, suggesting priming of genomic loci for mid occupancy irrespective of cell type. We then characterized hotspots for diverse structural functional properties. The genes neighbouring hotspots had a small overlap with hotspot genes in other cell types and were highly enriched for cell type specific function. Hotspots were enriched for sequence motifs of key TFs in that cell type and more than 90% of hotspots were occupied by pioneering factors. Though we did not find any sequence signature in the three groups, the H3K4me1 binding profile had bimodal peaks at hotspots, distinguishing hotspots from mono-modal H3K4me1 singletons. In ES cells, differentially expressed genes after perturbation of activators were enriched for hotspot genes suggesting hotspots primarily act as transcriptional activator hubs. Finally, we proposed that ES hotspots might be under control of SetDB1 and not DNMT for silencing. Transcriptional hotspots are enriched for tissue specific enhancers near cell type specific highly expressed genes. In ES cells, they are predicted to act as transcriptional activator hubs and might be under SetDB1 control for silencing.

  12. Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities.

    PubMed

    Taghavi, Zeinab; Movahedi, Narjes S; Draghici, Sorin; Chitsaz, Hamidreza

    2013-10-01

    Identification of every single genome present in a microbial sample is an important and challenging task with crucial applications. It is challenging because there are typically millions of cells in a microbial sample, the vast majority of which elude cultivation. The most accurate method to date is exhaustive single-cell sequencing using multiple displacement amplification, which is simply intractable for a large number of cells. However, there is hope for breaking this barrier, as the number of different cell types with distinct genome sequences is usually much smaller than the number of cells. Here, we present a novel divide and conquer method to sequence and de novo assemble all distinct genomes present in a microbial sample with a sequencing cost and computational complexity proportional to the number of genome types, rather than the number of cells. The method is implemented in a tool called Squeezambler. We evaluated Squeezambler on simulated data. The proposed divide and conquer method successfully reduces the cost of sequencing in comparison with the naïve exhaustive approach. Squeezambler and datasets are available at http://compbio.cs.wayne.edu/software/squeezambler/.

  13. Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.

    PubMed

    Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu

    2015-06-01

    High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.

  14. Multilocus Sequence Typing of Pathogenic Treponemes Isolated from Cloven-Hoofed Animals and Comparison to Treponemes Isolated from Humans

    PubMed Central

    Carter, Stuart D.; Birtles, Richard J.; Brown, Jennifer M.; Hart, C. Anthony; Evans, Nicholas J.

    2016-01-01

    ABSTRACT Treponema species are implicated in many diseases of humans and animals. Digital dermatitis (DD) treponemes are reported to cause severe lesions in cattle, sheep, pigs, goats, and wild elk, causing substantial global animal welfare issues and economic losses. The fastidiousness of these spirochetes has previously precluded studies investigating within-phylogroup genetic diversity. An archive of treponemes that we isolated enabled multilocus sequence typing to quantify the diversity and population structure of DD treponemes. Isolates (n = 121) were obtained from different animal hosts in nine countries on three continents. The analyses herein of currently isolated DD treponemes at seven housekeeping gene loci confirm the classification of the three previously designated phylogroups: the Treponema medium, Treponema phagedenis, and Treponema pedis phylogroups. Sequence analysis of seven DD treponeme housekeeping genes revealed a generally low level of diversity among the strains within each phylogroup, removing the need for the previously used “-like” suffix. Surprisingly, all isolates within each phylogroup clustered together, regardless of host or geographic origin, suggesting that the same sequence types (STs) can infect different animals. Some STs were derived from multiple animals from the same farm, highlighting probable within-farm transmissions. Several STs infected multiple hosts from similar geographic regions, identifying probable frequent between-host transmissions. Interestingly, T. pedis appears to be evolving more quickly than the T. medium or T. phagedenis DD treponeme phylogroup, by forming two unique ST complexes. The lack of phylogenetic discrimination between treponemes isolated from different hosts or geographic regions substantially contrasts with the data for other clinically relevant spirochetes. IMPORTANCE The recent expansion of the host range of digital dermatitis (DD) treponemes from cattle to sheep, goats, pigs, and wild elk, coupled with the high level of 16S rRNA gene sequence similarity across hosts and with human treponemes, suggests that the same bacterial species can cause disease in multiple different hosts. This multilocus sequence typing (MLST) study further demonstrates that these bacteria isolated from different hosts are indeed very similar, raising the potential for cross-species transmission. The study also shows that infection spread occurs frequently, both locally and globally, suggesting transmission by routes other than animal-animal transmission alone. These results indicate that on-farm biosecurity is important for controlling disease spread in domesticated species. Continued surveillance and vigilance are important for ascertaining the evolution and tracking any further host range expansion of these important pathogens. PMID:27208135

  15. Multilocus Sequence Typing of Pathogenic Treponemes Isolated from Cloven-Hoofed Animals and Comparison to Treponemes Isolated from Humans.

    PubMed

    Clegg, Simon R; Carter, Stuart D; Birtles, Richard J; Brown, Jennifer M; Hart, C Anthony; Evans, Nicholas J

    2016-08-01

    Treponema species are implicated in many diseases of humans and animals. Digital dermatitis (DD) treponemes are reported to cause severe lesions in cattle, sheep, pigs, goats, and wild elk, causing substantial global animal welfare issues and economic losses. The fastidiousness of these spirochetes has previously precluded studies investigating within-phylogroup genetic diversity. An archive of treponemes that we isolated enabled multilocus sequence typing to quantify the diversity and population structure of DD treponemes. Isolates (n = 121) were obtained from different animal hosts in nine countries on three continents. The analyses herein of currently isolated DD treponemes at seven housekeeping gene loci confirm the classification of the three previously designated phylogroups: the Treponema medium, Treponema phagedenis, and Treponema pedis phylogroups. Sequence analysis of seven DD treponeme housekeeping genes revealed a generally low level of diversity among the strains within each phylogroup, removing the need for the previously used "-like" suffix. Surprisingly, all isolates within each phylogroup clustered together, regardless of host or geographic origin, suggesting that the same sequence types (STs) can infect different animals. Some STs were derived from multiple animals from the same farm, highlighting probable within-farm transmissions. Several STs infected multiple hosts from similar geographic regions, identifying probable frequent between-host transmissions. Interestingly, T. pedis appears to be evolving more quickly than the T. medium or T. phagedenis DD treponeme phylogroup, by forming two unique ST complexes. The lack of phylogenetic discrimination between treponemes isolated from different hosts or geographic regions substantially contrasts with the data for other clinically relevant spirochetes. The recent expansion of the host range of digital dermatitis (DD) treponemes from cattle to sheep, goats, pigs, and wild elk, coupled with the high level of 16S rRNA gene sequence similarity across hosts and with human treponemes, suggests that the same bacterial species can cause disease in multiple different hosts. This multilocus sequence typing (MLST) study further demonstrates that these bacteria isolated from different hosts are indeed very similar, raising the potential for cross-species transmission. The study also shows that infection spread occurs frequently, both locally and globally, suggesting transmission by routes other than animal-animal transmission alone. These results indicate that on-farm biosecurity is important for controlling disease spread in domesticated species. Continued surveillance and vigilance are important for ascertaining the evolution and tracking any further host range expansion of these important pathogens. Copyright © 2016 Clegg et al.

  16. Leptospira interrogans serovars Bratislava and Muenchen animal infections: Implications for epidemiology and control.

    PubMed

    Arent, Z; Frizzell, C; Gilmore, C; Allen, A; Ellis, W A

    2016-07-15

    Strains of Leptospira interrogans belonging to two very closely related serovars - Bratislava and Muenchen - have been associated with disease in domestic animals, in particular pigs, but also in horses and dogs. Similar strains have also been recovered from various wildlife species. Their epidemiology is poorly understood. Two hundred and forty seven such isolates, from UK domestic animal and wildlife species, were examined by restriction endonuclease analysis in an attempt to elucidate their epidemiology. A representative sub-sample of 65 of these isolates was further examined by multiple-locus variable-number tandem repeat analysis and 22 by secY sequencing. Ten restriction pattern types were identified. The majority of isolates fell into one of three restriction endonuclease analysis pattern types designated B2a, B2b and M2a. B2a was ubiquitous and was isolated from 10 species and represented the majority of the horse and all dog isolates. B2b was very different, being isolated only from pigs, indicating that this type was maintained by pigs. The pattern M2a was reported for the majority of isolates from pigs but also was common in small rodents isolates. Five restriction pattern types were found only in wildlife suggesting that they are unlikely to pose a disease threat to domestic animals. Multiple-locus variable-number tandem repeat analysis identified six clusters. The REA types B2a and B2b were all found in one MLVA cluster while the majority of the M2a strains examined occurred in another cluster. The secY sequencing detected only one sequence type, clustered with other serovars of Leptospira interrogans. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.

    PubMed

    Lan, Haidong; Chan, Yuandong; Xu, Kai; Schmidt, Bertil; Peng, Shaoliang; Liu, Weiguo

    2016-07-19

    Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data parallelism, thread-level coarse-grained parallelism, and vector-level fine-grained parallelism. Furthermore, we re-organize the sequence datasets and use Xeon Phi shuffle operations to improve I/O efficiency. Evaluations show that our method achieves a peak overall performance up to 220 GCUPS for scanning real protein sequence databanks on a single node consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of sequence length and size, and number of compute nodes for both database scanning and multiple sequence alignment. Furthermore, the achieved performance is highly competitive in comparison to optimized Xeon Phi and GPU implementations. Our implementation is available at https://github.com/turbo0628/LSDBS-mpi .

  18. Viral expression associated with gastrointestinal adenocarcinomas in TCGA high-throughput sequencing data

    PubMed Central

    2013-01-01

    Background Up to 20% of cancers worldwide are thought to be associated with microbial pathogens, including bacteria and viruses. The widely used methods of viral infection detection are usually limited to a few a priori suspected viruses in one cancer type. To our knowledge, there have not been many broad screening approaches to address this problem more comprehensively. Methods In this study, we performed a comprehensive screening for viruses in nine common cancers using a multistep computational approach. Tumor transcriptome and genome sequencing data were available from The Cancer Genome Atlas (TCGA). Nine hundred fifty eight primary tumors in nine common cancers with poor prognosis were screened against a non-redundant database of virus sequences. DNA sequences from normal matched tissue specimens were used as controls to test whether each virus is associated with tumors. Results We identified human papilloma virus type 18 (HPV-18) and four human herpes viruses (HHV) types 4, 5, 6B, and 8, also known as EBV, CMV, roseola virus, and KSHV, in colon, rectal, and stomach adenocarcinomas. In total, 59% of screened gastrointestinal adenocarcinomas (GIA) were positive for at least one virus: 26% for EBV, 21% for CMV, 7% for HHV-6B, and 20% for HPV-18. Over 20% of tumors were co-infected with multiple viruses. Two viruses (EBV and CMV) were statistically significantly associated with colorectal cancers when compared to the matched healthy tissues from the same individuals (p = 0.02 and 0.03, respectively). HPV-18 was not detected in DNA, and thus, no association testing was possible. Nevertheless, HPV-18 expression patterns suggest viral integration in the host genome, consistent with the potentially oncogenic nature of HPV-18 in colorectal adenocarcinomas. The estimated counts of viral copies were below one per cell for all identified viruses and approached the detection limit. Conclusions Our comprehensive screening for viruses in multiple cancer types using next-generation sequencing data clearly demonstrates the presence of viral sequences in GIA. EBV, CMV, and HPV-18 are potentially causal for GIA, although their oncogenic role is yet to be established. PMID:24279398

  19. Population Structure of Invasive Streptococcus pneumoniae in the Netherlands in the Pre-Vaccination Era Assessed by MLVA and Capsular Sequence Typing

    PubMed Central

    Elberse, Karin E. M.; van de Pol, Ingrid; Witteveen, Sandra; van der Heide, Han G. J.; Schot, Corrie S.; van Dijk, Anita; van der Ende, Arie; Schouls, Leo M.

    2011-01-01

    The introduction of nationwide pneumococcal vaccination may lead to serotype replacement and the emergence of new variants that have expanded their genetic repertoire through recombination. To monitor alterations in the pneumococcal population structure, we have developed and utilized Capsular Sequence Typing (CST) in addition to Multiple-Locus Variable number tandem repeat Analysis (MLVA). To assess the serotype of each isolate CST was used. Based on the determination of the partial sequence of the capsular wzh gene, this method assigns a capsular type of an isolate within a single PCR reaction using multiple primersets. The genetic background of pneumococcal isolates was assessed by MLVA. MLVA and CST were used to create a snapshot of the Dutch pneumococcal population causing invasive disease before the introduction of the 7-valent pneumococcal conjugate vaccine in the Netherlands in 2006. A total of 1154 clinical isolates collected and serotyped by the Netherlands Reference Laboratory for Bacterial Meningitis were included in the snapshot. The CST was successful in discriminating most serotypes present in our collection. MLVA demonstrated that isolates belonging to some serotypes had a relatively high genetic diversity whilst other serotypes had a very homogeneous genetic background. MLVA and CST appear to be valuable tools to determine the population structure of pneumococcal isolates and are useful in monitoring the effects of pneumococcal vaccination. PMID:21637810

  20. Identifying metabolic enzymes with multiple types of association evidence

    PubMed Central

    Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M

    2006-01-01

    Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130

  1. Histoimmunogenetics Markup Language 1.0: Reporting next generation sequencing-based HLA and KIR genotyping.

    PubMed

    Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin

    2015-12-01

    We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Clinical next-generation sequencing in patients with non-small cell lung cancer.

    PubMed

    Hagemann, Ian S; Devarakonda, Siddhartha; Lockwood, Christina M; Spencer, David H; Guebert, Kalin; Bredemeyer, Andrew J; Al-Kateb, Hussam; Nguyen, TuDung T; Duncavage, Eric J; Cottrell, Catherine E; Kulkarni, Shashikant; Nagarajan, Rakesh; Seibert, Karen; Baggstrom, Maria; Waqar, Saiama N; Pfeifer, John D; Morgensztern, Daniel; Govindan, Ramaswamy

    2015-02-15

    A clinical assay was implemented to perform next-generation sequencing (NGS) of genes commonly mutated in multiple cancer types. This report describes the feasibility and diagnostic yield of this assay in 381 consecutive patients with non-small cell lung cancer (NSCLC). Clinical targeted sequencing of 23 genes was performed with DNA from formalin-fixed, paraffin-embedded (FFPE) tumor tissue. The assay used Agilent SureSelect hybrid capture followed by Illumina HiSeq 2000, MiSeq, or HiSeq 2500 sequencing in a College of American Pathologists-accredited, Clinical Laboratory Improvement Amendments-certified laboratory. Single-nucleotide variants and insertion/deletion events were reported. This assay was performed before methods were developed to detect rearrangements by NGS. Two hundred nine of all requisitioned samples (55%) were successfully sequenced. The most common reason for not performing the sequencing was an insufficient quantity of tissue available in the blocks (29%). Excisional, endoscopic, and core biopsy specimens were sufficient for testing in 95%, 66%, and 40% of the cases, respectively. The median turnaround time (TAT) in the pathology laboratory was 21 days, and there was a trend of an improved TAT with more rapid sequencing platforms. Sequencing yielded a mean coverage of 1318×. Potentially actionable mutations (ie, predictive or prognostic) were identified in 46% of 209 samples and were most commonly found in KRAS (28%), epidermal growth factor receptor (14%), phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (4%), phosphatase and tensin homolog (1%), and BRAF (1%). Five percent of the samples had multiple actionable mutations. A targeted therapy was instituted on the basis of NGS in 11% of the sequenced patients or in 6% of all patients. NGS-based diagnostics are feasible in NSCLC and provide clinically relevant information from readily available FFPE tissue. The sample type is associated with the probability of successful testing. © 2014 American Cancer Society.

  3. Multiple cis-acting elements involved in up-regulation of a cytochrome P450 gene conferring resistance to deltamethrin in smal brown planthopper, Laodelphax striatellus (Fallén).

    PubMed

    Pu, Jian; Sun, Haina; Wang, Jinda; Wu, Min; Wang, Kangxu; Denholm, Ian; Han, Zhaojun

    2016-11-01

    As well as arising from single point mutations in binding sites or detoxifying enzymes, it is likely that insecticide resistance mechanisms are frequently controlled by multiple genetic factors, resulting in resistance being inherited as a quantitative trait. However, empirical evidence for this is still rare. Here we analyse the causes of up-regulation of CYP6FU1, a monoxygenase implicated in resistance to deltamethrin in the rice pest Laodelphax striatellus. The 5'-flanking region of this gene was cloned and sequenced from individuals of a susceptible and a resistant strain. A luminescent reporter assay was used to evaluate different 5'-flanking regions and their fragments for promoter activity. Mutations enhancing promoter activity in various fragments were characterized, singly and in combination, by site mutation recovery. Nucleotide diversity in flanking sequences was greatly reduced in deltamethrin-resistant insects compared to susceptible ones. Phylogenetic sequence analysis found that CYP6FU1 had five different types of 5'-flanking region. All five types were present in a susceptible strain but only a single type showing the highest promoter activity was present in a resistant strain. Four cis-acting elements were identified whose influence on up-regulation was much more pronounced in combination than when present singly. Of these, two were new transcription factor (TF) binding sites produced by mutations, another one was also a new TF binding site alternated from an existing one, and the fourth was a unique transcription start site. These results demonstrate that multiple cis-acting elements are involved in up-regulating CYP6FU1 to generate a resistance phenotype. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Understanding the molecular epidemiology and global relationships of Brachyspira hyodysenteriae from swine herds in the United States: a multi-locus sequence typing approach.

    PubMed

    Mirajkar, Nandita S; Gebhart, Connie J

    2014-01-01

    Outbreaks of mucohemorrhagic diarrhea in pigs caused by Brachyspira hyodysenteriae in the late 2000s indicated the re-emergence of Swine Dysentery (SD) in the U.S. Although the clinical disease was absent in the U.S. since the early 1990s, it continued to cause significant economic losses to other swine rearing countries worldwide. This study aims to fill the gap in knowledge pertaining to the re-emergence and epidemiology of B. hyodysenteriae in the U.S. and its global relationships using a multi-locus sequence typing (MLST) approach. Fifty-nine post re-emergent isolates originating from a variety of sources in the U.S. were characterized by MLST, analyzed for epidemiological relationships (within and between multiple sites of swine systems), and were compared with pre re-emergent isolates from the U.S. Information for an additional 272 global isolates from the MLST database was utilized for international comparisons. Thirteen nucleotide sequence types (STs) including a predominant genotype (ST93) were identified in the post re-emergent U.S. isolates; some of which showed genetic similarity to the pre re-emergent STs thereby suggesting its likely role in the re-emergence of SD. In the U.S., in general, no more than one ST was found on a site; multiple sites of a common system shared a ST; and STs found in the U.S. were distinct from those identified globally. Of the 110 STs characterized from ten countries, only two were found in more than one country. The U.S. and global populations, identified as clonal and heterogeneous based on STs, showed close relatedness based on amino acid types (AATs). One predicted founder type (AAT9) and multiple predicted subgroup founder types identified for both the U.S. and the global population indicate the potential microevolution of this pathogen. This study elucidates the strain diversity and microevolution of B. hyodysenteriae, and highlights the utility of MLST for epidemiological and surveillance studies.

  5. Sequence evidence for the symbiotic origins of chloroplasts and mitochondria

    NASA Technical Reports Server (NTRS)

    George, D. G.; Hunt, L. T.; Dayhoff, M. O.

    1983-01-01

    The origin of mitochondria and chloroplasts is investigated on the basis of prokaryotic and early-eukaryotic evolutionary trees derived from protein and nucleic-acid sequences by the method of Dayhoff (1979). Trees for bacterial ferrodoxins, 5S ribosomal RNA, c-type cytochromes, the lipid-binding subunit of ATPase, and dihydrofolate reductase are presented and discussed. Good agreement among the trees is found, and it is argued that the mitochondria and chloroplasts evolved by multiple symbiotic events.

  6. Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps.

    PubMed

    Mortazavi, Ali; Pepke, Shirley; Jansen, Camden; Marinov, Georgi K; Ernst, Jason; Kellis, Manolis; Hardison, Ross C; Myers, Richard M; Wold, Barbara J

    2013-12-01

    We tested whether self-organizing maps (SOMs) could be used to effectively integrate, visualize, and mine diverse genomics data types, including complex chromatin signatures. A fine-grained SOM was trained on 72 ChIP-seq histone modifications and DNase-seq data sets from six biologically diverse cell lines studied by The ENCODE Project Consortium. We mined the resulting SOM to identify chromatin signatures related to sequence-specific transcription factor occupancy, sequence motif enrichment, and biological functions. To highlight clusters enriched for specific functions such as transcriptional promoters or enhancers, we overlaid onto the map additional data sets not used during training, such as ChIP-seq, RNA-seq, CAGE, and information on cis-acting regulatory modules from the literature. We used the SOM to parse known transcriptional enhancers according to the cell-type-specific chromatin signature, and we further corroborated this pattern on the map by EP300 (also known as p300) occupancy. New candidate cell-type-specific enhancers were identified for multiple ENCODE cell types in this way, along with new candidates for ubiquitous enhancer activity. An interactive web interface was developed to allow users to visualize and custom-mine the ENCODE SOM. We conclude that large SOMs trained on chromatin data from multiple cell types provide a powerful way to identify complex relationships in genomic data at user-selected levels of granularity.

  7. Integrating and mining the chromatin landscape of cell-type specificity using self-organizing maps

    PubMed Central

    Mortazavi, Ali; Pepke, Shirley; Jansen, Camden; Marinov, Georgi K.; Ernst, Jason; Kellis, Manolis; Hardison, Ross C.; Myers, Richard M.; Wold, Barbara J.

    2013-01-01

    We tested whether self-organizing maps (SOMs) could be used to effectively integrate, visualize, and mine diverse genomics data types, including complex chromatin signatures. A fine-grained SOM was trained on 72 ChIP-seq histone modifications and DNase-seq data sets from six biologically diverse cell lines studied by The ENCODE Project Consortium. We mined the resulting SOM to identify chromatin signatures related to sequence-specific transcription factor occupancy, sequence motif enrichment, and biological functions. To highlight clusters enriched for specific functions such as transcriptional promoters or enhancers, we overlaid onto the map additional data sets not used during training, such as ChIP-seq, RNA-seq, CAGE, and information on cis-acting regulatory modules from the literature. We used the SOM to parse known transcriptional enhancers according to the cell-type-specific chromatin signature, and we further corroborated this pattern on the map by EP300 (also known as p300) occupancy. New candidate cell-type-specific enhancers were identified for multiple ENCODE cell types in this way, along with new candidates for ubiquitous enhancer activity. An interactive web interface was developed to allow users to visualize and custom-mine the ENCODE SOM. We conclude that large SOMs trained on chromatin data from multiple cell types provide a powerful way to identify complex relationships in genomic data at user-selected levels of granularity. PMID:24170599

  8. Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor

    Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012more » alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.« less

  9. TGS-TB: Total Genotyping Solution for Mycobacterium tuberculosis Using Short-Read Whole-Genome Sequencing

    PubMed Central

    Sekizuka, Tsuyoshi; Yamashita, Akifumi; Murase, Yoshiro; Iwamoto, Tomotada; Mitarai, Satoshi; Kato, Seiya; Kuroda, Makoto

    2015-01-01

    Whole-genome sequencing (WGS) with next-generation DNA sequencing (NGS) is an increasingly accessible and affordable method for genotyping hundreds of Mycobacterium tuberculosis (Mtb) isolates, leading to more effective epidemiological studies involving single nucleotide variations (SNVs) in core genomic sequences based on molecular evolution. We developed an all-in-one web-based tool for genotyping Mtb, referred to as the Total Genotyping Solution for TB (TGS-TB), to facilitate multiple genotyping platforms using NGS for spoligotyping and the detection of phylogenies with core genomic SNVs, IS6110 insertion sites, and 43 customized loci for variable number tandem repeat (VNTR) through a user-friendly, simple click interface. This methodology is implemented with a KvarQ script to predict MTBC lineages/sublineages and potential antimicrobial resistance. Seven Mtb isolates (JP01 to JP07) in this study showing the same VNTR profile were accurately discriminated through median-joining network analysis using SNVs unique to those isolates. An additional IS6110 insertion was detected in one of those isolates as supportive genetic information in addition to core genomic SNVs. The results of in silico analyses using TGS-TB are consistent with those obtained using conventional molecular genotyping methods, suggesting that NGS short reads could provide multiple genotypes to discriminate multiple strains of Mtb, although longer NGS reads (≥300-mer) will be required for full genotyping on the TGS-TB web site. Most available short reads (~100-mer) can be utilized to discriminate the isolates based on the core genome phylogeny. TGS-TB provides a more accurate and discriminative strain typing for clinical and epidemiological investigations; NGS strain typing offers a total genotyping solution for Mtb outbreak and surveillance. TGS-TB web site: https://gph.niid.go.jp/tgs-tb/. PMID:26565975

  10. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results.

    PubMed

    Worley, K C; Wiese, B A; Smith, R F

    1995-09-01

    BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).

  11. Mining co-occurrence and sequence patterns from cancer diagnoses in New York State.

    PubMed

    Wang, Yu; Hou, Wei; Wang, Fusheng

    2018-01-01

    The goal of this study is to discover disease co-occurrence and sequence patterns from large scale cancer diagnosis histories in New York State. In particular, we want to identify disparities among different patient groups. Our study will provide essential knowledge for clinical researchers to further investigate comorbidities and disease progression for improving the management of multiple diseases. We used inpatient discharge and outpatient visit records from the New York State Statewide Planning and Research Cooperative System (SPARCS) from 2011-2015. We grouped each patient's visit history to generate diagnosis sequences for seven most popular cancer types. We performed frequent disease co-occurrence mining using the Apriori algorithm, and frequent disease sequence patterns discovery using the cSPADE algorithm. Different types of cancer demonstrated distinct patterns. Disparities of both disease co-occurrence and sequence patterns were observed from patients within different age groups. There were also considerable disparities in disease co-occurrence patterns with respect to different claim types (i.e., inpatient, outpatient, emergency department and ambulatory surgery). Disparities regarding genders were mostly found where the cancer types were gender specific. Supports of most patterns were usually higher for males than for females. Compared with secondary diagnosis codes, primary diagnosis codes can convey more stable results. Two disease sequences consisting of the same diagnoses but in different orders were usually with different supports. Our results suggest that the methods adopted can generate potentially interesting and clinically meaningful disease co-occurrence and sequence patterns, and identify disparities among various patient groups. These patterns could imply comorbidities and disease progressions.

  12. Variation in Campylobacter Mulilocus Sequence Typing Subtypes Detected on Three Different Plating Media

    USDA-ARS?s Scientific Manuscript database

    Introduction: There are multiple selective plating media available for detection and enumeration of naturally occurring Campylobacter. Campylobacter produce colonies with differing morphology and characteristics depending on the plating medium used. It is unclear if choice of plating medium can a...

  13. Aptamer-conjugated nanoparticles for cancer cell detection.

    PubMed

    Medley, Colin D; Bamrungsap, Suwussa; Tan, Weihong; Smith, Joshua E

    2011-02-01

    Aptamer-conjugated nanoparticles (ACNPs) have been used for a variety of applications, particularly dual nanoparticles for magnetic extraction and fluorescent labeling. In this type of assay, silica-coated magnetic and fluorophore-doped silica nanoparticles are conjugated to highly selective aptamers to detect and extract targeted cells in a variety of matrixes. However, considerable improvements are required in order to increase the selectivity and sensitivity of this two-particle assay to be useful in a clinical setting. To accomplish this, several parameters were investigated, including nanoparticle size, conjugation chemistry, use of multiple aptamer sequences on the nanoparticles, and use of multiple nanoparticles with different aptamer sequences. After identifying the best-performing elements, the improvements made to this assay's conditional parameters were combined to illustrate the overall enhanced sensitivity and selectivity of the two-particle assay using an innovative multiple aptamer approach, signifying a critical feature in the advancement of this technique.

  14. A Multiple-Sequence Variant of the Multiple-Baseline Design: A Strategy for Analysis of Sequence Effects and Treatment Comparison.

    ERIC Educational Resources Information Center

    Noell, George H.; Gresham, Frank M.

    2001-01-01

    Describes design logic and potential uses of a variant of the multiple-baseline design. The multiple-baseline multiple-sequence (MBL-MS) consists of multiple-baseline designs that are interlaced with one another and include all possible sequences of treatments. The MBL-MS design appears to be primarily useful for comparison of treatments taking…

  15. On the specificity of sequential congruency effects in implicit learning of motor and perceptual sequences.

    PubMed

    D'Angelo, Maria C; Jiménez, Luis; Milliken, Bruce; Lupiáñez, Juan

    2013-01-01

    Individuals experience less interference from conflicting information following events that contain conflicting information. Recently, Jiménez, Lupiáñez, and Vaquero (2009) demonstrated that such adaptations to conflict occur even when the source of conflict arises from implicit knowledge of sequences. There is accumulating evidence that momentary changes in adaptations made in response to conflicting information are conflict-type specific (e.g., Funes, Lupiáñez, & Humphreys, 2010a), suggesting that there are multiple modes of control. The current study examined whether conflict-specific sequential congruency effects occur when the 2 sources of conflict are implicitly learned. Participants implicitly learned a motor sequence while simultaneously learning a perceptual sequence. In a first experiment, after learning the 2 orthogonal sequences, participants expressed knowledge of the 2 sequences independently of each other in a transfer phase. In Experiments 2 and 3, within each sequence, the presence of a single control trial disrupted the expression of this specific type of learning on the following trial. There was no evidence of cross-conflict modulations in the expression of sequence learning. The results suggest that the mechanisms involved in transient shifts in conflict-specific control, as reflected in sequential congruency effects, are also engaged when the source of conflict is implicit. (c) 2013 APA, all rights reserved.

  16. Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

    DOEpatents

    Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

    2013-06-25

    A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.

  17. Identification of four novel HLA-B alleles, B*1590, B*1591, B*2726, and B*4705, from an East African population by high-resolution sequence-based typing.

    PubMed

    Luo, M; Mao, X; Plummer, F A

    2005-02-01

    We report here four novel HLA-B alleles, B*1590, B*1591, B*2726, and B*4705, identified from an East African population during sequence-based HLA-B typing. The novel alleles were confirmed by sequencing two separate polymerase chain reaction products, and by molecular cloning and sequencing multiple clones. B*1590 is identical to B*1510 at exon 2 and exon 3, except for a difference (GCCGTC) at codon 158. Sequence differences at codon 152 (GAGGTG) and codon 167 (TGGTCG) differentiate B*1591 from B*1503 at exon 3. B*2726 is identical to B*2708 at exon 2 and exon 3, except for a difference (AAGCAG) at codon 70. B*4705 was identified in three Kenyan women. The allele is identical to B*47010101/02 at exon 2 and exon 3, except for differences at codon 97 (AGGAAT) and codon 99 (TTTTAT). These new alleles have been named by the WHO Nomenclature Committee. Identification of these novel HLA-B alleles reflects the genetic diversity of this East African population.

  18. Full genome sequence analysis of a novel adenovirus of rhesus macaque origin indicates a new simian adenovirus type and species.

    PubMed

    Malouli, Daniel; Howell, Grant L; Legasse, Alfred W; Kahl, Christoph; Axthelm, Michael K; Hansen, Scott G; Früh, Klaus

    2014-09-01

    Multiple novel simian adenoviruses have been isolated over the past years and their potential to cross the species barrier and infect the human population is an ever present threat. Here we describe the isolation and full genome sequencing of a novel simian adenovirus (SAdV) isolated from the urine of two independent, never co-housed, late stage simian immunodeficiency virus (SIV)-infected rhesus macaques. The viral genome sequences revealed a novel type with a unique genome length, GC content, E3 region and DNA polymerase amino acid sequence that is sufficiently distinct from all currently known human- or simian adenovirus species to warrant classifying these isolates as a novel species of simian adenovirus. This new species, termed Simian mastadenovirus D (SAdV-D), displays the standard genome organization for the genus Mastadenovirus containing only one copy of the fiber gene which sets it apart from the old world monkey adenovirus species HAdV-G, SAdV-B and SAdV-C.

  19. Detection of high-risk mucosal human papillomavirus DNA in human specimens by a novel and sensitive multiplex PCR method combined with DNA microarray.

    PubMed

    Gheit, Tarik; Tommasino, Massimo

    2011-01-01

    Epidemiological and functional studies have clearly demonstrated that certain types of human papillomavirus (HPV) from the genus alpha of the HPV phylogenetic tree, referred to as high-risk (HR) types, are the etiological cause of cervical cancer. Several methods for HPV detection and typing have been developed, and their importance in clinical and epidemiological studies has been well demonstrated. However, comparative studies have shown that several assays have different sensitivities for the detection of specific HPV types, particularly in the case of multiple infections. In this chapter, we describe a novel one-shot method for the detection and typing of 19 mucosal HR HPV types (types 16, 18, 26, 31, 33, 35, 39, 45, 51, 52, 53, 56, 58, 59, 66, 68, 70, 73, and 82). The assay combines the advantages of the multiplex PCR methods, i.e., high sensitivity and the possibility to perform multiple amplifications in a single reaction, with an array primer extension (APEX) assay. The latter method offers the benefits of Sanger dideoxy sequencing with the high-throughput potential of the microarray. Initial studies have revealed that the assay is very sensitive in detecting multiple HPV infections.

  20. On continuous user authentication via typing behavior.

    PubMed

    Roth, Joseph; Liu, Xiaoming; Metaxas, Dimitris

    2014-10-01

    We hypothesize that an individual computer user has a unique and consistent habitual pattern of hand movements, independent of the text, while typing on a keyboard. As a result, this paper proposes a novel biometric modality named typing behavior (TB) for continuous user authentication. Given a webcam pointing toward a keyboard, we develop real-time computer vision algorithms to automatically extract hand movement patterns from the video stream. Unlike the typical continuous biometrics, such as keystroke dynamics (KD), TB provides a reliable authentication with a short delay, while avoiding explicit key-logging. We collect a video database where 63 unique subjects type static text and free text for multiple sessions. For one typing video, the hands are segmented in each frame and a unique descriptor is extracted based on the shape and position of hands, as well as their temporal dynamics in the video sequence. We propose a novel approach, named bag of multi-dimensional phrases, to match the cross-feature and cross-temporal pattern between a gallery sequence and probe sequence. The experimental results demonstrate a superior performance of TB when compared with KD, which, together with our ultrareal-time demo system, warrant further investigation of this novel vision application and biometric modality.

  1. On-line resources for bacterial micro-evolution studies using MLVA or CRISPR typing.

    PubMed

    Grissa, Ibtissem; Bouchon, Patrick; Pourcel, Christine; Vergnaud, Gilles

    2008-04-01

    The control of bacterial pathogens requires the development of tools allowing the precise identification of strains at the subspecies level. It is now widely accepted that these tools will need to be DNA-based assays (in contrast to identification at the species level, where biochemical based assays are still widely used, even though very powerful 16S DNA sequence databases exist). Typing assays need to be cheap and amenable to the designing of international databases. The success of such subspecies typing tools will eventually be measured by the size of the associated reference databases accessible over the internet. Three methods have shown some potential in this direction, the so-called spoligotyping assay (Mycobacterium tuberculosis, 40,000 entries database), Multiple Loci Sequence Typing (MLST; up to a few thousands entries for the more than 20 bacterial species), and more recently Multiple Loci VNTR Analysis (MLVA; up to a few hundred entries, assays available for more than 20 pathogens). In the present report we will review the current status of the tools and resources we have developed along the past seven years to help in the setting-up or the use of MLVA assays or lately for analysing Clustered Regularly Interspaced Short Palindromic Repeats called CRISPRs which are the basis for spoligotyping assays.

  2. Axolotl hemoglobin: cDNA-derived amino acid sequences of two alpha globins and a beta globin from an adult Ambystoma mexicanum.

    PubMed

    Shishikura, Fumio; Takeuchi, Hiro-aki; Nagai, Takatoshi

    2005-11-01

    Erythrocytes of the adult axolotl, Ambystoma mexicanum, have multiple hemoglobins. We separated and purified two kinds of hemoglobin, termed major hemoglobin (Hb M) and minor hemoglobin (Hb m), from a five-year-old male by hydrophobic interaction column chromatography on Alkyl Superose. The hemoglobins have two distinct alpha type globin polypeptides (alphaM and alpham) and a common beta globin polypeptide, all of which were purified in FPLC on a reversed-phase column after S-pyridylethylation. The complete amino acid sequences of the three globin chains were determined separately using nucleotide sequencing with the assistance of protein sequencing. The mature globin molecules were composed of 141 amino acid residues for alphaM globin, 143 for alpham globin and 146 for beta globin. Comparing primary structures of the five kinds of axolotl globins, including two previously established alpha type globins from the same species, with other known globins of amphibians and representatives of other vertebrates, we constructed phylogenetic trees for amphibian hemoglobins and tetrapod hemoglobins. The molecular trees indicated that alphaM, alpham, beta and the previously known alpha major globin were adult types of globins and the other known alpha globin was a larval type. The existence of two to four more globins in the axolotl erythrocyte is predicted.

  3. Bypassing bacterial infection in phage display by sequencing DNA released from phage particles.

    PubMed

    Villequey, Camille; Kong, Xu-Dong; Heinis, Christian

    2017-11-01

    Phage display relies on a bacterial infection step in which the phage particles are replicated to perform multiple affinity selection rounds and to enable the identification of isolated clones by DNA sequencing. While this process is efficient for wild-type phage, the bacterial infection rate of phage with mutant or chemically modified coat proteins can be low. For example, a phage mutant with a disulfide-free p3 coat protein, used for the selection of bicyclic peptides, has a more than 100-fold reduced infection rate compared to the wild-type. A potential strategy for bypassing the bacterial infection step is to directly sequence DNA extracted from phage particles after a single round of phage panning using high-throughput sequencing. In this work, we have quantified the fraction of phage clones that can be identified by directly sequencing DNA from phage particles. The results show that the DNA of essentially all of the phage particles can be 'decoded', and that the sequence coverage for mutants equals that of amplified DNA extracted from cells infected with wild-type phage. This procedure is particularly attractive for selections with phage that have a compromised infection capacity, and it may allow phage display to be performed with particles that are not infective at all. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. [Genetic analysis of two children patients affected with CHARGE syndrome].

    PubMed

    Li, Guoqiang; Li, Niu; Xu, Yufei; Li, Juan; Ding, Yu; Shen, Yiping; Wang, Xiumin; Wang, Jian

    2018-04-10

    To analyze two Chinese pediatric patients with multiple malformations and growth and development delay. Both patients were subjected to targeted gene sequencing, and the results were analyzed with Ingenuity Variant Analysis software. Suspected pathogenic variations were verified by Sanger sequencing. High-throughput sequencing showed that both patients have carried heterozygous variants of the CHD7 gene. Patient 1 carried a nonsense mutation in exon 36 (c.7957C>T, p.Arg2653*), while patient 2 carried a nonsense mutation of exon 2 (c.718C>T, p.Gln240*). Sanger sequencing confirmed the above mutations in both patients, while their parents were of wild-type for the corresponding sites, indicating that the two mutations have happened de novo. Two patients were diagnosed with CHARGE syndrome by high-throughput sequencing.

  5. Role of Modular Polyketide Synthases in the Production of Polyether Ladder Compounds in Ciguatoxin-Producing Gambierdiscus polynesiensis and G. excentricus (Dinophyceae).

    PubMed

    Kohli, Gurjeet S; Campbell, Katrina; John, Uwe; Smith, Kirsty F; Fraga, Santiago; Rhodes, Lesley L; Murray, Shauna A

    2017-09-01

    Gambierdiscus, a benthic dinoflagellate, produces ciguatoxins that cause the human illness Ciguatera. Ciguatoxins are polyether ladder compounds that have a polyketide origin, indicating that polyketide synthases (PKS) are involved in their production. We sequenced transcriptomes of Gambierdiscus excentricus and Gambierdiscus polynesiensis and found 264 contigs encoding single domain ketoacyl synthases (KS; G. excentricus: 106, G. polynesiensis: 143) and ketoreductases (KR; G. excentricus: 7, G. polynesiensis: 8) with sequence similarity to type I PKSs, as reported in other dinoflagellates. In addition, 24 contigs (G. excentricus: 3, G. polynesiensis: 21) encoding multiple PKS domains (forming typical type I PKSs modules) were found. The proposed structure produced by one of these megasynthases resembles a partial carbon backbone of a polyether ladder compound. Seventeen contigs encoding single domain KS, KR, s-malonyltransacylase, dehydratase and enoyl reductase with sequence similarity to type II fatty acid synthases (FAS) in plants were found. Type I PKS and type II FAS genes were distinguished based on the arrangement of domains on the contigs and their sequence similarity and phylogenetic clustering with known PKS/FAS genes in other organisms. This differentiation of PKS and FAS pathways in Gambierdiscus is important, as it will facilitate approaches to investigating toxin biosynthesis pathways in dinoflagellates. © 2017 The Author(s) Journal of Eukaryotic Microbiology © 2017 International Society of Protistologists.

  6. Phenotype–genotype correlation in Hirschsprung disease is illuminated by comparative analysis of the RET protein sequence

    PubMed Central

    Kashuk, Carl S.; Stone, Eric A.; Grice, Elizabeth A.; Portnoy, Matthew E.; Green, Eric D.; Sidow, Arend; Chakravarti, Aravinda; McCallion, Andrew S.

    2005-01-01

    The ability to discriminate between deleterious and neutral amino acid substitutions in the genes of patients remains a significant challenge in human genetics. The increasing availability of genomic sequence data from multiple vertebrate species allows inclusion of sequence conservation and physicochemical properties of residues to be used for functional prediction. In this study, the RET receptor tyrosine kinase serves as a model disease gene in which a broad spectrum (≥116) of disease-associated mutations has been identified among patients with Hirschsprung disease and multiple endocrine neoplasia type 2. We report the alignment of the human RET protein sequence with the orthologous sequences of 12 non-human vertebrates (eight mammalian, one avian, and three teleost species), their comparative analysis, the evolutionary topology of the RET protein, and predicted tolerance for all published missense mutations. We show that, although evolutionary conservation alone provides significant information to predict the effect of a RET mutation, a model that combines comparative sequence data with analysis of physiochemical properties in a quantitative framework provides far greater accuracy. Although the ability to discern the impact of a mutation is imperfect, our analyses permit substantial discrimination between predicted functional classes of RET mutations and disease severity even for a multigenic disease such as Hirschsprung disease. PMID:15956201

  7. Differential Effects of Alcohol on Working Memory: Distinguishing Multiple Processes

    PubMed Central

    Saults, J. Scott; Cowan, Nelson; Sher, Kenneth J.; Moreno, Matthew V.

    2008-01-01

    We assessed effects of alcohol consumption on different types of working memory (WM) tasks in an attempt to characterize the nature of alcohol effects on cognition. The WM tasks varied in two properties of materials to be retained in a two-stimulus comparison procedure. Conditions included (1) spatial arrays of colors, (2) temporal sequences of colors, (3) spatial arrays of spoken digits, and (4) temporal sequences of spoken digits. Alcohol consumption impaired memory for auditory and visual sequences, but not memory for simultaneous arrays of auditory or visual stimuli. These results suggest that processes needed to encode and maintain stimulus sequences, such as rehearsal, are more sensitive to alcohol intoxication than other WM mechanisms needed to maintain multiple concurrent items, such as focusing attention on them. These findings help to resolve disparate findings from prior research into alcohol’s effect on WM and on divided attention. The results suggest that moderate doses of alcohol impair WM by affecting certain mnemonic strategies and executive processes rather than by shrinking the basic holding capacity of WM. PMID:18179311

  8. Differential effects of alcohol on working memory: distinguishing multiple processes.

    PubMed

    Saults, J Scott; Cowan, Nelson; Sher, Kenneth J; Moreno, Matthew V

    2007-12-01

    The authors assessed effects of alcohol consumption on different types of working memory (WM) tasks in an attempt to characterize the nature of alcohol effects on cognition. The WM tasks varied in 2 properties of materials to be retained in a 2-stimulus comparison procedure. Conditions included (a) spatial arrays of colors, (b) temporal sequences of colors, (c) spatial arrays of spoken digits, and (d) temporal sequences of spoken digits. Alcohol consumption impaired memory for auditory and visual sequences but not memory for simultaneous arrays of auditory or visual stimuli. These results suggest that processes needed to encode and maintain stimulus sequences, such as rehearsal, are more sensitive to alcohol intoxication than other WM mechanisms needed to maintain multiple concurrent items, such as focusing attention on them. These findings help to resolve disparate findings from prior research on alcohol's effect on WM and on divided attention. The results suggest that moderate doses of alcohol impair WM by affecting certain mnemonic strategies and executive processes rather than by shrinking the basic holding capacity of WM. (c) 2008 APA, all rights reserved.

  9. Effect of the sequence data deluge on the performance of methods for detecting protein functional residues.

    PubMed

    Garrido-Martín, Diego; Pazos, Florencio

    2018-02-27

    The exponential accumulation of new sequences in public databases is expected to improve the performance of all the approaches for predicting protein structural and functional features. Nevertheless, this was never assessed or quantified for some widely used methodologies, such as those aimed at detecting functional sites and functional subfamilies in protein multiple sequence alignments. Using raw protein sequences as only input, these approaches can detect fully conserved positions, as well as those with a family-dependent conservation pattern. Both types of residues are routinely used as predictors of functional sites and, consequently, understanding how the sequence content of the databases affects them is relevant and timely. In this work we evaluate how the growth and change with time in the content of sequence databases affect five sequence-based approaches for detecting functional sites and subfamilies. We do that by recreating historical versions of the multiple sequence alignments that would have been obtained in the past based on the database contents at different time points, covering a period of 20 years. Applying the methods to these historical alignments allows quantifying the temporal variation in their performance. Our results show that the number of families to which these methods can be applied sharply increases with time, while their ability to detect potentially functional residues remains almost constant. These results are informative for the methods' developers and final users, and may have implications in the design of new sequencing initiatives.

  10. A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3

    PubMed Central

    Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa

    2001-01-01

    The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048

  11. The Poultry-Associated Microbiome: Network Analysis and Farm-to-Fork Characterizations

    PubMed Central

    Oakley, Brian B.; Morales, Cesar A.; Line, J.; Berrang, Mark E.; Meinersmann, Richard J.; Tillman, Glenn E.; Wise, Mark G.; Siragusa, Gregory R.; Hiett, Kelli L.; Seal, Bruce S.

    2013-01-01

    Microbial communities associated with agricultural animals are important for animal health, food safety, and public health. Here we combine high-throughput sequencing (HTS), quantitative-PCR assays, and network analysis to profile the poultry-associated microbiome and important pathogens at various stages of commercial poultry production from the farm to the consumer. Analysis of longitudinal data following two flocks from the farm through processing showed a core microbiome containing multiple sequence types most closely related to genera known to be pathogenic for animals and/or humans, including Campylobacter, Clostridium, and Shigella. After the final stage of commercial poultry processing, taxonomic richness was ca. 2–4 times lower than the richness of fecal samples from the same flocks and Campylobacter abundance was significantly reduced. Interestingly, however, carcasses sampled at 48 hr after processing harboured the greatest proportion of unique taxa (those not encountered in other samples), significantly more than expected by chance. Among these were anaerobes such as Prevotella, Veillonella, Leptrotrichia, and multiple Campylobacter sequence types. Retail products were dominated by Pseudomonas, but also contained 27 other genera, most of which were potentially metabolically active and encountered in on-farm samples. Network analysis was focused on the foodborne pathogen Campylobacter and revealed a majority of sequence types with no significant interactions with other taxa, perhaps explaining the limited efficacy of previous attempts at competitive exclusion of Campylobacter. These data represent the first use of HTS to characterize the poultry microbiome across a series of farm-to-fork samples and demonstrate the utility of HTS in monitoring the food supply chain and identifying sources of potential zoonoses and interactions among taxa in complex communities. PMID:23468931

  12. The Democratization of the Oncogene

    PubMed Central

    Le, Anh T.; Doebele, Robert C.

    2014-01-01

    Summary The identification of novel, oncogenic gene rearrangements in inflammatory myofibroblastic tumor (IMT) demonstrates the potential of next generation sequencing (NGS) platforms for the detection of therapeutically relevant oncogenes across multiple tumor types, but raises significant questions relating to the investigation of targeted therapies in this new era of widespread NGS testing. PMID:25092743

  13. A cDNA from a mouse pancreatic beta cell encoding a putative transcription factor of the insulin gene.

    PubMed Central

    Walker, M D; Park, C W; Rosen, A; Aronheim, A

    1990-01-01

    Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401

  14. Holocentromeres in Rhynchospora are associated with genome-wide centromere-specific repeat arrays interspersed among euchromatin.

    PubMed

    Marques, André; Ribeiro, Tiago; Neumann, Pavel; Macas, Jiří; Novák, Petr; Schubert, Veit; Pellino, Marco; Fuchs, Jörg; Ma, Wei; Kuhlmann, Markus; Brandt, Ronny; Vanzela, André L L; Beseda, Tomáš; Šimková, Hana; Pedrosa-Harand, Andrea; Houben, Andreas

    2015-11-03

    Holocentric chromosomes lack a primary constriction, in contrast to monocentrics. They form kinetochores distributed along almost the entire poleward surface of the chromatids, to which spindle fibers attach. No centromere-specific DNA sequence has been found for any holocentric organism studied so far. It was proposed that centromeric repeats, typical for many monocentric species, could not occur in holocentrics, most likely because of differences in the centromere organization. Here we show that the holokinetic centromeres of the Cyperaceae Rhynchospora pubera are highly enriched by a centromeric histone H3 variant-interacting centromere-specific satellite family designated "Tyba" and by centromeric retrotransposons (i.e., CRRh) occurring as genome-wide interspersed arrays. Centromeric arrays vary in length from 3 to 16 kb and are intermingled with gene-coding sequences and transposable elements. We show that holocentromeres of metaphase chromosomes are composed of multiple centromeric units rather than possessing a diffuse organization, thus favoring the polycentric model. A cell-cycle-dependent shuffling of multiple centromeric units results in the formation of functional (poly)centromeres during mitosis. The genome-wide distribution of centromeric repeat arrays interspersing the euchromatin provides a previously unidentified type of centromeric chromatin organization among eukaryotes. Thus, different types of holocentromeres exist in different species, namely with and without centromeric repetitive sequences.

  15. [Dynamic mutations--a newly detected category of mutations which is the basis for certain neurologic diseases].

    PubMed

    Mardesić, D

    1995-01-01

    This review offers some basic information on the discovery of a new type of mutations being the cause of some significant neurologic diseases: myotonic dystrophy, Huntington's disease, spinocerebellar ataxia type 1, spinobulbar pallido-louysian muscular atrophy, fragile X syndrome and some other, up to a total of ten entities. The basis of the so-called dynamic mutations is an abnormal multiplication of a trinucleotide producing sequences of several hundreds or even thousands of identical copies in the respective gene. The result is designated as expanded or amplified trinucleotide (or triplet) repeat. These sequences are not stable, but increase (or exceptionally decrease) in length during cell multiplication in successive generations. They segregate within families with the affected members, demonstrating a significant correlation between the length of the repeat sequence, the severity of the pathologic phenotype and an inverse correlation with the age at the clinical manifestation of the disease. Thus, at least, a formal explanation for the anticipation phenomenon of the age at which the disease is manifested within a family is offered. The importance of the discovery of dynamic mutations lies in the possibility for more precise and reliable genetic counselling. The discovery has opened a lot of new questions giving a new impetus for intensive research.

  16. The Salmonella In Silico Typing Resource (SISTR): An Open Web-Accessible Tool for Rapidly Typing and Subtyping Draft Salmonella Genome Assemblies.

    PubMed

    Yoshida, Catherine E; Kruczkiewicz, Peter; Laing, Chad R; Lingohr, Erika J; Gannon, Victor P J; Nash, John H E; Taboada, Eduardo N

    2016-01-01

    For nearly 100 years serotyping has been the gold standard for the identification of Salmonella serovars. Despite the increasing adoption of DNA-based subtyping approaches, serotype information remains a cornerstone in food safety and public health activities aimed at reducing the burden of salmonellosis. At the same time, recent advances in whole-genome sequencing (WGS) promise to revolutionize our ability to perform advanced pathogen characterization in support of improved source attribution and outbreak analysis. We present the Salmonella In Silico Typing Resource (SISTR), a bioinformatics platform for rapidly performing simultaneous in silico analyses for several leading subtyping methods on draft Salmonella genome assemblies. In addition to performing serovar prediction by genoserotyping, this resource integrates sequence-based typing analyses for: Multi-Locus Sequence Typing (MLST), ribosomal MLST (rMLST), and core genome MLST (cgMLST). We show how phylogenetic context from cgMLST analysis can supplement the genoserotyping analysis and increase the accuracy of in silico serovar prediction to over 94.6% on a dataset comprised of 4,188 finished genomes and WGS draft assemblies. In addition to allowing analysis of user-uploaded whole-genome assemblies, the SISTR platform incorporates a database comprising over 4,000 publicly available genomes, allowing users to place their isolates in a broader phylogenetic and epidemiological context. The resource incorporates several metadata driven visualizations to examine the phylogenetic, geospatial and temporal distribution of genome-sequenced isolates. As sequencing of Salmonella isolates at public health laboratories around the world becomes increasingly common, rapid in silico analysis of minimally processed draft genome assemblies provides a powerful approach for molecular epidemiology in support of public health investigations. Moreover, this type of integrated analysis using multiple sequence-based methods of sub-typing allows for continuity with historical serotyping data as we transition towards the increasing adoption of genomic analyses in epidemiology. The SISTR platform is freely available on the web at https://lfz.corefacility.ca/sistr-app/.

  17. Genomic Sequencing of Bordetella pertussis for Epidemiology and Global Surveillance of Whooping Cough.

    PubMed

    Bouchez, Valérie; Guglielmini, Julien; Dazas, Mélody; Landier, Annie; Toubiana, Julie; Guillot, Sophie; Criscuolo, Alexis; Brisse, Sylvain

    2018-06-01

    Bordetella pertussis causes whooping cough, a highly contagious respiratory disease that is reemerging in many world regions. The spread of antigen-deficient strains may threaten acellular vaccine efficacy. Dynamics of strain transmission are poorly defined because of shortcomings in current strain genotyping methods. Our objective was to develop a whole-genome genotyping strategy with sufficient resolution for local epidemiologic questions and sufficient reproducibility to enable international comparisons of clinical isolates. We defined a core genome multilocus sequence typing scheme comprising 2,038 loci and demonstrated its congruence with whole-genome single-nucleotide polymorphism variation. Most cases of intrafamilial groups of isolates or of multiple isolates recovered from the same patient were distinguished from temporally and geographically cocirculating isolates. However, epidemiologically unrelated isolates were sometimes nearly undistinguishable. We set up a publicly accessible core genome multilocus sequence typing database to enable global comparisons of B. pertussis isolates, opening the way for internationally coordinated surveillance.

  18. Sequence and structural characterization of Trx-Grx type of monothiol glutaredoxins from Ashbya gossypii.

    PubMed

    Yadav, Saurabh; Kumari, Pragati; Kushwaha, Hemant Ritturaj

    2013-01-01

    Glutaredoxins are enzymatic antioxidants which are small, ubiquitous, glutathione dependent and essentially classified under thioredoxin-fold superfamily. Glutaredoxins are classified into two types: dithiol and monothiol. Monothiol glutaredoxins which carry the signature "CGFS" as a redox active motif is known for its role in oxidative stress, inside the cell. In the present analysis, the 138 amino acid long monothiol glutaredoxin, AgGRX1 from Ashbya gossypii was identified and has been used for the analysis. The multiple sequence alignment of the AgGRX1 protein sequence revealed the characteristic motif of typical monothiol glutaredoxin as observed in various other organisms. The proposed structure of the AgGRX1 protein was used to analyze signature folds related to the thioredoxin superfamily. Further, the study highlighted the structural features pertaining to the complex mechanism of glutathione docking and interacting residues.

  19. Questioning short-term memory and its measurement: Why digit span measures long-term associative learning.

    PubMed

    Jones, Gary; Macken, Bill

    2015-11-01

    Traditional accounts of verbal short-term memory explain differences in performance for different types of verbal material by reference to inherent characteristics of the verbal items making up memory sequences. The role of previous experience with sequences of different types is ostensibly controlled for either by deliberate exclusion or by presenting multiple trials constructed from different random permutations. We cast doubt on this general approach in a detailed analysis of the basis for the robust finding that short-term memory for digit sequences is superior to that for other sequences of verbal material. Specifically, we show across four experiments that this advantage is not due to inherent characteristics of digits as verbal items, nor are individual digits within sequences better remembered than other types of individual verbal items. Rather, the advantage for digit sequences stems from the increased frequency, compared to other verbal material, with which digits appear in random sequences in natural language, and furthermore, relatively frequent digit sequences support better short-term serial recall than less frequent ones. We also provide corpus-based computational support for the argument that performance in a short-term memory setting is a function of basic associative learning processes operating on the linguistic experience of the rememberer. The experimental and computational results raise questions not only about the role played by measurement of digit span in cognition generally, but also about the way in which long-term memory processes impact on short-term memory functioning. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  20. Over a Decade of recA and tly Gene Sequence Typing of the Skin Bacterium Propionibacterium acnes: What Have We Learnt?

    PubMed Central

    2017-01-01

    The Gram-positive, anaerobic bacterium Propionibacterium acnes forms part of the normal microbiota on human skin and mucosal surfaces. While normally associated with skin health, P. acnes is also an opportunistic pathogen linked with a range of human infections and clinical conditions. Over the last decade, our knowledge of the intraspecies phylogenetics and taxonomy of this bacterium has increased tremendously due to the introduction of DNA typing schemes based on single and multiple gene loci, as well as whole genomes. Furthermore, this work has led to the identification of specific lineages associated with skin health and human disease. In this review we will look back at the introduction of DNA sequence typing of P. acnes based on recA and tly loci, and then describe how these methods provided a basic understanding of the population genetic structure of the bacterium, and even helped characterize the grapevine-associated lineage of P. acnes, known as P. acnes type Zappe, which appears to have undergone a host switch from humans-to-plants. Particular limitations of recA and tly sequence typing will also be presented, as well as a detailed discussion of more recent, higher resolution, DNA-based methods to type P. acnes and investigate its evolutionary history in greater detail. PMID:29267255

  1. First detection of canine parvovirus type 2b from diarrheic dogs in Himachal Pradesh.

    PubMed

    Sharma, Shalini; Dhar, Prasenjit; Thakur, Aneesh; Sharma, Vivek; Sharma, Mandeep

    2016-09-01

    The present study was conducted to detect the presence of canine parvovirus (CPV) among diarrheic dogs in Himachal Pradesh and to identify the most prevalent antigenic variant of CPV based on molecular typing and sequence analysis of VP2 gene. A total of 102 fecal samples were collected from clinical cases of diarrhea or hemorrhagic gastroenteritis from CPV vaccinated or non-vaccinated dogs. Samples were tested using CPV-specific polymerase chain reaction (PCR) targeting VP2 gene, multiplex PCR for detection of CPV-2a and CPV-2b antigenic variants, and a PCR for the detection of CPV-2c. CPV-2b isolate was cultured on Madin-Darby canine kidney (MDCK) cell lines and sequenced using VP2 structural protein gene. Multiple alignment and phylogenetic analysis was done using ClustalW and MEGA6 and inferred using the Neighbor-Joining method. No sample was found positive for the original CPV strain usually present in the vaccine. However, about 50% (52 out of 102) of the samples were found to be positive with CPV-2ab PCR assay that detects newer variants of CPV circulating in the field. In addition, multiplex PCR assay that identifies both CPV-2ab and CPV-2b revealed that CPV-2b was the major antigenic variant present in the affected dogs. A PCR positive isolate of CPV-2b was adapted to grow in MDCK cells and produced characteristic cytopathic effect after 5 th passage. Multiple sequence alignment of VP2 structural gene of CPV-2b isolate (Accession number HG004610) used in the study was found to be similar to other sequenced isolates in NCBI sequence database and showed 98-99% homology. This study reports the first detection of CPV-2b in dogs with hemorrhagic gastroenteritis in Himachal Pradesh and absence of other antigenic types of CPV. Further, CPV-specific PCR assay can be used for rapid confirmation of circulating virus strains under field conditions.

  2. Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

    DOEpatents

    Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

    2011-01-18

    A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.

  3. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls.

    PubMed

    Flannick, Jason; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M; Agarwala, Vineeta; Gaulton, Kyle J; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Dennis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana Cn; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Altshuler, David; Burtt, Noël P; Florez, Jose C; Boehnke, Michael; McCarthy, Mark I

    2017-12-19

    To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D.

  4. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls

    PubMed Central

    Jason, Flannick; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M.; Agarwala, Vineeta; Gaulton, Kyle J.; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J.; Rivas, Manuel A.; Perry, John R. B.; Sim, Xueling; Blackwell, Thomas W.; Robertson, Neil R.; Rayner, N William; Cingolani, Pablo; Locke, Adam E.; Tajes, Juan Fernandez; Highland, Heather M.; Dupuis, Josee; Chines, Peter S.; Lindgren, Cecilia M.; Hartl, Christopher; Jackson, Anne U.; Chen, Han; Huyghe, Jeroen R.; van de Bunt, Martijn; Pearson, Richard D.; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M.; Gamazon, Eric R.; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A.; Below, Jennifer E.; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L.; Pasko, Dorota; Parker, Stephen C. J.; Varga, Tibor V.; Green, Todd; Beer, Nicola L.; Day-Williams, Aaron G.; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J.; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P.; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F.; Han, Bok-Ghee; Jenkinson, Christopher P.; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C. Y.; Palmer, Nicholette D.; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E.; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D.; Neale, Benjamin M.; Purcell, Shaun; Butterworth, Adam S.; Howson, Joanna M. M.; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K. L.; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H. T.; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E.; Rybin, Dennis; Farook, Vidya S.; Fowler, Sharon P.; Freedman, Barry I.; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J.; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K.; Puppala, Sobha; Scott, William R.; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A.; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C.; Mangino, Massimo; Bonnycastle, Lori L.; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L.; Herder, Christian; Groves, Christopher J.; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A.; Doney, Alex S. F.; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J.; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E.; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H.; Stirrups, Kathleen; Wood, Andrew R.; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O.; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P.; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B.; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N. A.; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M.; Syvänen, Ann-Christine; Bergman, Richard N.; Bharadwaj, Dwaipayan; Bottinger, Erwin P.; Cho, Yoon Shin; Chandak, Giriraj R.; Chan, Juliana CN; Chia, Kee Seng; Daly, Mark J.; Ebrahim, Shah B.; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A.; Lehman, Donna M.; Jia, Weiping; Ma, Ronald C. W.; Pollin, Toni I.; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J. F.; Small, Kerrin S.; Ried, Janina S.; DeFronzo, Ralph A.; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J.; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W.; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R.; Gloyn, Anna L.; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D.; Hattersley, Andrew T.; Bowden, Donald W.; Collins, Francis S.; Atzmon, Gil; Chambers, John C.; Spector, Timothy D.; Laakso, Markku; Strom, Tim M.; Bell, Graeme I.; Blangero, John; Duggirala, Ravindranath; Tai, E. Shyong; McVean, Gilean; Hanis, Craig L.; Wilson, James G.; Seielstad, Mark; Frayling, Timothy M.; Meigs, James B.; Cox, Nancy J.; Sladek, Rob; Lander, Eric S.; Gabriel, Stacey; Mohlke, Karen L.; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J.; Morris, Andrew P.; Kang, Hyun Min; Altshuler, David; Burtt, Noël P.; Florez, Jose C.; Boehnke, Michael; McCarthy, Mark I.

    2017-01-01

    To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1–5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D. PMID:29257133

  5. Development and evaluation of a multi-locus sequence typing scheme for Mycoplasma synoviae.

    PubMed

    Dijkman, R; Feberwee, A; Landman, W J M

    2016-08-01

    Reproducible molecular Mycoplasma synoviae typing techniques with sufficient discriminatory power may help to expand knowledge on its epidemiology and contribute to the improvement of control and eradication programmes of this mycoplasma species. The present study describes the development and validation of a novel multi-locus sequence typing (MLST) scheme for M. synoviae. Thirteen M. synoviae isolates originating from different poultry categories, farms and lesions, were subjected to whole genome sequencing. Their sequences were compared to that of M. synoviae reference strain MS53. A high number of single nucleotide polymorphisms (SNPs) indicating considerable genetic diversity were identified. SNPs were present in over 40 putative target genes for MLST of which five target genes were selected (nanA, uvrA, lepA, ruvB and ugpA) for the MLST scheme. This scheme was evaluated analysing 209 M. synoviae samples from different countries, categories of poultry, farms and lesions. Eleven clonal clusters and 76 different sequence types (STs) were obtained. Clustering occurred following geographical origin, supporting the hypothesis of regional population evolution. M. synoviae samples obtained from epidemiologically linked outbreaks often harboured the same ST. In contrast, multiple M. synoviae lineages were found in samples originating from swollen joints or oviducts from hens that produce eggs with eggshell apex abnormalities indicating that further research is needed to identify the genetic factors of M. synoviae that may explain its variations in tissue tropism and disease inducing potential. Furthermore, MLST proved to have a higher discriminatory power compared to variable lipoprotein and haemagglutinin A typing, which generated 50 different genotypes on the same database.

  6. Exome Sequencing Finds a Novel PCSK1 Mutation in a Child With Generalized Malabsorptive Diarrhea and Diabetes Insipidus

    PubMed Central

    Yourshaw, Michael; Solorzano-Vargas, R. Sergio; Pickett, Lindsay A.; Lindberg, Iris; Wang, Jiafang; Cortina, Galen; Pawlikowska-Haddal, Anna; Baron, Howard; Venick, Robert S.; Nelson, Stanley F.; Martín, Martín G.

    2014-01-01

    Objectives Congenital diarrhea disorders are a group of genetically diverse and typically autosomal recessive disorders that have yet to be well characterized phenotypically or molecularly. Diagnostic assessments are generally limited to nutritional challenges and histologic evaluation, and many subjects eventually require a prolonged course of intravenous nutrition. Here we describe next-generation sequencing techniques to investigate a child with perplexing congenital malabsorptive diarrhea and other presumably unrelated clinical problems; this method provides an alternative approach to molecular diagnosis. Methods We screened the diploid genome of an affected individual, using exome sequencing, for uncommon variants that have observed protein-coding consequences. We assessed the functional activity of the mutant protein, as well as its lack of expression using immunohistochemistry. Results Among several rare variants detected was a homozygous nonsense mutation in the catalytic domain of the proprotein convertase subtilisin/kexin type 1 gene. The mutation abolishes prohormone convertase 1/3 endoprotease activity as well as expression in the intestine. These primary genetic findings prompted a careful endocrine reevaluation of the child at 4.5 years of age, and multiple significant problems were subsequently identified consistent with the known phenotypic consequences of proprotein convertase subtilisin/kexin type 1 (PCSK1) gene mutations. Based on the molecular diagnosis, alternate medical and dietary management was implemented for diabetes insipidus, polyphagia, and micropenis. Conclusions Whole-exome sequencing provides a powerful diagnostic tool to clinicians managing rare genetic disorders with multiple perplexing clinical manifestations. PMID:24280991

  7. Exome sequencing finds a novel PCSK1 mutation in a child with generalized malabsorptive diarrhea and diabetes insipidus.

    PubMed

    Yourshaw, Michael; Solorzano-Vargas, R Sergio; Pickett, Lindsay A; Lindberg, Iris; Wang, Jiafang; Cortina, Galen; Pawlikowska-Haddal, Anna; Baron, Howard; Venick, Robert S; Nelson, Stanley F; Martín, Martín G

    2013-12-01

    Congenital diarrhea disorders are a group of genetically diverse and typically autosomal recessive disorders that have yet to be well characterized phenotypically or molecularly. Diagnostic assessments are generally limited to nutritional challenges and histologic evaluation, and many subjects eventually require a prolonged course of intravenous nutrition. Here we describe next-generation sequencing techniques to investigate a child with perplexing congenital malabsorptive diarrhea and other presumably unrelated clinical problems; this method provides an alternative approach to molecular diagnosis. We screened the diploid genome of an affected individual, using exome sequencing, for uncommon variants that have observed protein-coding consequences. We assessed the functional activity of the mutant protein, as well as its lack of expression using immunohistochemistry. Among several rare variants detected was a homozygous nonsense mutation in the catalytic domain of the proprotein convertase subtilisin/kexin type 1 gene. The mutation abolishes prohormone convertase 1/3 endoprotease activity as well as expression in the intestine. These primary genetic findings prompted a careful endocrine reevaluation of the child at 4.5 years of age, and multiple significant problems were subsequently identified consistent with the known phenotypic consequences of proprotein convertase subtilisin/kexin type 1 (PCSK1) gene mutations. Based on the molecular diagnosis, alternate medical and dietary management was implemented for diabetes insipidus, polyphagia, and micropenis. Whole-exome sequencing provides a powerful diagnostic tool to clinicians managing rare genetic disorders with multiple perplexing clinical manifestations.

  8. Context-dependent control of alternative splicing by RNA-binding proteins

    PubMed Central

    Fu, Xiang-Dong; Ares, Manuel

    2015-01-01

    Sequence-specific RNA-binding proteins (RBPs) bind to pre-mRNA to control alternative splicing, but it is not yet possible to read the ‘splicing code’ that dictates splicing regulation on the basis of genome sequence. Each alternative splicing event is controlled by multiple RBPs, the combined action of which creates a distribution of alternatively spliced products in a given cell type. As each cell type expresses a distinct array of RBPs, the interpretation of regulatory information on a given RNA target is exceedingly dependent on the cell type. RBPs also control each other’s functions at many levels, including by mutual modulation of their binding activities on specific regulatory RNA elements. In this Review, we describe some of the emerging rules that govern the highly context-dependent and combinatorial nature of alternative splicing regulation. PMID:25112293

  9. Metallo-β-lactamase-producing Pseudomonas aeruginosa in the Netherlands: the nationwide emergence of a single sequence type.

    PubMed

    Van der Bij, A K; Van der Zwan, D; Peirano, G; Severin, J A; Pitout, J D D; Van Westreenen, M; Goessens, W H F

    2012-09-01

    Recently, the first outbreak of clonally related VIM-2 metallo-β-lactamase (MBL)-producing Pseudomonas aeruginosa in a Dutch tertiary-care centre was described. Subsequently, a nationwide surveillance study was performed in 2010-2011, which identified the presence of VIM-2 MBL-producing P. aeruginosa in 11 different hospitals. Genotyping by multiple-locus variable-number tandem-repeat analysis (MLVA) showed that the majority of the 82 MBL-producing isolates found belonged to a single MLVA type (n = 70, 85%), identified as ST111 by multilocus sequence typing (MLST). As MBL-producing isolates cause serious infections that are difficult to treat, the presence of clonally related isolates in various hospitals throughout the Netherlands is of nationwide concern. © 2012 The Authors. Clinical Microbiology and Infection © 2012 European Society of Clinical Microbiology and Infectious Diseases.

  10. Prolonged and mixed non-O157 Escherichia coli infection in an Australian household.

    PubMed

    Staples, M; Graham, R M A; Doyle, C J; Smith, H V; Jennison, A V

    2012-05-01

    An Australian family was identified through a Public Health follow up on a Shiga-toxigenic Escherichia coli (STEC) positive bloody diarrhoea case, with three of the four family members experiencing either symptomatic or asymptomatic STEC shedding. Bacterial isolates were submitted to stx sequence sub-typing, multi-locus variable number tandem repeat analysis (MLVA), multi-locus sequence typing (MLST) and binary typing. The analysis revealed that there were multiple strains of STEC being shed by the family members, with similar virulence gene profiles and the same serogroup but differing in their MLVA and MLST profiles. This study illustrates the potentially complicated nature of non-O157 STEC infections and the importance of molecular epidemiology in understanding disease clusters. © 2012 QUEENSLAND HEALTH. Clinical Microbiology and Infection © 2012 European Society of Clinical Microbiology and Infectious Diseases.

  11. Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa

    PubMed Central

    Doddapaneni, Harshavardhan; Yao, Jiqiang; Lin, Hong; Walker, M Andrew; Civerolo, Edwin L

    2006-01-01

    Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c), 54 (Dixon), 83 (Ann1) and 9 (Temecula-1). A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes have been identified as the main source of variations among strains, with individual strains showing different rates of genome evolution. Based on these genome comparisons, it appears that the Pierce's disease strain Temecula-1 genome represents the ancestral genome of the X. fastidiosa. Results of this analysis are publicly available in the form of a web database. PMID:16948851

  12. Highly Diverse Endophytic and Soil Fusarium oxysporum Populations Associated with Field-Grown Tomato Plants

    PubMed Central

    Demers, Jill E.; Gugino, Beth K.

    2014-01-01

    The diversity and genetic differentiation of populations of Fusarium oxysporum associated with tomato fields, both endophytes obtained from tomato plants and isolates obtained from soil surrounding the sampled plants, were investigated. A total of 609 isolates of F. oxysporum were obtained, 295 isolates from a total of 32 asymptomatic tomato plants in two fields and 314 isolates from eight soil cores sampled from the area surrounding the plants. Included in this total were 112 isolates from the stems of all 32 plants, a niche that has not been previously included in F. oxysporum population genetics studies. Isolates were characterized using the DNA sequence of the translation elongation factor 1α gene. A diverse population of 26 sequence types was found, although two sequence types represented nearly two-thirds of the isolates studied. The sequence types were placed in different phylogenetic clades within F. oxysporum, and endophytic isolates were not monophyletic. Multiple sequence types were found in all plants, with an average of 4.2 per plant. The population compositions differed between the two fields but not between soil samples within each field. A certain degree of differentiation was observed between populations associated with different tomato cultivars, suggesting that the host genotype may affect the composition of plant-associated F. oxysporum populations. No clear patterns of genetic differentiation were observed between endophyte populations and soil populations, suggesting a lack of specialization of endophytic isolates. PMID:25304514

  13. An investigation of the uniform random number generator

    NASA Technical Reports Server (NTRS)

    Temple, E. C.

    1982-01-01

    Most random number generators that are in use today are of the congruential form X(i+1) + AX(i) + C mod M where A, C, and M are nonnegative integers. If C=O, the generator is called the multiplicative type and those for which C/O are called mixed congruential generators. It is easy to see that congruential generators will repeat a sequence of numbers after a maximum of M values have been generated. The number of numbers that a procedure generates before restarting the sequence is called the length or the period of the generator. Generally, it is desirable to make the period as long as possible. A detailed discussion of congruential generators is given. Also, several promising procedures that differ from the multiplicative and mixed procedure are discussed.

  14. Two DNA-binding factors recognize specific sequences at silencers, upstream activating sequences, autonomously replicating sequences, and telomeres in Saccharomyces cerevisiae

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buchman, A.R.; Kimmerly, W.J.; Rine, J.

    1988-01-01

    Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less

  15. Harmonic Analysis of Sedimentary Cyclic Sequences in Kansas, Midcontinent, USA

    USGS Publications Warehouse

    Merriam, D.F.; Robinson, J.E.

    1997-01-01

    Several stratigraphic sequences in the Upper Carboniferous (Pennsylvanian) in Kansas (Midcontinent, USA) were analyzed quantitatively for periodic repetitions. The sequences were coded by lithologic type into strings of datasets. The strings then were analyzed by an adaptation of a one-dimensional Fourier transform analysis and examined for evidence of periodicity. The method was tested using different states in coding to determine the robustness of the method and data. The most persistent response is in multiples of 8-10 ft (2.5-3.0 m) and probably is dependent on the depositional thickness of the original lithologic units. Other cyclicities occurred in multiples of the basic frequency of 8-10 with persistent ones at 22 and 30 feet (6.5-9.0 m) and large ones at 80 and 160 feet (25-50 m). These levels of thickness relate well to the basic cyclothem and megacyclothem as measured on outcrop. We propose that this approach is a suitable one for analyzing cyclic events in the stratigraphic record.

  16. Rare Variant Association Test with Multiple Phenotypes

    PubMed Central

    Lee, Selyeong; Won, Sungho; Kim, Young Jin; Kim, Yongkang; Kim, Bong-Jo; Park, Taesung

    2016-01-01

    Although genome-wide association studies (GWAS) have now discovered thousands of genetic variants associated with common traits, such variants cannot explain the large degree of “missing heritability,” likely due to rare variants. The advent of next generation sequencing technology has allowed rare variant detection and association with common traits, often by investigating specific genomic regions for rare variant effects on a trait. Although multiply correlated phenotypes are often concurrently observed in GWAS, most studies analyze only single phenotypes, which may lessen statistical power. To increase power, multivariate analyses, which consider correlations between multiple phenotypes, can be used. However, few existing multi-variant analyses can identify rare variants for assessing multiple phenotypes. Here, we propose Multivariate Association Analysis using Score Statistics (MAAUSS), to identify rare variants associated with multiple phenotypes, based on the widely used Sequence Kernel Association Test (SKAT) for a single phenotype. We applied MAAUSS to Whole Exome Sequencing (WES) data from a Korean population of 1,058 subjects, to discover genes associated with multiple traits of liver function. We then assessed validation of those genes by a replication study, using an independent dataset of 3,445 individuals. Notably, we detected the gene ZNF620 among five significant genes. We then performed a simulation study to compare MAAUSS's performance with existing methods. Overall, MAAUSS successfully conserved type 1 error rates and in many cases, had a higher power than the existing methods. This study illustrates a feasible and straightforward approach for identifying rare variants correlated with multiple phenotypes, with likely relevance to missing heritability. PMID:28039885

  17. Generalized causal mediation and path analysis: Extensions and practical considerations.

    PubMed

    Albert, Jeffrey M; Cho, Jang Ik; Liu, Yiying; Nelson, Suchitra

    2018-01-01

    Causal mediation analysis seeks to decompose the effect of a treatment or exposure among multiple possible paths and provide casually interpretable path-specific effect estimates. Recent advances have extended causal mediation analysis to situations with a sequence of mediators or multiple contemporaneous mediators. However, available methods still have limitations, and computational and other challenges remain. The present paper provides an extended causal mediation and path analysis methodology. The new method, implemented in the new R package, gmediation (described in a companion paper), accommodates both a sequence (two stages) of mediators and multiple mediators at each stage, and allows for multiple types of outcomes following generalized linear models. The methodology can also handle unsaturated models and clustered data. Addressing other practical issues, we provide new guidelines for the choice of a decomposition, and for the choice of a reference group multiplier for the reduction of Monte Carlo error in mediation formula computations. The new method is applied to data from a cohort study to illuminate the contribution of alternative biological and behavioral paths in the effect of socioeconomic status on dental caries in adolescence.

  18. Low-pass sequencing for microbial comparative genomics

    PubMed Central

    Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy

    2004-01-01

    Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1. Identification of multiple TBP and TFB homologs in these four halophiles are consistent with the hypothesis that different types of complex transcriptional regulation may occur through multiple TBP-TFB combinations in response to rapidly changing environmental conditions. Low-pass shotgun sequence analyses of genomes permit extensive and diverse analyses, and should be generally useful for comparative microbial genomics. PMID:14718067

  19. Automated typing of red blood cell and platelet antigens: a whole-genome sequencing study.

    PubMed

    Lane, William J; Westhoff, Connie M; Gleadall, Nicholas S; Aguad, Maria; Smeland-Wagman, Robin; Vege, Sunitha; Simmons, Daimon P; Mah, Helen H; Lebo, Matthew S; Walter, Klaudia; Soranzo, Nicole; Di Angelantonio, Emanuele; Danesh, John; Roberts, David J; Watkins, Nick A; Ouwehand, Willem H; Butterworth, Adam S; Kaufman, Richard M; Rehm, Heidi L; Silberstein, Leslie E; Green, Robert C

    2018-06-01

    There are more than 300 known red blood cell (RBC) antigens and 33 platelet antigens that differ between individuals. Sensitisation to antigens is a serious complication that can occur in prenatal medicine and after blood transfusion, particularly for patients who require multiple transfusions. Although pre-transfusion compatibility testing largely relies on serological methods, reagents are not available for many antigens. Methods based on single-nucleotide polymorphism (SNP) arrays have been used, but typing for ABO and Rh-the most important blood groups-cannot be done with SNP typing alone. We aimed to develop a novel method based on whole-genome sequencing to identify RBC and platelet antigens. This whole-genome sequencing study is a subanalysis of data from patients in the whole-genome sequencing arm of the MedSeq Project randomised controlled trial (NCT01736566) with no measured patient outcomes. We created a database of molecular changes in RBC and platelet antigens and developed an automated antigen-typing algorithm based on whole-genome sequencing (bloodTyper). This algorithm was iteratively improved to address cis-trans haplotype ambiguities and homologous gene alignments. Whole-genome sequencing data from 110 MedSeq participants (30 × depth) were used to initially validate bloodTyper through comparison with conventional serology and SNP methods for typing of 38 RBC antigens in 12 blood-group systems and 22 human platelet antigens. bloodTyper was further validated with whole-genome sequencing data from 200 INTERVAL trial participants (15 × depth) with serological comparisons. We iteratively improved bloodTyper by comparing its typing results with conventional serological and SNP typing in three rounds of testing. The initial whole-genome sequencing typing algorithm was 99·5% concordant across the first 20 MedSeq genomes. Addressing discordances led to development of an improved algorithm that was 99·8% concordant for the remaining 90 MedSeq genomes. Additional modifications led to the final algorithm, which was 99·2% concordant across 200 INTERVAL genomes (or 99·9% after adjustment for the lower depth of coverage). By enabling more precise antigen-matching of patients with blood donors, antigen typing based on whole-genome sequencing provides a novel approach to improve transfusion outcomes with the potential to transform the practice of transfusion medicine. National Human Genome Research Institute, Doris Duke Charitable Foundation, National Health Service Blood and Transplant, National Institute for Health Research, and Wellcome Trust. Copyright © 2018 Elsevier Ltd. All rights reserved.

  20. Characterization of a protein that binds multiple sequences in mammalian type C retrovirus enhancers.

    PubMed Central

    Sun, W; O'Connell, M; Speck, N A

    1993-01-01

    Mammalian type C retrovirus enhancer factor 1 (MCREF-1) is a nuclear protein that binds several directly repeated sequences (CNGGN6CNGG) in the Moloney and Friend murine leukemia virus (MLV) enhancers (N. R. Manley, M. O'Connell, W. Sun, N. A. Speck, and N. Hopkins, J. Virol. 67:1967-1975, 1993). In this paper, we describe the partial purification of MCREF-1 from calf thymus nuclei and further characterize the binding properties of MCREF-1. MCREF-1 binds four sites in the Moloney MLV enhancer and three sites in the Friend MLV enhancer. Ethylation interference analysis suggests that the MCREF-1 binding site spans two adjacent minor grooves of DNA. Images PMID:8445719

  1. The Molecular Epidemiology of the Highly Virulent ST93 Australian Community Staphylococcus aureus Strain

    PubMed Central

    Coombs, Geoffrey W.; Goering, Richard V.; Chua, Kyra Y. L.; Monecke, Stefan; Howden, Benjamin P.; Stinear, Timothy P.; Ehricht, Ralf; O’Brien, Frances G.; Christiansen, Keryn J.

    2012-01-01

    In Australia the PVL - positive ST93-IV [2B], colloquially known as “Queensland CA-MRSA” has become the dominant CA-MRSA clone. First described in the early 2000s, ST93-IV [2B] is associated with skin and severe invasive infections including necrotizing pneumonia. A singleton by multilocus sequence typing (MLST) eBURST analysis ST93 is distinct from other S aureus clones. To determine if the increased prevalence of ST93-IV [2B] is due to the widespread transmission of a single strain of ST93-IV [2B] the genetic relatedness of 58 S. aureus ST93 isolated throughout Australia over an extended period were studied in detail using a variety of molecular methods including pulsed-field gel electrophoresis, spa typing, MLST, microarray DNA, SCCmec typing and dru typing. Identification of the phage harbouring the lukS-PV/lukF-PV Panton Valentine leucocidin genes, detection of allelic variations in lukS-PV/lukF-PV, and quantification of LukF-PV expression was also performed. Although ST93-IV [2B] is known to have an apparent enhanced clinical virulence, the isolates harboured few known virulence determinants. All PVL-positive isolates carried the PVL-encoding phage ΦSa2USA and the lukS-PV/lukF-PV genes had the same R variant SNP profile. The isolates produced similar expression levels of LukF-PV. Although multiple rearrangements of the spa sequence have occurred, the core genome in ST93 is very stable. The emergence of ST93-MRSA is due to independent acquisitions of different dru-defined type IV and type V SCCmec elements in several spa-defined ST93-MSSA backgrounds. Rearrangement of the spa sequence in ST93-MRSA has subsequently occurred in some of these strains. Although multiple ST93-MRSA strains were characterised, little genetic diversity was identified for most isolates, with PVL-positive ST93-IVa [2B]-t202-dt10 predominant across Australia. Whether ST93-IVa [2B] t202-dt10 arose from one PVL-positive ST93-MSSA-t202, or by independent acquisitions of SCCmec-IVa [2B]-dt10 into multiple PVL-positive ST93-MSSA-t202 strains is not known. PMID:22900085

  2. Phylogenetic Diversity, Distribution, and Cophylogeny of Giant Bacteria (Epulopiscium) with their Surgeonfish Hosts in the Red Sea

    PubMed Central

    Miyake, Sou; Ngugi, David K.; Stingl, Ulrich

    2016-01-01

    Epulopiscium is a group of giant bacteria found in high abundance in intestinal tracts of herbivorous surgeonfish. Despite their peculiarly large cell size (can be up to 600 μm), extreme polyploidy (some with over 100,000 genome copies per cell) and viviparity (whereby mother cells produce live offspring), details about their diversity, distribution or their role in the host gut are lacking. Previous studies have highlighted the existence of morphologically distinct Epulopiscium cell types (defined as morphotypes A to J) in some surgeonfish genera, but the corresponding genetic diversity and distribution among other surgeonfishes remain mostly unknown. Therefore, we investigated the phylogenetic diversity of Epulopiscium, distribution and co-occurrence in multiple hosts. Here, we identified eleven new phylogenetic clades, six of which were also morphologically characterized. Three of these novel clades were phylogenetically and morphologically similar to cigar-shaped type A1 cells, found in a wide range of surgeonfishes including Acanthurus nigrofuscus, while three were similar to smaller, rod-shaped type E that has not been phylogenetically classified thus far. Our results also confirmed that biogeography appears to have relatively little influence on Epulopiscium diversity, as clades found in the Great Barrier Reef and Hawaii were also recovered from the Red Sea. Although multiple symbiont clades inhabited a given species of host surgeonfish and multiple host species possessed a given symbiont clade, statistical analysis of host and symbiont phylogenies indicated significant cophylogeny, which in turn suggests co-evolutionary relationships. A cluster analysis of Epulopiscium sequences from previously published amplicon sequencing dataset revealed a similar pattern, where specific clades were consistently found in high abundance amongst closely related surgeonfishes. Differences in abundance may indicate specialization of clades to certain gut environments reflected by inferred differences in the host diets. Overall, our analysis identified a large phylogenetic diversity of Epulopiscium (up to 10% sequence divergence of 16S rRNA genes), which lets us hypothesize that there are multiple species that are spread across guts of different host species. PMID:27014209

  3. fLPS: Fast discovery of compositional biases for the protein universe.

    PubMed

    Harrison, Paul M

    2017-11-13

    Proteins often contain regions that are compositionally biased (CB), i.e., they are made from a small subset of amino-acid residue types. These CB regions can be functionally important, e.g., the prion-forming and prion-like regions that are rich in asparagine and glutamine residues. Here I report a new program fLPS that can rapidly annotate CB regions. It discovers both single-residue and multiple-residue biases. It works through a process of probability minimization. First, contigs are constructed for each amino-acid type out of sequence windows with a low degree of bias; second, these contigs are searched exhaustively for low-probability subsequences (LPSs); third, such LPSs are iteratively assessed for merger into possible multiple-residue biases. At each of these stages, efficiency measures are taken to avoid or delay probability calculations unless/until they are necessary. On a current desktop workstation, the fLPS algorithm can annotate the biased regions of the yeast proteome (>5700 sequences) in <1 s, and of the whole current TrEMBL database (>65 million sequences) in as little as ~1 h, which is >2 times faster than the commonly used program SEG, using default parameters. fLPS discovers both shorter CB regions (of the sort that are often termed 'low-complexity sequence'), and milder biases that may only be detectable over long tracts of sequence. fLPS can readily handle very large protein data sets, such as might come from metagenomics projects. It is useful in searching for proteins with similar CB regions, and for making functional inferences about CB regions for a protein of interest. The fLPS package is available from: http://biology.mcgill.ca/faculty/harrison/flps.html , or https://github.com/pmharrison/flps , or is a supplement to this article.

  4. Occurrence of Stolbur Phytoplasma Disease in Spreading Type Petunia hybrida Cultivars in Korea

    PubMed Central

    Chung, Bong Nam; Jeong, Myeong Il; Choi, Seung Kook; Joa, Jae Ho; Choi, Kyeong San; Choi, In Myeong

    2013-01-01

    In January 2012, spreading type petunia cv. Wave Pink plants showing an abnormal growth habit of sprouting unusual multiple plantlets from the lateral buds were collected from a greenhouse in Gwacheon, Gyeonggi Province, Korea. The presence of phytoplasma was investigated using PCR with the primer pairs P1/P6, and R16F1/R1 for nested-PCR. In the nested PCR, 1,096 bp PCR products were obtained, and through sequencing 12 Pet-Stol isolates were identified. Comparison of the nucleotide sequences of 16S rRNA gene of the 12 Pet-Stol isolates with other phytoplasmas belonging to aster yellows or Stolbur showed that Pet-Stol isolates were members of Stolbur. The presence of phytoplasma in petunia was also confirmed by microscopic observation of the pathogens. In this study, Stolbur phytoplasma was identified from spreading type petunia cultivars by sequence analysis of 16S rRNA gene of phytoplasma and microscopic observation of phytoplasma bodies. This is the first report of Stolbur phytoplasma in commercial Petunia hybrida cultivars. PMID:25288978

  5. Functional versus non-functional intratumor heterogeneity in cancer

    PubMed Central

    Williams, Marc J.; Werner, Benjamin; Graham, Trevor A.; Sottoriva, Andrea

    2016-01-01

    ABSTRACT Next-generation sequencing data from human cancers are often difficult to interpret within the context of tumor evolution. We developed a mathematical model describing the accumulation of mutations under neutral evolutionary dynamics and showed that 323/904 cancers (∼30%) from multiple types were consistent with the neutral model of tumor evolution. PMID:27652316

  6. The democratization of the oncogene.

    PubMed

    Le, Anh T; Doebele, Robert C

    2014-08-01

    The identification of novel, oncogenic gene rearrangements in inflammatory myofibroblastic tumor demonstrates the potential of next-generation sequencing (NGS) platforms for the detection of therapeutically relevant oncogenes across multiple tumor types, but raises significant questions relating to the investigation of targeted therapies in this new era of widespread NGS testing. ©2014 American Association for Cancer Research.

  7. Innovation of a Reinforcer Preference Assessment with the Difficult to Test

    ERIC Educational Resources Information Center

    Saunders, Muriel D.; Saunders, Richard R.

    2011-01-01

    In this study, we continued evaluation of a two-choice preference assessment aimed at identifying a hierarchy of reinforcers for individuals with only one voluntary motor sequence--closing and releasing an adaptive switch. We assessed preferences among types of sensory stimulation in 6 adults with multiple profound impairments using concurrent…

  8. Genetic analysis of human immunodeficiency virus type 1 envelope V3 region isolates from mothers and infants after perinatal transmission.

    PubMed Central

    Ahmad, N; Baroudy, B M; Baker, R C; Chappey, C

    1995-01-01

    The human immunodeficiency virus type 1 (HIV-1) sequences from variable region 3 (V3) of the envelope gene were analyzed from seven infected mother-infant pairs following perinatal transmission. The V3 region sequences directly derived from the DNA of the uncultured peripheral blood mononuclear cells from infected mothers displayed a heterogeneous population. In contrast, the infants' sequences were less diverse than those of their mothers. In addition, the sequences from the younger infants' peripheral blood mononuclear cell DNA were more homogeneous than the older infants' sequences. All infants' sequences were different but displayed patterns similar to those seen in their mothers. In the mother-infant pair sequences analyzed, a minor genotype or subtype found in the mothers predominated in their infants. The conserved N-linked glycosylation site proximal to the first cysteine of the V3 loop was absent only in one infant's sequence set and in some variants of two other infants' sequences. Furthermore, the HIV-1 sequences of the epidemiologically linked mother-infant pairs were closer than the sequences of epidemiologically unlinked individuals, suggesting that the sequence comparison of mother-infant pairs done in order to identify genetic variants transmitted from mother to infant could be performed even in older infants. There was no evidence for transmission of a major genotype or multiple genotypes from mother to infant. In conclusion, a minor genotype of maternal virus is transmitted to the infants, and this finding could be useful in developing strategies to prevent maternal transmission of HIV-1 by means of perinatal interventions. PMID:7815476

  9. Flagellin diversity in Clostridium botulinum groups I and II: a new strategy for strain identification.

    PubMed

    Paul, Catherine J; Twine, Susan M; Tam, Kevin J; Mullen, James A; Kelly, John F; Austin, John W; Logan, Susan M

    2007-05-01

    Strains of Clostridium botulinum are traditionally identified by botulinum neurotoxin type; however, identification of an additional target for typing would improve differentiation. Isolation of flagellar filaments and analysis by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) showed that C. botulinum produced multiple flagellin proteins. Nano-liquid chromatography-tandem mass spectrometry (nLC-MS/MS) analysis of in-gel tryptic digests identified peptides in all flagellin bands that matched two homologous tandem flagellin genes identified in the C. botulinum Hall A genome. Designated flaA1 and flaA2, these open reading frames encode the major structural flagellins of C. botulinum. Colony PCR and sequencing of flaA1/A2 variable regions classified 80 environmental and clinical strains into group I or group II and clustered isolates into 12 flagellar types. Flagellar type was distinct from neurotoxin type, and epidemiologically related isolates clustered together. Sequencing a larger PCR product, obtained during amplification of flaA1/A2 from type E strain Bennett identified a second flagellin gene, flaB. LC-MS analysis confirmed that flaB encoded a large type E-specific flagellin protein, and the predicted molecular mass for FlaB matched that observed by SDS-PAGE. In contrast, the molecular mass of FlaA was 2 to 12 kDa larger than the mass predicted by the flaA1/A2 sequence of a given strain, suggesting that FlaA is posttranslationally modified. While identification of FlaB, and the observation by SDS-PAGE of different masses of the FlaA proteins, showed the flagellin proteins of C. botulinum to be diverse, the presence of the flaA1/A2 gene in all strains examined facilitates single locus sequence typing of C. botulinum using the flagellin variable region.

  10. A sequence database allowing automated genotyping of Classical swine fever virus isolates.

    PubMed

    Dreier, Sabrina; Zimmermann, Bernd; Moennig, Volker; Greiser-Wilke, Irene

    2007-03-01

    Classical swine fever (CSF) is a highly contagious viral disease of pigs. According to the OIE classification of diseases it is classified as a notifiable (previously List A) disease, thus having the potential for causing severe socio-economic problems and affecting severely the international trade of pigs and pig products. Effective control measures are compulsory, and to expose weaknesses a reliable tracing of the spread of the virus is necessary. Genetic typing has proved to be the method of choice. However, genotyping involves the use of multiple software applications, which is laborious and complex. The implementation of a sequence database, which is accessible by the World Wide Web with the option to type automatically new CSF virus isolates once the sequence is available is described. The sequence to be typed is tested for correct orientation and, if necessary, adjusted to the right length. The alignment and the neighbor-joining phylogenetic analysis with a standard set of sequences can then be calculated. The results are displayed as a graph. As an example, the determination is shown of the genetic subgroup of the isolate obtained from the outbreaks registered in Russia, in 2005. After registration (Irene.greiser-wilke@tiho-hannover.de) the database including the module for genotyping are accessible under http://viro08.tiho-hannover.de/eg/eurl_virus_db.htm.

  11. Localization of migraine susceptibility genes in human brain by single-cell RNA sequencing.

    PubMed

    Renthal, William

    2018-01-01

    Background Migraine is a debilitating disorder characterized by severe headaches and associated neurological symptoms. A key challenge to understanding migraine has been the cellular complexity of the human brain and the multiple cell types implicated in its pathophysiology. The present study leverages recent advances in single-cell transcriptomics to localize the specific human brain cell types in which putative migraine susceptibility genes are expressed. Methods The cell-type specific expression of both familial and common migraine-associated genes was determined bioinformatically using data from 2,039 individual human brain cells across two published single-cell RNA sequencing datasets. Enrichment of migraine-associated genes was determined for each brain cell type. Results Analysis of single-brain cell RNA sequencing data from five major subtypes of cells in the human cortex (neurons, oligodendrocytes, astrocytes, microglia, and endothelial cells) indicates that over 40% of known migraine-associated genes are enriched in the expression profiles of a specific brain cell type. Further analysis of neuronal migraine-associated genes demonstrated that approximately 70% were significantly enriched in inhibitory neurons and 30% in excitatory neurons. Conclusions This study takes the next step in understanding the human brain cell types in which putative migraine susceptibility genes are expressed. Both familial and common migraine may arise from dysfunction of discrete cell types within the neurovascular unit, and localization of the affected cell type(s) in an individual patient may provide insight into to their susceptibility to migraine.

  12. Spike-Based Bayesian-Hebbian Learning of Temporal Sequences

    PubMed Central

    Lindén, Henrik; Lansner, Anders

    2016-01-01

    Many cognitive and motor functions are enabled by the temporal representation and processing of stimuli, but it remains an open issue how neocortical microcircuits can reliably encode and replay such sequences of information. To better understand this, a modular attractor memory network is proposed in which meta-stable sequential attractor transitions are learned through changes to synaptic weights and intrinsic excitabilities via the spike-based Bayesian Confidence Propagation Neural Network (BCPNN) learning rule. We find that the formation of distributed memories, embodied by increased periods of firing in pools of excitatory neurons, together with asymmetrical associations between these distinct network states, can be acquired through plasticity. The model’s feasibility is demonstrated using simulations of adaptive exponential integrate-and-fire model neurons (AdEx). We show that the learning and speed of sequence replay depends on a confluence of biophysically relevant parameters including stimulus duration, level of background noise, ratio of synaptic currents, and strengths of short-term depression and adaptation. Moreover, sequence elements are shown to flexibly participate multiple times in the sequence, suggesting that spiking attractor networks of this type can support an efficient combinatorial code. The model provides a principled approach towards understanding how multiple interacting plasticity mechanisms can coordinate hetero-associative learning in unison. PMID:27213810

  13. Molecular authentication of Radix Puerariae Lobatae and Radix Puerariae Thomsonii by ITS and 5S rRNA spacer sequencing.

    PubMed

    Sun, Ye; Shaw, Pang-Chui; Fung, Kwok-Pui

    2007-01-01

    In the present study, we examined nuclear DNA sequences in an attempt to reveal the relationships between Pueraria lobata (Willd). Ohwi, P. thomsonii Benth., and P. montana (Lour.) Merr. We found that internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA are highly divergent in P. lobata and P. thomsonii, and four types of ITS with different length are found in the two species. On the other hand, DNA sequences of 5S rRNA gene spacer are highly conserved across multiple copies in P. lobata and P. thomsonii, they could be used to identify P. lobata, P. thomsonii, and P. montana of this complex, and may serve as a useful tool in medical authentication of Radix Puerariae Lobatae and Radix Puerariae Thomsonii.

  14. Extreme diversity of scorpion venom peptides and proteins revealed by transcriptomic analysis: implication for proteome evolution of scorpion venom arsenal.

    PubMed

    Ma, Yibao; He, Yawen; Zhao, Ruiming; Wu, Yingliang; Li, Wenxin; Cao, Zhijian

    2012-02-16

    Venom is an important genetic development crucial to the survival of scorpions for over 400 million years. We studied the evolution of the scorpion venom arsenal by means of comparative transcriptome analysis of venom glands and phylogenetic analysis of shared types of venom peptides and proteins between buthids and euscorpiids. Fifteen types of venom peptides and proteins were sequenced during the venom gland transcriptome analyses of two Buthidae species (Lychas mucronatus and Isometrus maculatus) and one Euscorpiidae species (Scorpiops margerisonae). Great diversity has been observed in translated amino acid sequences of these transcripts for venom peptides and proteins. Seven types of venom peptides and proteins were shared between buthids and euscorpiids. Molecular phylogenetic analysis revealed that at least five of the seven common types of venom peptides and proteins were likely recruited into the scorpion venom proteome before the lineage split between Buthidae and Euscorpiidae with their corresponding genes undergoing individual or multiple gene duplication events. These are α-KTxs, βKSPNs (β-KTxs and scorpines), anionic peptides, La1-like peptides, and SPSVs (serine proteases from scorpion venom). Multiple types of venom peptides and proteins were demonstrated to be continuously recruited into the venom proteome during the evolution process of individual scorpion lineages. Our results provide an insight into the recruitment pattern of the scorpion venom arsenal for the first time. Copyright © 2011 Elsevier B.V. All rights reserved.

  15. Identification of Human Papillomavirus Type 16 L1 Surface Loops Required for Neutralization by Human Sera†

    PubMed Central

    Carter, Joseph J.; Wipf, Greg C.; Madeleine, Margaret M.; Schwartz, Stephen M.; Koutsky, Laura A.; Galloway, Denise A.

    2006-01-01

    The variable surface loops on human papillomavirus (HPV) virions required for type-specific neutralization by human sera remain poorly defined. To determine which loops are required for neutralization, a series of hybrid virus-like particles (VLPs) were used to adsorb neutralizing activity from HPV type 16 (HPV16)-reactive human sera before being tested in an HPV16 pseudovirion neutralization assay. The hybrid VLPs used were composed of L1 sequences of either HPV16 or HPV31, on which one or two regions were replaced with homologous sequences from the other type. The regions chosen for substitution were the five known loops that form surface epitopes recognized by monoclonal antibodies and two additional variable regions between residues 400 and 450. Pretreatment of human sera, previously found to react to HPV16 VLPs in enzyme-linked immunosorbent assays, with wild-type HPV16 VLPs and hybrid VLPs that retained the neutralizing epitopes reduced or eliminated the ability of sera to inhibit pseudovirus infection in vitro. Surprisingly, substitution of a single loop often ablated the ability of VLPs to adsorb neutralizing antibodies from human sera. However, for all sera tested, multiple surface loops were found to be important for neutralizing activity. Three regions, defined by loops DE, FG, and HI, were most frequently identified as being essential for binding by neutralizing antibodies. These observations are consistent with the existence of multiple neutralizing epitopes on the HPV virion surface. PMID:16641259

  16. Identification of human papillomavirus type 16 L1 surface loops required for neutralization by human sera.

    PubMed

    Carter, Joseph J; Wipf, Greg C; Madeleine, Margaret M; Schwartz, Stephen M; Koutsky, Laura A; Galloway, Denise A

    2006-05-01

    The variable surface loops on human papillomavirus (HPV) virions required for type-specific neutralization by human sera remain poorly defined. To determine which loops are required for neutralization, a series of hybrid virus-like particles (VLPs) were used to adsorb neutralizing activity from HPV type 16 (HPV16)-reactive human sera before being tested in an HPV16 pseudovirion neutralization assay. The hybrid VLPs used were composed of L1 sequences of either HPV16 or HPV31, on which one or two regions were replaced with homologous sequences from the other type. The regions chosen for substitution were the five known loops that form surface epitopes recognized by monoclonal antibodies and two additional variable regions between residues 400 and 450. Pretreatment of human sera, previously found to react to HPV16 VLPs in enzyme-linked immunosorbent assays, with wild-type HPV16 VLPs and hybrid VLPs that retained the neutralizing epitopes reduced or eliminated the ability of sera to inhibit pseudovirus infection in vitro. Surprisingly, substitution of a single loop often ablated the ability of VLPs to adsorb neutralizing antibodies from human sera. However, for all sera tested, multiple surface loops were found to be important for neutralizing activity. Three regions, defined by loops DE, FG, and HI, were most frequently identified as being essential for binding by neutralizing antibodies. These observations are consistent with the existence of multiple neutralizing epitopes on the HPV virion surface.

  17. Using high-sensitivity sequencing for the detection of mutations in BTK and PLCγ2 genes in cellular and cell-free DNA and correlation with progression in patients treated with BTK inhibitors.

    PubMed

    Albitar, Adam; Ma, Wanlong; DeDios, Ivan; Estella, Jeffrey; Ahn, Inhye; Farooqui, Mohammed; Wiestner, Adrian; Albitar, Maher

    2017-03-14

    Patients with chronic lymphocytic leukemia (CLL) that develop resistance to Bruton tyrosine kinase (BTK) inhibitors are typically positive for mutations in BTK or phospholipase c gamma 2 (PLCγ2). We developed a high sensitivity (HS) assay utilizing wild-type blocking polymerase chain reaction achieved via bridged and locked nucleic acids. We used this high sensitivity assay in combination with Sanger sequencing and next generation sequencing (NGS) and tested cellular DNA and cell-free DNA (cfDNA) from patients with CLL treated with the BTK inhibitor, ibrutinib. We also tested ibrutinib-naïve patients with CLL. HS testing achieved 100x greater sensitivity than Sanger. HS Sanger sequencing was capable of detecting < 1 mutant allele in background of 1000 wild-type alleles (1:1000). Similar sensitivity was achieved with HS NGS. No BTK or PLCγ2 mutations were detected in any of the 44 ibrutinib-naïve CLL patients. We demonstrate that without the HS testing 56% of positive samples would have been missed for BTK and 85% of PLCγ2 would have been missed. With the use of HS, we were able to detect multiple mutant clones in the same sample in 37.5% of patients; most would have been missed without HS testing. We also demonstrate that with HS sequencing, plasma cfDNA is more reliable than cellular DNA in detecting mutations. Our studies indicate that wild-type blocking and HS sequencing is necessary for proper and early detection of BTK or PLCγ2 mutations in monitoring patients treated with BTK inhibitors. Furthermore, cfDNA from plasma is very reliable sample-type for testing.

  18. Streptococcus agalactiae Serotype IV in Humans and Cattle, Northern Europe1

    PubMed Central

    Lyhs, Ulrike; Kulkas, Laura; Katholm, Jørgen; Waller, Karin Persson; Saha, Kerttu; Tomusk, Richard J.

    2016-01-01

    Streptococcus agalactiae is an emerging pathogen of nonpregnant human adults worldwide and a reemerging pathogen of dairy cattle in parts of Europe. To learn more about interspecies transmission of this bacterium, we compared contemporaneously collected isolates from humans and cattle in Finland and Sweden. Multilocus sequence typing identified 5 sequence types (STs) (ST1, 8, 12, 23, and 196) shared across the 2 host species, suggesting possible interspecies transmission. More than 54% of the isolates belonged to those STs. Molecular serotyping and pilus island typing of those isolates did not differentiate between populations isolated from different host species. Isolates from humans and cattle differed in lactose fermentation, which is encoded on the accessory genome and represents an adaptation to the bovine mammary gland. Serotype IV-ST196 isolates were obtained from multiple dairy herds in both countries. Cattle may constitute a previously unknown reservoir of this strain. PMID:27869599

  19. MANGO: a new approach to multiple sequence alignment.

    PubMed

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2007-01-01

    Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.

  20. Human Leukocyte Antigen Typing Using a Knowledge Base Coupled with a High-Throughput Oligonucleotide Probe Array Analysis

    PubMed Central

    Zhang, Guang Lan; Keskin, Derin B.; Lin, Hsin-Nan; Lin, Hong Huang; DeLuca, David S.; Leppanen, Scott; Milford, Edgar L.; Reinherz, Ellis L.; Brusic, Vladimir

    2014-01-01

    Human leukocyte antigens (HLA) are important biomarkers because multiple diseases, drug toxicity, and vaccine responses reveal strong HLA associations. Current clinical HLA typing is an elimination process requiring serial testing. We present an alternative in situ synthesized DNA-based microarray method that contains hundreds of thousands of probes representing a complete overlapping set covering 1,610 clinically relevant HLA class I alleles accompanied by computational tools for assigning HLA type to 4-digit resolution. Our proof-of-concept experiment included 21 blood samples, 18 cell lines, and multiple controls. The method is accurate, robust, and amenable to automation. Typing errors were restricted to homozygous samples or those with very closely related alleles from the same locus, but readily resolved by targeted DNA sequencing validation of flagged samples. High-throughput HLA typing technologies that are effective, yet inexpensive, can be used to analyze the world’s populations, benefiting both global public health and personalized health care. PMID:25505899

  1. A molecular surveillance reveals the prevalence of Vibrio cholerae O139 isolates in China from 1993 to 2012.

    PubMed

    Zhang, Ping; Zhou, Haijian; Diao, Baowei; Li, Fengjuan; Du, Pengcheng; Li, Jie; Kan, Biao; Morris, J Glenn; Wang, Duochun

    2014-04-01

    Vibrio cholerae serogroup O139 was first identified in 1992 in India and Bangladesh, in association with major epidemics of cholera in both countries; cases were noted shortly thereafter in China. We characterized 211 V. cholerae O139 isolates that were isolated at multiple sites in China between 1993 and 2012 from patients (n = 92) and the environment (n = 119). Among clinical isolates, 88 (95.7%) of 92 were toxigenic, compared with 47 (39.5%) of 119 environmental isolates. Toxigenic isolates carried the El Tor CTX prophage and toxin-coregulated pilus A gene (tcpA), as well as the Vibrio seventh pandemic island I (VSP-I) and VSP-II. Among a subset of 42 toxigenic isolates screened by multilocus sequence typing (MLST), all were in the same sequence type as a clinical isolate (MO45) from the original Indian outbreak. Nontoxigenic isolates, in contrast, generally lacked VSP-I and -II, and fell within 13 additional sequence types in two clonal complexes distinct from the toxigenic isolates. In further pulsed-field gel electrophoresis (PFGE) (with NotI digestion) studies, toxigenic isolates formed 60 pulsotypes clustered in one group, while the nontoxigenic isolates formed 43 pulsotypes which clustered into 3 different groups. Our data suggest that toxigenic O139 isolates from widely divergent geographic locations, while showing some diversity, have maintained a relatively tight clonal structure across a 20-year time span. Nontoxigenic isolates, in contrast, exhibited greater diversity, with multiple clonal lineages, than did their toxigenic counterparts.

  2. A decade of genomic history for healthcare-associated Enterococcus faecium in the United Kingdom and Ireland.

    PubMed

    Raven, Kathy E; Reuter, Sandra; Reynolds, Rosy; Brodrick, Hayley J; Russell, Julie E; Török, M Estée; Parkhill, Julian; Peacock, Sharon J

    2016-10-01

    Vancomycin-resistant Enterococcus faecium (VREfm) is an important cause of healthcare-associated infections worldwide. We undertook whole-genome sequencing (WGS) of 495 E. faecium bloodstream isolates from 2001-2011 in the United Kingdom and Ireland (UK&I) and 11 E. faecium isolates from a reference collection. Comparison between WGS and multilocus sequence typing (MLST) identified major discrepancies for 17% of isolates, with multiple instances of the same sequence type (ST) being located in genetically distant positions in the WGS tree. This confirms that WGS is superior to MLST for evolutionary analyses and is more accurate than current typing methods used during outbreak investigations. E. faecium has been categorized as belonging to three clades (Clades A1, hospital-associated; A2, animal-associated; and B, community-associated). Phylogenetic analysis of our isolates replicated the distinction between Clade A (97% of isolates) and Clade B but did not support the subdivision of Clade A into Clade A1 and A2. Phylogeographic analyses revealed that Clade A had been introduced multiple times into each hospital referral network or country, indicating frequent movement of E. faecium between regions that rarely share hospital patients. Numerous genetic clusters contained highly related vanA-positive and -negative E. faecium, which implies that control of vancomycin-resistant enterococci (VRE) in hospitals also requires consideration of vancomycin-susceptible E. faecium Our findings reveal the evolution and dissemination of hospital-associated E. faecium in the UK&I and provide evidence for WGS as an instrument for infection control. © 2016 Raven et al.; Published by Cold Spring Harbor Laboratory Press.

  3. A decade of genomic history for healthcare-associated Enterococcus faecium in the United Kingdom and Ireland

    PubMed Central

    Raven, Kathy E.; Reuter, Sandra; Reynolds, Rosy; Brodrick, Hayley J.; Russell, Julie E.; Török, M. Estée; Parkhill, Julian; Peacock, Sharon J.

    2016-01-01

    Vancomycin-resistant Enterococcus faecium (VREfm) is an important cause of healthcare-associated infections worldwide. We undertook whole-genome sequencing (WGS) of 495 E. faecium bloodstream isolates from 2001–2011 in the United Kingdom and Ireland (UK&I) and 11 E. faecium isolates from a reference collection. Comparison between WGS and multilocus sequence typing (MLST) identified major discrepancies for 17% of isolates, with multiple instances of the same sequence type (ST) being located in genetically distant positions in the WGS tree. This confirms that WGS is superior to MLST for evolutionary analyses and is more accurate than current typing methods used during outbreak investigations. E. faecium has been categorized as belonging to three clades (Clades A1, hospital-associated; A2, animal-associated; and B, community-associated). Phylogenetic analysis of our isolates replicated the distinction between Clade A (97% of isolates) and Clade B but did not support the subdivision of Clade A into Clade A1 and A2. Phylogeographic analyses revealed that Clade A had been introduced multiple times into each hospital referral network or country, indicating frequent movement of E. faecium between regions that rarely share hospital patients. Numerous genetic clusters contained highly related vanA-positive and -negative E. faecium, which implies that control of vancomycin-resistant enterococci (VRE) in hospitals also requires consideration of vancomycin-susceptible E. faecium. Our findings reveal the evolution and dissemination of hospital-associated E. faecium in the UK&I and provide evidence for WGS as an instrument for infection control. PMID:27527616

  4. Comparison of the Live Attenuated Yellow Fever Vaccine 17D-204 Strain to Its Virulent Parental Strain Asibi by Deep Sequencing

    PubMed Central

    Beck, Andrew; Tesh, Robert B.; Wood, Thomas G.; Widen, Steven G.; Ryman, Kate D.; Barrett, Alan D. T.

    2014-01-01

    Background. The first comparison of a live RNA viral vaccine strain to its wild-type parental strain by deep sequencing is presented using as a model the yellow fever virus (YFV) live vaccine strain 17D-204 and its wild-type parental strain, Asibi. Methods. The YFV 17D-204 vaccine genome was compared to that of the parental strain Asibi by massively parallel methods. Variability was compared on multiple scales of the viral genomes. A modeled exploration of small-frequency variants was performed to reconstruct plausible regions of mutational plasticity. Results. Overt quasispecies diversity is a feature of the parental strain, whereas the live vaccine strain lacks diversity according to multiple independent measurements. A lack of attenuating mutations in the Asibi population relative to that of 17D-204 was observed, demonstrating that the vaccine strain was derived by discrete mutation of Asibi and not by selection of genomes in the wild-type population. Conclusions. Relative quasispecies structure is a plausible correlate of attenuation for live viral vaccines. Analyses such as these of attenuated viruses improve our understanding of the molecular basis of vaccine attenuation and provide critical information on the stability of live vaccines and the risk of reversion to virulence. PMID:24141982

  5. Comparison of the live attenuated yellow fever vaccine 17D-204 strain to its virulent parental strain Asibi by deep sequencing.

    PubMed

    Beck, Andrew; Tesh, Robert B; Wood, Thomas G; Widen, Steven G; Ryman, Kate D; Barrett, Alan D T

    2014-02-01

    The first comparison of a live RNA viral vaccine strain to its wild-type parental strain by deep sequencing is presented using as a model the yellow fever virus (YFV) live vaccine strain 17D-204 and its wild-type parental strain, Asibi. The YFV 17D-204 vaccine genome was compared to that of the parental strain Asibi by massively parallel methods. Variability was compared on multiple scales of the viral genomes. A modeled exploration of small-frequency variants was performed to reconstruct plausible regions of mutational plasticity. Overt quasispecies diversity is a feature of the parental strain, whereas the live vaccine strain lacks diversity according to multiple independent measurements. A lack of attenuating mutations in the Asibi population relative to that of 17D-204 was observed, demonstrating that the vaccine strain was derived by discrete mutation of Asibi and not by selection of genomes in the wild-type population. Relative quasispecies structure is a plausible correlate of attenuation for live viral vaccines. Analyses such as these of attenuated viruses improve our understanding of the molecular basis of vaccine attenuation and provide critical information on the stability of live vaccines and the risk of reversion to virulence.

  6. Detection of cystic fibrosis mutations in a GeneChip{trademark} assay format

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miyada, C.G.; Cronin, M.T.; Kim, S.M.

    1994-09-01

    We are developing assays for the detection of cystic fibrosis mutations based on DNA hybridization. A DNA sample is amplified by PCR, labeled by incorporating a fluorescein-tagged dNTP, enzymatically treated to produce smaller fragments and hybridized to a series of short (13-16 bases) oligonucleotides synthesized on a glass surface via photolithography. The hybrids are detected by eqifluorescence and mutations are identified by the specific pattern of hybridization. In a GeneChip assay, the chip surface is composed of a series of subarrays, each being specific for a particular mutation. Each subarray is further subdivided into a series of probes (40 total),more » half based on the mutant sequence and the remainder based on the wild-type sequence. For each of the subarrays, there is a redundancy in the number of probes that should hybridize to either a wild-type or a mutant target. The multiple probe strategy provides sequence information for a short five base region overlapping the mutation site. In addition, homozygous wild-type and mutant as well as heterozygous samples are each identified by a specific pattern of hybridization. The small size of each probe feature (250 x 250 {mu}m{sup 2}) permits the inclusion of additional probes required to generate sequence information by hybridization.« less

  7. Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models

    PubMed Central

    Chiu, Chi-yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-ling; Xiong, Momiao; Fan, Ruzong

    2017-01-01

    To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data. PMID:28000696

  8. Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models.

    PubMed

    Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-Ling; Xiong, Momiao; Fan, Ruzong

    2017-02-01

    To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.

  9. Multiplex Reverse Transcription-PCR for Simultaneous Surveillance of Influenza A and B Viruses

    PubMed Central

    Zhou, Bin; Barnes, John R.; Sessions, October M.; Chou, Tsui-Wen; Wilson, Malania; Stark, Thomas J.; Volk, Michelle; Spirason, Natalie; Halpin, Rebecca A.; Kamaraj, Uma Sangumathi; Ding, Tao; Stockwell, Timothy B.; Ghedin, Elodie; Barr, Ian G.

    2017-01-01

    ABSTRACT Influenza A and B viruses are the causative agents of annual influenza epidemics that can be severe, and influenza A viruses intermittently cause pandemics. Sequence information from influenza virus genomes is instrumental in determining mechanisms underpinning antigenic evolution and antiviral resistance. However, due to sequence diversity and the dynamics of influenza virus evolution, rapid and high-throughput sequencing of influenza viruses remains a challenge. We developed a single-reaction influenza A/B virus (FluA/B) multiplex reverse transcription-PCR (RT-PCR) method that amplifies the most critical genomic segments (hemagglutinin [HA], neuraminidase [NA], and matrix [M]) of seasonal influenza A and B viruses for next-generation sequencing, regardless of viral type, subtype, or lineage. Herein, we demonstrate that the strategy is highly sensitive and robust. The strategy was validated on thousands of seasonal influenza A and B virus-positive specimens using multiple next-generation sequencing platforms. PMID:28978683

  10. Team-based learning to improve learning outcomes in a therapeutics course sequence.

    PubMed

    Bleske, Barry E; Remington, Tami L; Wells, Trisha D; Dorsch, Michael P; Guthrie, Sally K; Stumpf, Janice L; Alaniz, Marissa C; Ellingrod, Vicki L; Tingen, Jeffrey M

    2014-02-12

    To compare the effectiveness of team-based learning (TBL) to that of traditional lectures on learning outcomes in a therapeutics course sequence. A revised TBL curriculum was implemented in a therapeutic course sequence. Multiple choice and essay questions identical to those used to test third-year students (P3) taught using a traditional lecture format were administered to the second-year pharmacy students (P2) taught using the new TBL format. One hundred thirty-one multiple-choice questions were evaluated; 79 tested recall of knowledge and 52 tested higher level, application of knowledge. For the recall questions, students taught through traditional lectures scored significantly higher compared to the TBL students (88%±12% vs. 82%±16%, p=0.01). For the questions assessing application of knowledge, no differences were seen between teaching pedagogies (81%±16% vs. 77%±20%, p=0.24). Scores on essay questions and the number of students who achieved 100% were also similar between groups. Transition to a TBL format from a traditional lecture-based pedagogy allowed P2 students to perform at a similar level as students with an additional year of pharmacy education on application of knowledge type questions. However, P3 students outperformed P2 students regarding recall type questions and overall. Further assessment of long-term learning outcomes is needed to determine if TBL produces more persistent learning and improved application in clinical settings.

  11. FASMA: a service to format and analyze sequences in multiple alignments.

    PubMed

    Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

    2007-12-01

    Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.

  12. Rational Design of Peptide Vaccines Against Multiple Types of Human Papillomavirus

    PubMed Central

    Dey, Sumanta; De, Antara; Nandy, Ashesh

    2016-01-01

    Human papillomavirus (HPV) occurs in many types, some of which cause cervical, genital, and other cancers. While vaccination is available against the major cancer-causing HPV types, many others are not covered by these preventive measures. Herein, we present a bioinformatics study for the designing of multivalent peptide vaccines against multiple HPV types as an alternative strategy to the virus-like particle vaccines being used now. Our technique of rational design of peptide vaccines is expected to ensure stability of the vaccine against many cycles of mutational changes, elicit immune response, and negate autoimmune possibilities. Using the L1 capsid protein sequences, we identified several peptides for potential vaccine design for HPV 16, 18, 33, 35, 45, and 11 types. Although there are concerns about the epitope-binding affinities for the peptides identified in this process, the technique indicates possibilities of multivalent, adjuvanted, peptide vaccines against a wider range of HPV types, and tailor-made different combinations of the peptides to address frequency variations of types over different population groups as required for prophylaxis and at lower cost than are in use at the present time. PMID:27279731

  13. [Advance of the study on LRRK2 gene in Parkinson's disease].

    PubMed

    Zhang, Yu; Chen, Shengdi

    2008-12-01

    The leucine-rich repeat kinase2 (LRRK2) has been identified to be the gene causing autosomal dominant inherited Parkinson's disease(PD)8. The clinical features of this type of PD are similar to those of idiopathic PD, but the pathological changes are diverse. The mutation types and frequencies of the LRRK2 distribute unevenly in different populations. LRRK2 is a large complex protein with multiple functions and expresses widely in human body. Sequence alignment shows that LRRK2 might be a multiple function kinase for substrate phosphorylation and might also act as a scaffolding protein. Further study on the physiological function and pathogenic mechanism of LRRK2 will help to find out the possible pathogenesis and new treatment for PD.

  14. 1H and 15N NMR resonance assignments and secondary structure of titin type I domains.

    PubMed

    Muhle-Goll, C; Nilges, M; Pastore, A

    1997-01-01

    Titin/connectin is a giant muscle protein with a highly modular architecture consisting of multiple repeats of two sequence motifs, named type I and type II. Type I modules have been suggested to be intracellular members of the fibronectin type III (Fn3) domain family. Along the titin sequence they are exclusively present in the region of the molecule located in the sarcomere A-band. This region has been shown to interact with myosin and C-protein. One of the most noticeable features of type I modules is that they are particularly rich in semiconserved prolines, since these residues account for about 8% of their sequence. We have determined the secondary structure of a representative type I domain (A71) by 15N and 1H NMR. We show that the type I domains of titin have the Fn3 fold as proposed, consisting of a three- and a four-stranded beta-sheet. When the two sheets are placed on top of each other to form the beta-sandwich characteristic of the Fn3 fold, 8 out of 10 prolines are found on the same side of the molecule and form an exposed hydrophobic patch. This suggests that the semiconserved prolines might be relevant for the function of type I modules, providing a surface for binding to other A-band proteins. The secondary structure of A71 was structurally aligned to other extracellular Fn3 modules of known 3D structure. The alignment shows that titin type I modules have closest similarity to the first Fn3 domain of Drosophila neuroglian.

  15. Phylogenetic Copy-Number Factorization of Multiple Tumor Samples.

    PubMed

    Zaccaria, Simone; El-Kebir, Mohammed; Klau, Gunnar W; Raphael, Benjamin J

    2018-04-16

    Cancer is an evolutionary process driven by somatic mutations. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the many types of mutations in cancer and the fact that nearly all cancer sequencing is of a bulk tumor, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy-number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy-number data from multiple samples of a tumor. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that either perform deconvolution/factorization of mixed tumor samples or build phylogenetic trees assuming homogeneous tumor samples. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher resolution view of copy-number evolution of this cancer than published analyses.

  16. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments

    PubMed Central

    Di Tommaso, Paolo; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric

    2014-01-01

    This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee. PMID:24972831

  17. Comparative genome analysis of multiple vancomycin-resistant Enterococcus faecium isolated from two fatal cases.

    PubMed

    Lim, Shu Yong; Yap, Kien-Pong; Teh, Cindy Shuan Ju; Jabar, Kartini Abdul; Thong, Kwai Lin

    2017-04-01

    Enterococcus faecium is both a commensal of the human intestinal tract and an opportunistic pathogen. The increasing incidence of enterococcal infections is mainly due to the ability of this organism to develop resistance to multiple antibiotics, including vancomycin. The aim of this study was to perform comparative genome analyses on four vancomycin-resistant Enterococcus faecium (VRE fm ) strains isolated from two fatal cases in a tertiary hospital in Malaysia. Two sequence types, ST80 and ST203, were identified which belong to the clinically important clonal complex (CC) 17. This is the first report on the emergence of ST80 strains in Malaysia. Three of the studied strains (VREr5, VREr6, VREr7) were each isolated from different body sites of a single patient (patient Y) and had different PFGE patterns. While VREr6 and VREr7 were phenotypically and genotypically similar, the initial isolate, VREr5, was found to be more similar to VRE2 isolated from another patient (patient X), in terms of the genome contents, sequence types and phylogenomic relationship. Both the clinical records and genome sequence data suggested that patient Y was infected by multiple strains from different clones and the strain that infected patient Y could have derived from the same clone from patient X. These multidrug resistant strains harbored a number of virulence genes such as the epa locus and pilus-associated genes which could enhance their persistence. Apart from that, a homolog of E. faecalis bee locus was identified in VREr5 which might be involved in biofilm formation. Overall, our comparative genomic analyses had provided insight into the genetic relatedness, as well as the virulence potential, of the four clinical strains. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Genetic relationships and epidemiological links between wild type 1 poliovirus isolates in Pakistan and Afghanistan

    PubMed Central

    2012-01-01

    Background/Aim Efforts have been made to eliminate wild poliovirus transmission since 1988 when the World Health Organization began its global eradication campaign. Since then, the incidence of polio has decreased significantly. However, serotype 1 and serotype 3 still circulate endemically in Pakistan and Afghanistan. Both countries constitute a single epidemiologic block representing one of the three remaining major global reservoirs of poliovirus transmission. In this study we used genetic sequence data to investigate transmission links among viruses from diverse locations during 2005-2007. Methods In order to find the origins and routes of wild type 1 poliovirus circulation, polioviruses were isolated from faecal samples of Acute Flaccid Paralysis (AFP) patients. We used viral cultures, two intratypic differentiation methods PCR, ELISA to characterize as vaccine or wild type 1 and nucleic acid sequencing of entire VP1 region of poliovirus genome to determine the genetic relatedness. Results One hundred eleven wild type 1 poliovirus isolates were subjected to nucleotide sequencing for genetic variation study. Considering the 15% divergence of the sequences from Sabin 1, Phylogenetic analysis by MEGA software revealed that active inter and intra country transmission of many genetically distinct strains of wild poliovirus type 1 belonged to genotype SOAS which is indigenous in this region. By grouping wild type 1 polioviruses according to nucleotide sequence homology, three distinct clusters A, B and C were obtained with multiple chains of transmission together with some silent circulations represented by orphan lineages. Conclusion Our results emphasize that there was a persistent transmission of wild type1 polioviruses in Pakistan and Afghanistan during 2005-2007. The epidemiologic information provided by the sequence data can contribute to the formulation of better strategies for poliomyelitis control to those critical areas, associated with high risk population groups which include migrants, internally displaced people, and refugees. The implication of this study is to maintain high quality mass immunization with oral polio vaccine (OPV) in order to interrupt chains of virus transmission in both countries to endorse substantial progress in Eastern-Mediterranean region. PMID:22353446

  19. Genotyping of ancient Mycobacterium tuberculosis strains reveals historic genetic diversity.

    PubMed

    Müller, Romy; Roberts, Charlotte A; Brown, Terence A

    2014-04-22

    The evolutionary history of the Mycobacterium tuberculosis complex (MTBC) has previously been studied by analysis of sequence diversity in extant strains, but not addressed by direct examination of strain genotypes in archaeological remains. Here, we use ancient DNA sequencing to type 11 single nucleotide polymorphisms and two large sequence polymorphisms in the MTBC strains present in 10 archaeological samples from skeletons from Britain and Europe dating to the second-nineteenth centuries AD. The results enable us to assign the strains to groupings and lineages recognized in the extant MTBC. We show that at least during the eighteenth-nineteenth centuries AD, strains of M. tuberculosis belonging to different genetic groups were present in Britain at the same time, possibly even at a single location, and we present evidence for a mixed infection in at least one individual. Our study shows that ancient DNA typing applied to multiple samples can provide sufficiently detailed information to contribute to both archaeological and evolutionary knowledge of the history of tuberculosis.

  20. Functional and mechanistic diversity of distal transcription enhancers

    PubMed Central

    Bulger, Michael; Groudine, Mark

    2013-01-01

    Biological differences among metazoans, and between cell types in a given organism, arise in large part due to differences in gene expression patterns. The sequencing of multiple metazoan genomes, coupled with recent advances in genome-wide analysis of histone modifications and transcription factor binding, has revealed that among regulatory DNA sequences, gene-distal enhancers appear to exhibit the greatest diversity and cell-type specificity. Moreover, such elements are emerging as important targets for mutations that can give rise to disease and to genetic variability that underlies evolutionary change. Studies of long-range interactions between distal genomic sequences in the nucleus indicate that enhancers are often important determinants of nuclear organization, contributing to a general model for enhancer function that involves direct enhancer-promoter contact. In a number of systems, however, mechanisms for enhancer function are emerging that do not fit solely within such a model, suggesting that enhancers as a class of DNA regulatory element may be functionally and mechanistically diverse. PMID:21295696

  1. Typing and comparative genome analysis of Brucella melitensis isolated from Lebanon.

    PubMed

    Abou Zaki, Natalia; Salloum, Tamara; Osman, Marwan; Rafei, Rayane; Hamze, Monzer; Tokajian, Sima

    2017-10-16

    Brucella melitensis is the main causative agent of the zoonotic disease brucellosis. This study aimed at typing and characterizing genetic variation in 33 Brucella isolates recovered from patients in Lebanon. Bruce-ladder multiplex PCR and PCR-RFLP of omp31, omp2a and omp2b were performed. Sixteen representative isolates were chosen for draft-genome sequencing and analyzed to determine variations in virulence, resistance, genomic islands, prophages and insertion sequences. Comparative whole-genome single nucleotide polymorphism analysis was also performed. The isolates were confirmed to be B. melitensis. Genome analysis revealed multiple virulence determinants and efflux pumps. Genome comparisons and single nucleotide polymorphisms divided the isolates based on geographical distribution but revealed high levels of similarity between the strains. Sequence divergence in B. melitensis was mainly due to lateral gene transfer of mobile elements. This is the first report of an in-depth genomic characterization of B. melitensis in Lebanon. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

    PubMed

    Simmons, Sheri L; Dibartolo, Genevieve; Denef, Vincent J; Goltsman, Daniela S Aliaga; Thelen, Michael P; Banfield, Jillian F

    2008-07-22

    Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination.

  3. Population Genomic Analysis of Strain Variation in Leptospirillum Group II Bacteria Involved in Acid Mine Drainage Formation

    PubMed Central

    Denef, Vincent J; Goltsman, Daniela S. Aliaga; Thelen, Michael P; Banfield, Jillian F

    2008-01-01

    Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth ∼20×). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types (∼94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination. PMID:18651792

  4. Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.

    PubMed

    Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao

    2016-04-01

    To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.

  5. Next Generation Sequencing Technology and Genomewide Data Analysis: Perspectives for Retinal Research

    PubMed Central

    Chaitankar, Vijender; Karakülah, Gökhan; Ratnapriya, Rinki; Giuste, Felipe O.; Brooks, Matthew J.; Swaroop, Anand

    2016-01-01

    The advent of high throughput next generation sequencing (NGS) has accelerated the pace of discovery of disease-associated genetic variants and genomewide profiling of expressed sequences and epigenetic marks, thereby permitting systems-based analyses of ocular development and disease. Rapid evolution of NGS and associated methodologies presents significant challenges in acquisition, management, and analysis of large data sets and for extracting biologically or clinically relevant information. Here we illustrate the basic design of commonly used NGS-based methods, specifically whole exome sequencing, transcriptome, and epigenome profiling, and provide recommendations for data analyses. We briefly discuss systems biology approaches for integrating multiple data sets to elucidate gene regulatory or disease networks. While we provide examples from the retina, the NGS guidelines reviewed here are applicable to other tissues/cell types as well. PMID:27297499

  6. The Papillomavirus Episteme: a major update to the papillomavirus sequence database.

    PubMed

    Van Doorslaer, Koenraad; Li, Zhiwen; Xirasagar, Sandhya; Maes, Piet; Kaminsky, David; Liou, David; Sun, Qiang; Kaur, Ramandeep; Huyen, Yentram; McBride, Alison A

    2017-01-04

    The Papillomavirus Episteme (PaVE) is a database of curated papillomavirus genomic sequences, accompanied by web-based sequence analysis tools. This update describes the addition of major new features. The papillomavirus genomes within PaVE have been further annotated, and now includes the major spliced mRNA transcripts. Viral genes and transcripts can be visualized on both linear and circular genome browsers. Evolutionary relationships among PaVE reference protein sequences can be analysed using multiple sequence alignments and phylogenetic trees. To assist in viral discovery, PaVE offers a typing tool; a simplified algorithm to determine whether a newly sequenced virus is novel. PaVE also now contains an image library containing gross clinical and histopathological images of papillomavirus infected lesions. Database URL: https://pave.niaid.nih.gov/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  7. Molecular Evolution of a Type 1 Wild-Vaccine Poliovirus Recombinant during Widespread Circulation in China

    PubMed Central

    Liu, Hong-Mei; Zheng, Du-Ping; Zhang, Li-Bi; Oberste, M. Steven; Pallansch, Mark A.; Kew, Olen M.

    2000-01-01

    Type 1 wild-vaccine recombinant polioviruses were isolated from poliomyelitis patients in China from 1991 to 1993. We compared the sequences of 34 recombinant isolates over the 1,353-nucleotide (nt) genomic interval (nt 2480 to 3832) encoding the major capsid protein, VP1, and the protease, 2A. All recombinants had a 367-nt block of sequence (nt 3271 to 3637) derived from the Sabin 1 oral poliovirus vaccine strain spanning the 3′-terminal sequences of VP1 (115 nt) and the 5′ half of 2A (252 nt). The remaining VP1 sequences were closely (up to 99.5%) related to those of a major genotype of wild type 1 poliovirus endemic to China up to 1994. In contrast, the non-vaccine-derived sequences at the 3′ half of 2A were more distantly related (<90% nucleotide sequence match) to those of other contemporary wild polioviruses from China. The vaccine-derived sequences of the earliest (April 1991) isolates completely matched those of Sabin 1. Later isolates diverged from the early isolates primarily by accumulation of synonymous base substitutions (at a rate of ∼3.7 × 10−2 substitutions per synonymous site per year) over the entire VP1-2A interval. Distinct evolutionary lineages were found in different Chinese provinces. From the combined epidemiologic and evolutionary analyses, we propose that the recombinant virus arose during mixed infection of a single individual in northern China in early 1991 and that its progeny spread by multiple independent chains of transmission into some of the most populous areas of China within a year of the initiating infection. PMID:11070012

  8. Multiplexed fragaria chloroplast genome sequencing

    Treesearch

    W. Njuguna; A. Liston; R. Cronn; N.V. Bassil

    2010-01-01

    A method to sequence multiple chloroplast genomes using ultra high throughput sequencing technologies was recently described. Complete chloroplast genome sequences can resolve phylogenetic relationships at low taxonomic levels and identify informative point mutations and indels. The objective of this research was to sequence multiple Fragaria...

  9. Embedding strategies for effective use of information from multiple sequence alignments.

    PubMed Central

    Henikoff, S.; Henikoff, J. G.

    1997-01-01

    We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452

  10. Thysanophora penicillioides includes multiple genetically diverged groups that coexist respectively in Abies mariesii forests in Japan.

    PubMed

    Iwamoto, Susumu; Tokumasu, Seiji; Suyama, Yoshihisa; Kakishima, Makoto

    2005-01-01

    We investigated intraspecific diversity and genetic structures of a saprotrophic fungus--Thysanophora penicillioides--based on sequences of nuclear ribosomal internal transcribed spacer (ITS) in 15 discontinuous Abies mariesii forests of Japan. In such a well-defined morphological species, numerous unexpected ITS variations were revealed: 12 ITS sequence types detected in 254 isolates collected from 15 local populations were classified into five ITS sequence groups. Maximally, four ITS groups consisted of seven ITS types coexisting in one population. However, group 1 was dominant with approximately 65%; in particular, one haplotype, 1a, was most dominant with approximately 60% in respective populations. Therefore, few differences were recognized in genetic structure among local populations, implying that the gene flow of each lineage of the fungus occurs among local populations without geographic limitations. However, minor haplotypes in some ITS groups were found only in restricted areas, suggesting that they might expand steadily from their places of origin to neighboring A. mariesii forests. Aggregating sequence data of seven European strains and four North American strains from various substrates to those of Japanese strains, 18 ITS sequence types and 28 variable sites were recognized. They were clustered into nine lineages by phylogenetic analyses of the beta-tubulin and combined ITS and beta-tubulin datasets. According to phylogenetic species recognition by the concordance of genealogies, respective lineages correspond to phylogenetic species. Plural phylogenetic species coexist in a local population in an A. mariesii forest in Japan.

  11. pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

    PubMed

    Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

    2013-08-01

    With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  12. Characterisation of IS153, an IS3-family insertion sequence isolated from Lactobacillus sanfranciscensis and its use for strain differentiation.

    PubMed

    Ehrmann, M A; Vogel, R E

    2001-11-01

    An insertion sequence has been identified in the genome of Lactobacillus sanfranciscensis DSM 20451T as segment of 1351 nucleotides containing 37-bp imperfect terminal inverted repeats. The sequence of this element encodes two out of phase, overlapping open reading frames, orfA and orfB, from which three putative proteins are produced. OrfAB is a transframe protein produced by -1 translational frame shifting between orf A and orf B that is presumed to be the transposase. The large orfAB of this element encodes a 342 amino acid protein that displays similarities with transposases encoded by bacterial insertion sequences belonging to the IS3 family. In L. sanfranciscensis type strain DSM 20451T multiple truncated IS elements were identified. Inverse PCR was used to analyze target sites of four of these elements, but except of their highly AT rich character not any sequence specificity was identified so far. Moreover, no flanking direct repeats were identified. Multiple copies of IS153 were detected by hybridization in other strains of L. sanfranciscensis. Resulting hybridization patterns were shown to differentiate between organisms at strain level rather than a probe targeted against the 16S rDNA. With a PCR based approach IS153 or highly similar sequences were detected in L. acidophilus, L. casei, L. malefermentans, L. plantarum, L. hilgardii, L. collinoides L. farciminis L. sakei and L. salivarius, L. reuteri as well as in Enterococcus faecium, Pediococcus acidilactici and P. pentosaceus.

  13. 'DNA Strider': a 'C' program for the fast analysis of DNA and protein sequences on the Apple Macintosh family of computers.

    PubMed Central

    Marck, C

    1988-01-01

    DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831

  14. New Sequences with Low Correlation and Large Family Size

    NASA Astrophysics Data System (ADS)

    Zeng, Fanxin

    In direct-sequence code-division multiple-access (DS-CDMA) communication systems and direct-sequence ultra wideband (DS-UWB) radios, sequences with low correlation and large family size are important for reducing multiple access interference (MAI) and accepting more active users, respectively. In this paper, a new collection of families of sequences of length pn-1, which includes three constructions, is proposed. The maximum number of cyclically distinct families without GMW sequences in each construction is φ(pn-1)/n·φ(pm-1)/m, where p is a prime number, n is an even number, and n=2m, and these sequences can be binary or polyphase depending upon choice of the parameter p. In Construction I, there are pn distinct sequences within each family and the new sequences have at most d+2 nontrivial periodic correlation {-pm-1, -1, pm-1, 2pm-1,…,dpm-1}. In Construction II, the new sequences have large family size p2n and possibly take the nontrivial correlation values in {-pm-1, -1, pm-1, 2pm-1,…,(3d-4)pm-1}. In Construction III, the new sequences possess the largest family size p(d-1)n and have at most 2d correlation levels {-pm-1, -1,pm-1, 2pm-1,…,(2d-2)pm-1}. Three constructions are near-optimal with respect to the Welch bound because the values of their Welch-Ratios are moderate, WR_??_d, WR_??_3d-4 and WR_??_2d-2, respectively. Each family in Constructions I, II and III contains a GMW sequence. In addition, Helleseth sequences and Niho sequences are special cases in Constructions I and III, and their restriction conditions to the integers m and n, pm≠2 (mod 3) and n≅0 (mod 4), respectively, are removed in our sequences. Our sequences in Construction III include the sequences with Niho type decimation 3·2m-2, too. Finally, some open questions are pointed out and an example that illustrates the performance of these sequences is given.

  15. KinView: A visual comparative sequence analysis tool for integrated kinome research

    PubMed Central

    McSkimming, Daniel Ian; Dastgheib, Shima; Baffi, Timothy R.; Byrne, Dominic P.; Ferries, Samantha; Scott, Steven Thomas; Newton, Alexandra C.; Eyers, Claire E.; Kochut, Krzysztof J.; Eyers, Patrick A.

    2017-01-01

    Multiple sequence alignments (MSAs) are a fundamental analysis tool used throughout biology to investigate relationships between protein sequence, structure, function, evolutionary history, and patterns of disease-associated variants. However, their widespread application in systems biology research is currently hindered by the lack of user-friendly tools to simultaneously visualize, manipulate and query the information conceptualized in large sequence alignments, and the challenges in integrating MSAs with multiple orthogonal data such as cancer variants and post-translational modifications, which are often stored in heterogeneous data sources and formats. Here, we present the Multiple Sequence Alignment Ontology (MSAOnt), which represents a profile or consensus alignment in an ontological format. Subsets of the alignment are easily selected through the SPARQL Protocol and RDF Query Language for downstream statistical analysis or visualization. We have also created the Kinome Viewer (KinView), an interactive integrative visualization that places eukaryotic protein kinase cancer variants in the context of natural sequence variation and experimentally determined post-translational modifications, which play central roles in the regulation of cellular signaling pathways. Using KinView, we identified differential phosphorylation patterns between tyrosine and serine/threonine kinases in the activation segment, a major kinase regulatory region that is often mutated in proliferative diseases. We discuss cancer variants that disrupt phosphorylation sites in the activation segment, and show how KinView can be used as a comparative tool to identify differences and similarities in natural variation, cancer variants and post-translational modifications between kinase groups, families and subfamilies. Based on KinView comparisons, we identify and experimentally characterize a regulatory tyrosine (Y177PLK4) in the PLK4 C-terminal activation segment region termed the P+1 loop. To further demonstrate the application of KinView in hypothesis generation and testing, we formulate and validate a hypothesis explaining a novel predicted loss-of-function variant (D523NPKCβ) in the regulatory spine of PKCβ, a recently identified tumor suppressor kinase. KinView provides a novel, extensible interface for performing comparative analyses between subsets of kinases and for integrating multiple types of residue specific annotations in user friendly formats. PMID:27731453

  16. Application of Molecular Typing Results in Source Attribution Models: The Case of Multiple Locus Variable Number Tandem Repeat Analysis (MLVA) of Salmonella Isolates Obtained from Integrated Surveillance in Denmark.

    PubMed

    de Knegt, Leonardo V; Pires, Sara M; Löfström, Charlotta; Sørensen, Gitte; Pedersen, Karl; Torpdahl, Mia; Nielsen, Eva M; Hald, Tine

    2016-03-01

    Salmonella is an important cause of bacterial foodborne infections in Denmark. To identify the main animal-food sources of human salmonellosis, risk managers have relied on a routine application of a microbial subtyping-based source attribution model since 1995. In 2013, multiple locus variable number tandem repeat analysis (MLVA) substituted phage typing as the subtyping method for surveillance of S. Enteritidis and S. Typhimurium isolated from animals, food, and humans in Denmark. The purpose of this study was to develop a modeling approach applying a combination of serovars, MLVA types, and antibiotic resistance profiles for the Salmonella source attribution, and assess the utility of the results for the food safety decisionmakers. Full and simplified MLVA schemes from surveillance data were tested, and model fit and consistency of results were assessed using statistical measures. We conclude that loci schemes STTR5/STTR10/STTR3 for S. Typhimurium and SE9/SE5/SE2/SE1/SE3 for S. Enteritidis can be used in microbial subtyping-based source attribution models. Based on the results, we discuss that an adjustment of the discriminatory level of the subtyping method applied often will be required to fit the purpose of the study and the available data. The issues discussed are also considered highly relevant when applying, e.g., extended multi-locus sequence typing or next-generation sequencing techniques. © 2015 Society for Risk Analysis.

  17. DTWscore: differential expression and cell clustering analysis for time-series single-cell RNA-seq data.

    PubMed

    Wang, Zhuo; Jin, Shuilin; Liu, Guiyou; Zhang, Xiurui; Wang, Nan; Wu, Deliang; Hu, Yang; Zhang, Chiping; Jiang, Qinghua; Xu, Li; Wang, Yadong

    2017-05-23

    The development of single-cell RNA sequencing has enabled profound discoveries in biology, ranging from the dissection of the composition of complex tissues to the identification of novel cell types and dynamics in some specialized cellular environments. However, the large-scale generation of single-cell RNA-seq (scRNA-seq) data collected at multiple time points remains a challenge to effective measurement gene expression patterns in transcriptome analysis. We present an algorithm based on the Dynamic Time Warping score (DTWscore) combined with time-series data, that enables the detection of gene expression changes across scRNA-seq samples and recovery of potential cell types from complex mixtures of multiple cell types. The DTWscore successfully classify cells of different types with the most highly variable genes from time-series scRNA-seq data. The study was confined to methods that are implemented and available within the R framework. Sample datasets and R packages are available at https://github.com/xiaoxiaoxier/DTWscore .

  18. Complete Deletion of the Fucose Operon in Haemophilus influenzae Is Associated with a Cluster in Multilocus Sequence Analysis-Based Phylogenetic Group II Related to Haemophilus haemolyticus: Implications for Identification and Typing

    PubMed Central

    de Gier, Camilla; Kirkham, Lea-Ann S.

    2015-01-01

    Nonhemolytic variants of Haemophilus haemolyticus are difficult to differentiate from Haemophilus influenzae despite a wide difference in pathogenic potential. A previous investigation characterized a challenging set of 60 clinical strains using multiple PCRs for marker genes and described strains that could not be unequivocally identified as either species. We have analyzed the same set of strains by multilocus sequence analysis (MLSA) and near-full-length 16S rRNA gene sequencing. MLSA unambiguously allocated all study strains to either of the two species, while identification by 16S rRNA sequence was inconclusive for three strains. Notably, the two methods yielded conflicting identifications for two strains. Most of the “fuzzy species” strains were identified as H. influenzae that had undergone complete deletion of the fucose operon. Such strains, which are untypeable by the H. influenzae multilocus sequence type (MLST) scheme, have sporadically been reported and predominantly belong to a single branch of H. influenzae MLSA phylogenetic group II. We also found evidence of interspecies recombination between H. influenzae and H. haemolyticus within the 16S rRNA genes. Establishing an accurate method for rapid and inexpensive identification of H. influenzae is important for disease surveillance and treatment. PMID:26378279

  19. Molecular Epidemiology of Carbapenem-Resistant Acinetobacter baumannii Isolates in the Gulf Cooperation Council States: Dominance of OXA-23-Type Producers

    PubMed Central

    Sartor, Anna L.; Sidjabat, Hanna E.; Balkhy, Hanan H.; Walsh, Timothy R.; Al Johani, Sameera M.; AlJindan, Reem Y.; Alfaresi, Mubarak; Ibrahim, Emad; Al-Jardani, Amina; Al Salman, Jameela; Dashti, Ali A.; Johani, Khalid; Paterson, David L.

    2015-01-01

    The molecular epidemiology and mechanisms of resistance of carbapenem-resistant Acinetobacter baumannii (CRAB) were determined in hospitals in the states of the Cooperation Council for the Arab States of the Gulf (Gulf Cooperation Council [GCC]), namely, Saudi Arabia, United Arab Emirates, Oman, Qatar, Bahrain, and Kuwait. Isolates were subjected to PCR-based detection of antibiotic resistance genes and repetitive sequence-based PCR (rep-PCR) assessments of clonality. Selected isolates were subjected to multilocus sequence typing (MLST). We investigated 117 isolates resistant to carbapenem antibiotics (either imipenem or meropenem). All isolates were positive for OXA-51. The most common carbapenemases were the OXA-23-type, found in 107 isolates, followed by OXA-40-type (OXA-24-type), found in 5 isolates; 3 isolates carried the ISAba1 element upstream of blaOXA-51-type. No OXA-58-type, NDM-type, VIM-type, or IMP-type producers were detected. Multiple clones were detected with 16 clusters of clonally related CRAB. Some clusters involved hospitals in different states. MLST analysis of 15 representative isolates from different clusters identified seven different sequence types (ST195, ST208, ST229, ST436, ST450, ST452, and ST499), as well as three novel STs. The vast majority (84%) of the isolates in this study were associated with health care exposure. Awareness of multidrug-resistant organisms in GCC states has important implications for optimizing infection control practices; establishing antimicrobial stewardship programs within hospital, community, and agricultural settings; and emphasizing the need for establishing regional active surveillance systems. This will help to control the spread of CRAB in the Middle East and in hospitals accommodating transferred patients from this region. PMID:25568439

  20. Molecular epidemiology of carbapenem-resistant Acinetobacter baumannii isolates in the Gulf Cooperation Council States: dominance of OXA-23-type producers.

    PubMed

    Zowawi, Hosam M; Sartor, Anna L; Sidjabat, Hanna E; Balkhy, Hanan H; Walsh, Timothy R; Al Johani, Sameera M; AlJindan, Reem Y; Alfaresi, Mubarak; Ibrahim, Emad; Al-Jardani, Amina; Al Salman, Jameela; Dashti, Ali A; Johani, Khalid; Paterson, David L

    2015-03-01

    The molecular epidemiology and mechanisms of resistance of carbapenem-resistant Acinetobacter baumannii (CRAB) were determined in hospitals in the states of the Cooperation Council for the Arab States of the Gulf (Gulf Cooperation Council [GCC]), namely, Saudi Arabia, United Arab Emirates, Oman, Qatar, Bahrain, and Kuwait. Isolates were subjected to PCR-based detection of antibiotic resistance genes and repetitive sequence-based PCR (rep-PCR) assessments of clonality. Selected isolates were subjected to multilocus sequence typing (MLST). We investigated 117 isolates resistant to carbapenem antibiotics (either imipenem or meropenem). All isolates were positive for OXA-51. The most common carbapenemases were the OXA-23-type, found in 107 isolates, followed by OXA-40-type (OXA-24-type), found in 5 isolates; 3 isolates carried the ISAba1 element upstream of blaOXA-51-type. No OXA-58-type, NDM-type, VIM-type, or IMP-type producers were detected. Multiple clones were detected with 16 clusters of clonally related CRAB. Some clusters involved hospitals in different states. MLST analysis of 15 representative isolates from different clusters identified seven different sequence types (ST195, ST208, ST229, ST436, ST450, ST452, and ST499), as well as three novel STs. The vast majority (84%) of the isolates in this study were associated with health care exposure. Awareness of multidrug-resistant organisms in GCC states has important implications for optimizing infection control practices; establishing antimicrobial stewardship programs within hospital, community, and agricultural settings; and emphasizing the need for establishing regional active surveillance systems. This will help to control the spread of CRAB in the Middle East and in hospitals accommodating transferred patients from this region. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  1. Automated use of mutagenesis data in structure prediction.

    PubMed

    Nanda, Vikas; DeGrado, William F

    2005-05-15

    In the absence of experimental structural determination, numerous methods are available to indirectly predict or probe the structure of a target molecule. Genetic modification of a protein sequence is a powerful tool for identifying key residues involved in binding reactions or protein stability. Mutagenesis data is usually incorporated into the modeling process either through manual inspection of model compatibility with empirical data, or through the generation of geometric constraints linking sensitive residues to a binding interface. We present an approach derived from statistical studies of lattice models for introducing mutation information directly into the fitness score. The approach takes into account the phenotype of mutation (neutral or disruptive) and calculates the energy for a given structure over an ensemble of sequences. The structure prediction procedure searches for the optimal conformation where neutral sequences either have no impact or improve stability and disruptive sequences reduce stability relative to wild type. We examine three types of sequence ensembles: information from saturation mutagenesis, scanning mutagenesis, and homologous proteins. Incorporating multiple sequences into a statistical ensemble serves to energetically separate the native state and misfolded structures. As a result, the prediction of structure with a poor force field is sufficiently enhanced by mutational information to improve accuracy. Furthermore, by separating misfolded conformations from the target score, the ensemble energy serves to speed up conformational search algorithms such as Monte Carlo-based methods. Copyright 2005 Wiley-Liss, Inc.

  2. Single-cell genomic sequencing using Multiple Displacement Amplification.

    PubMed

    Lasken, Roger S

    2007-10-01

    Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).

  3. Costimulatory receptors in jawed vertebrates: Conserved CD28, odd CTLA4 and multiple BTLAs

    USGS Publications Warehouse

    Bernard, D.; Hansen, J.D.; Du, Pasquier L.; Lefranc, M.-P.; Benmansour, A.; Boudinot, P.

    2007-01-01

    CD28 family of costimulatory receptors is comprised of molecules with a single V-type extracellular Ig domain, a transmembrane and an intracytoplasmic region with signaling motifs. CD28 and cytotoxic T lymphocyte antigen-4 (CTLA4) homologs have been recently identified in rainbow trout. Other sequences similar to mammalian CD28 family members have now been identified using teleost, Xenopus and chicken databases. CD28- and CTLA4 homologs were found in all vertebrate classes whereas inducible costimulatory signal (ICOS) was restricted to tetrapods, and programmed cell death-1 (PD1) was limited to mammals and chicken. Multiple B and T Lymphocyte Attenuator (BTLA) sequences were found in teleosts, but not in Xenopus or in avian genomes. The intron/exon structure of btlas was different from that of cd28 and other members of the family. The Ig domain encoded in all the btla genes has features of the C-type structure, which suggests that BTLA does not belong to the CD28 family. The genomic localization of these genes in vertebrate genomes supports the split between the BTLA and CD28 families. ?? 2006 Elsevier Ltd. All rights reserved.

  4. Bellerophon: A program to detect chimeric sequences in multiple sequence alignments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip

    2003-12-23

    Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.

  5. Patterns of Viral DNA Integration in Cells Transformed by Wild Type or DNA-Binding Protein Mutants of Adenovirus Type 5 and Effect of Chemical Carcinogens on Integration

    PubMed Central

    Dorsch-Häsler, Karoline; Fisher, Paul B.; Weinstein, I. Bernard; Ginsberg, Harold S.

    1980-01-01

    The integration pattern of viral DNA was studied in a number of cell lines transformed by wild-type adenovirus type 5 (Ad5 WT) and two mutants of the DNA-binding protein gene, H5ts125 and H5ts107. The effect of chemical carcinogens on the integration of viral DNA was also investigated. Liquid hybridization (C0t) analyses showed that rat embryo cells transformed by Ad5 WT usually contained only the left-hand end of the viral genome, whereas cell lines transformed by H5ts125 or H5ts107 at either the semipermissive (36°C) or nonpermissive (39.5°C) temperature often contained one to five copies of all or most of the entire adenovirus genome. The arrangement of the integrated adenovirus DNA sequences was determined by cleavage of transformed cell DNA with restriction endonucleases XbaI, EcoRI, or HindIII followed by transfer of separated fragments to nitrocellulose paper and hybridization according to the technique of E. M. Southern (J. Mol. Biol. 98: 503-517, 1975). It was found that the adenovirus genome is integrated as a linear sequence covalently linked to host cell DNA; that the viral DNA is integrated into different host DNA sequences in each cell line studied; that in cell lines that contain multiple copies of the Ad5 genome the viral DNA sequences can be integrated in a single set of host cell DNA sequences and not as concatemers; and that chemical carcinogens do not alter the extent or pattern of viral DNA integration. Images PMID:6246266

  6. Dual signal amplification for highly sensitive electrochemical detection of uropathogens via enzyme-based catalytic target recycling.

    PubMed

    Su, Jiao; Zhang, Haijie; Jiang, Bingying; Zheng, Huzhi; Chai, Yaqin; Yuan, Ruo; Xiang, Yun

    2011-11-15

    We report an ultrasensitive electrochemical approach for the detection of uropathogen sequence-specific DNA target. The sensing strategy involves a dual signal amplification process, which combines the signal enhancement by the enzymatic target recycling technique with the sensitivity improvement by the quantum dot (QD) layer-by-layer (LBL) assembled labels. The enzyme-based catalytic target DNA recycling process results in the use of each target DNA sequence for multiple times and leads to direct amplification of the analytical signal. Moreover, the LBL assembled QD labels can further enhance the sensitivity of the sensing system. The coupling of these two effective signal amplification strategies thus leads to low femtomolar (5fM) detection of the target DNA sequences. The proposed strategy also shows excellent discrimination between the target DNA and the single-base mismatch sequences. The advantageous intrinsic sequence-independent property of exonuclease III over other sequence-dependent enzymes makes our new dual signal amplification system a general sensing platform for monitoring ultralow level of various types of target DNA sequences. Copyright © 2011 Elsevier B.V. All rights reserved.

  7. Targeted therapy according to next generation sequencing-based panel sequencing.

    PubMed

    Saito, Motonobu; Momma, Tomoyuki; Kono, Koji

    2018-04-17

    Targeted therapy against actionable gene mutations shows a significantly higher response rate as well as longer survival compared to conventional chemotherapy, and has become a standard therapy for many cancers. Recent progress in next-generation sequencing (NGS) has enabled to identify huge number of genetic aberrations. Based on sequencing results, patients recommend to undergo targeted therapy or immunotherapy. In cases where there are no available approved drugs for the genetic mutations detected in the patients, it is recommended to be facilitate the registration for the clinical trials. For that purpose, a NGS-based sequencing panel that can simultaneously target multiple genes in a single investigation has been used in daily clinical practice. To date, various types of sequencing panels have been developed to investigate genetic aberrations with tumor somatic genome variants (gain-of-function or loss-of-function mutations, high-level copy number alterations, and gene fusions) through comprehensive bioinformatics. Because sequencing panels are efficient and cost-effective, they are quickly being adopted outside the lab, in hospitals and clinics, in order to identify personal targeted therapy for individual cancer patients.

  8. Carriage of Cronobacter sakazakii in the very preterm infant gut.

    PubMed

    Chandrasekaran, Sukantha; Burnham, Carey-Ann D; Warner, Barbara B; Tarr, Phillip I; Wylie, Todd N

    2018-01-31

    Cronobacter sakazakii causes severe neonatal infections, but we know little about gut carriage of this pathogen in very low birthweight infants. We sequenced 16S rRNA genes from 2,304 stools from 121 children at St. Louis Children's Hospital whose birthweight was ≤1,500 grams, attempted to isolate C. sakazakii from 157 of these stools, genome sequenced the recovered isolates, and sought correlations between indices of Cronobacter excretion, host characteristics and unit formula use. Of these 2,304 stools, 1,271 (55.2%) contained Cronobacter rRNA gene sequences. The median (interquartile range) per-subject percent of specimens with at least one Cronobacter sequence and the median per-subject read density were 57.1 (25.5-87.3) and 0.07 (0.01-0.67), respectively. There was no variation according to commercially prepared liquid versus powdered formula use in the NICU, or the day-of-life that specimens were produced. However, the proportion of specimens containing >4.0% of reads mapping to Cronobacter fell from 4.3% to 0.9% after powdered infant formula was discontinued (P<0.0001). We isolated sequence type (ST) 4 C. sakazakii from multiple specimens from two subjects; one also harbored ST233. The sequenced ST4 isolates from the two subjects had >99.9% sequence identity in the ~93% of best-match reference genome that they contained, and shared multiple virulence loci. Very low birthweight infants excrete putatively pathogenic Cronobacter. High-density Cronobacter sequence samples were more common during the use of powdered infant formula. Better understanding of the ecology of Cronobacter in infant guts will inform future prevention and control strategies. © The Author(s) 2018. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.

  9. A ganglioneuroma of the sigmoid colon presenting as leading point of intussusception in a child: a case report.

    PubMed

    Soccorso, Giampiero; Puls, Florian; Richards, Cathy; Pringle, Howard; Nour, Shawqui

    2009-01-01

    We present a case of intestinal ganglioneuroma (GN) of the sigmoid colon in a 5-year-old girl, which caused intermittent colocolic intussusception. Ganglioneuromas are rare benign tumors of the autonomic nervous system composed of mature ganglion cells and satellite cells. Colonic GNs are uncommon. The unusual intramural proliferation of neural elements in this case resembled the diffuse intestinal ganglioneuromatosis, which is known to be associated with multiple endocrine neoplasia type 2B. However, the specific mutations of multiple endocrine neoplasia type 2B were not found by genetic sequencing. This is the first pediatric case described in the literature of a solitary polypoid GN presenting as a colocolic intussusception. We present a brief overview of intestinal ganglioneuromatous lesions and associated conditions.

  10. Isolation and characterization of multiple F-box genes linked to the S9- and S10-RNase in apple (Malus × domestica Borkh.).

    PubMed

    Okada, Kazuma; Moriya, Shigeki; Haji, Takashi; Abe, Kazuyuki

    2013-06-01

    Using 11 consensus primer pairs designed from S-linked F-box genes of apple and Japanese pear, 10 new F-box genes (MdFBX21 to 30) were isolated from the apple cultivar 'Spartan' (S(9)S(10)). MdFBX21 to 23 and MdFBX24 to 30 were completely linked to the S(9) -RNase and S(10-)RNase, respectively, and showed pollen-specific expression and S-haplotype-specific polymorphisms. Therefore, these 10 F-box genes are good candidates for the pollen determinant of self-incompatibility in apple. Phylogenetic analysis and comparison of deduced amino acid sequences of MdFBX21 to 30 with those of 25 S-linked F-box genes previously isolated from apple showed that a deduced amino acid identity of greater than 88.0 % can be used as the tentative criterion to classify F-box genes into one type. Using this criterion, 31 of 35 F-box genes of apple were classified into 11 types (SFBB1-11). All types included F-box genes derived from S(3-) and S(9-)haplotypes, and seven types included F-box genes derived from S(3-), S(9-), and S(10-)haplotypes. Moreover, comparison of nucleotide sequences of S-RNases and multiple F-box genes among S(3-), S(9-), and S(10-)haplotypes suggested that F-box genes within each type showed high nucleotide identity regardless of the identity of the S-RNase. The large number of F-box genes as candidates for the pollen determinant and the high degree of conservation within each type are consistent with the collaborative non-self-recognition model reported for Petunia. These findings support that the collaborative non-self-recognition system also exists in apple.

  11. The production of Multiple Small Peptaibol Families by Single 14-Module Peptide Synthetases in Trichoderma/Hypocrea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Degenkolb, Thomas; Aghchehb, Razieh Karimi; Dieckmann, Ralf

    2012-03-01

    The most common peptaibibiotic structures are 11-residue peptaibols found widely distributed in the genus Trichoderma/Hypocrea. Frequently associated are 14-residue peptaibols sharing partial sequence identity. Genome sequencing projects of 3 Trichoderma strains of the major clades reveal the presence of up to 3 types of nonribosomal peptide synthetases with 7, 14, or 18-20 amino acid adding modules. We here provide evidence that the 14-module NRPS type found in T. virens, T. reesei (teleomorph Hypocrea jecorina) and T. atroviride produces both 11- and 14- residue peptaibols based on the disruption of the respective NRPS gene of T. reesei, and bioinformatic analysis ofmore » their amino acid activating domains and modules. The structures of these peptides may be predicted from the gene structures and have been confirmed by analysis of families of 11- and 14-residue peptaibols from the strain 618, termed hypojecorins A (23 sequences determined, 4 new) and B (3 new sequences), and the recently established trichovirins A from T. virens. The distribution of 11- and 14-residue products is strain-specific and depends on growth conditions as well. Possible mechanisms of module skipping are discussed.« less

  12. Conservation of a pH-sensitive structure in the C-terminal region of spider silk extends across the entire silk gene family.

    PubMed

    Strickland, Michelle; Tudorica, Victor; Řezáč, Milan; Thomas, Neil R; Goodacre, Sara L

    2018-06-01

    Spiders produce multiple silks with different physical properties that allow them to occupy a diverse range of ecological niches, including the underwater environment. Despite this functional diversity, past molecular analyses show a high degree of amino acid sequence similarity between C-terminal regions of silk genes that appear to be independent of the physical properties of the resulting silks; instead, this domain is crucial to the formation of silk fibers. Here, we present an analysis of the C-terminal domain of all known types of spider silk and include silk sequences from the spider Argyroneta aquatica, which spins the majority of its silk underwater. Our work indicates that spiders have retained a highly conserved mechanism of silk assembly, despite the extraordinary diversification of species, silk types and applications of silk over 350 million years. Sequence analysis of the silk C-terminal domain across the entire gene family shows the conservation of two uncommon amino acids that are implicated in the formation of a salt bridge, a functional bond essential to protein assembly. This conservation extends to the novel sequences isolated from A. aquatica. This finding is relevant to research regarding the artificial synthesis of spider silk, suggesting that synthesis of all silk types will be possible using a single process.

  13. Molecular Typing of Lung Adenocarcinoma on Cytological Samples Using a Multigene Next Generation Sequencing Panel

    PubMed Central

    Fassan, Matteo; Rachiglio, Anna Maria; Cappellesso, Rocco; Antonello, Davide; Amato, Eliana; Mafficini, Andrea; Lambiase, Matilde; Esposito, Claudia; Bria, Emilio; Simonato, Francesca; Scardoni, Maria; Turri, Giona; Chilosi, Marco; Tortora, Giampaolo; Fassina, Ambrogio; Normanno, Nicola

    2013-01-01

    Identification of driver mutations in lung adenocarcinoma has led to development of targeted agents that are already approved for clinical use or are in clinical trials. Therefore, the number of biomarkers that will be needed to assess is expected to rapidly increase. This calls for the implementation of methods probing the mutational status of multiple genes for inoperable cases, for which limited cytological or bioptic material is available. Cytology specimens from 38 lung adenocarcinomas were subjected to the simultaneous assessment of 504 mutational hotspots of 22 lung cancer-associated genes using 10 nanograms of DNA and Ion Torrent PGM next-generation sequencing. Thirty-six cases were successfully sequenced (95%). In 24/36 cases (67%) at least one mutated gene was observed, including EGFR, KRAS, PIK3CA, BRAF, TP53, PTEN, MET, SMAD4, FGFR3, STK11, MAP2K1. EGFR and KRAS mutations, respectively found in 6/36 (16%) and 10/36 (28%) cases, were mutually exclusive. Nine samples (25%) showed concurrent alterations in different genes. The next-generation sequencing test used is superior to current standard methodologies, as it interrogates multiple genes and requires limited amounts of DNA. Its applicability to routine cytology samples might allow a significant increase in the fraction of lung cancer patients eligible for personalized therapy. PMID:24236184

  14. Antimicrobial Peptides from Plants

    PubMed Central

    Tam, James P.; Wang, Shujing; Wong, Ka H.; Tan, Wei Liang

    2015-01-01

    Plant antimicrobial peptides (AMPs) have evolved differently from AMPs from other life forms. They are generally rich in cysteine residues which form multiple disulfides. In turn, the disulfides cross-braced plant AMPs as cystine-rich peptides to confer them with extraordinary high chemical, thermal and proteolytic stability. The cystine-rich or commonly known as cysteine-rich peptides (CRPs) of plant AMPs are classified into families based on their sequence similarity, cysteine motifs that determine their distinctive disulfide bond patterns and tertiary structure fold. Cystine-rich plant AMP families include thionins, defensins, hevein-like peptides, knottin-type peptides (linear and cyclic), lipid transfer proteins, α-hairpinin and snakins family. In addition, there are AMPs which are rich in other amino acids. The ability of plant AMPs to organize into specific families with conserved structural folds that enable sequence variation of non-Cys residues encased in the same scaffold within a particular family to play multiple functions. Furthermore, the ability of plant AMPs to tolerate hypervariable sequences using a conserved scaffold provides diversity to recognize different targets by varying the sequence of the non-cysteine residues. These properties bode well for developing plant AMPs as potential therapeutics and for protection of crops through transgenic methods. This review provides an overview of the major families of plant AMPs, including their structures, functions, and putative mechanisms. PMID:26580629

  15. The Reference Genome of the Halophytic Plant Eutrema salsugineum

    PubMed Central

    Yang, Ruolin; Jarvis, David E.; Chen, Hao; Beilstein, Mark A.; Grimwood, Jane; Jenkins, Jerry; Shu, ShengQiang; Prochnik, Simon; Xin, Mingming; Ma, Chuang; Schmutz, Jeremy; Wing, Rod A.; Mitchell-Olds, Thomas; Schumaker, Karen S.; Wang, Xiangfeng

    2013-01-01

    Halophytes are plants that can naturally tolerate high concentrations of salt in the soil, and their tolerance to salt stress may occur through various evolutionary and molecular mechanisms. Eutrema salsugineum is a halophytic species in the Brassicaceae that can naturally tolerate multiple types of abiotic stresses that typically limit crop productivity, including extreme salinity and cold. It has been widely used as a laboratorial model for stress biology research in plants. Here, we present the reference genome sequence (241 Mb) of E. salsugineum at 8× coverage sequenced using the traditional Sanger sequencing-based approach with comparison to its close relative Arabidopsis thaliana. The E. salsugineum genome contains 26,531 protein-coding genes and 51.4% of its genome is composed of repetitive sequences that mostly reside in pericentromeric regions. Comparative analyses of the genome structures, protein-coding genes, microRNAs, stress-related pathways, and estimated translation efficiency of proteins between E. salsugineum and A. thaliana suggest that halophyte adaptation to environmental stresses may occur via a global network adjustment of multiple regulatory mechanisms. The E. salsugineum genome provides a resource to identify naturally occurring genetic alterations contributing to the adaptation of halophytic plants to salinity and that might be bioengineered in related crop species. PMID:23518688

  16. SNPGenie: estimating evolutionary parameters to detect natural selection using pooled next-generation sequencing data.

    PubMed

    Nelson, Chase W; Moncla, Louise H; Hughes, Austin L

    2015-11-15

    New applications of next-generation sequencing technologies use pools of DNA from multiple individuals to estimate population genetic parameters. However, no publicly available tools exist to analyse single-nucleotide polymorphism (SNP) calling results directly for evolutionary parameters important in detecting natural selection, including nucleotide diversity and gene diversity. We have developed SNPGenie to fill this gap. The user submits a FASTA reference sequence(s), a Gene Transfer Format (.GTF) file with CDS information and a SNP report(s) in an increasing selection of formats. The program estimates nucleotide diversity, distance from the reference and gene diversity. Sites are flagged for multiple overlapping reading frames, and are categorized by polymorphism type: nonsynonymous, synonymous, or ambiguous. The results allow single nucleotide, single codon, sliding window, whole gene and whole genome/population analyses that aid in the detection of positive and purifying natural selection in the source population. SNPGenie version 1.2 is a Perl program with no additional dependencies. It is free, open-source, and available for download at https://github.com/hugheslab/snpgenie. nelsoncw@email.sc.edu or austin@biol.sc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Molecular identification and typing of Burkholderia pseudomallei and Burkholderia mallei: when is enough enough?

    PubMed

    Antonov, Valery A; Tkachenko, Galina A; Altukhova, Viktoriya V; Savchenko, Sergey S; Zinchenko, Olga V; Viktorov, Dmitry V; Zamaraev, Valery S; Ilyukhin, Vladimir I; Alekseev, Vladimir V

    2008-12-01

    Burkholderia mallei and B. pseudomallei are highly pathogenic microorganisms for both humans and animals. Moreover, they are regarded as potential agents of bioterrorism. Thus, rapid and unequivocal detection and identification of these dangerous pathogens is critical. In the present study, we describe the use of an optimized protocol for the early diagnosis of experimental glanders and melioidosis and for the rapid differentiation and typing of Burkholderia strains. This experience with PCR-based identification methods indicates that single PCR targets (23S and 16S rRNA genes, 16S-23S intergenic region, fliC and type III secretion gene cluster) should be used with caution for identification of B. mallei and B. pseudomallei, and need to be used alongside molecular methods such as gene sequencing. Several molecular typing procedures have been used to identify genetically related B. pseudomallei and B. mallei isolates, including ribotyping, pulsed-field gel electrophoresis and multilocus sequence typing. However, these methods are time consuming and technically challenging for many laboratories. RAPD, variable amplicon typing scheme, Rep-PCR, BOX-PCR and multiple-locus variable-number tandem repeat analysis have been recommended by us for the rapid differentiation of B. mallei and B. pseudomallei strains.

  18. NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

    PubMed Central

    Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

    2010-01-01

    β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC  = 0.50, Qtotal = 82.1%, sensitivity  = 75.6%, PPV  = 68.8% and AUC  = 0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17 – 0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. Conclusion The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences. PMID:21152409

  19. NetTurnP--neural network prediction of beta-turns by use of evolutionary information and predicted protein sequence features.

    PubMed

    Petersen, Bent; Lundegaard, Claus; Petersen, Thomas Nordahl

    2010-11-30

    β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences.

  20. Multiplicity among Solar-type Stars

    NASA Astrophysics Data System (ADS)

    Fuhrmann, K.; Chini, R.; Kaderhandt, L.; Chen, Z.

    2017-02-01

    We present a multiplicity census for a volume-complete all-sky survey of 422 stars with distances less than 25 pc and primary main-sequence effective temperatures T eff ≥ 5300 K. Very similar to previous results that have been presented for various subsets of this survey, we confirm the positive correlation of the stellar multiplicities with primary mass. We find for the F- and G-type Population I stars that 58% are non-single and 21% are in triple or higher level systems. For the old intermediate-disk and Population II stars—virtually all of G type and less massive—even two out of three sources prove to be non-single. These numbers being lower limits because of the continuous flow of new discoveries, the unbiased survey clearly demonstrates that the standard case for solar-type field stars is a hydrogen-burning source with at least one ordinary or degenerate stellar companion, and a surprisingly large number of stars are organized in multiple systems. A principal consequence is that orbital evolution, including the formation of blue straggler stars, is a potentially important issue on all spatial scales and timescales for a significant percentage of the stellar systems, in particular among Population II stars. We discuss a number of recent observations of known or suspected companions in the local survey, including a new detection of a double-lined Ba-Bb subsystem to the visual binary HR 8635.

  1. A novel approach to multiple sequence alignment using hadoop data grids.

    PubMed

    Sudha Sadasivam, G; Baktavatchalam, G

    2010-01-01

    Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.

  2. Non-invasive prenatal diagnosis of multiple endocrine neoplasia type 2A using COLD-PCR combined with HRM genotyping analysis from maternal serum.

    PubMed

    Macher, Hada C; Martinez-Broca, Maria A; Rubio-Calvo, Amalia; Leon-Garcia, Cristina; Conde-Sanchez, Manuel; Costa, Alzenira; Navarro, Elena; Guerrero, Juan M

    2012-01-01

    The multiple endocrine neoplasia type 2A (MEN2A) is a monogenic disorder characterized by an autosomal dominant pattern of inheritance which is characterized by high risk of medullary thyroid carcinoma in all mutation carriers. Although this disorder is classified as a rare disease, the patients affected have a low life quality and a very expensive and continuous treatment. At present, MEN2A is diagnosed by gene sequencing after birth, thus trying to start an early treatment and by reduction of morbidity and mortality. We first evaluated the presence of MEN2A mutation (C634Y) in serum of 25 patients, previously diagnosed by sequencing in peripheral blood leucocytes, using HRM genotyping analysis. In a second step, we used a COLD-PCR approach followed by HRM genotyping analysis for non-invasive prenatal diagnosis of a pregnant woman carrying a fetus with a C634Y mutation. HRM analysis revealed differences in melting curve shapes that correlated with patients diagnosed for MEN2A by gene sequencing analysis with 100% accuracy. Moreover, the pregnant woman carrying the fetus with the C634Y mutation revealed a melting curve shape in agreement with the positive controls in the COLD-PCR study. The mutation was confirmed by sequencing of the COLD-PCR amplification product. In conclusion, we have established a HRM analysis in serum samples as a new primary diagnosis method suitable for the detection of C634Y mutations in MEN2A patients. Simultaneously, we have applied the increase of sensitivity of COLD-PCR assay approach combined with HRM analysis for the non-invasive prenatal diagnosis of C634Y fetal mutations using pregnant women serum.

  3. A Homologue of an Operon Required for DNA Transfer in Agrobacterium Is Required in Brucella abortus for Virulence and Intracellular Multiplication

    PubMed Central

    Sieira, Rodrigo; Comerci, Diego J.; Sánchez, Daniel O.; Ugalde, Rodolfo A.

    2000-01-01

    As part of a Brucella abortus 2308 genome project carried out in our laboratory, we identified, cloned, and sequenced a genomic DNA fragment containing a locus (virB) highly homologous to bacterial type IV secretion systems. The B. abortus virB locus is a collinear arrangement of 13 open reading frames (ORFs). Between virB1 and virB2 and downstream of ORF12, two degenerated, palindromic repeat sequences characteristic of Brucella intergenic regions were found. Gene reporter studies demonstrated that the B. abortus virB locus constitutes an operon transcribed from virB1 which is turned on during the stationary phase of growth. A B. abortus polar virB1 mutant failed to replicate in HeLa cells, indicating that the virB operon plays a critical role in intracellular multiplication. Mutants with polar and nonpolar mutations introduced in virB10 showed different behaviors in mice and in the HeLa cell infection assay, suggesting that virB10 per se is necessary for the correct function of this type IV secretion apparatus. Mouse infection assays demonstrated that the virB operon constitutes a major determinant of B. abortus virulence. It is suggested that putative effector molecules secreted by this type IV secretion system determine routing of B. abortus to an endoplasmic reticulum-related replication compartment. PMID:10940027

  4. Dissemination of IMP-6-producing Pseudomonas aeruginosa ST244 in multiple cities in China.

    PubMed

    Chen, Y; Sun, M; Wang, M; Lu, Y; Yan, Z

    2014-07-01

    Pseudomonas aeruginosa is an important opportunistic pathogen responsible for nosocomial infections and is currently reported to be a worldwide nosocomial menace. The aim of this study was to investigate the epidemiological traits and the distribution of metallo-β-lactamases (MBLs)-producing P. aeruginosa clinical isolates in ten cities in China between January 2010 and May 2012. Antimicrobial susceptibility was determined by disc diffusion assay and the minimum inhibitory concentrations (MICs) of imipenem and meropenem were also determined by the Etest according to Clinical and Laboratory Standards Institute (CLSI) guidelines. In addition, polymerase chain reaction (PCR) and DNA sequencing were applied to detect bla MBL genes, and their epidemiological relationships were investigated by multilocus sequence typing (MLST). Of 368 P. aeruginosa isolates, MLST analysis identified 138 sequence types (STs), including 122 known and 16 novel STs, and the most frequently detected clone was ST244, followed by ST235. Besides, our study revealed that 25 isolates carried the bla IMP-6 gene and three isolates carried the bla VIM-2 gene, and a probe specific for both genes could be hybridised to an ~1,125-kb fragment in all isolates. Interestingly, all of the bla IMP-6-producing isolates shared an identical ST, ST244, and exhibited a higher level of resistance to several antibiotics. Overall, these observations suggest that P. aeruginosa ST244 carrying the chromosomally located bla IMP-6 gene is widely disseminated in multiple cites in China.

  5. Cellulose in Cyanobacteria. Origin of Vascular Plant Cellulose Synthase?

    PubMed Central

    Nobles, David R.; Romanovicz, Dwight K.; Brown, R. Malcolm

    2001-01-01

    Although cellulose biosynthesis among the cyanobacteria has been suggested previously, we present the first conclusive evidence, to our knowledge, of the presence of cellulose in these organisms. Based on the results of x-ray diffraction, electron microscopy of microfibrils, and cellobiohydrolase I-gold labeling, we report the occurrence of cellulose biosynthesis in nine species representing three of the five sections of cyanobacteria. Sequence analysis of the genomes of four cyanobacteria revealed the presence of multiple amino acid sequences bearing the DDD35QXXRW motif conserved in all cellulose synthases. Pairwise alignments demonstrated that CesAs from plants were more similar to putative cellulose synthases from Anabaena sp. Pasteur Culture Collection 7120 and Nostoc punctiforme American Type Culture Collection 29133 than any other cellulose synthases in the database. Multiple alignments of putative cellulose synthases from Anabaena sp. Pasteur Culture Collection 7120 and N. punctiforme American Type Culture Collection 29133 with the cellulose synthases of other prokaryotes, Arabidopsis, Gossypium hirsutum, Populus alba × Populus tremula, corn (Zea mays), and Dictyostelium discoideum showed that cyanobacteria share an insertion between conserved regions U1 and U2 found previously only in eukaryotic sequences. Furthermore, phylogenetic analysis indicates that the cyanobacterial cellulose synthases share a common branch with CesAs of vascular plants in a manner similar to the relationship observed with cyanobacterial and chloroplast 16s rRNAs, implying endosymbiotic transfer of CesA from cyanobacteria to plants and an ancient origin for cellulose synthase in eukaryotes. PMID:11598227

  6. Spectral Sequences of Type Ia Supernovae. I. Connecting Normal and Subluminous SNe Ia and the Presence of Unburned Carbon

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heringer, E.; Kerkwijk, M. H. van; Sim, S. A.

    2017-09-01

    Type Ia supernovae (SNe Ia) are generally agreed to arise from thermonuclear explosions of carbon–oxygen white dwarfs. The actual path to explosion, however, remains elusive, with numerous plausible parent systems and explosion mechanisms suggested. Observationally, SNe Ia have multiple subclasses, distinguished by their light curves and spectra. This raises the question of whether these indicate that multiple mechanisms occur in nature or that explosions have a large but continuous range of physical properties. We revisit the idea that normal and 91bg-like SNe can be understood as part of a spectral sequence in which changes in temperature dominate. Specifically, we findmore » that a single ejecta structure is sufficient to provide reasonable fits of both the normal SN Ia SN 2011fe and the 91bg-like SN 2005bl, provided that the luminosity and thus temperature of the ejecta are adjusted appropriately. This suggests that the outer layers of the ejecta are similar, thus providing some support for a common explosion mechanism. Our spectral sequence also helps to shed light on the conditions under which carbon can be detected in premaximum SN Ia spectra—we find that emission from iron can “fill in” the carbon trough in cool SNe Ia. This may indicate that the outer layers of the ejecta of events in which carbon is detected are relatively metal-poor compared to events in which carbon is not detected.« less

  7. Flagellin Diversity in Clostridium botulinum Groups I and II: a New Strategy for Strain Identification▿

    PubMed Central

    Paul, Catherine J.; Twine, Susan M.; Tam, Kevin J.; Mullen, James A.; Kelly, John F.; Austin, John W.; Logan, Susan M.

    2007-01-01

    Strains of Clostridium botulinum are traditionally identified by botulinum neurotoxin type; however, identification of an additional target for typing would improve differentiation. Isolation of flagellar filaments and analysis by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) showed that C. botulinum produced multiple flagellin proteins. Nano-liquid chromatography-tandem mass spectrometry (nLC-MS/MS) analysis of in-gel tryptic digests identified peptides in all flagellin bands that matched two homologous tandem flagellin genes identified in the C. botulinum Hall A genome. Designated flaA1 and flaA2, these open reading frames encode the major structural flagellins of C. botulinum. Colony PCR and sequencing of flaA1/A2 variable regions classified 80 environmental and clinical strains into group I or group II and clustered isolates into 12 flagellar types. Flagellar type was distinct from neurotoxin type, and epidemiologically related isolates clustered together. Sequencing a larger PCR product, obtained during amplification of flaA1/A2 from type E strain Bennett identified a second flagellin gene, flaB. LC-MS analysis confirmed that flaB encoded a large type E-specific flagellin protein, and the predicted molecular mass for FlaB matched that observed by SDS-PAGE. In contrast, the molecular mass of FlaA was 2 to 12 kDa larger than the mass predicted by the flaA1/A2 sequence of a given strain, suggesting that FlaA is posttranslationally modified. While identification of FlaB, and the observation by SDS-PAGE of different masses of the FlaA proteins, showed the flagellin proteins of C. botulinum to be diverse, the presence of the flaA1/A2 gene in all strains examined facilitates single locus sequence typing of C. botulinum using the flagellin variable region. PMID:17351097

  8. Current sequencing technology makes microhaplotypes a powerful new type of genetic marker for forensics.

    PubMed

    Kidd, Kenneth K; Pakstis, Andrew J; Speed, William C; Lagacé, Robert; Chang, Joseph; Wootton, Sharon; Haigh, Eva; Kidd, Judith R

    2014-09-01

    SNPs that are molecularly very close (<10kb) will generally have extremely low recombination rates, much less than 10(-4). Multiple haplotypes will often exist because of the history of the origins of the variants at the different sites, rare recombinants, and the vagaries of random genetic drift and/or selection. Such multiallelic haplotype loci are potentially important in forensic work for individual identification, for defining ancestry, and for identifying familial relationships. The new DNA sequencing capabilities currently available make possible continuous runs of a few hundred base pairs so that we can now determine the allelic combination of multiple SNPs on each chromosome of an individual, i.e., the phase, for multiple SNPs within a small segment of DNA. Therefore, we have begun to identify regions, encompassing two to four SNPs with an extent of <200bp that define multiallelic haplotype loci. We have identified candidate regions and have collected pilot data on many candidate microhaplotype loci. Here we present 31 microhaplotype loci that have at least three alleles, have high heterozygosity, are globally informative, and are statistically independent at the population level. This study of microhaplotype loci (microhaps) provides proof of principle that such markers exist and validates their usefulness for ancestry inference, lineage-clan-family inference, and individual identification. The true value of microhaplotypes will come with sequencing methods that can establish alleles unambiguously, including disentangling of mixtures, because a single sequencing run on a single strand of DNA will encompass all of the SNPs. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  9. Emergence of a deviating genotype VI pigeon paramyxovirus type-1 isolated from India.

    PubMed

    Ganar, Ketan; Das, Moushumee; Raut, Ashwin Ashok; Mishra, Anamika; Kumar, Sachin

    2017-07-01

    Pigeon paramyxovirus type 1 (PPMV-1) is an antigenic variant of avian paramyxovirus type 1 (APMV-1), which infects pigeons. The virus causes high morbidity and mortality, creating an alarming state for the poultry industry. The present work describes the molecular and pathogenic characterization of a PPMV-1 strain isolated from pigeon in Bhopal, India. Complete genome sequence analysis revealed a genome of 15,192 nucleotides encoding six genes organized in the order 3'-N-P-M-F-HN-L-5'. The fusion gene sequence analysis showed the presence of multiple basic amino acids 112 R-R-Q-K-R-F 117 at the cleavage site corresponding to pathogenic strains. The mean death time and intracerebral pathogenicity index values indicated a mesogenic nature for the PPMV-1 isolate. On phylogenetic analysis, the strain clustered with genotype VI viruses, including isolates from pigeon and dove. The Bhopal strain showed significant intra and inter-genotype evolutionary distance, suggesting the emergence of a new sub-genotype, VIj.

  10. Abundant and Diverse Clustered Regularly Interspaced Short Palindromic Repeat Spacers in Clostridium difficile Strains and Prophages Target Multiple Phage Types within This Pathogen

    PubMed Central

    Hargreaves, Katherine R.; Flores, Cesar O.; Lawley, Trevor D.

    2014-01-01

    ABSTRACT Clostridium difficile is an important human-pathogenic bacterium causing antibiotic-associated nosocomial infections worldwide. Mobile genetic elements and bacteriophages have helped shape C. difficile genome evolution. In many bacteria, phage infection may be controlled by a form of bacterial immunity called the clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) system. This uses acquired short nucleotide sequences (spacers) to target homologous sequences (protospacers) in phage genomes. C. difficile carries multiple CRISPR arrays, and in this paper we examine the relationships between the host- and phage-carried elements of the system. We detected multiple matches between spacers and regions in 31 C. difficile phage and prophage genomes. A subset of the spacers was located in prophage-carried CRISPR arrays. The CRISPR spacer profiles generated suggest that related phages would have similar host ranges. Furthermore, we show that C. difficile strains of the same ribotype could either have similar or divergent CRISPR contents. Both synonymous and nonsynonymous mutations in the protospacer sequences were identified, as well as differences in the protospacer adjacent motif (PAM), which could explain how phages escape this system. This paper illustrates how the distribution and diversity of CRISPR spacers in C. difficile, and its prophages, could modulate phage predation for this pathogen and impact upon its evolution and pathogenicity. PMID:25161187

  11. Human immunodeficiency viruses appear compartmentalized to the female genital tract in cross-sectional analyses but genital lineages do not persist over time.

    PubMed

    Bull, Marta E; Heath, Laura M; McKernan-Mullin, Jennifer L; Kraft, Kelli M; Acevedo, Luis; Hitti, Jane E; Cohn, Susan E; Tapia, Kenneth A; Holte, Sarah E; Dragavon, Joan A; Coombs, Robert W; Mullins, James I; Frenkel, Lisa M

    2013-04-15

    Whether unique human immunodeficiency type 1 (HIV) genotypes occur in the genital tract is important for vaccine development and management of drug resistant viruses. Multiple cross-sectional studies suggest HIV is compartmentalized within the female genital tract. We hypothesize that bursts of HIV replication and/or proliferation of infected cells captured in cross-sectional analyses drive compartmentalization but over time genital-specific viral lineages do not form; rather viruses mix between genital tract and blood. Eight women with ongoing HIV replication were studied during a period of 1.5 to 4.5 years. Multiple viral sequences were derived by single-genome amplification of the HIV C2-V5 region of env from genital secretions and blood plasma. Maximum likelihood phylogenies were evaluated for compartmentalization using 4 statistical tests. In cross-sectional analyses compartmentalization of genital from blood viruses was detected in three of eight women by all tests; this was associated with tissue specific clades containing multiple monotypic sequences. In longitudinal analysis, the tissues-specific clades did not persist to form viral lineages. Rather, across women, HIV lineages were comprised of both genital tract and blood sequences. The observation of genital-specific HIV clades only in cross-sectional analysis and an absence of genital-specific lineages in longitudinal analyses suggest a dynamic interchange of HIV variants between the female genital tract and blood.

  12. Comparative analysis of mitochondrial genomes between a wheat K-type cytoplasmic male sterility (CMS) line and its maintainer line.

    PubMed

    Liu, Huitao; Cui, Peng; Zhan, Kehui; Lin, Qiang; Zhuo, Guoyin; Guo, Xiaoli; Ding, Feng; Yang, Wenlong; Liu, Dongcheng; Hu, Songnian; Yu, Jun; Zhang, Aimin

    2011-03-29

    Plant mitochondria, semiautonomous organelles that function as manufacturers of cellular ATP, have their own genome that has a slow rate of evolution and rapid rearrangement. Cytoplasmic male sterility (CMS), a common phenotype in higher plants, is closely associated with rearrangements in mitochondrial DNA (mtDNA), and is widely used to produce F1 hybrid seeds in a variety of valuable crop species. Novel chimeric genes deduced from mtDNA rearrangements causing CMS have been identified in several plants, such as rice, sunflower, pepper, and rapeseed, but there are very few reports about mtDNA rearrangements in wheat. In the present work, we describe the mitochondrial genome of a wheat K-type CMS line and compare it with its maintainer line. The complete mtDNA sequence of a wheat K-type (with cytoplasm of Aegilops kotschyi) CMS line, Ks3, was assembled into a master circle (MC) molecule of 647,559 bp and found to harbor 34 known protein-coding genes, three rRNAs (18 S, 26 S, and 5 S rRNAs), and 16 different tRNAs. Compared to our previously published sequence of a K-type maintainer line, Km3, we detected Ks3-specific mtDNA (> 100 bp, 11.38%) and repeats (> 100 bp, 29 units) as well as genes that are unique to each line: rpl5 was missing in Ks3 and trnH was absent from Km3. We also defined 32 single nucleotide polymorphisms (SNPs) in 13 protein-coding, albeit functionally irrelevant, genes, and predicted 22 unique ORFs in Ks3, representing potential candidates for K-type CMS. All these sequence variations are candidates for involvement in CMS. A comparative analysis of the mtDNA of several angiosperms, including those from Ks3, Km3, rice, maize, Arabidopsis thaliana, and rapeseed, showed that non-coding sequences of higher plants had mostly divergent multiple reorganizations during the mtDNA evolution of higher plants. The complete mitochondrial genome of the wheat K-type CMS line Ks3 is very different from that of its maintainer line Km3, especially in non-coding sequences. Sequence rearrangement has produced novel chimeric ORFs, which may be candidate genes for CMS. Comparative analysis of several angiosperm mtDNAs indicated that non-coding sequences are the most frequently reorganized during mtDNA evolution in higher plants.

  13. tRF2Cancer: A web server to detect tRNA-derived small RNA fragments (tRFs) and their expression in multiple cancers.

    PubMed

    Zheng, Ling-Ling; Xu, Wei-Lin; Liu, Shun; Sun, Wen-Ju; Li, Jun-Hao; Wu, Jie; Yang, Jian-Hua; Qu, Liang-Hu

    2016-07-08

    tRNA-derived small RNA fragments (tRFs) are one class of small non-coding RNAs derived from transfer RNAs (tRNAs). tRFs play important roles in cellular processes and are involved in multiple cancers. High-throughput small RNA (sRNA) sequencing experiments can detect all the cellular expressed sRNAs, including tRFs. However, distinguishing genuine tRFs from RNA fragments generated by random degradation remains a major challenge. In this study, we developed an integrated web-based computing system, tRF2Cancer, to accurately identify tRFs from sRNA deep-sequencing data and evaluate their expression in multiple cancers. The binomial test was introduced to evaluate whether reads from a small RNA-seq data set represent tRFs or degraded fragments. A classification method was then used to annotate the types of tRFs based on their sites of origin in pre-tRNA or mature tRNA. We applied the pipeline to analyze 10 991 data sets from 32 types of cancers and identified thousands of expressed tRFs. A tool called 'tRFinCancer' was developed to facilitate the users to inspect the expression of tRFs across different types of cancers. Another tool called 'tRFBrowser' shows both the sites of origin and the distribution of chemical modification sites in tRFs on their source tRNA. The tRF2Cancer web server is available at http://rna.sysu.edu.cn/tRFfinder/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Draft genome sequence of a multidrug-resistant Aeromonas hydrophila ST508 strain carrying rmtD and blaCTX-M-131 isolated from a bloodstream infection.

    PubMed

    Moura, Quézia; Fernandes, Miriam R; Cerdeira, Louise; Santos, Ana Carolina M; de Souza, Tiago A; Ienne, Susan; Pignatari, Antonio Carlos C; Gales, Ana C; Silva, Rosa M; Lincopan, Nilton

    2017-09-01

    Here we report the draft genome sequence of a multidrug-resistant (MDR) Aeromonas hydrophila strain belonging to sequence type 508 (ST508) isolated from a human bloodstream infection. Assembly and annotation of this draft genome resulted in 5028498bp and revealed the presence of 16S rRNA methylase rmtD and bla CTX-M-131 genes encoding high-level resistance to aminoglycosides and cephalosporins, respectively, as well as multiple virulence genes. This draft genome can provide significant information for understanding mechanisms on the establishment and treatment of infections caused by this pathogen. Copyright © 2017 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.

  15. Insertion sequence typing of Mycobacterium tuberculosis: characterization of a widespread subtype with a single copy of IS6110.

    PubMed

    Fomukong, N G; Tang, T H; al-Maamary, S; Ibrahim, W A; Ramayah, S; Yates, M; Zainuddin, Z F; Dale, J W

    1994-12-01

    DNA fingerprinting with the insertion sequence IS6110 (also known as IS986) has become established as a major tool for investigating the spread of tuberculosis. Most strains of Mycobacterium tuberculosis have multiple copies of IS6110, but a small minority carry a single copy only. We have examined selected strains from Malaysia, Tanzania and Oman, in comparison with M. bovis isolates and BCG strains carrying one or two copies of IS6110. The insertion sequence appears to be present in the same position in all these strains, which suggests that in these organisms the element is defective in transposition and that the loss of transposability may have occurred at an early stage in the evolution of the M. tuberculosis complex.

  16. Investigation of genetic diversity and epidemiological characteristics of Pasteurella multocida isolates from poultry in southwest China by population structure, multi-locus sequence typing and virulence-associated gene profile analysis.

    PubMed

    Li, Zhangcheng; Cheng, Fangjun; Lan, Shimei; Guo, Jianhua; Liu, Wei; Li, Xiaoyan; Luo, Zeli; Zhang, Manli; Wu, Juan; Shi, Yang

    2018-04-25

    Fowl cholera caused by Pasteurella multocida has always been a disease of global importance for poultry production. The aim of this study was to obtain more information about the epidemiology of avian P. multocida infection in southwest China and the genetic characteristics of clinical isolates. P. multocida isolates were characterized by biochemical and molecular-biological methods. The distributions of the capsular serogroups, the phenotypic antimicrobial resistance profiles, lipopolysaccharide (LPS) genotyping and the presence of 19 virulence genes were investigated in 45 isolates of P. multocida that were associated with clinical disease in poultry. The genetic diversity of P. multocida strains was performed by 16S rRNA and rpoB gene sequence analysis as well as multilocus sequence typing (MLST). The results showed that most (80.0%) of the P. multocida isolates in this study represented special P. multocida subspecies, and 71.1% of the isolates showed multiple-drug resistance. 45 isolates belonged to capsular types: A (100%) and two LPS genotypes: L1 (95.6%) and L3 (4.4%). MLST revealed two new alleles (pmi77 and gdh57) and one new sequence type (ST342). ST129 types dominated in 45 P. multocida isolates. Isolates belonging to ST129 were with the genes ompH+plpB+ptfA+tonB, whereas ST342 included isolates with fur+hgbA+tonB genes. Population genetic analysis and the MLST results revealed that at least one new ST genotype was present in the avian P. multocida in China. These findings provide novel insights into the epidemiological characteristics of avian P. multocida isolates in southwest China.

  17. Free Energy Landscape and Multiple Folding Pathways of an H-Type RNA Pseudoknot

    PubMed Central

    Bian, Yunqiang; Zhang, Jian; Wang, Jun; Wang, Jihua; Wang, Wei

    2015-01-01

    How RNA sequences fold to specific tertiary structures is one of the key problems for understanding their dynamics and functions. Here, we study the folding process of an H-type RNA pseudoknot by performing a large-scale all-atom MD simulation and bias-exchange metadynamics. The folding free energy landscapes are obtained and several folding intermediates are identified. It is suggested that the folding occurs via multiple mechanisms, including a step-wise mechanism starting either from the first helix or the second, and a cooperative mechanism with both helices forming simultaneously. Despite of the multiple mechanism nature, the ensemble folding kinetics estimated from a Markov state model is single-exponential. It is also found that the correlation between folding and binding of metal ions is significant, and the bound ions mediate long-range interactions in the intermediate structures. Non-native interactions are found to be dominant in the unfolded state and also present in some intermediates, possibly hinder the folding process of the RNA. PMID:26030098

  18. Asynchronous, Decentralized DS-CDMA Using Feedback-Controlled Spreading Sequences for Time-Dispersive Channels

    NASA Astrophysics Data System (ADS)

    Miyatake, Teruhiko; Chiba, Kazuki; Hamamura, Masanori; Tachikawa, Shin'ichi

    We propose a novel asynchronous direct-sequence codedivision multiple access (DS-CDMA) using feedback-controlled spreading sequences (FCSSs) (FCSS/DS-CDMA). At the receiver of FCSS/DS-CDMA, the code-orthogonalizing filter (COF) produces a spreading sequence, and the receiver returns the spreading sequence to the transmitter. Then the transmitter uses the spreading sequence as its updated version. The performance of FCSS/DS-CDMA is evaluated over time-dispersive channels. The results indicate that FCSS/DS-CDMA greatly suppresses both the intersymbol interference (ISI) and multiple access interference (MAI) over time-invariant channels. FCSS/DS-CDMA is applicable to the decentralized multiple access.

  19. CRISPR interference and priming varies with individual spacer sequences

    PubMed Central

    Xue, Chaoyou; Seetharam, Arun S.; Musharova, Olga; Severinov, Konstantin; J. Brouns, Stan J.; Severin, Andrew J.; Sashital, Dipali G.

    2015-01-01

    CRISPR–Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) systems allow bacteria to adapt to infection by acquiring ‘spacer’ sequences from invader DNA into genomic CRISPR loci. Cas proteins use RNAs derived from these loci to target cognate sequences for destruction through CRISPR interference. Mutations in the protospacer adjacent motif (PAM) and seed regions block interference but promote rapid ‘primed’ adaptation. Here, we use multiple spacer sequences to reexamine the PAM and seed sequence requirements for interference and priming in the Escherichia coli Type I-E CRISPR–Cas system. Surprisingly, CRISPR interference is far more tolerant of mutations in the seed and the PAM than previously reported, and this mutational tolerance, as well as priming activity, is highly dependent on spacer sequence. We identify a large number of functional PAMs that can promote interference, priming or both activities, depending on the associated spacer sequence. Functional PAMs are preferentially acquired during unprimed ‘naïve’ adaptation, leading to a rapid priming response following infection. Our results provide numerous insights into the importance of both spacer and target sequences for interference and priming, and reveal that priming is a major pathway for adaptation during initial infection. PMID:26586800

  20. Genetic diversity of pneumococcal surface protein A in invasive pneumococcal isolates from Korean children, 1991-2016.

    PubMed

    Yun, Ki Wook; Choi, Eun Hwa; Lee, Hoan Jong

    2017-01-01

    Pneumococcal surface protein A (PspA) is an important virulence factor of pneumococci and has been investigated as a primary component of a capsular serotype-independent pneumococcal vaccine. Thus, we sought to determine the genetic diversity of PspA to explore its potential as a vaccine candidate. Among the 190 invasive pneumococcal isolates collected from Korean children between 1991 and 2016, two (1.1%) isolates were found to have no pspA by multiple polymerase chain reactions. The full length pspA genes from 185 pneumococcal isolates were sequenced. The length of pspA varied, ranging from 1,719 to 2,301 base pairs with 55.7-100% nucleotide identity. Based on the sequences of the clade-defining regions, 68.7% and 49.7% were in PspA family 2 and clade 3/family 2, respectively. PspA clade types were correlated with genotypes using multilocus sequence typing and divided into several subclades based on diversity analysis of the N-terminal α-helical regions, which showed nucleotide sequence identities of 45.7-100% and amino acid sequence identities of 23.1-100%. Putative antigenicity plots were also diverse among individual clades and subclades. The differences in antigenicity patterns were concentrated within the N-terminal 120 amino acids. In conclusion, the N-terminal α-helical domain, which is known to be the major immunogenic portion of PspA, is genetically variable and should be further evaluated for antigenic differences and cross-reactivity between various PspA types from pneumococcal isolates.

  1. Photosynthesis Is Widely Distributed among Proteobacteria as Demonstrated by the Phylogeny of PufLM Reaction Center Proteins

    PubMed Central

    Imhoff, Johannes F.; Rahn, Tanja; Künzel, Sven; Neulinger, Sven C.

    2018-01-01

    Two different photosystems for performing bacteriochlorophyll-mediated photosynthetic energy conversion are employed in different bacterial phyla. Those bacteria employing a photosystem II type of photosynthetic apparatus include the phototrophic purple bacteria (Proteobacteria), Gemmatimonas and Chloroflexus with their photosynthetic relatives. The proteins of the photosynthetic reaction center PufL and PufM are essential components and are common to all bacteria with a type-II photosynthetic apparatus, including the anaerobic as well as the aerobic phototrophic Proteobacteria. Therefore, PufL and PufM proteins and their genes are perfect tools to evaluate the phylogeny of the photosynthetic apparatus and to study the diversity of the bacteria employing this photosystem in nature. Almost complete pufLM gene sequences and the derived protein sequences from 152 type strains and 45 additional strains of phototrophic Proteobacteria employing photosystem II were compared. The results give interesting and comprehensive insights into the phylogeny of the photosynthetic apparatus and clearly define Chromatiales, Rhodobacterales, Sphingomonadales as major groups distinct from other Alphaproteobacteria, from Betaproteobacteria and from Caulobacterales (Brevundimonas subvibrioides). A special relationship exists between the PufLM sequences of those bacteria employing bacteriochlorophyll b instead of bacteriochlorophyll a. A clear phylogenetic association of aerobic phototrophic purple bacteria to anaerobic purple bacteria according to their PufLM sequences is demonstrated indicating multiple evolutionary lines from anaerobic to aerobic phototrophic purple bacteria. The impact of pufLM gene sequences for studies on the environmental diversity of phototrophic bacteria is discussed and the possibility of their identification on the species level in environmental samples is pointed out. PMID:29472894

  2. High-Molecular-Mass Multi-c-Heme Cytochromes from Methylococcus capsulatus Bath†

    PubMed Central

    Bergmann, David J.; Zahn, James A.; DiSpirito, Alan A.

    1999-01-01

    The polypeptide and structural gene for a high-molecular-mass c-type cytochrome, cytochrome c553O, was isolated from the methanotroph Methylococcus capsulatus Bath. Cytochrome c553O is a homodimer with a subunit molecular mass of 124,350 Da and an isoelectric point of 6.0. The heme c concentration was estimated to be 8.2 ± 0.4 mol of heme c per subunit. The electron paramagnetic resonance spectrum showed the presence of multiple low spin, S = 1/2, hemes. A degenerate oligonucleotide probe synthesized based on the N-terminal amino acid sequence of cytochrome c553O was used to identify a DNA fragment from M. capsulatus Bath that contains occ, the gene encoding cytochrome c553O. occ is part of a gene cluster which contains three other open reading frames (ORFs). ORF1 encodes a putative periplasmic c-type cytochrome with a molecular mass of 118,620 Da that shows approximately 40% amino acid sequence identity with occ and contains nine c-heme-binding motifs. ORF3 encodes a putative periplasmic c-type cytochrome with a molecular mass of 94,000 Da and contains seven c-heme-binding motifs but shows no sequence homology to occ or ORF1. ORF4 encodes a putative 11,100-Da protein. The four ORFs have no apparent similarity to any proteins in the GenBank database. The subunit molecular masses, arrangement and number of hemes, and amino acid sequences demonstrate that cytochrome c553O and the gene products of ORF1 and ORF3 constitute a new class of c-type cytochrome. PMID:9922265

  3. Volcanic Soils as Sources of Novel CO-Oxidizing Paraburkholderia and Burkholderia: Paraburkholderia hiiakae sp. nov., Paraburkholderia metrosideri sp. nov., Paraburkholderia paradisi sp. nov., Paraburkholderia peleae sp. nov., and Burkholderia alpina sp. nov. a Member of the Burkholderia cepacia Complex

    PubMed Central

    Weber, Carolyn F.; King, Gary M.

    2017-01-01

    Previous studies showed that members of the Burkholderiales were important in the succession of aerobic, molybdenum-dependent CO oxidizing-bacteria on volcanic soils. During these studies, four isolates were obtained from Kilauea Volcano (Hawai‘i, USA); one strain was isolated from Pico de Orizaba (Mexico) during a separate study. Based on 16S rRNA gene sequence similarities, the Pico de Orizaba isolate and the isolates from Kilauea Volcano were provisionally assigned to the genera Burkholderia and Paraburkholderia, respectively. Each of the isolates possessed a form I coxL gene that encoded the catalytic subunit of carbon monoxide dehydrogenase (CODH); none of the most closely related type strains possessed coxL or oxidized CO. Genome sequences for Paraburkholderia type strains facilitated an analysis of 16S rRNA gene sequence similarities and average nucleotide identities (ANI). ANI did not exceed 95% (the recommended cutoff for species differentiation) for any of the pairwise comparisons among 27 reference strains related to the new isolates. However, since the highest 16S rRNA gene sequence similarity among this set of reference strains was 98.93%, DNA-DNA hybridizations (DDH) were performed for two isolates whose 16S rRNA gene sequence similarities with their nearest phylogenetic neighbors were 98.96 and 99.11%. In both cases DDH values were <16%. Based on multiple variables, four of the isolates represent novel species within the Paraburkholderia: Paraburkholderia hiiakae sp. nov. (type strain I2T = DSM 28029T = LMG 27952T); Paraburkholderia paradisi sp. nov. (type strain WAT = DSM 28027T = LMG 27949T); Paraburkholderia peleae sp. nov. (type strain PP52-1T = DSM 28028T = LMG 27950T); and Paraburkholderia metrosideri sp. nov. (type strain DNBP6-1T = DSM 28030T = LMG 28140T). The remaining isolate represents the first CO-oxidizing member of the Burkholderia cepacia complex: Burkholderia alpina sp. nov. (type strain PO-04-17-38T = DSM 28031T = LMG 28138T). PMID:28270796

  4. Volcanic Soils as Sources of Novel CO-Oxidizing Paraburkholderia and Burkholderia: Paraburkholderia hiiakae sp. nov., Paraburkholderia metrosideri sp. nov., Paraburkholderia paradisi sp. nov., Paraburkholderia peleae sp. nov., and Burkholderia alpina sp. nov. a Member of the Burkholderia cepacia Complex.

    PubMed

    Weber, Carolyn F; King, Gary M

    2017-01-01

    Previous studies showed that members of the Burkholderiales were important in the succession of aerobic, molybdenum-dependent CO oxidizing-bacteria on volcanic soils. During these studies, four isolates were obtained from Kilauea Volcano (Hawai'i, USA); one strain was isolated from Pico de Orizaba (Mexico) during a separate study. Based on 16S rRNA gene sequence similarities, the Pico de Orizaba isolate and the isolates from Kilauea Volcano were provisionally assigned to the genera Burkholderia and Paraburkholderia , respectively. Each of the isolates possessed a form I coxL gene that encoded the catalytic subunit of carbon monoxide dehydrogenase (CODH); none of the most closely related type strains possessed coxL or oxidized CO. Genome sequences for Paraburkholderia type strains facilitated an analysis of 16S rRNA gene sequence similarities and average nucleotide identities (ANI). ANI did not exceed 95% (the recommended cutoff for species differentiation) for any of the pairwise comparisons among 27 reference strains related to the new isolates. However, since the highest 16S rRNA gene sequence similarity among this set of reference strains was 98.93%, DNA-DNA hybridizations (DDH) were performed for two isolates whose 16S rRNA gene sequence similarities with their nearest phylogenetic neighbors were 98.96 and 99.11%. In both cases DDH values were <16%. Based on multiple variables, four of the isolates represent novel species within the Paraburkholderia : Paraburkholderia hiiakae sp. nov. (type strain I2 T = DSM 28029 T = LMG 27952 T ); Paraburkholderia paradisi sp. nov. (type strain WA T = DSM 28027 T = LMG 27949 T ); Paraburkholderia peleae sp. nov. (type strain PP52-1 T = DSM 28028 T = LMG 27950 T ); and Paraburkholderia metrosideri sp. nov. (type strain DNBP6-1 T = DSM 28030 T = LMG 28140 T ). The remaining isolate represents the first CO-oxidizing member of the Burkholderia cepacia complex: Burkholderia alpina sp. nov. (type strain PO-04-17-38 T = DSM 28031 T = LMG 28138 T ).

  5. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

    PubMed

    Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske

    2007-02-14

    The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.

  6. [Molecular typing methods for Pasteurella multocida-A review].

    PubMed

    Peng, Zhong; Liang, Wan; Wu, Bin

    2016-10-04

    Pasteurella multocida is an important gram-negative pathogenic bacterium that could infect wide ranges of animals. Humans could also be infected by P. multocida via animal bite or scratching. Current typing methods for P. multocida include serological typing methods and molecular typing methods. Of them, serological typing methods are based on immunological assays, which are too complicated for clinical bacteriological studies. However, the molecular methods including multiple PCRs and multilocus sequence typing (MLST) methods are more suitable for bacteriological studies of P. multocida in clinic, with their simple operation, high efficiency and accurate detection compared to the traditional serological typing methods, they are therefore widely used. In the current review, we briefly describe the molecular typing methods for P. multocida. Our aim is to provide a knowledge-foundation for clinical bacteriological investigation especially the molecular investigation for P. multocida.

  7. Mumps virus F gene and HN gene sequencing as a molecular tool to study mumps virus transmission.

    PubMed

    Gouma, Sigrid; Cremer, Jeroen; Parkkali, Saara; Veldhuijzen, Irene; van Binnendijk, Rob S; Koopmans, Marion P G

    2016-11-01

    Various mumps outbreaks have occurred in the Netherlands since 2004, particularly among persons who had received 2 doses of measles, mumps, and rubella (MMR) vaccination. Genomic typing of pathogens can be used to track outbreaks, but the established genotyping of mumps virus based on the small hydrophobic (SH) gene sequences did not provide sufficient resolution. Therefore, we expanded the sequencing to include fusion (F) gene and haemagglutinin-neuraminidase (HN) gene sequences in addition to the SH gene sequences from 109 mumps virus genotype G strains obtained between 2004 and mid 2015 in the Netherlands. When the molecular information from these 3 genes was combined, we were able to identify separate mumps virus clusters and track mumps virus transmission. The analyses suggested that multiple mumps virus introductions occurred in the Netherlands between 2004 and 2015 resulting in several mumps outbreaks throughout this period, whereas during some local outbreaks the molecular data pointed towards endemic circulation. Combined analysis of epidemiological data and sequence data collected in 2015 showed good support for the phylogenetic clustering. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Transcriptogenomics identification and characterization of RNA editing sites in human primary monocytes using high-depth next generation sequencing data.

    PubMed

    Leong, Wai-Mun; Ripen, Adiratna Mat; Mirsafian, Hoda; Mohamad, Saharuddin Bin; Merican, Amir Feisal

    2018-06-07

    High-depth next generation sequencing data provide valuable insights into the number and distribution of RNA editing events. Here, we report the RNA editing events at cellular level of human primary monocyte using high-depth whole genomic and transcriptomic sequencing data. We identified over a ten thousand putative RNA editing sites and 69% of the sites were A-to-I editing sites. The sites enriched in repetitive sequences and intronic regions. High-depth sequencing datasets revealed that 90% of the canonical sites were edited at lower frequencies (<0.7). Single and multiple human monocytes and brain tissues samples were analyzed through genome sequence independent approach. The later approach was observed to identify more editing sites. Monocytes was observed to contain more C-to-U editing sites compared to brain tissues. Our results establish comparable pipeline that can address current limitations as well as demonstrate the potential for highly sensitive detection of RNA editing events in single cell type. Copyright © 2018 Elsevier Inc. All rights reserved.

  9. Genetic diversity of Babesia bovis in virulent and attenuated strains.

    PubMed

    Mazuz, M L; Molad, T; Fish, L; Leibovitz, B; Wolkomirsky, R; Fleiderovitz, L; Shkap, V

    2012-03-01

    The aim of this study was to compare the genetic diversity of the single copy Bv80 gene sequences of Babesia bovis in populations of attenuated and virulent parasites. PCR/ RT-PCR followed by cloning and sequence analyses of 4 attenuated and 4 virulent strains were performed. Multiple fragments in the range of 420 to 744 bp were amplified by PCR or RT-PCR. Cloning of the PCR fragments and sequence analyses revealed the presence of mixed subpopulations in either virulent or attenuated parasites with a total of 19 variants with 12 different sequences that differed in number and type of tandem repeats. High levels of intra- and inter-strain diversity of the Bv80 gene, with the presence of mixed populations of parasites were found in both the virulent field isolates and the attenuated vaccine strains. In addition, during the attenuation process, sequence analyses showed changes in the pattern of the parasite subpopulations. Despite high polymorphism found by sequence analyses, the patterns observed and the number of repeats, order, or motifs found could not discriminate between virulent field isolates and attenuated vaccine strains of the parasite.

  10. High resolution identity testing of inactivated poliovirus vaccines

    PubMed Central

    Mee, Edward T.; Minor, Philip D.; Martin, Javier

    2015-01-01

    Background Definitive identification of poliovirus strains in vaccines is essential for quality control, particularly where multiple wild-type and Sabin strains are produced in the same facility. Sequence-based identification provides the ultimate in identity testing and would offer several advantages over serological methods. Methods We employed random RT-PCR and high throughput sequencing to recover full-length genome sequences from monovalent and trivalent poliovirus vaccine products at various stages of the manufacturing process. Results All expected strains were detected in previously characterised products and the method permitted identification of strains comprising as little as 0.1% of sequence reads. Highly similar Mahoney and Sabin 1 strains were readily discriminated on the basis of specific variant positions. Analysis of a product known to contain incorrect strains demonstrated that the method correctly identified the contaminants. Conclusion Random RT-PCR and shotgun sequencing provided high resolution identification of vaccine components. In addition to the recovery of full-length genome sequences, the method could also be easily adapted to the characterisation of minor variant frequencies and distinction of closely related products on the basis of distinguishing consensus and low frequency polymorphisms. PMID:26049003

  11. BIOPEP database and other programs for processing bioactive peptide sequences.

    PubMed

    Minkiewicz, Piotr; Dziuba, Jerzy; Iwaniak, Anna; Dziuba, Marta; Darewicz, Małgorzata

    2008-01-01

    This review presents the potential for application of computational tools in peptide science based on a sample BIOPEP database and program as well as other programs and databases available via the World Wide Web. The BIOPEP application contains a database of biologically active peptide sequences and a program enabling construction of profiles of the potential biological activity of protein fragments, calculation of quantitative descriptors as measures of the value of proteins as potential precursors of bioactive peptides, and prediction of bonds susceptible to hydrolysis by endopeptidases in a protein chain. Other bioactive and allergenic peptide sequence databases are also presented. Programs enabling the construction of binary and multiple alignments between peptide sequences, the construction of sequence motifs attributed to a given type of bioactivity, searching for potential precursors of bioactive peptides, and the prediction of sites susceptible to proteolytic cleavage in protein chains are available via the Internet as are other approaches concerning secondary structure prediction and calculation of physicochemical features based on amino acid sequence. Programs for prediction of allergenic and toxic properties have also been developed. This review explores the possibilities of cooperation between various programs.

  12. Comparison of seven techniques for typing international epidemic strains of Clostridium difficile: restriction endonuclease analysis, pulsed-field gel electrophoresis, PCR-ribotyping, multilocus sequence typing, multilocus variable-number tandem-repeat analysis, amplified fragment length polymorphism, and surface layer protein A gene sequence typing.

    PubMed

    Killgore, George; Thompson, Angela; Johnson, Stuart; Brazier, Jon; Kuijper, Ed; Pepin, Jacques; Frost, Eric H; Savelkoul, Paul; Nicholson, Brad; van den Berg, Renate J; Kato, Haru; Sambol, Susan P; Zukowski, Walter; Woods, Christopher; Limbago, Brandi; Gerding, Dale N; McDonald, L Clifford

    2008-02-01

    Using 42 isolates contributed by laboratories in Canada, The Netherlands, the United Kingdom, and the United States, we compared the results of analyses done with seven Clostridium difficile typing techniques: multilocus variable-number tandem-repeat analysis (MLVA), amplified fragment length polymorphism (AFLP), surface layer protein A gene sequence typing (slpAST), PCR-ribotyping, restriction endonuclease analysis (REA), multilocus sequence typing (MLST), and pulsed-field gel electrophoresis (PFGE). We assessed the discriminating ability and typeability of each technique as well as the agreement among techniques in grouping isolates by allele profile A (AP-A) through AP-F, which are defined by toxinotype, the presence of the binary toxin gene, and deletion in the tcdC gene. We found that all isolates were typeable by all techniques and that discrimination index scores for the techniques tested ranged from 0.964 to 0.631 in the following order: MLVA, REA, PFGE, slpAST, PCR-ribotyping, MLST, and AFLP. All the techniques were able to distinguish the current epidemic strain of C. difficile (BI/027/NAP1) from other strains. All of the techniques showed multiple types for AP-A (toxinotype 0, binary toxin negative, and no tcdC gene deletion). REA, slpAST, MLST, and PCR-ribotyping all included AP-B (toxinotype III, binary toxin positive, and an 18-bp deletion in tcdC) in a single group that excluded other APs. PFGE, AFLP, and MLVA grouped two, one, and two different non-AP-B isolates, respectively, with their AP-B isolates. All techniques appear to be capable of detecting outbreak strains, but only REA and MLVA showed sufficient discrimination to distinguish strains from different outbreaks.

  13. Characterization of Human Papillomavirus Type 154 and Tissue Tropism of Gammapapillomaviruses

    PubMed Central

    Ure, Agustín Enrique; Forslund, Ola

    2014-01-01

    The novel human papillomavirus type 154 (HPV154) was characterized from a wart on the crena ani of a three-year-old boy. It was previously designated as the putative HPV type FADI3 by sequencing of a subgenomic FAP amplicon. We obtained the complete genome by combined methods including rolling circle amplification (RCA), genome walking through an adapted method for detection of integrated papillomavirus sequences by ligation-mediated PCR (DIPS-PCR), long-range PCR, and finally by cloning of four overlapping amplicons. Phylogenetically, the HPV154 genome clustered together with members of the proposed species Gammapapillomavirus 11, and demonstrated the highest identity in L1 to HPV136 (68.6%). The HPV154 was detected in 3% (2/62) of forehead skin swabs from healthy children. In addition, the different detection sites of 62 gammapapillomaviruses were summarized in order to analyze their tissue tropism. Several of these HPV types have been detected from multiple sources such as skin, oral, nasal, and genital sites, suggesting that the gammapapillomaviruses are generalists with a broader tissue tropism than previously appreciated. The study expands current knowledge concerning genetic diversity and tropism among HPV types in the rapidly growing gammapapillomavirus genus. PMID:24551244

  14. Development of a multiplex polymerase chain reaction-sequence-specific primer method for NKG2D and NKG2F single-nucleotide polymorphism typing using isothermal multiple displacement amplification products.

    PubMed

    Kaewmanee, M; Phoksawat, W; Romphruk, A; Romphruk, A V; Jumnainsong, A; Leelayuwat, C

    2013-06-01

    Natural killer group 2 member D (NKG2D) on immune effector cells recognizes multiple stress-inducible ligands. NKG2D single-nucleotide polymorphism (SNP) haplotypes were related to the levels of cytotoxic activity of peripheral blood mononuclear cells. Indeed, these polymorphisms were also located in NKG2F. Isothermal multiple displacement amplification (IMDA) is used for whole genome amplification (WGA) that can amplify very small genomic DNA templates into microgram with whole genome coverage. This is particularly useful in the cases of limited amount of valuable DNA samples requiring multi-locus genotyping. In this study, we evaluated the quality and applicability of IMDA to genetic studies in terms of sensitivity, efficiency of IMDA re-amplification and stability of IMDA products. The smallest amount of DNA to be effectively amplified by IMDA was 200 pg yielding final DNA of approximately 16 µg within 1.5 h. IMDA could be re-amplified only once (second round of amplification), and could be kept for 5 months at 4°C and more than a year at -20°C without loosing genome coverage. The amplified products were used successfully to setup a multiplex polymerase chain reaction-sequence-specific primer for SNP typing of the NKG2D/F genes. The NKG2D/F multiplex polymerase chain reaction (PCR) contained six PCR mixtures for detecting 10 selected SNPs, including 8 NKG2D/F SNP haplotypes and 2 additional NKG2D coding SNPs. This typing procedure will be applicable in both clinical and research laboratories. Thus, our data provide useful information and limitations for utilization of genome-wide amplification using IMDA and its application for multiplex NKG2D/F typing. © 2013 John Wiley & Sons Ltd.

  15. The human clone ST22 SCCmec IV methicillin-resistant Staphylococcus aureus isolated from swine herds and wild primates in Nepal: is man the common source?

    PubMed

    Roberts, Marilyn C; Joshi, Prabhu Raj; Greninger, Alexander L; Melendez, Daira; Paudel, Saroj; Acharya, Mahesh; Bimali, Nabin Kishor; Koju, Narayan P; No, David; Chalise, Mukesh; Kyes, Randall C

    2018-05-01

    Swine nasal samples [n = 282] were collected from 12 randomly selected farms around Kathmandu, Nepal, from healthy animals. In addition, wild monkey (Macaca mulatta) saliva samples [n = 59] were collected near temples areas in Kathmandu using a non-invasive sampling technique. All samples were processed for MRSA using standardized selective media and conventional biochemical tests. MRSA verification was done and isolates characterized by SCCmec, multilocus sequence typing, whole genome sequencing [WGS] and antibiotic susceptibilities. Six (2.1%) swine MRSA were isolated from five of the different swine herds tested, five were ST22 type IV and one ST88 type V. Four (6.8%) macaques MRSA were isolated, with three ST22 SCCmec type IV and one ST239 type III. WGS sequencing showed that the eight ciprofloxacin resistant ST22 isolates carried gyrA mutation [S84L]. Six isolates carried the erm(C) genes, five isolates carried aacC-aphD genes and four isolates carried blaZ genes. The swine linezolid resistant ST22 did not carry any known acquired linezolid resistance genes but had a mutation in ribosomal protein L22 [A29V] and an insertion in L4 [68KG69], both previously associated with linezolid resistance. Multiple virulence factors were also identified. This is the first time MRSA ST22 SCCmec IV has been isolated from livestock or primates.

  16. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    PubMed

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  17. Limitations of variable number of tandem repeat typing identified through whole genome sequencing of Mycobacterium avium subsp. paratuberculosis on a national and herd level.

    PubMed

    Ahlstrom, Christina; Barkema, Herman W; Stevenson, Karen; Zadoks, Ruth N; Biek, Roman; Kao, Rowland; Trewby, Hannah; Haupstein, Deb; Kelton, David F; Fecteau, Gilles; Labrecque, Olivia; Keefe, Greg P; McKenna, Shawn L B; De Buck, Jeroen

    2015-03-08

    Mycobacterium avium subsp. paratuberculosis (MAP), the causative bacterium of Johne's disease in dairy cattle, is widespread in the Canadian dairy industry and has significant economic and animal welfare implications. An understanding of the population dynamics of MAP can be used to identify introduction events, improve control efforts and target transmission pathways, although this requires an adequate understanding of MAP diversity and distribution between herds and across the country. Whole genome sequencing (WGS) offers a detailed assessment of the SNP-level diversity and genetic relationship of isolates, whereas several molecular typing techniques used to investigate the molecular epidemiology of MAP, such as variable number of tandem repeat (VNTR) typing, target relatively unstable repetitive elements in the genome that may be too unpredictable to draw accurate conclusions. The objective of this study was to evaluate the diversity of bovine MAP isolates in Canadian dairy herds using WGS and then determine if VNTR typing can distinguish truly related and unrelated isolates. Phylogenetic analysis based on 3,039 SNPs identified through WGS of 124 MAP isolates identified eight genetically distinct subtypes in dairy herds from seven Canadian provinces, with the dominant type including over 80% of MAP isolates. VNTR typing of 527 MAP isolates identified 12 types, including "bison type" isolates, from seven different herds. At a national level, MAP isolates differed from each other by 1-2 to 239-240 SNPs, regardless of whether they belonged to the same or different VNTR types. A herd-level analysis of MAP isolates demonstrated that VNTR typing may both over-estimate and under-estimate the relatedness of MAP isolates found within a single herd. The presence of multiple MAP subtypes in Canada suggests multiple introductions into the country including what has now become one dominant type, an important finding for Johne's disease control. VNTR typing often failed to identify closely and distantly related isolates, limiting the applicability of using this typing scheme to study the molecular epidemiology of MAP at a national and herd-level.

  18. A core microbiome associated with the peritoneal tumors of pseudomyxoma peritonei

    PubMed Central

    2013-01-01

    Background Pseudomyxoma peritonei (PMP) is a malignancy characterized by dissemination of mucus-secreting cells throughout the peritoneum. This disease is associated with significant morbidity and mortality and despite effective treatment options for early-stage disease, patients with PMP often relapse. Thus, there is a need for additional treatment options to reduce relapse rate and increase long-term survival. A previous study identified the presence of both typed and non-culturable bacteria associated with PMP tissue and determined that increased bacterial density was associated with more severe disease. These findings highlighted the possible role for bacteria in PMP disease. Methods To more clearly define the bacterial communities associated with PMP disease, we employed a sequenced-based analysis to profile the bacterial populations found in PMP tumor and mucin tissue in 11 patients. Sequencing data were confirmed by in situ hybridization at multiple taxonomic depths and by culturing. A pilot clinical study was initiated to determine whether the addition of antibiotic therapy affected PMP patient outcome. Main results We determined that the types of bacteria present are highly conserved in all PMP patients; the dominant phyla are the Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes. A core set of taxon-specific sequences were found in all 11 patients; many of these sequences were classified into taxonomic groups that also contain known human pathogens. In situ hybridization directly confirmed the presence of bacteria in PMP at multiple taxonomic depths and supported our sequence-based analysis. Furthermore, culturing of PMP tissue samples allowed us to isolate 11 different bacterial strains from eight independent patients, and in vitro analysis of subset of these isolates suggests that at least some of these strains may interact with the PMP-associated mucin MUC2. Finally, we provide evidence suggesting that targeting these bacteria with antibiotic treatment may increase the survival of PMP patients. Conclusions Using 16S amplicon-based sequencing, direct in situ hybridization analysis and culturing methods, we have identified numerous bacterial taxa that are consistently present in all PMP patients tested. Combined with data from a pilot clinical study, these data support the hypothesis that adding antimicrobials to the standard PMP treatment could improve PMP patient survival. PMID:23844722

  19. Using a Sequence of Earcons to Monitor Multiple Simulated Patients.

    PubMed

    Hickling, Anna; Brecknell, Birgit; Loeb, Robert G; Sanderson, Penelope

    2017-03-01

    The aim of this study was to determine whether a sequence of earcons can effectively convey the status of multiple processes, such as the status of multiple patients in a clinical setting. Clinicians often monitor multiple patients. An auditory display that intermittently conveys the status of multiple patients may help. Nonclinician participants listened to sequences of 500-ms earcons that each represented the heart rate (HR) and oxygen saturation (SpO 2 ) levels of a different simulated patient. In each sequence, one, two, or three patients had an abnormal level of HR and/or SpO 2 . In Experiment 1, participants reported which of nine patients in a sequence were abnormal. In Experiment 2, participants identified the vital signs of one, two, or three abnormal patients in sequences of one, five, or nine patients, where the interstimulus interval (ISI) between earcons was 150 ms. Experiment 3 used the five-sequence condition of Experiment 2, but the ISI was either 150 ms or 800 ms. Participants reported which patient(s) were abnormal with median 95% accuracy. Identification accuracy for vital signs decreased as the number of abnormal patients increased from one to three, p < .001, but accuracy was unaffected by number of patients in a sequence. Overall, identification accuracy was significantly higher with an ISI of 800 ms (89%) compared with an ISI of 150 ms (83%), p < .001. A multiple-patient display can be created by cycling through earcons that represent individual patients. The principles underlying the multiple-patient display can be extended to other vital signs, designs, and domains.

  20. Multiple Use One-Sided Hypotheses Testing in Univariate Linear Calibration

    NASA Technical Reports Server (NTRS)

    Krishnamoorthy, K.; Kulkarni, Pandurang M.; Mathew, Thomas

    1996-01-01

    Consider a normally distributed response variable, related to an explanatory variable through the simple linear regression model. Data obtained on the response variable, corresponding to known values of the explanatory variable (i.e., calibration data), are to be used for testing hypotheses concerning unknown values of the explanatory variable. We consider the problem of testing an unlimited sequence of one sided hypotheses concerning the explanatory variable, using the corresponding sequence of values of the response variable and the same set of calibration data. This is the situation of multiple use of the calibration data. The tests derived in this context are characterized by two types of uncertainties: one uncertainty associated with the sequence of values of the response variable, and a second uncertainty associated with the calibration data. We derive tests based on a condition that incorporates both of these uncertainties. The solution has practical applications in the decision limit problem. We illustrate our results using an example dealing with the estimation of blood alcohol concentration based on breath estimates of the alcohol concentration. In the example, the problem is to test if the unknown blood alcohol concentration of an individual exceeds a threshold that is safe for driving.

  1. Germline mutations in candidate predisposition genes in individuals with cutaneous melanoma and at least two independent additional primary cancers.

    PubMed

    Pritchard, Antonia L; Johansson, Peter A; Nathan, Vaishnavi; Howlie, Madeleine; Symmons, Judith; Palmer, Jane M; Hayward, Nicholas K

    2018-01-01

    While a number of autosomal dominant and autosomal recessive cancer syndromes have an associated spectrum of cancers, the prevalence and variety of cancer predisposition mutations in patients with multiple primary cancers have not been extensively investigated. An understanding of the variants predisposing to more than one cancer type could improve patient care, including screening and genetic counselling, as well as advancing the understanding of tumour development. A cohort of 57 patients ascertained due to their cutaneous melanoma (CM) diagnosis and with a history of two or more additional non-cutaneous independent primary cancer types were recruited for this study. Patient blood samples were assessed by whole exome or whole genome sequencing. We focussed on variants in 525 pre-selected genes, including 65 autosomal dominant and 31 autosomal recessive cancer predisposition genes, 116 genes involved in the DNA repair pathway, and 313 commonly somatically mutated in cancer. The same genes were analysed in exome sequence data from 1358 control individuals collected as part of non-cancer studies (UK10K). The identified variants were classified for pathogenicity using online databases, literature and in silico prediction tools. No known pathogenic autosomal dominant or previously described compound heterozygous mutations in autosomal recessive genes were observed in the multiple cancer cohort. Variants typically found somatically in haematological malignancies (in JAK1, JAK2, SF3B1, SRSF2, TET2 and TYK2) were present in lymphocyte DNA of patients with multiple primary cancers, all of whom had a history of haematological malignancy and cutaneous melanoma, as well as colorectal cancer and/or prostate cancer. Other potentially pathogenic variants were discovered in BUB1B, POLE2, ROS1 and DNMT3A. Compared to controls, multiple cancer cases had significantly more likely damaging mutations (nonsense, frameshift ins/del) in tumour suppressor and tyrosine kinase genes and higher overall burden of mutations in all cancer genes. We identified several pathogenic variants that likely predispose to at least one of the tumours in patients with multiple cancers. We additionally present evidence that there may be a higher burden of variants of unknown significance in 'cancer genes' in patients with multiple cancer types. Further screens of this nature need to be carried out to build evidence to show if the cancers observed in these patients form part of a cancer spectrum associated with single germline variants in these genes, whether multiple layers of susceptibility exist (oligogenic or polygenic), or if the occurrence of multiple different cancers is due to random chance.

  2. A multiplexable TALE-based binary expression system for in vivo cellular interaction studies.

    PubMed

    Toegel, Markus; Azzam, Ghows; Lee, Eunice Y; Knapp, David J H F; Tan, Ying; Fa, Ming; Fulga, Tudor A

    2017-11-21

    Binary expression systems have revolutionised genetic research by enabling delivery of loss-of-function and gain-of-function transgenes with precise spatial-temporal resolution in vivo. However, at present, each existing platform relies on a defined exogenous transcription activator capable of binding a unique recognition sequence. Consequently, none of these technologies alone can be used to simultaneously target different tissues or cell types in the same organism. Here, we report a modular system based on programmable transcription activator-like effector (TALE) proteins, which enables parallel expression of multiple transgenes in spatially distinct tissues in vivo. Using endogenous enhancers coupled to TALE drivers, we demonstrate multiplexed orthogonal activation of several transgenes carrying cognate variable activating sequences (VAS) in distinct neighbouring cell types of the Drosophila central nervous system. Since the number of combinatorial TALE-VAS pairs is virtually unlimited, this platform provides an experimental framework for highly complex genetic manipulation studies in vivo.

  3. A minimalist model protein with multiple folding funnels

    PubMed Central

    Locker, C. Rebecca; Hernandez, Rigoberto

    2001-01-01

    Kinetic and structural studies of wild-type proteins such as prions and amyloidogenic proteins provide suggestive evidence that proteins may adopt multiple long-lived states in addition to the native state. All of these states differ structurally because they lie far apart in configuration space, but their stability is not necessarily caused by cooperative (nucleation) effects. In this study, a minimalist model protein is designed to exhibit multiple long-lived states to explore the dynamics of the corresponding wild-type proteins. The minimalist protein is modeled as a 27-monomer sequence confined to a cubic lattice with three different monomer types. An order parameter—the winding index—is introduced to characterize the extent of folding. The winding index has several advantages over other commonly used order parameters like the number of native contacts. It can distinguish between enantiomers, its calculation requires less computational time than the number of native contacts, and reduced-dimensional landscapes can be developed when the native state structure is not known a priori. The results for the designed model protein prove by existence that the rugged energy landscape picture of protein folding can be generalized to include protein “misfolding” into long-lived states. PMID:11470921

  4. Genetic characterization of Anaplasma marginale strains from Tunisia using single and multiple gene typing reveals novel variants with an extensive genetic diversity.

    PubMed

    Ben Said, Mourad; Ben Asker, Alaa; Belkahia, Hanène; Ghribi, Raoua; Selmi, Rachid; Messadi, Lilia

    2018-05-12

    Anaplasma marginale, which is responsible for bovine anaplasmosis in tropical and subtropical regions, is a tick-borne obligatory intraerythrocytic bacterium of cattle and wild ruminants. In Tunisia, information about the genetic diversity and the phylogeny of A. marginale strains are limited to the msp4 gene analysis. The purpose of this study is to investigate A. marginale isolates infecting 16 cattle located in different bioclimatic areas of northern Tunisia with single gene analysis and multilocus sequence typing methods on the basis of seven partial genes (dnaA, ftsZ, groEL, lipA, secY, recA and sucB). The single gene analysis confirmed the presence of different and novel heterogenic A. marginale strains infecting cattle from the north of Tunisia. The concatenated sequence analysis showed a phylogeographical resolution at the global level and that most of the Tunisian sequence types (STs) formed a separate cluster from a South African isolate and from all New World isolates and strains. By combining the characteristics of each single locus with those of the multi-loci scheme, these results provide a more detailed understanding on the diversity and the evolution of Tunisian A. marginale strains. Copyright © 2018 Elsevier GmbH. All rights reserved.

  5. Multiple layers of temporal and spatial control regulate accumulation of the fruiting body-specific protein APP in Sordaria macrospora and Neurospora crassa.

    PubMed

    Nowrousian, Minou; Piotrowski, Markus; Kück, Ulrich

    2007-07-01

    During fungal fruiting body development, specialized cell types differentiate from vegetative mycelium. We have isolated a protein from the ascomycete Sordaria macrospora that is not present during vegetative growth but accumulates in perithecia. The protein was sequenced by mass spectrometry and the corresponding gene was termed app (abundant perithecial protein). app transcript occurs only after the onset of sexual development; however, the formation of ascospores is not a prerequisite for APP accumulation. The transcript of the Neurospora crassa ortholog is present prior to fertilization, but the protein accumulates only after fertilization. In crosses of N. crassa Deltaapp strains with the wild type, APP accumulates when the wild type serves as female parent, but not in the reciprocal cross; thus, the presence of a functional female app allele is necessary and sufficient for APP accumulation. These findings highlight multiple layers of temporal and spatial control of gene expression during fungal development.

  6. An unbiased study of debris discs around A-type stars with Herschel

    NASA Astrophysics Data System (ADS)

    Thureau, N. D.; Greaves, J. S.; Matthews, B. C.; Kennedy, G.; Phillips, N.; Booth, M.; Duchêne, G.; Horner, J.; Rodriguez, D. R.; Sibthorpe, B.; Wyatt, M. C.

    2014-12-01

    The Herschel DEBRIS (Disc Emission via a Bias-free Reconnaissance in the Infrared/Submillimetre) survey brings us a unique perspective on the study of debris discs around main-sequence A-type stars. Bias-free by design, the survey offers a remarkable data set with which to investigate the cold disc properties. The statistical analysis of the 100 and 160 μm data for 86 main-sequence A stars yields a lower than previously found debris disc rate. Considering better than 3σ excess sources, we find a detection rate ≥24 ± 5 per cent at 100 μm which is similar to the debris disc rate around main-sequence F/G/K-spectral type stars. While the 100 and 160 μm excesses slowly decline with time, debris discs with large excesses are found around some of the oldest A stars in our sample, evidence that the debris phenomenon can survive throughout the length of the main sequence (˜1 Gyr). Debris discs are predominantly detected around the youngest and hottest stars in our sample. Stellar properties such as metallicity are found to have no effect on the debris disc incidence. Debris discs are found around A stars in single systems and multiple systems at similar rates. While tight and wide binaries (<1 and >100 au, respectively) host debris discs with a similar frequency and global properties, no intermediate separation debris systems were detected in our sample.

  7. Optimal rotation sequences for active perception

    NASA Astrophysics Data System (ADS)

    Nakath, David; Rachuy, Carsten; Clemens, Joachim; Schill, Kerstin

    2016-05-01

    One major objective of autonomous systems navigating in dynamic environments is gathering information needed for self localization, decision making, and path planning. To account for this, such systems are usually equipped with multiple types of sensors. As these sensors often have a limited field of view and a fixed orientation, the task of active perception breaks down to the problem of calculating alignment sequences which maximize the information gain regarding expected measurements. Action sequences that rotate the system according to the calculated optimal patterns then have to be generated. In this paper we present an approach for calculating these sequences for an autonomous system equipped with multiple sensors. We use a particle filter for multi- sensor fusion and state estimation. The planning task is modeled as a Markov decision process (MDP), where the system decides in each step, what actions to perform next. The optimal control policy, which provides the best action depending on the current estimated state, maximizes the expected cumulative reward. The latter is computed from the expected information gain of all sensors over time using value iteration. The algorithm is applied to a manifold representation of the joint space of rotation and time. We show the performance of the approach in a spacecraft navigation scenario where the information gain is changing over time, caused by the dynamic environment and the continuous movement of the spacecraft

  8. Whole Genome Sequencing demonstrates that Geographic Variation of Escherichia coli O157 Genotypes Dominates Host Association.

    PubMed

    Strachan, Norval J C; Rotariu, Ovidiu; Lopes, Bruno; MacRae, Marion; Fairley, Susan; Laing, Chad; Gannon, Victor; Allison, Lesley J; Hanson, Mary F; Dallman, Tim; Ashton, Philip; Franz, Eelco; van Hoek, Angela H A M; French, Nigel P; George, Tessy; Biggs, Patrick J; Forbes, Ken J

    2015-10-07

    Genetic variation in an infectious disease pathogen can be driven by ecological niche dissimilarities arising from different host species and different geographical locations. Whole genome sequencing was used to compare E. coli O157 isolates from host reservoirs (cattle and sheep) from Scotland and to compare genetic variation of isolates (human, animal, environmental/food) obtained from Scotland, New Zealand, Netherlands, Canada and the USA. Nei's genetic distance calculated from core genome single nucleotide polymorphisms (SNPs) demonstrated that the animal isolates were from the same population. Investigation of the Shiga toxin bacteriophage and their insertion sites (SBI typing) revealed that cattle and sheep isolates had statistically indistinguishable rarefaction profiles, diversity and genotypes. In contrast, isolates from different countries exhibited significant differences in Nei's genetic distance and SBI typing. Hence, after successful international transmission, which has occurred on multiple occasions, local genetic variation occurs, resulting in a global patchwork of continental and trans-continental phylogeographic clades. These findings are important for three reasons: first, understanding transmission and evolution of infectious diseases associated with multiple host reservoirs and multi-geographic locations; second, highlighting the relevance of the sheep reservoir when considering farm based interventions; and third, improving our understanding of why human disease incidence varies across the world.

  9. Molecular cloning of human T-cell lymphotrophic virus type I-like proviral genome from the peripheral lymphocyte DNA of a patient with chronic neurologic disorders.

    PubMed Central

    Reddy, E P; Mettus, R V; DeFreitas, E; Wroblewska, Z; Cisco, M; Koprowski, H

    1988-01-01

    Human T-cell lymphotropic virus type 1 (HTLV-I), the etiologic agent of human T-cell leukemia, has recently been shown to be associated with neurologic disorders such as tropical spastic paraparesis, HTLV-associated myelopathy, and possibly with multiple sclerosis. In this communication, we have examined one specific case of neurologic disorder that can be classified as multiple sclerosis or tropical spastic paraparesis. The patient suffering from chronic neurologic disorder was found to contain antibodies to HTLV-I envelope and gag proteins in his serum and cerebrospinal fluid. Lymphocytes from peripheral blood and cerebrospinal fluid of the patient were shown to express viral RNA sequences by in situ hybridization. Southern blot analysis of the patient lymphocyte DNA revealed the presence of HTLV-I-related sequences. Blot-hybridization analysis of the RNA from fresh peripheral lymphocytes stimulated with interleukin 2 revealed the presence of abundant amounts of genomic viral RNA with little or no subgenomic RNA. We have cloned the proviral genome from the DNA of the peripheral lymphocytes and determined its restriction map. This analysis shows that this proviral genome is very similar if not identical to that of the prototype HTLV-I genome. Images PMID:2897123

  10. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions.

    PubMed

    Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2013-01-27

    A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.

  11. “Epidemic Clones” of Listeria monocytogenes Are Widespread and Ancient Clonal Groups

    PubMed Central

    Cantinelli, Thomas; Chenal-Francisque, Viviane; Diancourt, Laure; Frezal, Lise; Leclercq, Alexandre; Wirth, Thierry

    2013-01-01

    The food-borne pathogen Listeria monocytogenes is genetically heterogeneous. Although some clonal groups have been implicated in multiple outbreaks, there is currently no consensus on how “epidemic clones” should be defined. The objectives of this work were to compare the patterns of sequence diversity on two sets of genes that have been widely used to define L. monocytogenes clonal groups: multilocus sequence typing (MLST) and multi-virulence-locus sequence typing (MvLST). Further, we evaluated the diversity within clonal groups by pulsed-field gel electrophoresis (PFGE). Based on 125 isolates of diverse temporal, geographical, and source origins, MLST and MvLST genes (i) had similar patterns of sequence polymorphisms, recombination, and selection, (ii) provided concordant phylogenetic clustering, and (iii) had similar discriminatory power, which was not improved when we combined both data sets. Inclusion of representative strains of previous outbreaks demonstrated the correspondence of epidemic clones with previously recognized MLST clonal complexes. PFGE analysis demonstrated heterogeneity within major clones, most of which were isolated decades before their involvement in outbreaks. We conclude that the “epidemic clone” denominations represent a redundant but largely incomplete nomenclature system for MLST-defined clones, which must be regarded as successful genetic groups that are widely distributed across time and space. PMID:24006010

  12. Clustered regularly interspaced short palindromic repeats (CRISPRs) analysis of members of the Mycobacterium tuberculosis complex.

    PubMed

    Botelho, Ana; Canto, Ana; Leão, Célia; Cunha, Mónica V

    2015-01-01

    Typical CRISPR (clustered, regularly interspaced, short palindromic repeat) regions are constituted by short direct repeats (DRs), interspersed with similarly sized non-repetitive spacers, derived from transmissible genetic elements, acquired when the cell is challenged with foreign DNA. The analysis of the structure, in number and nature, of CRISPR spacers is a valuable tool for molecular typing since these loci are polymorphic among strains, originating characteristic signatures. The existence of CRISPR structures in the genome of the members of Mycobacterium tuberculosis complex (MTBC) enabled the development of a genotyping method, based on the analysis of the presence or absence of 43 oligonucleotide spacers separated by conserved DRs. This method, called spoligotyping, consists on PCR amplification of the DR chromosomal region and recognition after hybridization of the spacers that are present. The workflow beneath this methodology implies that the PCR products are brought onto a membrane containing synthetic oligonucleotides that have complementary sequences to the spacer sequences. Lack of hybridization of the PCR products to a specific oligonucleotide sequence indicates absence of the correspondent spacer sequence in the examined strain. Spoligotyping gained great notoriety as a robust identification and typing tool for members of MTBC, enabling multiple epidemiological studies on human and animal tuberculosis.

  13. Occurrence of Carbapenemase-Producing Enterobacteriaceae Isolates in the Wildlife: First Report of OXA-48 in Wild Boars in Algeria.

    PubMed

    Bachiri, Taous; Bakour, Sofiane; Lalaoui, Rym; Belkebla, Nadia; Allouache, Meriem; Rolain, Jean Marc; Touati, Abdelaziz

    2018-04-01

    The aim of the present study was to screen for the presence of carbapenemase-producing Enterobacteriaceae (CPE) isolates from wild boars and Barbary macaques in Algeria. Fecal samples were collected from wild boars (n = 168) and Barbary macaques (n = 212), in Bejaia, Algeria, between September 2014 and April 2016. The isolates were identified and antimicrobial susceptibility was determined. Carbapenem resistance determinants were studied using PCR and sequencing, while clonal relatedness was performed using multilocus sequence typing (MLST). PCR was used to investigate certain virulence genes. Three CPE isolates from three different samples (1.8%) recovered from wild boars were identified as Escherichia coli (two isolates) and Klebsiella pneumoniae (one isolate). These isolates were resistant to amoxicillin, amoxicillin-clavulanate, tobramycin, ertapenem, and meropenem. The results of PCR and sequencing analysis showed that all three isolates produced the OXA-48 enzyme. The MLST showed that the two E. coli isolates were assigned to the same sequence type, ST635, and belonged to phylogroup A, whereas K. pneumoniae strain belonged to ST13. The K. pneumoniae strain was positive for multiple virulence factors, whereas no virulence determinants were found in E. coli isolates. This is the first report of OXA-48-producing Enterobacteriaceae in wild animals from Algeria and Africa.

  14. Phylo: A Citizen Science Approach for Improving Multiple Sequence Alignment

    PubMed Central

    Kam, Alfred; Kwak, Daniel; Leung, Clarence; Wu, Chu; Zarour, Eleyine; Sarmenta, Luis; Blanchette, Mathieu; Waldispühl, Jérôme

    2012-01-01

    Background Comparative genomics, or the study of the relationships of genome structure and function across different species, offers a powerful tool for studying evolution, annotating genomes, and understanding the causes of various genetic disorders. However, aligning multiple sequences of DNA, an essential intermediate step for most types of analyses, is a difficult computational task. In parallel, citizen science, an approach that takes advantage of the fact that the human brain is exquisitely tuned to solving specific types of problems, is becoming increasingly popular. There, instances of hard computational problems are dispatched to a crowd of non-expert human game players and solutions are sent back to a central server. Methodology/Principal Findings We introduce Phylo, a human-based computing framework applying “crowd sourcing” techniques to solve the Multiple Sequence Alignment (MSA) problem. The key idea of Phylo is to convert the MSA problem into a casual game that can be played by ordinary web users with a minimal prior knowledge of the biological context. We applied this strategy to improve the alignment of the promoters of disease-related genes from up to 44 vertebrate species. Since the launch in November 2010, we received more than 350,000 solutions submitted from more than 12,000 registered users. Our results show that solutions submitted contributed to improving the accuracy of up to 70% of the alignment blocks considered. Conclusions/Significance We demonstrate that, combined with classical algorithms, crowd computing techniques can be successfully used to help improving the accuracy of MSA. More importantly, we show that an NP-hard computational problem can be embedded in casual game that can be easily played by people without significant scientific training. This suggests that citizen science approaches can be used to exploit the billions of “human-brain peta-flops” of computation that are spent every day playing games. Phylo is available at: http://phylo.cs.mcgill.ca. PMID:22412834

  15. Staphylococcus aureus activates type I IFN signaling in mice and humans through the Xr repeated sequences of protein A

    PubMed Central

    Martin, Francis J.; Gomez, Marisa I.; Wetzel, Dawn M.; Memmi, Guido; O’Seaghdha, Maghnus; Soong, Grace; Schindler, Christian; Prince, Alice

    2009-01-01

    The activation of type I IFN signaling is a major component of host defense against viral infection, but it is not typically associated with immune responses to extracellular bacterial pathogens. Using mouse and human airway epithelial cells, we have demonstrated that Staphylococcus aureus activates type I IFN signaling, which contributes to its virulence as a respiratory pathogen. This response was dependent on the expression of protein A and, more specifically, the Xr domain, a short sequence–repeat region encoded by DNA that consists of repeated 24-bp sequences that are the basis of an internationally used epidemiological typing scheme. Protein A was endocytosed by airway epithelial cells and subsequently induced IFN-β expression, JAK-STAT signaling, and IL-6 production. Mice lacking IFN-α/β receptor 1 (IFNAR-deficient mice), which are incapable of responding to type I IFNs, were substantially protected against lethal S. aureus pneumonia compared with wild-type control mice. The profound immunological consequences of IFN-β signaling, particularly in the lung, may help to explain the conservation of multiple copies of the Xr domain of protein A in S. aureus strains and the importance of protein A as a virulence factor in the pathogenesis of staphylococcal pneumonia. PMID:19603548

  16. Monoterpene and sesquiterpene synthases and the origin of terpene skeletal diversity in plants.

    PubMed

    Degenhardt, Jörg; Köllner, Tobias G; Gershenzon, Jonathan

    2009-01-01

    The multitude of terpene carbon skeletons in plants is formed by enzymes known as terpene synthases. This review covers the monoterpene and sesquiterpene synthases presenting an up-to-date list of enzymes reported and evidence for their ability to form multiple products. The reaction mechanisms of these enzyme classes are described, and information on how terpene synthase proteins mediate catalysis is summarized. Correlations between specific amino acid motifs and terpene synthase function are described, including an analysis of the relationships between active site sequence and cyclization type and a discussion of whether specific protein features might facilitate multiple product formation.

  17. A Complete Developmental Sequence of a Drosophila Neuronal Lineage as Revealed by Twin-Spot MARCM

    PubMed Central

    He, Yisheng; Ding, Peng; Kao, Jui-Chun; Lee, Tzumin

    2010-01-01

    Drosophila brains contain numerous neurons that form complex circuits. These neurons are derived in stereotyped patterns from a fixed number of progenitors, called neuroblasts, and identifying individual neurons made by a neuroblast facilitates the reconstruction of neural circuits. An improved MARCM (mosaic analysis with a repressible cell marker) technique, called twin-spot MARCM, allows one to label the sister clones derived from a common progenitor simultaneously in different colors. It enables identification of every single neuron in an extended neuronal lineage based on the order of neuron birth. Here we report the first example, to our knowledge, of complete lineage analysis among neurons derived from a common neuroblast that relay olfactory information from the antennal lobe (AL) to higher brain centers. By identifying the sequentially derived neurons, we found that the neuroblast serially makes 40 types of AL projection neurons (PNs). During embryogenesis, one PN with multi-glomerular innervation and 18 uniglomerular PNs targeting 17 glomeruli of the adult AL are born. Many more PNs of 22 additional types, including four types of polyglomerular PNs, derive after the neuroblast resumes dividing in early larvae. Although different offspring are generated in a rather arbitrary sequence, the birth order strictly dictates the fate of each post-mitotic neuron, including the fate of programmed cell death. Notably, the embryonic progenitor has an altered temporal identity following each self-renewing asymmetric cell division. After larval hatching, the same progenitor produces multiple neurons for each cell type, but the number of neurons for each type is tightly regulated. These observations substantiate the origin-dependent specification of neuron types. Sequencing neuronal lineages will not only unravel how a complex brain develops but also permit systematic identification of neuron types for detailed structure and function analysis of the brain. PMID:20808769

  18. Diversity of Group I and II Clostridium botulinum Strains from France Including Recently Identified Subtypes

    PubMed Central

    Mazuet, Christelle; Legeay, Christine; Sautereau, Jean; Ma, Laurence; Bouchier, Christiane; Bouvet, Philippe; Popoff, Michel R.

    2016-01-01

    In France, human botulism is mainly food-borne intoxication, whereas infant botulism is rare. A total of 99 group I and II Clostridium botulinum strains including 59 type A (12 historical isolates [1947–1961], 43 from France [1986–2013], 3 from other countries, and 1 collection strain), 31 type B (3 historical, 23 recent isolates, 4 from other countries, and 1 collection strain), and 9 type E (5 historical, 3 isolates, and 1 collection strain) were investigated by botulinum locus gene sequencing and multilocus sequence typing analysis. Historical C. botulinum A strains mainly belonged to subtype A1 and sequence type (ST) 1, whereas recent strains exhibited a wide genetic diversity: subtype A1 in orfX or ha locus, A1(B), A1(F), A2, A2b2, A5(B2′) A5(B3′), as well as the recently identified A7 and A8 subtypes, and were distributed into 25 STs. Clostridium botulinum A1(B) was the most frequent subtype from food-borne botulism and food. Group I C. botulinum type B in France were mainly subtype B2 (14 out of 20 historical and recent strains) and were divided into 19 STs. Food-borne botulism resulting from ham consumption during the recent period was due to group II C. botulinum B4. Type E botulism is rare in France, 5 historical and 1 recent strains were subtype E3. A subtype E12 was recently identified from an unusual ham contamination. Clostridium botulinum strains from human botulism in France showed a wide genetic diversity and seems to result not from a single evolutionary lineage but from multiple and independent genetic rearrangements. PMID:27189984

  19. Evaluation of simultaneous binding of Chromomycin A3 to the multiple sites of DNA by the new restriction enzyme assay.

    PubMed

    Murase, Hirotaka; Noguchi, Tomoharu; Sasaki, Shigeki

    2018-06-01

    Chromomycin A3 (CMA3) is an aureolic acid-type antitumor antibiotic. CMA3 forms dimeric complexes with divalent cations, such as Mg 2+ , which strongly binds to the GC rich sequence of DNA to inhibit DNA replication and transcription. In this study, the binding property of CMA3 to the DNA sequence containing multiple GC-rich binding sites was investigated by measuring the protection from hydrolysis by the restriction enzymes, AccII and Fnu4HI, for the center of the CGCG site and the 5'-GC↓GGC site, respectively. In contrast to the standard DNase I footprinting method, the DNA substrates are fully hydrolyzed by the restriction enzymes, therefore, the full protection of DNA at all the cleavable sites indicates that CMA3 simultaneously binds to all the binding sites. The restriction enzyme assay has suggested that CMA3 has a high tendency to bind the successive CGCG sites and the CGG repeat. Copyright © 2018 Elsevier Ltd. All rights reserved.

  20. Burkholderia: an update on taxonomy and biotechnological potential as antibiotic producers.

    PubMed

    Depoorter, Eliza; Bull, Matt J; Peeters, Charlotte; Coenye, Tom; Vandamme, Peter; Mahenthiralingam, Eshwar

    2016-06-01

    Burkholderia is an incredibly diverse and versatile Gram-negative genus, within which over 80 species have been formally named and multiple other genotypic groups likely represent new species. Phylogenetic analysis based on the 16S rRNA gene sequence and core genome ribosomal multilocus sequence typing analysis indicates the presence of at least three major clades within the genus. Biotechnologically, Burkholderia are well-known for their bioremediation and biopesticidal properties. Within this review, we explore the ability of Burkholderia to synthesise a wide range of antimicrobial compounds ranging from historically characterised antifungals to recently described antibacterial antibiotics with activity against multiresistant clinical pathogens. The production of multiple Burkholderia antibiotics is controlled by quorum sensing and examples of quorum sensing pathways found across the genus are discussed. The capacity for antibiotic biosynthesis and secondary metabolism encoded within Burkholderia genomes is also evaluated. Overall, Burkholderia demonstrate significant biotechnological potential as a source of novel antibiotics and bioactive secondary metabolites.

  1. Bellerophon: a program to detect chimeric sequences in multiple sequence alignments.

    PubMed

    Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip

    2004-09-22

    Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments. Bellerophon is available as an interactive web server at http://foo.maths.uq.edu.au/~huber/bellerophon.pl

  2. Human immunodeficiency virus type 1 LTR TATA and TAR region sequences required for transcriptional regulation.

    PubMed Central

    Garcia, J A; Harrich, D; Soultanakis, E; Wu, F; Mitsuyasu, R; Gaynor, R B

    1989-01-01

    The human immunodeficiency virus (HIV) type 1 LTR is regulated at the transcriptional level by both cellular and viral proteins. Using HeLa cell extracts, multiple regions of the HIV LTR were found to serve as binding sites for cellular proteins. An untranslated region binding protein UBP-1 has been purified and fractions containing this protein bind to both the TAR and TATA regions. To investigate the role of cellular proteins binding to both the TATA and TAR regions and their potential interaction with other HIV DNA binding proteins, oligonucleotide-directed mutagenesis of both these regions was performed followed by DNase I footprinting and transient expression assays. In the TATA region, two direct repeats TC/AAGC/AT/AGCTGC surround the TATA sequence. Mutagenesis of both of these direct repeats or of the TATA sequence interrupted binding over the TATA region on the coding strand, but only a mutation of the TATA sequence affected in vivo assays for tat-activation. In addition to TAR serving as the site of binding of cellular proteins, RNA transcribed from TAR is capable of forming a stable stem-loop structure. To determine the relative importance of DNA binding proteins as compared to secondary structure, oligonucleotide-directed mutations in the TAR region were studied. Local mutations that disrupted either the stem or loop structure were defective in gene expression. However, compensatory mutations which restored base pairing in the stem resulted in complete tat-activation. This indicated a significant role for the stem-loop structure in HIV gene expression. To determine the role of TAR binding proteins, mutations were constructed which extensively changed the primary structure of the TAR region, yet left stem base pairing, stem energy and the loop sequence intact. These mutations resulted in decreased protein binding to TAR DNA and defects in tat-activation, and revealed factor binding specifically to the loop DNA sequence. Further mutagenesis which inverted this stem and loop mutation relative to the HIV LTR mRNA start site resulted in even larger decreases in tat-activation. This suggests that multiple determinants, including protein binding, the loop sequence, and RNA or DNA secondary structure, are important in tat-activation and suggests that tat may interact with cellular proteins binding to DNA to increase HIV gene expression. Images PMID:2721501

  3. DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method.

    PubMed

    Balech, Bachir; Monaco, Alfonso; Perniola, Michele; Santamaria, Monica; Donvito, Giacinto; Vicario, Saverio; Maggi, Giorgio; Pesole, Graziano

    2018-01-01

    Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called "Gene" and "Genome," accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user's interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/ .

  4. Human Immunodeficiency Viruses Appear Compartmentalized to the Female Genital Tract in Cross-Sectional Analyses but Genital Lineages Do Not Persist Over Time

    PubMed Central

    Bull, Marta E.; Heath, Laura M.; McKernan-Mullin, Jennifer L.; Kraft, Kelli M.; Acevedo, Luis; Hitti, Jane E.; Cohn, Susan E.; Tapia, Kenneth A.; Holte, Sarah E.; Dragavon, Joan A.; Coombs, Robert W.; Mullins, James I.; Frenkel, Lisa M.

    2013-01-01

    Background. Whether unique human immunodeficiency type 1 (HIV) genotypes occur in the genital tract is important for vaccine development and management of drug resistant viruses. Multiple cross-sectional studies suggest HIV is compartmentalized within the female genital tract. We hypothesize that bursts of HIV replication and/or proliferation of infected cells captured in cross-sectional analyses drive compartmentalization but over time genital-specific viral lineages do not form; rather viruses mix between genital tract and blood. Methods. Eight women with ongoing HIV replication were studied during a period of 1.5 to 4.5 years. Multiple viral sequences were derived by single-genome amplification of the HIV C2-V5 region of env from genital secretions and blood plasma. Maximum likelihood phylogenies were evaluated for compartmentalization using 4 statistical tests. Results. In cross-sectional analyses compartmentalization of genital from blood viruses was detected in three of eight women by all tests; this was associated with tissue specific clades containing multiple monotypic sequences. In longitudinal analysis, the tissues-specific clades did not persist to form viral lineages. Rather, across women, HIV lineages were comprised of both genital tract and blood sequences. Conclusions. The observation of genital-specific HIV clades only in cross-sectional analysis and an absence of genital-specific lineages in longitudinal analyses suggest a dynamic interchange of HIV variants between the female genital tract and blood. PMID:23315326

  5. Unraveling the genetic diversity and phylogeny of Leishmania RNA virus 1 strains of infected Leishmania isolates circulating in French Guiana.

    PubMed

    Tirera, Sourakhata; Ginouves, Marine; Donato, Damien; Caballero, Ignacio S; Bouchier, Christiane; Lavergne, Anne; Bourreau, Eliane; Mosnier, Emilie; Vantilcke, Vincent; Couppié, Pierre; Prevot, Ghislaine; Lacoste, Vincent

    2017-07-01

    Leishmania RNA virus type 1 (LRV1) is an endosymbiont of some Leishmania (Vianna) species in South America. Presence of LRV1 in parasites exacerbates disease severity in animal models and humans, related to a disproportioned innate immune response, and is correlated with drug treatment failures in humans. Although the virus was identified decades ago, its genomic diversity has been overlooked until now. We subjected LRV1 strains from 19 L. (V.) guyanensis and one L. (V.) braziliensis isolates obtained from cutaneous leishmaniasis samples identified throughout French Guiana with next-generation sequencing and de novo sequence assembly. We generated and analyzed 24 unique LRV1 sequences over their full-length coding regions. Multiple alignment of these new sequences revealed variability (0.5%-23.5%) across the entire sequence except for highly conserved motifs within the 5' untranslated region. Phylogenetic analyses showed that viral genomes of L. (V.) guyanensis grouped into five distinct clusters. They further showed a species-dependent clustering between viral genomes of L. (V.) guyanensis and L. (V.) braziliensis, confirming a long-term co-evolutionary history. Noteworthy, we identified cases of multiple LRV1 infections in three of the 20 Leishmania isolates. Here, we present the first-ever estimate of LRV1 genomic diversity that exists in Leishmania (V.) guyanensis parasites. Genetic characterization and phylogenetic analyses of these viruses has shed light on their evolutionary relationships. To our knowledge, this study is also the first to report cases of multiple LRV1 infections in some parasites. Finally, this work has made it possible to develop molecular tools for adequate identification and genotyping of LRV1 strains for diagnostic purposes. Given the suspected worsening role of LRV1 infection in the pathogenesis of human leishmaniasis, these data have a major impact from a clinical viewpoint and for the management of Leishmania-infected patients.

  6. Unraveling the genetic diversity and phylogeny of Leishmania RNA virus 1 strains of infected Leishmania isolates circulating in French Guiana

    PubMed Central

    Caballero, Ignacio S.; Bouchier, Christiane; Lavergne, Anne; Bourreau, Eliane; Mosnier, Emilie; Vantilcke, Vincent; Couppié, Pierre; Prevot, Ghislaine

    2017-01-01

    Introduction Leishmania RNA virus type 1 (LRV1) is an endosymbiont of some Leishmania (Vianna) species in South America. Presence of LRV1 in parasites exacerbates disease severity in animal models and humans, related to a disproportioned innate immune response, and is correlated with drug treatment failures in humans. Although the virus was identified decades ago, its genomic diversity has been overlooked until now. Methodology/Principles findings We subjected LRV1 strains from 19 L. (V.) guyanensis and one L. (V.) braziliensis isolates obtained from cutaneous leishmaniasis samples identified throughout French Guiana with next-generation sequencing and de novo sequence assembly. We generated and analyzed 24 unique LRV1 sequences over their full-length coding regions. Multiple alignment of these new sequences revealed variability (0.5%–23.5%) across the entire sequence except for highly conserved motifs within the 5’ untranslated region. Phylogenetic analyses showed that viral genomes of L. (V.) guyanensis grouped into five distinct clusters. They further showed a species-dependent clustering between viral genomes of L. (V.) guyanensis and L. (V.) braziliensis, confirming a long-term co-evolutionary history. Noteworthy, we identified cases of multiple LRV1 infections in three of the 20 Leishmania isolates. Conclusions/Significance Here, we present the first-ever estimate of LRV1 genomic diversity that exists in Leishmania (V.) guyanensis parasites. Genetic characterization and phylogenetic analyses of these viruses has shed light on their evolutionary relationships. To our knowledge, this study is also the first to report cases of multiple LRV1 infections in some parasites. Finally, this work has made it possible to develop molecular tools for adequate identification and genotyping of LRV1 strains for diagnostic purposes. Given the suspected worsening role of LRV1 infection in the pathogenesis of human leishmaniasis, these data have a major impact from a clinical viewpoint and for the management of Leishmania-infected patients. PMID:28715422

  7. Longitudinal stability of MRI for mapping brain change using tensor-based morphometry.

    PubMed

    Leow, Alex D; Klunder, Andrea D; Jack, Clifford R; Toga, Arthur W; Dale, Anders M; Bernstein, Matt A; Britson, Paula J; Gunter, Jeffrey L; Ward, Chadwick P; Whitwell, Jennifer L; Borowski, Bret J; Fleisher, Adam S; Fox, Nick C; Harvey, Danielle; Kornak, John; Schuff, Norbert; Studholme, Colin; Alexander, Gene E; Weiner, Michael W; Thompson, Paul M

    2006-06-01

    Measures of brain change can be computed from sequential MRI scans, providing valuable information on disease progression, e.g., for patient monitoring and drug trials. Tensor-based morphometry (TBM) creates maps of these brain changes, visualizing the 3D profile and rates of tissue growth or atrophy, but its sensitivity depends on the contrast and geometric stability of the images. As part of the Alzheimer's Disease Neuroimaging Initiative (ADNI), 17 normal elderly subjects were scanned twice (at a 2-week interval) with several 3D 1.5 T MRI pulse sequences: high and low flip angle SPGR/FLASH (from which Synthetic T1 images were generated), MP-RAGE, IR-SPGR (N = 10) and MEDIC (N = 7) scans. For each subject and scan type, a 3D deformation map aligned baseline and follow-up scans, computed with a nonlinear, inverse-consistent elastic registration algorithm. Voxelwise statistics, in ICBM stereotaxic space, visualized the profile of mean absolute change and its cross-subject variance; these maps were then compared using permutation testing. Image stability depended on: (1) the pulse sequence; (2) the transmit/receive coil type (birdcage versus phased array); (3) spatial distortion corrections (using MEDIC sequence information); (4) B1-field intensity inhomogeneity correction (using N3). SPGR/FLASH images acquired using a birdcage coil had least overall deviation. N3 correction reduced coil type and pulse sequence differences and improved scan reproducibility, except for Synthetic T1 images (which were intrinsically corrected for B1-inhomogeneity). No strong evidence favored B0 correction. Although SPGR/FLASH images showed least deviation here, pulse sequence selection for the ADNI project was based on multiple additional image analyses, to be reported elsewhere.

  8. Phylogenetic and environmental diversity of DsrAB-type dissimilatory (bi)sulfite reductases

    PubMed Central

    Müller, Albert Leopold; Kjeldsen, Kasper Urup; Rattei, Thomas; Pester, Michael; Loy, Alexander

    2015-01-01

    The energy metabolism of essential microbial guilds in the biogeochemical sulfur cycle is based on a DsrAB-type dissimilatory (bi)sulfite reductase that either catalyzes the reduction of sulfite to sulfide during anaerobic respiration of sulfate, sulfite and organosulfonates, or acts in reverse during sulfur oxidation. Common use of dsrAB as a functional marker showed that dsrAB richness in many environments is dominated by novel sequence variants and collectively represents an extensive, largely uncharted sequence assemblage. Here, we established a comprehensive, manually curated dsrAB/DsrAB database and used it to categorize the known dsrAB diversity, reanalyze the evolutionary history of dsrAB and evaluate the coverage of published dsrAB-targeted primers. Based on a DsrAB consensus phylogeny, we introduce an operational classification system for environmental dsrAB sequences that integrates established taxonomic groups with operational taxonomic units (OTUs) at multiple phylogenetic levels, ranging from DsrAB enzyme families that reflect reductive or oxidative DsrAB types of bacterial or archaeal origin, superclusters, uncultured family-level lineages to species-level OTUs. Environmental dsrAB sequences constituted at least 13 stable family-level lineages without any cultivated representatives, suggesting that major taxa of sulfite/sulfate-reducing microorganisms have not yet been identified. Three of these uncultured lineages occur mainly in marine environments, while specific habitat preferences are not evident for members of the other 10 uncultured lineages. In summary, our publically available dsrAB/DsrAB database, the phylogenetic framework, the multilevel classification system and a set of recommended primers provide a necessary foundation for large-scale dsrAB ecology studies with next-generation sequencing methods. PMID:25343514

  9. Longitudinal stability of MRI for mapping brain change using tensor-based morphometry

    PubMed Central

    Leow, Alex D.; Klunder, Andrea D.; Jack, Clifford R.; Toga, Arthur W.; Dale, Anders M.; Bernstein, Matt A.; Britson, Paula J.; Gunter, Jeffrey L.; Ward, Chadwick P.; Whitwell, Jennifer L.; Borowski, Bret J.; Fleisher, Adam S.; Fox, Nick C.; Harvey, Danielle; Kornak, John; Schuff, Norbert; Studholme, Colin; Alexander, Gene E.; Weiner, Michael W.; Thompson, Paul M.

    2007-01-01

    Measures of brain change can be computed from sequential MRI scans, providing valuable information on disease progression, e.g., for patient monitoring and drug trials. Tensor-based morphometry (TBM) creates maps of these brain changes, visualizing the 3D profile and rates of tissue growth or atrophy, but its sensitivity depends on the contrast and geometric stability of the images. A s part of the Alzheimer’s Disease Neuroimaging Initiative (ADNI), 17 normal elderly subjects were scanned twice (at a 2-week interval) with several 3D 1.5 T MRI pulse sequences: high and low flip angle SPGR/FLASH (from which Synthetic T1 images were generated), MP-RAGE, IR-SPGR (N = 10) and MEDIC (N = 7) scans. For each subject and scan type, a 3D deformation map aligned baseline and follow-up scans, computed with a nonlinear, inverse-consistent elastic registration algorithm. Voxelwise statistics, in ICBM stereotaxic space, visualized the profile of mean absolute change and its cross-subject variance; these maps were then compared using permutation testing. Image stability depended on: (1) the pulse sequence; (2) the transmit/receive coil type (birdcage versus phased array); (3) spatial distortion corrections (using MEDIC sequence information); (4) B1-field intensity inhomogeneity correction (using N3). SPGR/FLASH images acquired using a birdcage coil had least overall deviation. N3 correction reduced coil type and pulse sequence differences and improved scan reproducibility, except for Synthetic T1 images (which were intrinsically corrected for B1-inhomogeneity). No strong evidence favored B0 correction. Although SPGR/FLASH images showed least deviation here, pulse sequence selection for the ADNI project was based on multiple additional image analyses, to be reported elsewhere. PMID:16480900

  10. Effective application of multiple locus variable number of tandem repeats analysis to tracing Staphylococcus aureus in food-processing environment.

    PubMed

    Rešková, Z; Koreňová, J; Kuchta, T

    2014-04-01

    A total of 256 isolates of Staphylococcus aureus were isolated from 98 samples (34 swabs and 64 food samples) obtained from small or medium meat- and cheese-processing plants in Slovakia. The strains were genotypically characterized by multiple locus variable number of tandem repeats analysis (MLVA), involving multiplex polymerase chain reaction (PCR) with subsequent separation of the amplified DNA fragments by an automated flow-through gel electrophoresis. With the panel of isolates, MLVA produced 31 profile types, which was a sufficient discrimination to facilitate the description of spatial and temporal aspects of contamination. Further data on MLVA discrimination were obtained by typing a subpanel of strains by multiple locus sequence typing (MLST). MLVA coupled to automated electrophoresis proved to be an effective, comparatively fast and inexpensive method for tracing S. aureus contamination of food-processing factories. Subspecies genotyping of microbial contaminants in food-processing factories may facilitate identification of spatial and temporal aspects of the contamination. This may help to properly manage the process hygiene. With S. aureus, multiple locus variable number of tandem repeats analysis (MLVA) proved to be an effective method for the purpose, being sufficiently discriminative, yet comparatively fast and inexpensive. The application of automated flow-through gel electrophoresis to separation of DNA fragments produced by multiplex PCR helped to improve the accuracy and speed of the method. © 2013 The Society for Applied Microbiology.

  11. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

    PubMed Central

    2013-01-01

    Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200

  12. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    PubMed

    Nagar, Anurag; Hahsler, Michael

    2013-01-01

    Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.

  13. The detection and phylogenetic analysis of the alkane 1-monooxygenase gene of members of the genus Rhodococcus.

    PubMed

    Táncsics, András; Benedek, Tibor; Szoboszlay, Sándor; Veres, Péter G; Farkas, Milán; Máthé, István; Márialigeti, Károly; Kukolya, József; Lányi, Szabolcs; Kriszt, Balázs

    2015-02-01

    Naturally occurring and anthropogenic petroleum hydrocarbons are potential carbon sources for many bacteria. The AlkB-related alkane hydroxylases, which are integral membrane non-heme iron enzymes, play a key role in the microbial degradation of many of these hydrocarbons. Several members of the genus Rhodococcus are well-known alkane degraders and are known to harbor multiple alkB genes encoding for different alkane 1-monooxygenases. In the present study, 48 Rhodococcus strains, representing 35 species of the genus, were investigated to find out whether there was a dominant type of alkB gene widespread among species of the genus that could be used as a phylogenetic marker. Phylogenetic analysis of rhodococcal alkB gene sequences indicated that a certain type of alkB gene was present in almost every member of the genus Rhodococcus. These alkB genes were common in a unique nucleotide sequence stretch absent from other types of rhodococcal alkB genes that encoded a conserved amino acid motif: WLG(I/V/L)D(G/D)GL. The sequence identity of the targeted alkB gene in Rhodococcus ranged from 78.5 to 99.2% and showed higher nucleotide sequence variation at the inter-species level compared to the 16S rRNA gene (93.9-99.8%). The results indicated that the alkB gene type investigated might be applicable for: (i) differentiating closely related Rhodococcus species, (ii) properly assigning environmental isolates to existing Rhodococcus species, and finally (iii) assessing whether a new Rhodococcus isolate represents a novel species of the genus. Copyright © 2014 Elsevier GmbH. All rights reserved.

  14. Clonality and distribution of clinical Ureaplasma isolates recovered from male patients and infertile couples in China

    PubMed Central

    Ruan, Zhi; Yang, Ting; Shi, Xinyan; Kong, Yingying; Xie, Xinyou

    2017-01-01

    Ureaplasma spp. have gained increasing recognition as pathogens in both adult and neonatal patients with multiple clinical presentations. However, the clonality of this organism in the male population and infertile couples in China is largely unknown. In this study, 96 (53 U. parvum and 43 U. urealyticum) of 103 Ureaplasma spp. strains recovered from genital specimens from male patients and 15 pairs of infertile couples were analyzed using multilocus sequence typing (MLST)/expanded multilocus sequence typing (eMLST) schemes. A total of 39 sequence types (STs) and 53 expanded sequence types (eSTs) were identified, with three predominant STs (ST1, ST9 and ST22) and eSTs (eST16, eST41 and eST82). Moreover, phylogenetic analysis revealed two distinct clusters that were highly congruent with the taxonomic differences between the two Ureaplasma species. We found significant differences in the distributions of both clusters and sub-groups between the male and female patients (P < 0.001). Moreover, 66.7% and 40.0% of the male and female partners of the infertile couples tested positive for Ureaplasma spp. The present study also attained excellent agreement of the identification of both Ureaplasma species between paired urine and semen specimens from the male partners (k > 0.80). However, this concordance was observed only for the detection of U. urealyticum within the infertile couples. In conclusion, the distributions of the clusters and sub-groups significantly differed between the male and female patients. U. urealyticum is more likely to transmit between infertile couples and be associated with clinical manifestations by the specific epidemic clonal lineages. PMID:28859153

  15. Clonality and distribution of clinical Ureaplasma isolates recovered from male patients and infertile couples in China.

    PubMed

    Ruan, Zhi; Yang, Ting; Shi, Xinyan; Kong, Yingying; Xie, Xinyou; Zhang, Jun

    2017-01-01

    Ureaplasma spp. have gained increasing recognition as pathogens in both adult and neonatal patients with multiple clinical presentations. However, the clonality of this organism in the male population and infertile couples in China is largely unknown. In this study, 96 (53 U. parvum and 43 U. urealyticum) of 103 Ureaplasma spp. strains recovered from genital specimens from male patients and 15 pairs of infertile couples were analyzed using multilocus sequence typing (MLST)/expanded multilocus sequence typing (eMLST) schemes. A total of 39 sequence types (STs) and 53 expanded sequence types (eSTs) were identified, with three predominant STs (ST1, ST9 and ST22) and eSTs (eST16, eST41 and eST82). Moreover, phylogenetic analysis revealed two distinct clusters that were highly congruent with the taxonomic differences between the two Ureaplasma species. We found significant differences in the distributions of both clusters and sub-groups between the male and female patients (P < 0.001). Moreover, 66.7% and 40.0% of the male and female partners of the infertile couples tested positive for Ureaplasma spp. The present study also attained excellent agreement of the identification of both Ureaplasma species between paired urine and semen specimens from the male partners (k > 0.80). However, this concordance was observed only for the detection of U. urealyticum within the infertile couples. In conclusion, the distributions of the clusters and sub-groups significantly differed between the male and female patients. U. urealyticum is more likely to transmit between infertile couples and be associated with clinical manifestations by the specific epidemic clonal lineages.

  16. DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors.

    PubMed

    Schmollinger, Martin; Nieselt, Kay; Kaufmann, Michael; Morgenstern, Burkhard

    2004-09-09

    Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.

  17. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, M.S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.

  18. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  19. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1998-08-18

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  20. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2003-08-19

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  1. Mechanism of chimera formation during the Multiple Displacement Amplification reaction.

    PubMed

    Lasken, Roger S; Stockwell, Timothy B

    2007-04-12

    Multiple Displacement Amplification (MDA) is a method used for amplifying limiting DNA sources. The high molecular weight amplified DNA is ideal for DNA library construction. While this has enabled genomic sequencing from one or a few cells of unculturable microorganisms, the process is complicated by the tendency of MDA to generate chimeric DNA rearrangements in the amplified DNA. Determining the source of the DNA rearrangements would be an important step towards reducing or eliminating them. Here, we characterize the major types of chimeras formed by carrying out an MDA whole genome amplification from a single E. coli cell and sequencing by the 454 Life Sciences method. Analysis of 475 chimeras revealed the predominant reaction mechanisms that create the DNA rearrangements. The highly branched DNA synthesized in MDA can assume many alternative secondary structures. DNA strands extended on an initial template can be displaced becoming available to prime on a second template creating the chimeras. Evidence supports a model in which branch migration can displace 3'-ends freeing them to prime on the new templates. More than 85% of the resulting DNA rearrangements were inverted sequences with intervening deletions that the model predicts. Intramolecular rearrangements were favored, with displaced 3'-ends reannealing to single stranded 5'-strands contained within the same branched DNA molecule. In over 70% of the chimeric junctions, the 3' termini had initiated priming at complimentary sequences of 2-21 nucleotides (nts) in the new templates. Formation of chimeras is an important limitation to the MDA method, particularly for whole genome sequencing. Identification of the mechanism for chimera formation provides new insight into the MDA reaction and suggests methods to reduce chimeras. The 454 sequencing approach used here will provide a rapid method to assess the utility of reaction modifications.

  2. Mechanism of chimera formation during the Multiple Displacement Amplification reaction

    PubMed Central

    Lasken, Roger S; Stockwell, Timothy B

    2007-01-01

    Background Multiple Displacement Amplification (MDA) is a method used for amplifying limiting DNA sources. The high molecular weight amplified DNA is ideal for DNA library construction. While this has enabled genomic sequencing from one or a few cells of unculturable microorganisms, the process is complicated by the tendency of MDA to generate chimeric DNA rearrangements in the amplified DNA. Determining the source of the DNA rearrangements would be an important step towards reducing or eliminating them. Results Here, we characterize the major types of chimeras formed by carrying out an MDA whole genome amplification from a single E. coli cell and sequencing by the 454 Life Sciences method. Analysis of 475 chimeras revealed the predominant reaction mechanisms that create the DNA rearrangements. The highly branched DNA synthesized in MDA can assume many alternative secondary structures. DNA strands extended on an initial template can be displaced becoming available to prime on a second template creating the chimeras. Evidence supports a model in which branch migration can displace 3'-ends freeing them to prime on the new templates. More than 85% of the resulting DNA rearrangements were inverted sequences with intervening deletions that the model predicts. Intramolecular rearrangements were favored, with displaced 3'-ends reannealing to single stranded 5'-strands contained within the same branched DNA molecule. In over 70% of the chimeric junctions, the 3' termini had initiated priming at complimentary sequences of 2–21 nucleotides (nts) in the new templates. Conclusion Formation of chimeras is an important limitation to the MDA method, particularly for whole genome sequencing. Identification of the mechanism for chimera formation provides new insight into the MDA reaction and suggests methods to reduce chimeras. The 454 sequencing approach used here will provide a rapid method to assess the utility of reaction modifications. PMID:17430586

  3. eShadow: A tool for comparing closely related sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.

    2004-01-15

    Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less

  4. Multiple Access Interference Reduction Using Received Response Code Sequence for DS-CDMA UWB System

    NASA Astrophysics Data System (ADS)

    Toh, Keat Beng; Tachikawa, Shin'ichi

    This paper proposes a combination of novel Received Response (RR) sequence at the transmitter and a Matched Filter-RAKE (MF-RAKE) combining scheme receiver system for the Direct Sequence-Code Division Multiple Access Ultra Wideband (DS-CDMA UWB) multipath channel model. This paper also demonstrates the effectiveness of the RR sequence in Multiple Access Interference (MAI) reduction for the DS-CDMA UWB system. It suggests that by using conventional binary code sequence such as the M sequence or the Gold sequence, there is a possibility of generating extra MAI in the UWB system. Therefore, it is quite difficult to collect the energy efficiently although the RAKE reception method is applied at the receiver. The main purpose of the proposed system is to overcome the performance degradation for UWB transmission due to the occurrence of MAI during multiple accessing in the DS-CDMA UWB system. The proposed system improves the system performance by improving the RAKE reception performance using the RR sequence which can reduce the MAI effect significantly. Simulation results verify that significant improvement can be obtained by the proposed system in the UWB multipath channel models.

  5. Synergy and contingency as driving forces for the evolution of multiple secondary metabolite production by Streptomyces species.

    PubMed

    Challis, Gregory L; Hopwood, David A

    2003-11-25

    In this article we briefly review theories about the ecological roles of microbial secondary metabolites and discuss the prevalence of multiple secondary metabolite production by strains of Streptomyces, highlighting results from analysis of the recently sequenced Streptomyces coelicolor and Streptomyces avermitilis genomes. We address this question: Why is multiple secondary metabolite production in Streptomyces species so commonplace? We argue that synergy or contingency in the action of individual metabolites against biological competitors may, in some cases, be a powerful driving force for the evolution of multiple secondary metabolite production. This argument is illustrated with examples of the coproduction of synergistically acting antibiotics and contingently acting siderophores: two well-known classes of secondary metabolite. We focus, in particular, on the coproduction of beta-lactam antibiotics and beta-lactamase inhibitors, the coproduction of type A and type B streptogramins, and the coregulated production and independent uptake of structurally distinct siderophores by species of Streptomyces. Possible mechanisms for the evolution of multiple synergistic and contingent metabolite production in Streptomyces species are discussed. It is concluded that the production by Streptomyces species of two or more secondary metabolites that act synergistically or contingently against biological competitors may be far more common than has previously been recognized, and that synergy and contingency may be common driving forces for the evolution of multiple secondary metabolite production by these sessile saprophytes.

  6. Synergy and contingency as driving forces for the evolution of multiple secondary metabolite production by Streptomyces species

    PubMed Central

    Challis, Gregory L.; Hopwood, David A.

    2003-01-01

    In this article we briefly review theories about the ecological roles of microbial secondary metabolites and discuss the prevalence of multiple secondary metabolite production by strains of Streptomyces, highlighting results from analysis of the recently sequenced Streptomyces coelicolor and Streptomyces avermitilis genomes. We address this question: Why is multiple secondary metabolite production in Streptomyces species so commonplace? We argue that synergy or contingency in the action of individual metabolites against biological competitors may, in some cases, be a powerful driving force for the evolution of multiple secondary metabolite production. This argument is illustrated with examples of the coproduction of synergistically acting antibiotics and contingently acting siderophores: two well-known classes of secondary metabolite. We focus, in particular, on the coproduction of β-lactam antibiotics and β-lactamase inhibitors, the coproduction of type A and type B streptogramins, and the coregulated production and independent uptake of structurally distinct siderophores by species of Streptomyces. Possible mechanisms for the evolution of multiple synergistic and contingent metabolite production in Streptomyces species are discussed. It is concluded that the production by Streptomyces species of two or more secondary metabolites that act synergistically or contingently against biological competitors may be far more common than has previously been recognized, and that synergy and contingency may be common driving forces for the evolution of multiple secondary metabolite production by these sessile saprophytes. PMID:12970466

  7. Microbial genomic taxonomy

    PubMed Central

    2013-01-01

    A need for a genomic species definition is emerging from several independent studies worldwide. In this commentary paper, we discuss recent studies on the genomic taxonomy of diverse microbial groups and a unified species definition based on genomics. Accordingly, strains from the same microbial species share >95% Average Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI), >95% identity based on multiple alignment genes, <10 in Karlin genomic signature, and > 70% in silico Genome-to-Genome Hybridization similarity (GGDH). Species of the same genus will form monophyletic groups on the basis of 16S rRNA gene sequences, Multilocus Sequence Analysis (MLSA) and supertree analysis. In addition to the established requirements for species descriptions, we propose that new taxa descriptions should also include at least a draft genome sequence of the type strain in order to obtain a clear outlook on the genomic landscape of the novel microbe. The application of the new genomic species definition put forward here will allow researchers to use genome sequences to define simultaneously coherent phenotypic and genomic groups. PMID:24365132

  8. A multi-omic future for microbiome studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jansson, Janet K.; Baker, Erin S.

    2016-04-26

    Microbes constitute about a third of the Earth’s biomass and play critical roles in sustaining life. While results from multiple sequence-based studies have illustrated the importance of microbial communities for human health and the environment, additional technological developments are still needed to gain more insight into their functions [1]. To date, the majority of sequencing studies have focused on the 16S rRNA gene as a phylogenetic marker. This approach has enabled exploration of microbial compositions in a range of sample types, while bypassing the need for cultivation. 16S rRNA gene sequencing has also enabled a vast majority of microorganisms nevermore » previously isolated in culture to be identified and placed into a phylogenetic context [2]. These technologies have been utilized to map the locations of microbes inhabiting various locations of the body [3]. Similarly, sequencing has been used to determine the identities and distributions of microorganisms inhabiting different ecosystems [4, 5], and efforts in single cell sequencing of the microbiome have helped fill in missing branches of the phylogenetic tree [6].« less

  9. A multiplexed system for quantitative comparisons of chromatin landscapes

    PubMed Central

    van Galen, Peter; Viny, Aaron D.; Ram, Oren; Ryan, Russell J.H.; Cotton, Matthew J.; Donohue, Laura; Sievers, Cem; Drier, Yotam; Liau, Brian B.; Gillespie, Shawn M.; Carroll, Kaitlin M.; Cross, Michael B.; Levine, Ross L.; Bernstein, Bradley E.

    2015-01-01

    Genome-wide profiling of histone modifications can provide systematic insight into the regulatory elements and programs engaged in a given cell type. However, conventional chromatin immunoprecipitation and sequencing (ChIP-seq) does not capture quantitative information on histone modification levels, requires large amounts of starting material, and involves tedious processing of each individual sample. Here we address these limitations with a technology that leverages DNA barcoding to profile chromatin quantitatively and in multiplexed format. We concurrently map relative levels of multiple histone modifications across multiple samples, each comprising as few as a thousand cells. We demonstrate the technology by monitoring dynamic changes following inhibition of P300, EZH2 or KDM5, by linking altered epigenetic landscapes to chromatin regulator mutations, and by mapping active and repressive marks in purified human hematopoietic stem cells. Hence, this technology enables quantitative studies of chromatin state dynamics across rare cell types, genotypes, environmental conditions and drug treatments. PMID:26687680

  10. Genetic diversity and multiple introductions of porcine reproductive and respiratory syndrome viruses in Thailand

    PubMed Central

    2011-01-01

    Porcine reproductive and respiratory syndrome virus (PRRSV) is prevalent in Thailand, causing a huge impact on the country's swine industry. Yet the diversity and origin of these Thai PRRSVs remained vague. In this context, we collected all the Thai PRRSV sequences described earlier and incorporated them into the global diversity. The results indicated that PRRSVs in Thailand were originated from multiple introductions involving both Type 1 and Type 2 PRRSVs. Many of the introductions were followed by extensive geographic expansion, causing regional co-circulation of diverse PRRSV variants in three major pig-producing provinces. Based on these results, we suggest (1) to avoid blind vaccination and to apply vaccines tailor-made for target diversity, (2) to monitor pig importation and transportation, and (3) to implement a better biosecurity to reduce horizontal transmissions as three potentially effective strategies of controlling PRRS in Thailand. PMID:21486451

  11. Genetic diversity and multiple introductions of porcine reproductive and respiratory syndrome viruses in Thailand.

    PubMed

    Tun, Hein M; Shi, Mang; Wong, Charles L Y; Ayudhya, Suparlark N N; Amonsin, Alongkorn; Thanawonguwech, Roongroje; Leung, Frederick C C

    2011-04-12

    Porcine reproductive and respiratory syndrome virus (PRRSV) is prevalent in Thailand, causing a huge impact on the country's swine industry. Yet the diversity and origin of these Thai PRRSVs remained vague. In this context, we collected all the Thai PRRSV sequences described earlier and incorporated them into the global diversity. The results indicated that PRRSVs in Thailand were originated from multiple introductions involving both Type 1 and Type 2 PRRSVs. Many of the introductions were followed by extensive geographic expansion, causing regional co-circulation of diverse PRRSV variants in three major pig-producing provinces. Based on these results, we suggest (1) to avoid blind vaccination and to apply vaccines tailor-made for target diversity, (2) to monitor pig importation and transportation, and (3) to implement a better biosecurity to reduce horizontal transmissions as three potentially effective strategies of controlling PRRS in Thailand.

  12. Differential correlation for sequencing data.

    PubMed

    Siska, Charlotte; Kechris, Katerina

    2017-01-19

    Several methods have been developed to identify differential correlation (DC) between pairs of molecular features from -omics studies. Most DC methods have only been tested with microarrays and other platforms producing continuous and Gaussian-like data. Sequencing data is in the form of counts, often modeled with a negative binomial distribution making it difficult to apply standard correlation metrics. We have developed an R package for identifying DC called Discordant which uses mixture models for correlations between features and the Expectation Maximization (EM) algorithm for fitting parameters of the mixture model. Several correlation metrics for sequencing data are provided and tested using simulations. Other extensions in the Discordant package include additional modeling for different types of differential correlation, and faster implementation, using a subsampling routine to reduce run-time and address the assumption of independence between molecular feature pairs. With simulations and breast cancer miRNA-Seq and RNA-Seq data, we find that Spearman's correlation has the best performance among the tested correlation methods for identifying differential correlation. Application of Spearman's correlation in the Discordant method demonstrated the most power in ROC curves and sensitivity/specificity plots, and improved ability to identify experimentally validated breast cancer miRNA. We also considered including additional types of differential correlation, which showed a slight reduction in power due to the additional parameters that need to be estimated, but more versatility in applications. Finally, subsampling within the EM algorithm considerably decreased run-time with negligible effect on performance. A new method and R package called Discordant is presented for identifying differential correlation with sequencing data. Based on comparisons with different correlation metrics, this study suggests Spearman's correlation is appropriate for sequencing data, but other correlation metrics are available to the user depending on the application and data type. The Discordant method can also be extended to investigate additional DC types and subsampling with the EM algorithm is now available for reduced run-time. These extensions to the R package make Discordant more robust and versatile for multiple -omics studies.

  13. [Clinical value of MRI united-sequences examination in diagnosis and differentiation of morphological sub-type of hilar and extrahepatic big bile duct cholangiocarcinoma].

    PubMed

    Yin, Long-Lin; Song, Bin; Guan, Ying; Li, Ying-Chun; Chen, Guang-Wen; Zhao, Li-Ming; Lai, Li

    2014-09-01

    To investigate MRI features and associated histological and pathological changes of hilar and extrahepatic big bile duct cholangiocarcinoma with different morphological sub-types, and its value in differentiating between nodular cholangiocarcinoma (NCC) and intraductal growing cholangiocarcinoma (IDCC). Imaging data of 152 patients with pathologically confirmed hilar and extrahepatic big bile duct cholangiocarcinoma were reviewed, which included 86 periductal infiltrating cholangiocarcinoma (PDCC), 55 NCC, and 11 IDCC. Imaging features of the three morphological sub-types were compared. Each of the subtypes demonstrated its unique imaging features. Significant differences (P < 0.05) were found between NCC and IDCC in tumor shape, dynamic enhanced pattern, enhancement degree during equilibrium phase, multiplicity or singleness of tumor, changes in wall and lumen of bile duct at the tumor-bearing segment, dilatation of tumor upstream or downstream bile duct, and invasion of adjacent organs. Imaging features reveal tumor growth patterns of hilar and extrahepatic big bile duct cholangiocarcinoma. MRI united-sequences examination can accurately describe those imaging features for differentiation diagnosis.

  14. Short-term evolution of Shiga toxin-producing Escherichia coli O157:H7 between two food-borne outbreaks.

    PubMed

    Cowley, Lauren A; Dallman, Timothy J; Fitzgerald, Stephen; Irvine, Neil; Rooney, Paul J; McAteer, Sean P; Day, Martin; Perry, Neil T; Bono, James L; Jenkins, Claire; Gally, David L

    2016-09-01

    Shiga toxin-producing Escherichia coli (STEC) O157:H7 is a public health threat and outbreaks occur worldwide. Here, we investigate genomic differences between related STEC O157:H7 that caused two outbreaks, eight weeks apart, at the same restaurant. Short-read genome sequencing divided the outbreak strains into two sub-clusters separated by only three single-nucleotide polymorphisms in the core genome while traditional typing identified them as separate phage types, PT8 and PT54. Isolates did not cluster with local strains but with those associated with foreign travel to the Middle East/North Africa. Combined long-read sequencing approaches and optical mapping revealed that the two outbreak strains had undergone significant microevolution in the accessory genome with prophage gain, loss and recombination. In addition, the PT54 sub-type had acquired a 240 kbp multi-drug resistance (MDR) IncHI2 plasmid responsible for the phage type switch. A PT54 isolate had a general fitness advantage over a PT8 isolate in rich medium, including an increased capacity to use specific amino acids and dipeptides as a nitrogen source. The second outbreak was considerably larger and there were multiple secondary cases indicative of effective human-to-human transmission. We speculate that MDR plasmid acquisition and prophage changes have adapted the PT54 strain for human infection and transmission. Our study shows the added insights provided by combining whole-genome sequencing approaches for outbreak investigations.

  15. Short-term evolution of Shiga toxin-producing Escherichia coli O157:H7 between two food-borne outbreaks

    PubMed Central

    Dallman, Timothy J.; Fitzgerald, Stephen; Irvine, Neil; Rooney, Paul J.; McAteer, Sean P.; Day, Martin; Perry, Neil T.; Bono, James L.; Jenkins, Claire; Gally, David L.

    2016-01-01

    Shiga toxin-producing Escherichia coli (STEC) O157:H7 is a public health threat and outbreaks occur worldwide. Here, we investigate genomic differences between related STEC O157:H7 that caused two outbreaks, eight weeks apart, at the same restaurant. Short-read genome sequencing divided the outbreak strains into two sub-clusters separated by only three single-nucleotide polymorphisms in the core genome while traditional typing identified them as separate phage types, PT8 and PT54. Isolates did not cluster with local strains but with those associated with foreign travel to the Middle East/North Africa. Combined long-read sequencing approaches and optical mapping revealed that the two outbreak strains had undergone significant microevolution in the accessory genome with prophage gain, loss and recombination. In addition, the PT54 sub-type had acquired a 240 kbp multi-drug resistance (MDR) IncHI2 plasmid responsible for the phage type switch. A PT54 isolate had a general fitness advantage over a PT8 isolate in rich medium, including an increased capacity to use specific amino acids and dipeptides as a nitrogen source. The second outbreak was considerably larger and there were multiple secondary cases indicative of effective human-to-human transmission. We speculate that MDR plasmid acquisition and prophage changes have adapted the PT54 strain for human infection and transmission. Our study shows the added insights provided by combining whole-genome sequencing approaches for outbreak investigations. PMID:28348875

  16. Detailed analysis of stem I and its 5' and 3' neighbor regions in the trans-acting HDV ribozyme.

    PubMed Central

    Nishikawa, F; Roy, M; Fauzi, H; Nishikawa, S

    1999-01-01

    To determine the stem I structure of the human hepatitis delta virus (HDV) ribozyme, which is related to the substrate sequence in the trans -acting system, we kinetically studied stem I length and sequences. Stem I extension from 7 to 8 or 9 bp caused a loss of activity and a low amount of active complex with 9 bp in the trans -acting system. In a previous report, we presented cleavage in a 6 bp stem I. The observed reaction rates indicate that the original 7 bp stem I is in the most favorable location for catalytic reaction among the possible 6-8 bp stems. To test base specificity, we replaced the original GC-rich sequence in stem I with AU-rich sequences containing six AU or UA base pairs with the natural +1G.U wobble base pair at the cleavage site. The cis -acting AU-rich molecules demonstrated similar catalytic activity to that of the wild-type. In trans -acting molecules, due to stem I instability, reaction efficiency strongly depended on the concentration of the ribozyme-substrate complex and reaction temperature. Multiple turnover was observed at 37 degreesC, strongly suggesting that stem I has no base specificity and more efficient activity can be expected under multiple turnover conditions by substituting several UA or AU base pairs into stem I. We also studied the substrate damaging sequences linked to both ends of stem I for its development in therapeutic applications and confirmed the functions of the unique structure. PMID:9862958

  17. High-throughput analysis of T-DNA location and structure using sequence capture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less

  18. High-throughput analysis of T-DNA location and structure using sequence capture

    DOE PAGES

    Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...

    2015-10-07

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less

  19. Multiple ice-binding proteins of probable prokaryotic origin in an Antarctic lake alga, Chlamydomonas sp. ICE-MDV (Chlorophyceae).

    PubMed

    Raymond, James A; Morgan-Kiss, Rachael

    2017-08-01

    Ice-associated algae produce ice-binding proteins (IBPs) to prevent freezing damage. The IBPs of the three chlorophytes that have been examined so far share little similarity across species, making it likely that they were acquired by horizontal gene transfer (HGT). To clarify the importance and source of IBPs in chlorophytes, we sequenced the IBP genes of another Antarctic chlorophyte, Chlamydomonas sp. ICE-MDV (Chlamy-ICE). Genomic DNA and total RNA were sequenced and screened for known ice-associated genes. Chlamy-ICE has as many as 50 IBP isoforms, indicating that they have an important role in survival. The IBPs are of the DUF3494 type and have similar exon structures. The DUF3494 sequences are much more closely related to prokaryotic sequences than they are to sequences in other chlorophytes, and the chlorophyte IBP and ribosomal 18S phylogenies are dissimilar. The multiple IBP isoforms found in Chlamy-ICE and other algae may allow the algae to adapt to a greater variety of ice conditions than prokaryotes, which typically have a single IBP gene. The predicted structure of the DUF3494 domain has an ice-binding face with an orderly array of hydrophilic side chains. The results indicate that Chlamy-ICE acquired its IBP genes by HGT in a single event. The acquisitions of IBP genes by this and other species of Antarctic algae by HGT appear to be key evolutionary events that allowed algae to extend their ranges into polar environments. © 2017 Phycological Society of America.

  20. Single-strand conformation polymorphism (SSCP)-based mutation scanning approaches to fingerprint sequence variation in ribosomal DNA of ascaridoid nematodes.

    PubMed

    Zhu, X Q; Gasser, R B

    1998-06-01

    In this study, we assessed single-strand conformation polymorphism (SSCP)-based approaches for their capacity to fingerprint sequence variation in ribosomal DNA (rDNA) of ascaridoid nematodes of veterinary and/or human health significance. The second internal transcribed spacer region (ITS-2) of rDNA was utilised as the target region because it is known to provide species-specific markers for this group of parasites. ITS-2 was amplified by PCR from genomic DNA derived from individual parasites and subjected to analysis. Direct SSCP analysis of amplicons from seven taxa (Toxocara vitulorum, Toxocara cati, Toxocara canis, Toxascaris leonina, Baylisascaris procyonis, Ascaris suum and Parascaris equorum) showed that the single-strand (ss) ITS-2 patterns produced allowed their unequivocal identification to species. While no variation in SSCP patterns was detected in the ITS-2 within four species for which multiple samples were available, the method allowed the direct display of four distinct sequence types of ITS-2 among individual worms of T. cati. Comparison of SSCP/sequencing with the methods of dideoxy fingerprinting (ddF) and restriction endonuclease fingerprinting (REF) revealed that also ddF allowed the definition of the four sequence types, whereas REF displayed three of four. The findings indicate the usefulness of the SSCP-based approaches for the identification of ascaridoid nematodes to species, the direct display of sequence variation in rDNA and the detection of population variation. The ability to fingerprint microheterogeneity in ITS-2 rDNA using such approaches also has implications for studying fundamental aspects relating to mutational change in rDNA.

  1. Variability and repertoire size of T-cell receptor V alpha gene segments.

    PubMed

    Becker, D M; Pattern, P; Chien, Y; Yokota, T; Eshhar, Z; Giedlin, M; Gascoigne, N R; Goodnow, C; Wolf, R; Arai, K

    The immune system of higher organisms is composed largely of two distinct cell types, B lymphocytes and T lymphocytes, each of which is independently capable of recognizing an enormous number of distinct entities through their antigen receptors; surface immunoglobulin in the case of the former, and the T-cell receptor (TCR) in the case of the latter. In both cell types, the genes encoding the antigen receptors consist of multiple gene segments which recombine during maturation to produce many possible peptides. One striking difference between B- and T-cell recognition that has not yet been resolved by the structural data is the fact that T cells generally require a major histocompatibility determinant together with an antigen whereas, in most cases, antibodies recognize antigen alone. Recently, we and others have found that a series of TCR V beta gene sequences show conservation of many of the same residues that are conserved between heavy- and light-chain immunoglobulin V regions, and these V beta sequences are predicted to have an immunoglobulin-like secondary structure. To extend these studies, we have isolated and sequenced eight additional alpha-chain complementary cDNA clones and compared them with published sequences. Analyses of these sequences, reported here, indicate that V alpha regions have many of the characteristics of V beta gene segments but differ in that they almost always occur as cross-hybridizing gene families. We conclude that there may be very different selective pressures operating on V alpha and V beta sequences and that the V alpha repertoire may be considerably larger than that of V beta.

  2. A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band

    NASA Astrophysics Data System (ADS)

    Zou, Quan; Shan, Xiao; Jiang, Yi

    Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.

  3. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    1999-10-26

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  4. Computer-aided visualization and analysis system for sequence evaluation

    DOEpatents

    Chee, Mark S.

    2001-06-05

    A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).

  5. Whole-Genome-Sequencing characterization of bloodstream infection-causing hypervirulent Klebsiella pneumoniae of capsular serotype K2 and ST374.

    PubMed

    Wang, Xiaoli; Xie, Yingzhou; Li, Gang; Liu, Jialin; Li, Xiaobin; Tian, Lijun; Sun, Jingyong; Ou, Hong-Yu; Qu, Hongping

    2018-01-01

    Hypervirulent K. pneumoniae variants (hvKP) have been increasingly reported worldwide, causing metastasis of severe infections such as liver abscesses and bacteremia. The capsular serotype K2 hvKP strains show diverse multi-locus sequence types (MLSTs), but with limited genetics and virulence information. In this study, we report a hypermucoviscous K. pneumoniae strain, RJF293, isolated from a human bloodstream sample in a Chinese hospital. It caused a metastatic infection and fatal septic shock in a critical patient. The microbiological features and genetic background were investigated with multiple approaches. The Strain RJF293 was determined to be multilocis sequence type (ST) 374 and serotype K2, displayed a median lethal dose (LD50) of 1.5 × 10 2 CFU in BALB/c mice and was as virulent as the ST23 K1 serotype hvKP strain NTUH-K2044 in a mouse lethality assay. Whole genome sequencing revealed that the RJF293 genome codes for 32 putative virulence factors and exhibits a unique presence/absence pattern in comparison to the other 105 completely sequenced K. pneumoniae genomes. Whole genome SNP-based phylogenetic analysis revealed that strain RJF293 formed a single clade, distant from those containing either ST66 or ST86 hvKP. Compared to the other sequenced hvKP chromosomes, RJF293 contains several strain-variable regions, including one prophage, one ICEKp1 family integrative and conjugative element and six large genomic islands. The sequencing of the first complete genome of an ST374 K2 hvKP clinical strain should reinforce our understanding of the epidemiology and virulence mechanisms of this bloodstream infection-causing hvKP with clinical significance.

  6. Whole-Genome-Sequencing characterization of bloodstream infection-causing hypervirulent Klebsiella pneumoniae of capsular serotype K2 and ST374

    PubMed Central

    Wang, Xiaoli; Xie, Yingzhou; Li, Gang; Liu, Jialin; Li, Xiaobin; Tian, Lijun; Sun, Jingyong; Qu, Hongping

    2018-01-01

    ABSTRACT Hypervirulent K. pneumoniae variants (hvKP) have been increasingly reported worldwide, causing metastasis of severe infections such as liver abscesses and bacteremia. The capsular serotype K2 hvKP strains show diverse multi-locus sequence types (MLSTs), but with limited genetics and virulence information. In this study, we report a hypermucoviscous K. pneumoniae strain, RJF293, isolated from a human bloodstream sample in a Chinese hospital. It caused a metastatic infection and fatal septic shock in a critical patient. The microbiological features and genetic background were investigated with multiple approaches. The Strain RJF293 was determined to be multilocis sequence type (ST) 374 and serotype K2, displayed a median lethal dose (LD50) of 1.5 × 102 CFU in BALB/c mice and was as virulent as the ST23 K1 serotype hvKP strain NTUH-K2044 in a mouse lethality assay. Whole genome sequencing revealed that the RJF293 genome codes for 32 putative virulence factors and exhibits a unique presence/absence pattern in comparison to the other 105 completely sequenced K. pneumoniae genomes. Whole genome SNP-based phylogenetic analysis revealed that strain RJF293 formed a single clade, distant from those containing either ST66 or ST86 hvKP. Compared to the other sequenced hvKP chromosomes, RJF293 contains several strain-variable regions, including one prophage, one ICEKp1 family integrative and conjugative element and six large genomic islands. The sequencing of the first complete genome of an ST374 K2 hvKP clinical strain should reinforce our understanding of the epidemiology and virulence mechanisms of this bloodstream infection-causing hvKP with clinical significance. PMID:29338592

  7. eXframe: reusable framework for storage, analysis and visualization of genomics experiments

    PubMed Central

    2011-01-01

    Background Genome-wide experiments are routinely conducted to measure gene expression, DNA-protein interactions and epigenetic status. Structured metadata for these experiments is imperative for a complete understanding of experimental conditions, to enable consistent data processing and to allow retrieval, comparison, and integration of experimental results. Even though several repositories have been developed for genomics data, only a few provide annotation of samples and assays using controlled vocabularies. Moreover, many of them are tailored for a single type of technology or measurement and do not support the integration of multiple data types. Results We have developed eXframe - a reusable web-based framework for genomics experiments that provides 1) the ability to publish structured data compliant with accepted standards 2) support for multiple data types including microarrays and next generation sequencing 3) query, analysis and visualization integration tools (enabled by consistent processing of the raw data and annotation of samples) and is available as open-source software. We present two case studies where this software is currently being used to build repositories of genomics experiments - one contains data from hematopoietic stem cells and another from Parkinson's disease patients. Conclusion The web-based framework eXframe offers structured annotation of experiments as well as uniform processing and storage of molecular data from microarray and next generation sequencing platforms. The framework allows users to query and integrate information across species, technologies, measurement types and experimental conditions. Our framework is reusable and freely modifiable - other groups or institutions can deploy their own custom web-based repositories based on this software. It is interoperable with the most important data formats in this domain. We hope that other groups will not only use eXframe, but also contribute their own useful modifications. PMID:22103807

  8. High resolution identity testing of inactivated poliovirus vaccines.

    PubMed

    Mee, Edward T; Minor, Philip D; Martin, Javier

    2015-07-09

    Definitive identification of poliovirus strains in vaccines is essential for quality control, particularly where multiple wild-type and Sabin strains are produced in the same facility. Sequence-based identification provides the ultimate in identity testing and would offer several advantages over serological methods. We employed random RT-PCR and high throughput sequencing to recover full-length genome sequences from monovalent and trivalent poliovirus vaccine products at various stages of the manufacturing process. All expected strains were detected in previously characterised products and the method permitted identification of strains comprising as little as 0.1% of sequence reads. Highly similar Mahoney and Sabin 1 strains were readily discriminated on the basis of specific variant positions. Analysis of a product known to contain incorrect strains demonstrated that the method correctly identified the contaminants. Random RT-PCR and shotgun sequencing provided high resolution identification of vaccine components. In addition to the recovery of full-length genome sequences, the method could also be easily adapted to the characterisation of minor variant frequencies and distinction of closely related products on the basis of distinguishing consensus and low frequency polymorphisms. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  9. Evaluation and validation of de novo and hybrid assembly techniques to derive high quality genome sequences

    DOE PAGES

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Land, Miriam L.; ...

    2014-06-14

    Our motivation with this work was to assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences. Our results show Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as anmore » additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies. As to availability and implementation–all assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.« less

  10. Single-cell sequencing provides clues about the host interactions of segmented filamentous bacteria (SFB)

    PubMed Central

    Pamp, Sünje J.; Harrington, Eoghan D.; Quake, Stephen R.; Relman, David A.; Blainey, Paul C.

    2012-01-01

    Segmented filamentous bacteria (SFB) are host-specific intestinal symbionts that comprise a distinct clade within the Clostridiaceae, designated Candidatus Arthromitus. SFB display a unique life cycle within the host, involving differentiation into multiple cell types. The latter include filaments that attach intimately to intestinal epithelial cells, and from which “holdfasts” and spores develop. SFB induce a multifaceted immune response, leading to host protection from intestinal pathogens. Cultivation resistance has hindered characterization of these enigmatic bacteria. In the present study, we isolated five SFB filaments from a mouse using a microfluidic device equipped with laser tweezers, generated genome sequences from each, and compared these sequences with each other, as well as to recently published SFB genome sequences. Based on the resulting analyses, SFB appear to be dependent on the host for a variety of essential nutrients. SFB have a relatively high abundance of predicted proteins devoted to cell cycle control and to envelope biogenesis, and have a group of SFB-specific autolysins and a dynamin-like protein. Among the five filament genomes, an average of 8.6% of predicted proteins were novel, including a family of secreted SFB-specific proteins. Four ADP-ribosyltransferase (ADPRT) sequence types, and a myosin-cross-reactive antigen (MCRA) protein were discovered; we hypothesize that they are involved in modulation of host responses. The presence of polymorphisms among mouse SFB genomes suggests the evolution of distinct SFB lineages. Overall, our results reveal several aspects of SFB adaptation to the mammalian intestinal tract. PMID:22434425

  11. Detection of Quiescent Infections with Multiple Elephant Endotheliotropic Herpesviruses (EEHVs), Including EEHV2, EEHV3, EEHV6, and EEHV7, within Lymphoid Lung Nodules or Lung and Spleen Tissue Samples from Five Asymptomatic Adult African Elephants.

    PubMed

    Zong, Jian-Chao; Heaggans, Sarah Y; Long, Simon Y; Latimer, Erin M; Nofs, Sally A; Bronson, Ellen; Casares, Miguel; Fouraker, Michael D; Pearson, Virginia R; Richman, Laura K; Hayward, Gary S

    2015-12-30

    More than 80 cases of lethal hemorrhagic disease associated with elephant endotheliotropic herpesviruses (EEHVs) have been identified in young Asian elephants worldwide. Diagnostic PCR tests detected six types of EEHV in blood of elephants with acute disease, although EEHV1A is the predominant pathogenic type. Previously, the presence of herpesvirus virions within benign lung and skin nodules from healthy African elephants led to suggestions that African elephants may be the source of EEHV disease in Asian elephants. Here, we used direct PCR-based DNA sequencing to detect EEHV genomes in necropsy tissue from five healthy adult African elephants. Two large lung nodules collected from culled wild South African elephants contained high levels of either EEHV3 alone or both EEHV2 and EEHV3. Similarly, a euthanized U.S. elephant proved to harbor multiple EEHV types distributed nonuniformly across four small lung nodules, including high levels of EEHV6, lower levels of EEHV3 and EEHV2, and a new GC-rich branch type, EEHV7. Several of the same EEHV types were also detected in random lung and spleen samples from two other elephants. Sanger PCR DNA sequence data comprising 100 kb were obtained from a total of 15 different strains identified, with (except for a few hypervariable genes) the EEHV2, EEHV3, and EEHV6 strains all being closely related to known genotypes from cases of acute disease, whereas the seven loci (4.0 kb) obtained from EEHV7 averaged 18% divergence from their nearest relative, EEHV3. Overall, we conclude that these four EEHV species, but probably not EEHV1, occur commonly as quiescent infections in African elephants. Acute hemorrhagic disease characterized by high-level viremia due to infection by members of the Proboscivirus genus threatens the future breeding success of endangered Asian elephants worldwide. Although the genomes of six EEHV types from acute cases have been partially or fully characterized, lethal disease predominantly involves a variety of strains of EEHV1, whose natural host has been unclear. Here, we carried out genotype analyses by partial PCR sequencing of necropsy tissue from five asymptomatic African elephants and identified multiple simultaneous infections by several different EEHV types, including high concentrations in lymphoid lung nodules. Overall, the results provide strong evidence that EEHV2, EEHV3, EEHV6, and EEHV7 represent natural ubiquitous infections in African elephants, whereas Asian elephants harbor EEHV1A, EEHV1B, EEHV4, and EEHV5. Although a single case of fatal cross-species infection by EEHV3 is known, the results do not support the previous concept that highly pathogenic EEHV1A crossed from African to Asian elephants in zoos. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  12. Detection of Quiescent Infections with Multiple Elephant Endotheliotropic Herpesviruses (EEHVs), Including EEHV2, EEHV3, EEHV6, and EEHV7, within Lymphoid Lung Nodules or Lung and Spleen Tissue Samples from Five Asymptomatic Adult African Elephants

    PubMed Central

    Zong, Jian-Chao; Heaggans, Sarah Y.; Long, Simon Y.; Latimer, Erin M.; Nofs, Sally A.; Bronson, Ellen; Casares, Miguel; Fouraker, Michael D.; Pearson, Virginia R.; Richman, Laura K.

    2015-01-01

    ABSTRACT More than 80 cases of lethal hemorrhagic disease associated with elephant endotheliotropic herpesviruses (EEHVs) have been identified in young Asian elephants worldwide. Diagnostic PCR tests detected six types of EEHV in blood of elephants with acute disease, although EEHV1A is the predominant pathogenic type. Previously, the presence of herpesvirus virions within benign lung and skin nodules from healthy African elephants led to suggestions that African elephants may be the source of EEHV disease in Asian elephants. Here, we used direct PCR-based DNA sequencing to detect EEHV genomes in necropsy tissue from five healthy adult African elephants. Two large lung nodules collected from culled wild South African elephants contained high levels of either EEHV3 alone or both EEHV2 and EEHV3. Similarly, a euthanized U.S. elephant proved to harbor multiple EEHV types distributed nonuniformly across four small lung nodules, including high levels of EEHV6, lower levels of EEHV3 and EEHV2, and a new GC-rich branch type, EEHV7. Several of the same EEHV types were also detected in random lung and spleen samples from two other elephants. Sanger PCR DNA sequence data comprising 100 kb were obtained from a total of 15 different strains identified, with (except for a few hypervariable genes) the EEHV2, EEHV3, and EEHV6 strains all being closely related to known genotypes from cases of acute disease, whereas the seven loci (4.0 kb) obtained from EEHV7 averaged 18% divergence from their nearest relative, EEHV3. Overall, we conclude that these four EEHV species, but probably not EEHV1, occur commonly as quiescent infections in African elephants. IMPORTANCE Acute hemorrhagic disease characterized by high-level viremia due to infection by members of the Proboscivirus genus threatens the future breeding success of endangered Asian elephants worldwide. Although the genomes of six EEHV types from acute cases have been partially or fully characterized, lethal disease predominantly involves a variety of strains of EEHV1, whose natural host has been unclear. Here, we carried out genotype analyses by partial PCR sequencing of necropsy tissue from five asymptomatic African elephants and identified multiple simultaneous infections by several different EEHV types, including high concentrations in lymphoid lung nodules. Overall, the results provide strong evidence that EEHV2, EEHV3, EEHV6, and EEHV7 represent natural ubiquitous infections in African elephants, whereas Asian elephants harbor EEHV1A, EEHV1B, EEHV4, and EEHV5. Although a single case of fatal cross-species infection by EEHV3 is known, the results do not support the previous concept that highly pathogenic EEHV1A crossed from African to Asian elephants in zoos. PMID:26719245

  13. Molecular characterization and combined genotype association study of bovine cluster of differentiation 14 gene with clinical mastitis in crossbred dairy cattle

    PubMed Central

    Selvan, A. Sakthivel; Gupta, I. D.; Verma, A.; Chaudhari, M. V.; Magotra, A.

    2016-01-01

    Aim: The present study was undertaken with the objectives to characterize and to analyze combined genotypes of cluster of differentiation 14 (CD14) gene to explore its association with clinical mastitis in Karan Fries (KF) cows maintained in the National Dairy Research Institute herd, Karnal. Materials and Methods: Genomic DNA was extracted using blood of randomly selected 94 KF lactating cattle by phenol-chloroform method. After checking its quality and quantity, polymerase chain reaction (PCR) was carried out using six sets of reported gene-specific primers to amplify complete KF CD14 gene. The forward and reverse sequences for each PCR fragments were assembled to form complete sequence for the respective region of KF CD14 gene. The multiple sequence alignments of the edited sequence with the corresponding reference with reported Bos taurus sequence (EU148610.1) were performed with ClustalW software to identify single nucleotide polymorphisms (SNPs). Basic Local Alignment Search Tool analysis was performed to compare the sequence identity of KF CD14 gene with other species. The restriction fragment length polymorphism (RFLP) analysis was carried out in all KF cows using Helicobacter pylori 188I (Hpy188I) (contig 2) and Haemophilus influenzae I (HinfI) (contig 4) restriction enzyme (RE). Cows were assigned genotypes obtained by PCR-RFLP analysis, and association study was done using Chi-square (χ2) test. The genotypes of both contigs (loci) number 2 and 4 were combined with respect to each animal to construct combined genotype patterns. Results: Two types of sequences of KF were obtained: One with 2630 bp having one insertion at 616 nucleotide (nt) position and one deletion at 1117 nt position, and the another sequence was of 2629 bp having only one deletion at 615 nt position. ClustalW, multiple alignments of KF CD14 gene sequence with B. taurus cattle sequence (EU148610.1), revealed 24 nt changes (SNPs). Cows were also screened using PCR-RFLP with Hpy188I (contig 2) and HinfI (contig 4) RE, which revealed three genotypes each that differed significantly regarding mastitis incidence. The maximum possible combination of these two loci shown nine combined genotype patterns and it was observed only eight combined genotypes out of nine: AACC, AACD, AADD, ABCD, ABDD, BBCC, BBCD, and BBDD. The combined genotype ABCC was not observed in the studied population of KF cows. Out of 94 animals, AACD combined genotype animals (10.63%) were found to be not affected with mastitis, and ABDD combined genotyped animals was observed having the highest mastitis incidence of 15.96%. Conclusion: AACD typed cows were found to be least susceptible to mastitis incidence as compared to other combined genotypes. PMID:27536026

  14. Molecular characterization and combined genotype association study of bovine cluster of differentiation 14 gene with clinical mastitis in crossbred dairy cattle.

    PubMed

    Selvan, A Sakthivel; Gupta, I D; Verma, A; Chaudhari, M V; Magotra, A

    2016-07-01

    The present study was undertaken with the objectives to characterize and to analyze combined genotypes of cluster of differentiation 14 (CD14) gene to explore its association with clinical mastitis in Karan Fries (KF) cows maintained in the National Dairy Research Institute herd, Karnal. Genomic DNA was extracted using blood of randomly selected 94 KF lactating cattle by phenol-chloroform method. After checking its quality and quantity, polymerase chain reaction (PCR) was carried out using six sets of reported gene-specific primers to amplify complete KF CD14 gene. The forward and reverse sequences for each PCR fragments were assembled to form complete sequence for the respective region of KF CD14 gene. The multiple sequence alignments of the edited sequence with the corresponding reference with reported Bos taurus sequence (EU148610.1) were performed with ClustalW software to identify single nucleotide polymorphisms (SNPs). Basic Local Alignment Search Tool analysis was performed to compare the sequence identity of KF CD14 gene with other species. The restriction fragment length polymorphism (RFLP) analysis was carried out in all KF cows using Helicobacter pylori 188I (Hpy188I) (contig 2) and Haemophilus influenzae I (HinfI) (contig 4) restriction enzyme (RE). Cows were assigned genotypes obtained by PCR-RFLP analysis, and association study was done using Chi-square (χ (2)) test. The genotypes of both contigs (loci) number 2 and 4 were combined with respect to each animal to construct combined genotype patterns. Two types of sequences of KF were obtained: One with 2630 bp having one insertion at 616 nucleotide (nt) position and one deletion at 1117 nt position, and the another sequence was of 2629 bp having only one deletion at 615 nt position. ClustalW, multiple alignments of KF CD14 gene sequence with B. taurus cattle sequence (EU148610.1), revealed 24 nt changes (SNPs). Cows were also screened using PCR-RFLP with Hpy188I (contig 2) and HinfI (contig 4) RE, which revealed three genotypes each that differed significantly regarding mastitis incidence. The maximum possible combination of these two loci shown nine combined genotype patterns and it was observed only eight combined genotypes out of nine: AACC, AACD, AADD, ABCD, ABDD, BBCC, BBCD, and BBDD. The combined genotype ABCC was not observed in the studied population of KF cows. Out of 94 animals, AACD combined genotype animals (10.63%) were found to be not affected with mastitis, and ABDD combined genotyped animals was observed having the highest mastitis incidence of 15.96%. AACD typed cows were found to be least susceptible to mastitis incidence as compared to other combined genotypes.

  15. Development of Multiple-Locus Variable-Number Tandem-Repeat Analysis for Molecular Subtyping of Campylobacter jejuni by Using Capillary Electrophoresis

    PubMed Central

    Techaruvichit, Punnida; Vesaratchavest, Mongkol; Keeratipibul, Suwimon; Kuda, Takashi; Kimura, Bon

    2015-01-01

    Campylobacter jejuni is a common cause of the frequently reported food-borne diseases in developed and developing nations. This study describes the development of multiple-locus variable-number tandem-repeat (VNTR) analysis (MLVA) using capillary electrophoresis as a novel typing method for microbial source tracking and epidemiological investigation of C. jejuni. Among 36 tandem repeat loci detected by the Tandem Repeat Finder program, 7 VNTR loci were selected and used for characterizing 60 isolates recovered from chicken meat samples from retail shops, samples from chicken meat processing factory, and stool samples. The discrimination ability of MLVA was compared with that of multilocus sequence typing (MLST). MLVA (diversity index of 0.97 with 31 MLVA types) provided slightly higher discrimination than MLST (diversity index of 0.95 with 25 MLST types). The overall concordance between MLVA and MLST was estimated at 63% by adjusted Rand coefficient. MLVA predicted MLST type better than MLST predicted MLVA type, as reflected by Wallace coefficient (Wallace coefficient for MLVA to MLST versus MLST to MLVA, 86% versus 51%). MLVA is a useful tool and can be used for effective monitoring of C. jejuni and investigation of epidemics caused by C. jejuni. PMID:26025899

  16. The dynamics of genome replication using deep sequencing

    PubMed Central

    Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.

    2014-01-01

    Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142

  17. Computational and experimental analysis of DNA shuffling

    PubMed Central

    Maheshri, Narendra; Schaffer, David V.

    2003-01-01

    We describe a computational model of DNA shuffling based on the thermodynamics and kinetics of this process. The model independently tracks a representative ensemble of DNA molecules and records their states at every stage of a shuffling reaction. These data can subsequently be analyzed to yield information on any relevant metric, including reassembly efficiency, crossover number, type and distribution, and DNA sequence length distributions. The predictive ability of the model was validated by comparison to three independent sets of experimental data, and analysis of the simulation results led to several unique insights into the DNA shuffling process. We examine a tradeoff between crossover frequency and reassembly efficiency and illustrate the effects of experimental parameters on this relationship. Furthermore, we discuss conditions that promote the formation of useless “junk” DNA sequences or multimeric sequences containing multiple copies of the reassembled product. This model will therefore aid in the design of optimal shuffling reaction conditions. PMID:12626764

  18. Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding.

    PubMed

    Shahi, Payam; Kim, Samuel C; Haliburton, John R; Gartner, Zev J; Abate, Adam R

    2017-03-14

    Proteins are the primary effectors of cellular function, including cellular metabolism, structural dynamics, and information processing. However, quantitative characterization of proteins at the single-cell level is challenging due to the tiny amount of protein available. Here, we present Abseq, a method to detect and quantitate proteins in single cells at ultrahigh throughput. Like flow and mass cytometry, Abseq uses specific antibodies to detect epitopes of interest; however, unlike these methods, antibodies are labeled with sequence tags that can be read out with microfluidic barcoding and DNA sequencing. We demonstrate this novel approach by characterizing surface proteins of different cell types at the single-cell level and distinguishing between the cells by their protein expression profiles. DNA-tagged antibodies provide multiple advantages for profiling proteins in single cells, including the ability to amplify low-abundance tags to make them detectable with sequencing, to use molecular indices for quantitative results, and essentially limitless multiplexing.

  19. Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding

    NASA Astrophysics Data System (ADS)

    Shahi, Payam; Kim, Samuel C.; Haliburton, John R.; Gartner, Zev J.; Abate, Adam R.

    2017-03-01

    Proteins are the primary effectors of cellular function, including cellular metabolism, structural dynamics, and information processing. However, quantitative characterization of proteins at the single-cell level is challenging due to the tiny amount of protein available. Here, we present Abseq, a method to detect and quantitate proteins in single cells at ultrahigh throughput. Like flow and mass cytometry, Abseq uses specific antibodies to detect epitopes of interest; however, unlike these methods, antibodies are labeled with sequence tags that can be read out with microfluidic barcoding and DNA sequencing. We demonstrate this novel approach by characterizing surface proteins of different cell types at the single-cell level and distinguishing between the cells by their protein expression profiles. DNA-tagged antibodies provide multiple advantages for profiling proteins in single cells, including the ability to amplify low-abundance tags to make them detectable with sequencing, to use molecular indices for quantitative results, and essentially limitless multiplexing.

  20. Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding

    PubMed Central

    Shahi, Payam; Kim, Samuel C.; Haliburton, John R.; Gartner, Zev J.; Abate, Adam R.

    2017-01-01

    Proteins are the primary effectors of cellular function, including cellular metabolism, structural dynamics, and information processing. However, quantitative characterization of proteins at the single-cell level is challenging due to the tiny amount of protein available. Here, we present Abseq, a method to detect and quantitate proteins in single cells at ultrahigh throughput. Like flow and mass cytometry, Abseq uses specific antibodies to detect epitopes of interest; however, unlike these methods, antibodies are labeled with sequence tags that can be read out with microfluidic barcoding and DNA sequencing. We demonstrate this novel approach by characterizing surface proteins of different cell types at the single-cell level and distinguishing between the cells by their protein expression profiles. DNA-tagged antibodies provide multiple advantages for profiling proteins in single cells, including the ability to amplify low-abundance tags to make them detectable with sequencing, to use molecular indices for quantitative results, and essentially limitless multiplexing. PMID:28290550

  1. Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

    PubMed Central

    2011-01-01

    Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510

  2. Single-Cell Sequencing of the Healthy and Diseased Heart Reveals Ckap4 as a New Modulator of Fibroblasts Activation.

    PubMed

    Gladka, Monika M; Molenaar, Bas; de Ruiter, Hesther; van der Elst, Stefan; Tsui, Hoyee; Versteeg, Danielle; Lacraz, Grègory P A; Huibers, Manon M H; van Oudenaarden, Alexander; van Rooij, Eva

    2018-01-31

    Background -Genome-wide transcriptome analysis has greatly advanced our understanding of the regulatory networks underlying basic cardiac biology and mechanisms driving disease. However, so far, the resolution of studying gene expression patterns in the adult heart has been limited to the level of extracts from whole tissues. The use of tissue homogenates inherently causes the loss of any information on cellular origin or cell type-specific changes in gene expression. Recent developments in RNA amplification strategies provide a unique opportunity to use small amounts of input RNA for genome-wide sequencing of single cells. Methods -Here, we present a method to obtain high quality RNA from digested cardiac tissue from adult mice for automated single-cell sequencing of both the healthy and diseased heart. Results -After optimization, we were able to perform single-cell sequencing on adult cardiac tissue under both homeostatic conditions and after ischemic injury. Clustering analysis based on differential gene expression unveiled known and novel markers of all main cardiac cell types. Based on differential gene expression we were also able to identify multiple subpopulations within a certain cell type. Furthermore, applying single-cell sequencing on both the healthy and the injured heart indicated the presence of disease-specific cell subpopulations. As such, we identified cytoskeleton associated protein 4 ( Ckap4 ) as a novel marker for activated fibroblasts that positively correlates with known myofibroblast markers in both mouse and human cardiac tissue. Ckap4 inhibition in activated fibroblasts treated with TGFβ triggered a greater increase in the expression of genes related to activated fibroblasts compared to control, suggesting a role of Ckap4 in modulating fibroblast activation in the injured heart. Conclusions -Single-cell sequencing on both the healthy and diseased adult heart allows us to study transcriptomic differences between cardiac cells, as well as cell type-specific changes in gene expression during cardiac disease. This new approach provides a wealth of novel insights into molecular changes that underlie the cellular processes relevant for cardiac biology and pathophysiology. Applying this technology could lead to the discovery of new therapeutic targets relevant for heart disease.

  3. Ultrathin type-II GaSb/GaAs quantum wells grown by OMVPE

    NASA Astrophysics Data System (ADS)

    Pitts, O. J.; Watkins, S. P.; Wang, C. X.; Stotz, J. A. H.; Meyer, T. A.; Thewalt, M. L. W.

    2004-09-01

    Heterostructures containing monolayer (ML) and submonolayer GaSb insertions in GaAs were grown using organometallic vapour phase epitaxy. At the GaAs-on-GaSb interface, strong intermixing occurs due to the surface segregation of Sb. To form structures with relatively abrupt interfaces, a flashoff growth sequence, in which growth interruptions are employed to desorb Sb from the surface, was introduced. Reflectance-difference spectroscopy and high-resolution X-ray diffraction data demonstrate that interfacial grading is strongly reduced by this procedure. For layer structures grown with the flashoff sequence, a GaSb coverage up to 1 ML can be obtained in the two-dimensional (2D) growth mode. For uncapped GaSb layers, on the other hand, atomic force microscope images show that the 2D-3D growth mode transition occurs at a submonolayer coverage between 0.3 and 0.5 ML. Low-temperature photoluminescence spectra of multiple quantum well samples grown using the flashoff sequence show a strong quantum well-related peak which shifts to lower energies as the amount of Sb incorporated increases. The PL peak energies are consistent with a type-II band lineup at the GaAs/GaSb interface.

  4. Streptococcus moroccensis sp. nov. and Streptococcus rifensis sp. nov., isolated from raw camel milk.

    PubMed

    Kadri, Zaina; Amar, Mohamed; Ouadghiri, Mouna; Cnockaert, Margo; Aerts, Maarten; El Farricha, Omar; Vandamme, Peter

    2014-07-01

    Two catalase- and oxidase-negative Streptococcus-like strains, LMG 27682(T) and LMG 27684(T), were isolated from raw camel milk in Morocco. Comparative 16S rRNA gene sequencing assigned these bacteria to the genus Streptococcus with Streptococcus rupicaprae 2777-2-07(T) as their closest phylogenetic neighbour (95.9% and 95.7% similarity, respectively). 16S rRNA gene sequence similarity between the two strains was 96.7%. Although strains LMG 27682(T) and LMG 27684(T) shared a DNA-DNA hybridization value that corresponded to the threshold level for species delineation (68%), the two strains could be distinguished by multiple biochemical tests, sequence analysis of the phenylalanyl-tRNA synthase (pheS), RNA polymerase (rpoA) and ATP synthase (atpA) genes and by their MALDI-TOF MS profiles. On the basis of these considerable phenotypic and genotypic differences, we propose to classify both strains as novel species of the genus Streptococcus, for which the names Streptococcus moroccensis sp. nov. (type strain, LMG 27682(T)  = CCMM B831(T)) and Streptococcus rifensis sp. nov. (type strain, LMG 27684(T)  = CCMM B833(T)) are proposed. © 2014 IUMS.

  5. Discovery of ALK-PTPN3 gene fusion from human non-small cell lung carcinoma cell line using next generation RNA sequencing.

    PubMed

    Jung, Yeonjoo; Kim, Pora; Jung, Yeonhwa; Keum, Juhee; Kim, Soon-Nam; Choi, Yong Soo; Do, In-Gu; Lee, Jinseon; Choi, So-Jung; Kim, Sujin; Lee, Jong-Eun; Kim, Jhingook; Lee, Sanghyuk; Kim, Jaesang

    2012-06-01

    An increasing number of chromosomal aberrations is being identified in solid tumors providing novel biomarkers for various types of cancer and new insights into the mechanisms of carcinogenesis. We applied next generation sequencing technique to analyze the transcriptome of the non-small cell lung carcinoma (NSCLC) cell line H2228 and discovered a fusion transcript composed of multiple exons of ALK (anaplastic lymphoma receptor tyrosine kinase) and PTPN3 (protein tyrosine phosphatase, nonreceptor Type 3). Detailed analysis of the genomic structure revealed that a portion of genomic region encompassing Exons 10 and 11 of ALK has been translocated into the intronic region between Exons 2 and 3 of PTPN3. The key net result appears to be the null mutation of one allele of PTPN3, a gene with tumor suppressor activity. Consistently, ectopic expression of PTPN3 in NSCLC cell lines led to inhibition of colony formation. Our study confirms the utility of next generation sequencing as a tool for the discovery of somatic mutations and has led to the identification of a novel mutation in NSCLC that may be of diagnostic, prognostic, and therapeutic importance. Copyright © 2012 Wiley Periodicals, Inc.

  6. Novel genomic findings in multiple myeloma identified through routine diagnostic sequencing.

    PubMed

    Ryland, Georgina L; Jones, Kate; Chin, Melody; Markham, John; Aydogan, Elle; Kankanige, Yamuna; Caruso, Marisa; Guinto, Jerick; Dickinson, Michael; Prince, H Miles; Yong, Kwee; Blombery, Piers

    2018-05-14

    Multiple myeloma is a genomically complex haematological malignancy with many genomic alterations recognised as important in diagnosis, prognosis and therapeutic decision making. Here, we provide a summary of genomic findings identified through routine diagnostic next-generation sequencing at our centre. A cohort of 86 patients with multiple myeloma underwent diagnostic sequencing using a custom hybridisation-based panel targeting 104 genes. Sequence variants, genome-wide copy number changes and structural rearrangements were detected using an inhouse-developed bioinformatics pipeline. At least one mutation was found in 69 (80%) patients. Frequently mutated genes included TP53 (36%), KRAS (22.1%), NRAS (15.1%), FAM46C/DIS3 (8.1%) and TET2/FGFR3 (5.8%), including multiple mutations not previously described in myeloma. Importantly we observed TP53 mutations in the absence of a 17 p deletion in 8% of the cohort, highlighting the need for sequencing-based assessment in addition to cytogenetics to identify these high-risk patients. Multiple novel copy number changes and immunoglobulin heavy chain translocations are also discussed. Our results demonstrate that many clinically relevant genomic findings remain in multiple myeloma which have not yet been identified through large-scale sequencing efforts, and provide important mechanistic insights into plasma cell pathobiology. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  7. Sequence analyses of fimbriae subunit FimA proteins on Actinomyces naeslundii genospecies 1 and 2 and Actinomyces odontolyticus with variant carbohydrate binding specificities

    PubMed Central

    Drobni, Mirva; Hallberg, Kristina; Öhman, Ulla; Birve, Anna; Persson, Karina; Johansson, Ingegerd; Strömberg, Nicklas

    2006-01-01

    Background Actinomyces naeslundii genospecies 1 and 2 express type-2 fimbriae (FimA subunit polymers) with variant Galβ binding specificities and Actinomyces odontolyticus a sialic acid specificity to colonize different oral surfaces. However, the fimbrial nature of the sialic acid binding property and sequence information about FimA proteins from multiple strains are lacking. Results Here we have sequenced fimA genes from strains of A.naeslundii genospecies 1 (n = 4) and genospecies 2 (n = 4), both of which harboured variant Galβ-dependent hemagglutination (HA) types, and from A.odontolyticus PK984 with a sialic acid-dependent HA pattern. Three unique subtypes of FimA proteins with 63.8–66.4% sequence identity were present in strains of A. naeslundii genospecies 1 and 2 and A. odontolyticus. The generally high FimA sequence identity (>97.2%) within a genospecies revealed species specific sequences or segments that coincided with binding specificity. All three FimA protein variants contained a signal peptide, pilin motif, E box, proline-rich segment and an LPXTG sorting motif among other conserved segments for secretion, assembly and sorting of fimbrial proteins. The highly conserved pilin, E box and LPXTG motifs are present in fimbriae proteins from other Gram-positive bacteria. Moreover, only strains of genospecies 1 were agglutinated with type-2 fimbriae antisera derived from A. naeslundii genospecies 1 strain 12104, emphasizing that the overall folding of FimA may generate different functionalities. Western blot analyses with FimA antisera revealed monomers and oligomers of FimA in whole cell protein extracts and a purified recombinant FimA preparation, indicating a sortase-independent oligomerization of FimA. Conclusion The genus Actinomyces involves a diversity of unique FimA proteins with conserved pilin, E box and LPXTG motifs, depending on subspecies and associated binding specificity. In addition, a sortase independent oligomerization of FimA subunit proteins in solution was indicated. PMID:16686953

  8. A standardized framework for accurate, high-throughput genotyping of recombinant and non-recombinant viral sequences.

    PubMed

    Alcantara, Luiz Carlos Junior; Cassol, Sharon; Libin, Pieter; Deforche, Koen; Pybus, Oliver G; Van Ranst, Marc; Galvão-Castro, Bernardo; Vandamme, Anne-Mieke; de Oliveira, Tulio

    2009-07-01

    Human immunodeficiency virus type-1 (HIV-1), hepatitis B and C and other rapidly evolving viruses are characterized by extremely high levels of genetic diversity. To facilitate diagnosis and the development of prevention and treatment strategies that efficiently target the diversity of these viruses, and other pathogens such as human T-lymphotropic virus type-1 (HTLV-1), human herpes virus type-8 (HHV8) and human papillomavirus (HPV), we developed a rapid high-throughput-genotyping system. The method involves the alignment of a query sequence with a carefully selected set of pre-defined reference strains, followed by phylogenetic analysis of multiple overlapping segments of the alignment using a sliding window. Each segment of the query sequence is assigned the genotype and sub-genotype of the reference strain with the highest bootstrap (>70%) and bootscanning (>90%) scores. Results from all windows are combined and displayed graphically using color-coded genotypes. The new Virus-Genotyping Tools provide accurate classification of recombinant and non-recombinant viruses and are currently being assessed for their diagnostic utility. They have incorporated into several HIV drug resistance algorithms including the Stanford (http://hivdb.stanford.edu) and two European databases (http://www.umcutrecht.nl/subsite/spread-programme/ and http://www.hivrdb.org.uk/) and have been successfully used to genotype a large number of sequences in these and other databases. The tools are a PHP/JAVA web application and are freely accessible on a number of servers including: http://bioafrica.mrc.ac.za/rega-genotype/html/, http://lasp.cpqgm.fiocruz.br/virus-genotype/html/, http://jose.med.kuleuven.be/genotypetool/html/.

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baraibar, Martin A.; Muhoberac, Barry B.; Garringer, Holly J.

    Mutations in the coding sequence of the ferritin light chain (FTL) gene cause a neurodegenerative disease known as neuroferritinopathy or hereditary ferritinopathy, which is characterized by the presence of intracellular inclusion bodies containing the mutant FTL polypeptide and by abnormal accumulation of iron in the brain. Here, we describe the x-ray crystallographic structure and report functional studies of ferritin homopolymers formed from the mutant FTL polypeptide p.Phe167SerfsX26, which has a C terminus that is altered in amino acid sequence and length. The structure was determined and refined to 2.85 {angstrom} resolution and was very similar to the wild type betweenmore » residues Ile-5 and Arg-154. However, instead of the E-helices normally present in wild type ferritin, the C-terminal sequences of all 24 mutant subunits showed substantial amounts of disorder, leading to multiple C-terminal polypeptide conformations and a large disruption of the normally tiny 4-fold axis pores. Functional studies underscored the importance of the mutant C-terminal sequence in iron-induced precipitation and revealed iron mishandling by soluble mutant FTL homopolymers in that only wild type incorporated iron when in direct competition in solution with mutant ferritin. Even without competition, the amount of iron incorporation over the first few minutes differed severalfold. Our data suggest that disruption at the 4-fold pores may lead to direct iron mishandling through attenuated iron incorporation by the soluble form of mutant ferritin and that the disordered C-terminal polypeptides may play a major role in iron-induced precipitation and formation of ferritin inclusion bodies in hereditary ferritinopathy.« less

  10. Diversity of mitochondrial large subunit rDNA haplotypes of Glomus intraradices in two agricultural field experiments and two semi-natural grasslands.

    PubMed

    Börstler, Boris; Thiéry, Odile; Sýkorová, Zuzana; Berner, Alfred; Redecker, Dirk

    2010-04-01

    Glomus intraradices, an arbuscular mycorrhizal fungus (AMF), is frequently found in a surprisingly wide range of ecosystems all over the world. It is used as model organism for AMF and its genome is being sequenced. Despite the ecological importance of AMF, little has been known about their population structure, because no adequate molecular markers have been available. In the present study we analyse for the first time the intraspecific genetic structure of an AMF directly from colonized roots in the field. A recently developed PCR-RFLP approach for the mitochondrial rRNA large subunit gene (mtLSU) of these obligate symbionts was used and complemented by sequencing and primers specific for a particularly frequent mtLSU haplotype. We analysed root samples from two agricultural field experiments in Switzerland and two semi-natural grasslands in France and Switzerland. RFLP type composition of G. intraradices (phylogroup GLOM A-1) differed strongly between agricultural and semi-natural sites and the G. intraradices populations of the two agricultural sites were significantly differentiated. RFLP type richness was higher in the agricultural sites compared with the grasslands. Detailed sequence analyses which resolved multiple sequence haplotypes within some RFLP types even revealed that there was no overlap of haplotypes among any of the study sites except between the two grasslands. Our results demonstrate a surprisingly high differentiation among semi-natural and agricultural field sites for G. intraradices. These findings will have major implications on our views of processes of adaptation and specialization in these plant/fungus associations.

  11. Explicit pre-training instruction does not improve implicit perceptual-motor sequence learning

    PubMed Central

    Sanchez, Daniel J.; Reber, Paul J.

    2012-01-01

    Memory systems theory argues for separate neural systems supporting implicit and explicit memory in the human brain. Neuropsychological studies support this dissociation, but empirical studies of cognitively healthy participants generally observe that both kinds of memory are acquired to at least some extent, even in implicit learning tasks. A key question is whether this observation reflects parallel intact memory systems or an integrated representation of memory in healthy participants. Learning of complex tasks in which both explicit instruction and practice is used depends on both kinds of memory, and how these systems interact will be an important component of the learning process. Theories that posit an integrated, or single, memory system for both types of memory predict that explicit instruction should contribute directly to strengthening task knowledge. In contrast, if the two types of memory are independent and acquired in parallel, explicit knowledge should have no direct impact and may serve in a “scaffolding” role in complex learning. Using an implicit perceptual-motor sequence learning task, the effect of explicit pre-training instruction on skill learning and performance was assessed. Explicit pre-training instruction led to robust explicit knowledge, but sequence learning did not benefit from the contribution of pre-training sequence memorization. The lack of an instruction benefit suggests that during skill learning, implicit and explicit memory operate independently. While healthy participants will generally accrue parallel implicit and explicit knowledge in complex tasks, these types of information appear to be separately represented in the human brain consistent with multiple memory systems theory. PMID:23280147

  12. Sequence History Update Tool

    NASA Technical Reports Server (NTRS)

    Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris

    2008-01-01

    The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.

  13. Differential evolution-simulated annealing for multiple sequence alignment

    NASA Astrophysics Data System (ADS)

    Addawe, R. C.; Addawe, J. M.; Sueño, M. R. K.; Magadia, J. C.

    2017-10-01

    Multiple sequence alignments (MSA) are used in the analysis of molecular evolution and sequence structure relationships. In this paper, a hybrid algorithm, Differential Evolution - Simulated Annealing (DESA) is applied in optimizing multiple sequence alignments (MSAs) based on structural information, non-gaps percentage and totally conserved columns. DESA is a robust algorithm characterized by self-organization, mutation, crossover, and SA-like selection scheme of the strategy parameters. Here, the MSA problem is treated as a multi-objective optimization problem of the hybrid evolutionary algorithm, DESA. Thus, we name the algorithm as DESA-MSA. Simulated sequences and alignments were generated to evaluate the accuracy and efficiency of DESA-MSA using different indel sizes, sequence lengths, deletion rates and insertion rates. The proposed hybrid algorithm obtained acceptable solutions particularly for the MSA problem evaluated based on the three objectives.

  14. Capture-based next-generation sequencing reveals multiple actionable mutations in cancer patients failed in traditional testing.

    PubMed

    Xie, Jing; Lu, Xiongxiong; Wu, Xue; Lin, Xiaoyi; Zhang, Chao; Huang, Xiaofang; Chang, Zhili; Wang, Xinjing; Wen, Chenlei; Tang, Xiaomei; Shi, Minmin; Zhan, Qian; Chen, Hao; Deng, Xiaxing; Peng, Chenghong; Li, Hongwei; Fang, Yuan; Shao, Yang; Shen, Baiyong

    2016-05-01

    Targeted therapies including monoclonal antibodies and small molecule inhibitors have dramatically changed the treatment of cancer over past 10 years. Their therapeutic advantages are more tumor specific and with less side effects. For precisely tailoring available targeted therapies to each individual or a subset of cancer patients, next-generation sequencing (NGS) has been utilized as a promising diagnosis tool with its advantages of accuracy, sensitivity, and high throughput. We developed and validated a NGS-based cancer genomic diagnosis targeting 115 prognosis and therapeutics relevant genes on multiple specimen including blood, tumor tissue, and body fluid from 10 patients with different cancer types. The sequencing data was then analyzed by the clinical-applicable analytical pipelines developed in house. We have assessed analytical sensitivity, specificity, and accuracy of the NGS-based molecular diagnosis. Also, our developed analytical pipelines were capable of detecting base substitutions, indels, and gene copy number variations (CNVs). For instance, several actionable mutations of EGFR,PIK3CA,TP53, and KRAS have been detected for indicating drug susceptibility and resistance in the cases of lung cancer. Our study has shown that NGS-based molecular diagnosis is more sensitive and comprehensive to detect genomic alterations in cancer, and supports a direct clinical use for guiding targeted therapy.

  15. Genetic heterogeneity in type 1 Gaucher disease: Multiple genotypes in Ashkenazic and non-Ashkenazic individuals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tsuji, Shoji; Martin, B.M.; Stubblefield, B.K.

    1988-04-01

    Nucleotide sequence analysis of a genomic clone from an Ashkenazic Jewish patient with type 1 Gaucher disease revealed a single-base mutation (adenosine to guanosine transition) in exon 9 of the glucocerebrosidase gene. This change results in the amino acid substitution of serine for asparagine. Transient expression studies following oligonucleotide-directed mutagenesis of the normal cDNA confirmed that the mutation results in loss of glucocerebrosidase activity. Allele-specific hybridization with oligonucleotide probes demonstrated that this mutation was found exclusively in type 1 phenotype. None of the 6 type 2 patients, 11 type 3 patients, or 12 normal controls had this allele. In contrast,more » 15 of 24 type 1 patients had one allele with this mutation, and 3 others were homozygous for the mutation. Furthermore, some of the Ashkenazic Jewish type 1 patients had only one allele with this mutation, suggesting that even in this population there is allelic heterozygosity. These findings indicate that there are multiple allelic mutations responsible for type 1 Gaucher disease in both the Jewish and non-Jewish populations. Allelic-specific hybridization demonstrating this mutation in exon 9, used in conjunction with the Nci I restriction fragment length polymorphism described as a marker for neuronopathic Gaucher disease, provides a tool for diagnosis and genetic counseling that is {approx}80% informative in all Gaucher patients studied.« less

  16. Bisulfite-independent analysis of CpG island methylation enables genome-scale stratification of single cells

    PubMed Central

    Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L.; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M.

    2017-01-01

    Abstract Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population. PMID:28126923

  17. Triple helix-forming oligonucleotide corresponding to the polypyrimidine sequence in the rat alpha 1(I) collagen promoter specifically inhibits factor binding and transcription.

    PubMed

    Kovacs, A; Kandala, J C; Weber, K T; Guntaka, R V

    1996-01-19

    Type I and III fibrillar collagens are the major structural proteins of the extracellular matrix found in various organs including the myocardium. Abnormal and progressive accumulation of fibrillar type I collagen in the interstitial spaces compromises organ function and therefore, the study of transcriptional regulation of this gene and specific targeting of its expression is of major interest. Transient transfection of adult cardiac fibroblasts indicate that the polypurine-polypyrimidine sequence of alpha 1(I) collagen promoter between nucleotides - 200 and -140 represents an overall positive regulatory element. DNase I footprinting and electrophoretic mobility shift assays suggest that multiple factors bind to different elements of this promoter region. We further demonstrate that the unique polypyrimidine sequence between -172 and -138 of the promoter represents a suitable target for a single-stranded polypurine oligonucleotide (TFO) to form a triple helix DNA structure. Modified electrophoretic mobility shift assays show that this TFO specifically inhibits the protein-DNA interaction within the target region. In vitro transcription assays and transient transfection experiments demonstrate that the transcriptional activity of the promoter is inhibited by this oligonucleotide. We propose that TFOs represent a therapeutic potential to specifically influence the expression of alpha 1(I) collagen gene in various disease states where abnormal type I collagen accumulation is known to occur.

  18. Analysis of enterovirus types in patients with symptoms of aseptic meningitis in 2014 in Shandong, China.

    PubMed

    Chen, Peng; Lin, Xiaojuan; Liu, Guifang; Wang, Suting; Song, Lizhi; Tao, Zexin; Xu, Aiqiang

    2018-03-01

    We reviewed the epidemiological and clinical characteristics of 927 aseptic meningitis patients in Shandong in 2014, and the phylogeny of predominant enterovirus (EV) types causing this disease was analyzed. A total of 209 patients that were positive for EV were identified by both cell culture and a reverse transcription-seminested PCR in cerebrospinal fluid samples. The positive patients were most likely to be children within 15 years of age, had symptoms such as fever, vomiting and nausea (P< .05). The 209 EV sequences belonged to 11 types, and coxsackievirus B5, echovirus types 6 and 30 were predominant types. VP1 analysis exhibited multiple lineages were co-circulating. The significance of the study could come from the fact that surveillance is important to monitor the prevalence of EV types in population, which shows enterovirus meningitis maintains an important public health problem in China. Copyright © 2018 Elsevier Inc. All rights reserved.

  19. Clinical Epidemiology and Molecular Analysis of Extended-Spectrum-β-Lactamase-Producing Escherichia coli in Nepal: Characteristics of Sequence Types 131 and 648

    PubMed Central

    Sherchan, Jatan Bahadur; Miyoshi-Akiyama, Tohru; Ohmagari, Norio; Kirikae, Teruo; Nagamatsu, Maki; Tojo, Masayoshi; Ohara, Hiroshi; Sherchand, Jeevan B.; Tandukar, Sarmila

    2015-01-01

    Recently, CTX-M-type extended-spectrum-β-lactamase (ESBL)-producing Escherichia coli strains have emerged worldwide. In particular, E. coli with O antigen type 25 (O25) and sequence type 131 (ST131), which is often associated with the CTX-M-15 ESBL, has been increasingly reported globally; however, epidemiology reports on ESBL-producing E. coli in Asia are limited. Patients with clinical isolates of ESBL-producing E. coli in the Tribhuvan University teaching hospital in Kathmandu, Nepal, were included in this study. Whole-genome sequencing of the isolates was conducted to analyze multilocus sequence types, phylotypes, virulence genotypes, O25b-ST131 clones, and distribution of acquired drug resistance genes. During the study period, 105 patients with ESBL-producing E. coli isolation were identified, and the majority (90%) of these isolates were CTX-M-15 positive. The most dominant ST was ST131 (n = 54; 51.4%), followed by ST648 (n = 15; 14.3%). All ST131 isolates were identified as O25b-ST131 clones, subclone H30-Rx. Three ST groups (ST131, ST648, and non-ST131/648) were compared in further analyses. ST648 isolates had a proportionally higher resistance to non-β-lactam antibiotics and featured drug-resistant genes more frequently than ST131 or non-ST131/648 isolates. ST131 possessed the most virulence genes, followed by ST648. The clinical characteristics were similar among groups. More than 38% of ESBL-producing E. coli isolates were from the outpatient clinic, and pregnant patients comprised 24% of ESBL-producing E. coli cases. We revealed that the high resistance of ESBL-producing E. coli to multiple classes of antibiotics in Nepal is driven mainly by CTX-M-producing ST131 and ST648. Their immense prevalence in the communities is a matter of great concern. PMID:25824221

  20. Enhanced sequencing coverage with digital droplet multiple displacement amplification

    PubMed Central

    Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.

    2016-01-01

    Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978

  1. Burkholderia pseudomallei sequencing identifies genomic clades with distinct recombination, accessory, and epigenetic profiles

    PubMed Central

    Nandi, Tannistha; Holden, Matthew T.G.; Didelot, Xavier; Mehershahi, Kurosh; Boddey, Justin A.; Beacham, Ifor; Peak, Ian; Harting, John; Baybayan, Primo; Guo, Yan; Wang, Susana; How, Lee Chee; Sim, Bernice; Essex-Lopresti, Angela; Sarkar-Tyson, Mitali; Nelson, Michelle; Smither, Sophie; Ong, Catherine; Aw, Lay Tin; Hoon, Chua Hui; Michell, Stephen; Studholme, David J.; Titball, Richard; Chen, Swaine L.; Parkhill, Julian

    2015-01-01

    Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. PMID:25236617

  2. Whole genome sequencing analyses of Listeria monocytogenes that persisted in a milkshake machine for a year and caused illnesses in Washington State.

    PubMed

    Li, Zhen; Pérez-Osorio, Ailyn; Wang, Yu; Eckmann, Kaye; Glover, William A; Allard, Marc W; Brown, Eric W; Chen, Yi

    2017-06-15

    In 2015, in addition to a United States multistate outbreak linked to contaminated ice cream, another outbreak linked to ice cream was reported in the Pacific Northwest of the United States. It was a hospital-acquired outbreak linked to milkshakes, made from contaminated ice cream mixes and milkshake maker, served to patients. Here we performed multiple analyses on isolates associated with this outbreak: pulsed-field gel electrophoresis (PFGE), whole genome single nucleotide polymorphism (SNP) analysis, species-specific core genome multilocus sequence typing (cgMLST), lineage-specific cgMLST and whole genome-specific MLST (wgsMLST)/outbreak-specific cgMLST. We also analyzed the prophages and virulence genes. The outbreak isolates belonged to sequence type 1038, clonal complex 101, genetic lineage II. There were no pre-mature stop codons in inlA. Isolates contained Listeria Pathogenicity Island 1 and multiple internalins. PFGE and multiple whole genome sequencing (WGS) analyses all clustered together food, environmental and clinical isolates when compared to outgroup from the same clonal complex, which supported the finding that L. monocytogenes likely persisted in the soft serve ice cream/milkshake maker from November 2014 to November 2015 and caused 3 illnesses, and that the outbreak strain was transmitted between two ice cream production facilities. The whole genome SNP analysis, one of the two species-specific cgMLST, the lineage II-specific cgMLST and the wgsMLST/outbreak-specific cgMLST showed that L. monocytogenes cells persistent in the milkshake maker for a year formed a unique clade inside the outbreak cluster. This clustering was consistent with the cleaning practice after the outbreak was initially recognized in late 2014 and early 2015. Putative prophages were conserved among prophage-containing isolates. The loss of a putative prophage in two isolates resulted in the loss of the AscI restriction site in the prophage, which contributed to their AscI-PFGE banding pattern differences from other isolates. The high resolution of WGS analyses allowed the differentiation of epidemiologically unrelated isolates, as well as the elucidation of the microevolution and persistence of isolates within the scope of one outbreak. We applied a wgsMLST scheme which is essentially the outbreak-specific cgMLST. This scheme can be combined with lineage-specific cgMLST and species-specific cgMLST to maximize the resolution of WGS.

  3. Prevalence of Avian-Pathogenic Escherichia coli Strain O1 Genomic Islands among Extraintestinal and Commensal E. coli Isolates

    PubMed Central

    Johnson, Timothy J.; Wannemuehler, Yvonne; Kariyawasam, Subhashinie; Johnson, James R.; Logue, Catherine M.

    2012-01-01

    Escherichia coli strains that cause disease outside the intestine are known as extraintestinal pathogenic E. coli (ExPEC) and include pathogens of humans and animals. Previously, the genome of avian-pathogenic E. coli (APEC) O1:K1:H7 strain O1, from ST95, was sequenced and compared to those of several other E. coli strains, identifying 43 genomic islands. Here, the genomic islands of APEC O1 were compared to those of other sequenced E. coli strains, and the distribution of 81 genes belonging to 12 APEC O1 genomic islands among 828 human and avian ExPEC and commensal E. coli isolates was determined. Multiple islands were highly prevalent among isolates belonging to the O1 and O18 serogroups within phylogenetic group B2, which are implicated in human neonatal meningitis. Because of the extensive genomic similarities between APEC O1 and other human ExPEC strains belonging to the ST95 phylogenetic lineage, its ability to cause disease in a rat model of sepsis and meningitis was assessed. Unlike other ST95 lineage strains, APEC O1 was unable to cause bacteremia or meningitis in the neonatal rat model and was significantly less virulent than uropathogenic E. coli (UPEC) CFT073 in a mouse sepsis model, despite carrying multiple neonatal meningitis E. coli (NMEC) virulence factors and belonging to the ST95 phylogenetic lineage. These results suggest that host adaptation or genome modifications have occurred either in APEC O1 or in highly virulent ExPEC isolates, resulting in differences in pathogenicity. Overall, the genomic islands examined provide targets for further discrimination of the different ExPEC subpathotypes, serogroups, phylogenetic types, and sequence types. PMID:22467781

  4. Prevalence of avian-pathogenic Escherichia coli strain O1 genomic islands among extraintestinal and commensal E. coli isolates.

    PubMed

    Johnson, Timothy J; Wannemuehler, Yvonne; Kariyawasam, Subhashinie; Johnson, James R; Logue, Catherine M; Nolan, Lisa K

    2012-06-01

    Escherichia coli strains that cause disease outside the intestine are known as extraintestinal pathogenic E. coli (ExPEC) and include pathogens of humans and animals. Previously, the genome of avian-pathogenic E. coli (APEC) O1:K1:H7 strain O1, from ST95, was sequenced and compared to those of several other E. coli strains, identifying 43 genomic islands. Here, the genomic islands of APEC O1 were compared to those of other sequenced E. coli strains, and the distribution of 81 genes belonging to 12 APEC O1 genomic islands among 828 human and avian ExPEC and commensal E. coli isolates was determined. Multiple islands were highly prevalent among isolates belonging to the O1 and O18 serogroups within phylogenetic group B2, which are implicated in human neonatal meningitis. Because of the extensive genomic similarities between APEC O1 and other human ExPEC strains belonging to the ST95 phylogenetic lineage, its ability to cause disease in a rat model of sepsis and meningitis was assessed. Unlike other ST95 lineage strains, APEC O1 was unable to cause bacteremia or meningitis in the neonatal rat model and was significantly less virulent than uropathogenic E. coli (UPEC) CFT073 in a mouse sepsis model, despite carrying multiple neonatal meningitis E. coli (NMEC) virulence factors and belonging to the ST95 phylogenetic lineage. These results suggest that host adaptation or genome modifications have occurred either in APEC O1 or in highly virulent ExPEC isolates, resulting in differences in pathogenicity. Overall, the genomic islands examined provide targets for further discrimination of the different ExPEC subpathotypes, serogroups, phylogenetic types, and sequence types.

  5. Comparison of the performance in detection of HPV infections between the high-risk HPV genotyping real time PCR and the PCR-reverse dot blot assays.

    PubMed

    Zhang, Lahong; Dai, Yibei; Chen, Jiahuan; Hong, Liquan; Liu, Yuhua; Ke, Qiang; Chen, Yiwen; Cai, Chengsong; Liu, Xia; Chen, Zhaojun

    2018-01-01

    A new multiplex real-time PCR assay, the high-risk HPV genotyping real time PCR assay (HR HPV RT-PCR), has been developed to detect 15 high-risk HPV types with respective viral loads. In this report, a total of 684 cervical specimens from women diagnosed with vaginitis were assessed by the HR HPV RT-PCR and the PCR reaction and reverse dot blot (PCR-RDB) assays, using a PCR-sequencing method as a reference standard. A total coincidence of 97.7% between the HR HPV RT PCR and the PCR-RDB assays was determined with a Kappa value of 0.953. The HR HPV RT PCR assay had sensitivity, specificity, and concordance rates (accuracy) of 99.7%, 99.7%, and 99.7%, respectively, as confirmed by PCR-sequencing, while the PCR-RDB assay had respective rates of 98.8%, 97.1%, and 98.0%. The overall rate of HPV infection, determined by PCR-sequencing, in women diagnosed with vaginitis was 49.85%, including 36.26% of single infection and 13.6% of multiple infections. The most common infections among the 15 high-risk HPV types in women diagnosed with vaginitis were HPV-52, HPV-16, and HPV-58, with a total detection rate of 10.23%, 7.75%, and 5.85%, respectively. We conclude that the HR HPV RT PCR assay exhibits better clinical performance than the PCR-RDB assay, and is an ideal alternative method for HPV genotyping. In addition, the HR HPV RT PCR assay provides HPV DNA viral loads, and could serve as a quantitative marker in the diagnosis and treatment of single and multiple HPV infections. © 2017 Wiley Periodicals, Inc.

  6. Multilocus Sequence Analysis of Nectar Pseudomonads Reveals High Genetic Diversity and Contrasting Recombination Patterns

    PubMed Central

    Álvarez-Pérez, Sergio; de Vega, Clara; Herrera, Carlos M.

    2013-01-01

    The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas ‘sensu stricto’ isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA) of four core housekeeping genes (rrs, gyrB, rpoB and rpoD). A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs) 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1); P. fluorescens, P. lutea and P. syringae (NG 2); and P. rhizosphaerae (NG 3). Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria. PMID:24116076

  7. The B chromosomes in Brachycome.

    PubMed

    Leach, C R; Houben, A; Timmis, J N

    2004-01-01

    This review presents a historical account of studies of B chromosomes in the genus Brachycome Cass. (synonym: Brachyscome) from the earliest cytological investigations carried out in the late 1960s though to the most recent molecular analyses. Molecular analyses provide insights into the origin and evolution of the B chromosomes (Bs) of Brachycome dichromosomatica, a species which has Bs of two different sizes. The larger Bs are somatically stable whereas the smaller, or micro, Bs are somatically unstable. Both B types contain clusters of ribosomal RNA genes that have been shown unequivocally to be inactive in the case of the larger Bs. The large Bs carry a family of tandem repeat sequences (Bd49) that are located mainly at the centromere. Multiple copies of sequences related to this repeat are present on the A chromosomes (As) of related species, whereas only a few copies exist in the A chromosomes of B. dichromosomatica. The micro Bs share DNA sequences with the As and the larger Bs, and they also have B-specific repeats (Bdm29 and Bdm54). In some cases repeat sequences on the micro Bs have been shown to occur as clusters on the A chromosomes in a proportion of individuals within a population. It is clear that none of these B types originated by simple excision of segments from the A chromosomes. Copyright 2004 S. Karger AG, Basel

  8. Population genetics, taxonomy, phylogeny and evolution of Borrelia burgdorferi sensu lato

    PubMed Central

    Margos, Gabriele; Vollmer, Stephanie A.; Ogden, Nicholas H.; Fish, Durland

    2011-01-01

    In order to understand the population structure and dynamics of bacterial microorganisms, typing systems that accurately reflect the phylogenetic and evolutionary relationship of the agents are required. Over the past 15 years multilocus sequence typing schemes have replaced single locus approaches, giving novel insights into phylogenetic and evolutionary relationships of many bacterial species and facilitating taxonomy. Since 2004, several schemes using multiple loci have been developed to better understand the taxonomy, phylogeny and evolution of Lyme borreliosis spirochetes and in this paper we have reviewed and summarized the progress that has been made for this important group of vector-borne zoonotic bacteria. PMID:21843658

  9. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  10. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  11. Multiplex Detection of KRAS Mutations Using Passive Droplet Fusion.

    PubMed

    Pekin, Deniz; Taly, Valerie

    2017-01-01

    We describe a droplet microfluidics method to screen for multiple mutations of a same oncogene in a single experiment using passive droplet fusion. Genomic DNA from H1573 cell-line was screened for the presence of the six common mutations of the KRAS oncogene as well as wild-type sequences with a detection efficiency of 98 %. Furthermore, the mutant allelic fraction of the cell-line was also assessed correctly showing that the technique is quantitative.

  12. Sulfur-doped Graphene Nanoribbons with a Sequence of Distinct Band Gaps

    NASA Astrophysics Data System (ADS)

    Du, Shi-Xuan; Zhang, Yan-Fang; Zhang, Yi; Berger, Reinhard; Feng, Xinliang; Mullen, Klaus; Lin, Xiao; Zhang, Yu-Yang; Pantelides, Sokrates T.; Gao, Hong-Jun

    Unlike free-standing graphene, graphene nanoribbons (GNRs) can possess semiconducting band gap. However, achieving such control has been a major challenge in the fabrication of GNRs. Chevron-type GNRs were recently achieved by surface-assisted polymerization of pristine or N-substituted oligophenylene monomers. By mixing two different monomers, GNR heterojunctions can in principle be fabricated. Here we report fabrication and characterization of chevron-type GNRs by using sulfur-substituted oligophenylene monomers to achieve GNRs and related heterostructures for the first time. Importantly, our first-principles calculations show that the band gaps of GNRs can be tailored by different S configurations in cyclodehydrogenated isomers through debromination and intramolecular cyclodehydrogenation. This feature should open up new avenues to create multiple GNR heterojunctions by engineering the sulfur configurations. These predictions have been confirmed by Scanning Tunneling Microscopy (STM) and Scanning Tunneling Spectroscopy (STS). The unusual sequence of intraribbon heterojunctions may be useful for nanoscale optoelectronic applications based on quantum dots

  13. Survival of antibiotic resistant bacteria following artificial solar radiation of secondary wastewater effluent.

    PubMed

    Glady-Croue, Julie; Niu, Xi-Zhi; Ramsay, Joshua P; Watkin, Elizabeth; Murphy, Riley J T; Croue, Jean-Philippe

    2018-06-01

    Urban wastewater treatment plant effluents represent one of the major emission sources of antibiotic-resistant bacteria (ARB) in natural aquatic environments. In this study, the effect of artificial solar radiation on total culturable heterotrophic bacteria and ARB (including amoxicillin-resistant, ciprofloxacin-resistant, rifampicin-resistant, sulfamethoxazole-resistant, and tetracycline-resistant bacteria) present in secondary effluent was investigated. Artificial solar radiation was effective in inactivating the majority of environmental bacteria, however, the proportion of strains with ciprofloxacin-resistance and rifampicin-resistance increased in the surviving populations. Isolates of Pseudomonas putida, Serratia marcescens, and Stenotrophomonas maltophilia nosocomial pathogens were identified as resistant to solar radiation and to at least three antibiotics. Draft genome sequencing and typing revealed isolates carrying multiple resistance genes; where S. maltophilia (resistant to all studied antibiotics) sequence type was similar to strains isolated in blood infections. Results from this study confirm that solar radiation reduces total bacterial load in secondary effluent, but may indirectly increase the relative abundance of ARB. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Bowhead whale (Balaena mysticetus) songs in the Chukchi Sea between October 2007 and May 2008.

    PubMed

    Delarue, Julien; Laurinolli, Marjo; Martin, Bruce

    2009-12-01

    This paper reports on the acoustic detection of bowhead whale (Balaena mysticetus) songs from the Bering-Chukchi-Beaufort stock, including the first recordings of songs in the fall and early winter. Bowhead whale songs were detected almost continuously in the Chukchi Sea between October 30, 2007 and January 1, 2008 and twice from April 16 to May 5, 2008 during a long-term deployment of five acoustic recorders moored off Point Lay and Wainwright, AK, between October 21, 2007 and August 3, 2008. Two complex and four simple songs were detected. The complex songs consisted of highly stereotyped sequences of four units. The simple songs were primarily made of sequences of two to three moan types whose repetition patterns were constant over short periods but more variable over time. Multiple song types were recorded simultaneously and there is evidence of synchronized song variation over time. The implications of the spatiotemporal distribution of song detection with respect to the migratory and mating behavior of western Arctic bowheads are discussed.

  15. Multidrug-resistant Escherichia coli in Asia: epidemiology and management.

    PubMed

    Sidjabat, Hanna E; Paterson, David L

    2015-05-01

    Escherichia coli has become multiresistant by way of production of a variety of β-lactamases. The prevalence of CTX-M-producing E. coli has reached 60-79% in certain parts of Asia. The acquisition of CTX-M plasmids by E. coli sequence type 131, a successful clone of E. coli, has caused further dissemination of CTX-M-producing E. coli. The prevalence of carbapenemase-producing E. coli, especially Klebsiella pneumoniae carbapenemase, and New Delhi metallo-β-lactamase (NDM)-producing E. coli has been increasing in Asia. K. pneumoniae carbapenemase and NDM have now been found in E. coli sequence type 131. The occurrence of NDM-producing E. coli is a major concern particularly in the Indian subcontinent, but now elsewhere in Asia as well. There are multiple reasons why antibiotic resistance in E. coli in Asia has reached such extreme levels. Approaches beyond antibiotic therapy, such as prevention of antibiotic resistance by antibiotic stewardship and protecting natural microbiome, are strategies to avoid further spread of antibiotic resistance.

  16. Horizontal gene transfer of chromosomal Type II toxin-antitoxin systems of Escherichia coli.

    PubMed

    Ramisetty, Bhaskar Chandra Mohan; Santhosh, Ramachandran Sarojini

    2016-02-01

    Type II toxin-antitoxin systems (TAs) are small autoregulated bicistronic operons that encode a toxin protein with the potential to inhibit metabolic processes and an antitoxin protein to neutralize the toxin. Most of the bacterial genomes encode multiple TAs. However, the diversity and accumulation of TAs on bacterial genomes and its physiological implications are highly debated. Here we provide evidence that Escherichia coli chromosomal TAs (encoding RNase toxins) are 'acquired' DNA likely originated from heterologous DNA and are the smallest known autoregulated operons with the potential for horizontal propagation. Sequence analyses revealed that integration of TAs into the bacterial genome is unique and contributes to variations in the coding and/or regulatory regions of flanking host genome sequences. Plasmids and genomes encoding identical TAs of natural isolates are mutually exclusive. Chromosomal TAs might play significant roles in the evolution and ecology of bacteria by contributing to host genome variation and by moderation of plasmid maintenance. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  17. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism

    PubMed Central

    Willsey, A. Jeremy; Sanders, Stephan J.; Li, Mingfeng; Dong, Shan; Tebbenkamp, Andrew T.; Muhle, Rebecca A.; Reilly, Steven K.; Lin, Leon; Fertuzinhos, Sofia; Miller, Jeremy A.; Murtha, Michael T.; Bichsel, Candace; Niu, Wei; Cotney, Justin; Ercan-Sencicek, A. Gulhan; Gockley, Jake; Gupta, Abha; Han, Wenqi; He, Xin; Hoffman, Ellen; Klei, Lambertus; Lei, Jing; Liu, Wenzhong; Liu, Li; Lu, Cong; Xu, Xuming; Zhu, Ying; Mane, Shrikant M.; Lein, Edward S.; Wei, Liping; Noonan, James P.; Roeder, Kathryn; Devlin, Bernie; Šestan, Nenad; State, Matthew W.

    2013-01-01

    SUMMARY Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology. PMID:24267886

  18. High-Resolution Spectroscopy of some very Active Southern Stars

    NASA Technical Reports Server (NTRS)

    Soderblom, David R.; King, Jeremy R.; Henry, Todd J.

    1998-01-01

    We have obtained high-resolution echelle spectra of 18 solar-type stars that an earlier survey showed to have very high levels of Ca II H and K emission. Most of these stars belong to close binary systems, but five remain as probable single stars or well-separated binaries that are younger than the Pleiades on the basis of their lithium abundances and H.alpha emission. Three of these probable single stars also lie more than 1 mag above the main sequence in a color-magnitude diagram, and appear to have ages of 10 to 15 Myr. Two of them, HD 202917 and HD 222259, also appear to have a kinematic association with the pre-main-sequence multiple system HD 98800.

  19. Robust one-Tube Ω-PCR Strategy Accelerates Precise Sequence Modification of Plasmids for Functional Genomics

    PubMed Central

    Chen, Letian; Wang, Fengpin; Wang, Xiaoyu; Liu, Yao-Guang

    2013-01-01

    Functional genomics requires vector construction for protein expression and functional characterization of target genes; therefore, a simple, flexible and low-cost molecular manipulation strategy will be highly advantageous for genomics approaches. Here, we describe a Ω-PCR strategy that enables multiple types of sequence modification, including precise insertion, deletion and substitution, in any position of a circular plasmid. Ω-PCR is based on an overlap extension site-directed mutagenesis technique, and is named for its characteristic Ω-shaped secondary structure during PCR. Ω-PCR can be performed either in two steps, or in one tube in combination with exonuclease I treatment. These strategies have wide applications for protein engineering, gene function analysis and in vitro gene splicing. PMID:23335613

  20. Constructing and Modifying Sequence Statistics for relevent Using informR in 𝖱

    PubMed Central

    Marcum, Christopher Steven; Butts, Carter T.

    2015-01-01

    The informR package greatly simplifies the analysis of complex event histories in 𝖱 by providing user friendly tools to build sufficient statistics for the relevent package. Historically, building sufficient statistics to model event sequences (of the form a→b) using the egocentric generalization of Butts’ (2008) relational event framework for modeling social action has been cumbersome. The informR package simplifies the construction of the complex list of arrays needed by the rem() model fitting for a variety of cases involving egocentric event data, multiple event types, and/or support constraints. This paper introduces these tools using examples from real data extracted from the American Time Use Survey. PMID:26185488

  1. Multiplication of Legionella pneumophila Sequence Types 1, 47, and 62 in Buffered Yeast Extract Broth and Biofilms Exposed to Flowing Tap Water at Temperatures of 38°C to 42°C

    PubMed Central

    van der Kooij, Dick; Brouwer-Hanzens, Anke J.; Veenendaal, Harm R.

    2016-01-01

    ABSTRACT Legionella pneumophila proliferates in freshwater environments at temperatures ranging from 25 to 45°C. To investigate the preference of different sequence types (ST) for a specific temperature range, growth of L. pneumophila serogroup 1 (SG1) ST1 (environmental strains), ST47, and ST62 (disease-associated strains) was measured in buffered yeast extract broth (BYEB) and biofilms grown on plasticized polyvinyl chloride in flowing heated drinking water originating from a groundwater supply. The optimum growth temperatures in BYEB were approximately 37°C (ST1), 39°C (ST47), and 41°C (ST62), with maximum growth temperatures of 42°C (ST1) and 43°C (ST47 and ST62). In the biofilm at 38°C, the ST47 and ST62 strains multiplied equally well compared to growth of the environmental ST1 strain and an indigenous L. pneumophila non-SG1 strain, all attaining a concentration of approximately 107 CFU/cm−2. Raising the temperature to 41°C did not impact these levels within 4 weeks, but the colony counts of all strains tested declined (at a specific decline rate of 0.14 to 0.41 day−1) when the temperature was raised to 42°C. At this temperature, the concentration of Vermamoeba vermiformis in the biofilm, determined with quantitative PCR (qPCR), was about 2 log units lower than the concentration at 38°C. In columns operated at a constant temperature, ranging from 38 to 41°C, none of the tested strains multiplied in the biofilm at 41°C, in which also V. vermiformis was not detected. These observations suggest that strains of ST47 and ST62 did not multiply in the biofilm at a temperature of ≥41°C because of the absence of a thermotolerant host. IMPORTANCE Growth of Legionella pneumophila in tap water installations is a serious public health concern. The organism includes more than 2,100 varieties (sequence types). More than 50% of the reported cases of Legionnaires' disease are caused by a few sequence types which are very rarely detected in the environment. Strains of selected virulent sequence types proliferated in biofilms on surfaces exposed to warm (38°C) tap water to the same level as environmental varieties and multiplied well as pure culture in a nutrient-rich medium at temperatures of 42 and 43°C. However, these organisms did not grow in the biofilms at temperatures of ≥41°C. Typical host amoebae also did not multiply at these temperatures. Apparently, proliferation of thermotolerant host amoebae is needed to enable multiplication of the virulent L. pneumophila strains in the environment at elevated temperatures. The detection of these amoebae in water installations therefore is a scientific challenge with practical implications. PMID:27613680

  2. Optical Processing Techniques For Pseudorandom Sequence Prediction

    NASA Astrophysics Data System (ADS)

    Gustafson, Steven C.

    1983-11-01

    Pseudorandom sequences are series of apparently random numbers generated, for example, by linear or nonlinear feedback shift registers. An important application of these sequences is in spread spectrum communication systems, in which, for example, the transmitted carrier phase is digitally modulated rapidly and pseudorandomly and in which the information to be transmitted is incorporated as a slow modulation in the pseudorandom sequence. In this case the transmitted information can be extracted only by a receiver that uses for demodulation the same pseudorandom sequence used by the transmitter, and thus this type of communication system has a very high immunity to third-party interference. However, if a third party can predict in real time the probable future course of the transmitted pseudorandom sequence given past samples of this sequence, then interference immunity can be significantly reduced.. In this application effective pseudorandom sequence prediction techniques should be (1) applicable in real time to rapid (e.g., megahertz) sequence generation rates, (2) applicable to both linear and nonlinear pseudorandom sequence generation processes, and (3) applicable to error-prone past sequence samples of limited number and continuity. Certain optical processing techniques that may meet these requirements are discussed in this paper. In particular, techniques based on incoherent optical processors that perform general linear transforms or (more specifically) matrix-vector multiplications are considered. Computer simulation examples are presented which indicate that significant prediction accuracy can be obtained using these transforms for simple pseudorandom sequences. However, the useful prediction of more complex pseudorandom sequences will probably require the application of more sophisticated optical processing techniques.

  3. Quasispecies Analyses of the HIV-1 Near-full-length Genome With Illumina MiSeq

    PubMed Central

    Ode, Hirotaka; Matsuda, Masakazu; Matsuoka, Kazuhiro; Hachiya, Atsuko; Hattori, Junko; Kito, Yumiko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

    2015-01-01

    Human immunodeficiency virus type-1 (HIV-1) exhibits high between-host genetic diversity and within-host heterogeneity, recognized as quasispecies. Because HIV-1 quasispecies fluctuate in terms of multiple factors, such as antiretroviral exposure and host immunity, analyzing the HIV-1 genome is critical for selecting effective antiretroviral therapy and understanding within-host viral coevolution mechanisms. Here, to obtain HIV-1 genome sequence information that includes minority variants, we sought to develop a method for evaluating quasispecies throughout the HIV-1 near-full-length genome using the Illumina MiSeq benchtop deep sequencer. To ensure the reliability of minority mutation detection, we applied an analysis method of sequence read mapping onto a consensus sequence derived from de novo assembly followed by iterative mapping and subsequent unique error correction. Deep sequencing analyses of aHIV-1 clone showed that the analysis method reduced erroneous base prevalence below 1% in each sequence position and discarded only < 1% of all collected nucleotides, maximizing the usage of the collected genome sequences. Further, we designed primer sets to amplify the HIV-1 near-full-length genome from clinical plasma samples. Deep sequencing of 92 samples in combination with the primer sets and our analysis method provided sufficient coverage to identify >1%-frequency sequences throughout the genome. When we evaluated sequences of pol genes from 18 treatment-naïve patients' samples, the deep sequencing results were in agreement with Sanger sequencing and identified numerous additional minority mutations. The results suggest that our deep sequencing method would be suitable for identifying within-host viral population dynamics throughout the genome. PMID:26617593

  4. Human Papillomavirus Community in Healthy Persons, Defined by Metagenomics Analysis of Human Microbiome Project Shotgun Sequencing Data Sets

    PubMed Central

    Ma, Yingfei; Madupu, Ramana; Karaoz, Ulas; Nossa, Carlos W.; Yang, Liying; Yooseph, Shibu; Yachimski, Patrick S.; Brodie, Eoin L.; Nelson, Karen E.

    2014-01-01

    ABSTRACT Human papillomavirus (HPV) causes a number of neoplastic diseases in humans. Here, we show a complex normal HPV community in a cohort of 103 healthy human subjects, by metagenomics analysis of the shotgun sequencing data generated from the NIH Human Microbiome Project. The overall HPV prevalence was 68.9% and was highest in the skin (61.3%), followed by the vagina (41.5%), mouth (30%), and gut (17.3%). Of the 109 HPV types as well as additional unclassified types detected, most were undetectable by the widely used commercial kits targeting the vaginal/cervical HPV types. These HPVs likely represent true HPV infections rather than transitory exposure because of strong organ tropism and persistence of the same HPV types in repeat samples. Coexistence of multiple HPV types was found in 48.1% of the HPV-positive samples. Networking between HPV types, cooccurrence or exclusion, was detected in vaginal and skin samples. Large contigs assembled from short HPV reads were obtained from several samples, confirming their genuine HPV origin. This first large-scale survey of HPV using a shotgun sequencing approach yielded a comprehensive map of HPV infections among different body sites of healthy human subjects. IMPORTANCE This nonbiased survey indicates that the HPV community in healthy humans is much more complex than previously defined by widely used kits that are target selective for only a few high- and low-risk HPV types for cervical cancer. The importance of nononcogenic viruses in a mixed HPV infection could be for stimulating or inhibiting a coexisting oncogenic virus via viral interference or immune cross-reaction. Knowledge gained from this study will be helpful to guide the designing of epidemiological and clinical studies in the future to determine the impact of nononcogenic HPV types on the outcome of HPV infections. PMID:24522917

  5. Impact of the HIV-1 genetic background and HIV-1 population size on the evolution of raltegravir resistance.

    PubMed

    Fun, Axel; Leitner, Thomas; Vandekerckhove, Linos; Däumer, Martin; Thielen, Alexander; Buchholz, Bernd; Hoepelman, Andy I M; Gisolf, Elizabeth H; Schipper, Pauline J; Wensing, Annemarie M J; Nijhuis, Monique

    2018-01-05

    Emergence of resistance against integrase inhibitor raltegravir in human immunodeficiency virus type 1 (HIV-1) patients is generally associated with selection of one of three signature mutations: Y143C/R, Q148K/H/R or N155H, representing three distinct resistance pathways. The mechanisms that drive selection of a specific pathway are still poorly understood. We investigated the impact of the HIV-1 genetic background and population dynamics on the emergence of raltegravir resistance. Using deep sequencing we analyzed the integrase coding sequence (CDS) in longitudinal samples from five patients who initiated raltegravir plus optimized background therapy at viral loads > 5000 copies/ml. To investigate the role of the HIV-1 genetic background we created recombinant viruses containing the viral integrase coding region from pre-raltegravir samples from two patients in whom raltegravir resistance developed through different pathways. The in vitro selections performed with these recombinant viruses were designed to mimic natural population bottlenecks. Deep sequencing analysis of the viral integrase CDS revealed that the virological response to raltegravir containing therapy inversely correlated with the relative amount of unique sequence variants that emerged suggesting diversifying selection during drug pressure. In 4/5 patients multiple signature mutations representing different resistance pathways were observed. Interestingly, the resistant population can consist of a single resistant variant that completely dominates the population but also of multiple variants from different resistance pathways that coexist in the viral population. We also found evidence for increased diversification after stronger bottlenecks. In vitro selections with low viral titers, mimicking population bottlenecks, revealed that both recombinant viruses and HXB2 reference virus were able to select mutations from different resistance pathways, although typically only one resistance pathway emerged in each individual culture. The generation of a specific raltegravir resistant variant is not predisposed in the genetic background of the viral integrase CDS. Typically, in the early phases of therapy failure the sequence space is explored and multiple resistance pathways emerge and then compete for dominance which frequently results in a switch of the dominant population over time towards the fittest variant or even multiple variants of similar fitness that can coexist in the viral population.

  6. Successful treatment of multifocal pedal Prototheca wickerhamii infection in a feline immunodeficiency virus-positive cat with multiple Bowenoid in situ carcinomas containing papillomaviral DNA sequences

    PubMed Central

    Kessell, Allan E; McNair, Derek; Munday, John S; Savory, Richard; Halliday, Catriona; Malik, Richard

    2017-01-01

    Case summary A 16-year-old, castrated male, feline immunodeficiency virus (FIV)-positive, domestic shorthair cat developed multiple skin lesions. Most of these were Bowenoid carcinoma in situ and contained DNA sequences consistent with Felis catus papillomavirus type 2. Two additional lesions that developed in the skin and subcutaneous tissues between the digital and carpal pads on the left forelimb and right hindlimb were shown by cytology, histology and culture to be caused by Prototheca wickerhamii. These lesions failed to improve in response to systemic therapy treatment with itraconazole, but excision by sharp en bloc resection with follow-up oral itraconazole therapy proved curative for one lesion, although the other lesion recurred, necessitating a second surgery. Relevance and novel information This is only the second reported case of feline protothecosis from Australia and the first case that has been cultured and identified to the species level. Also of great interest was the presence of multiple papillomavirus-associated neoplastic lesions, which may have afforded a portal of entry for the algal pathogen and the cat’s positive FIV status; the latter might have impacted on both viral and algal pathogenesis by effects on immunocompetence. PMID:28491447

  7. AN EVALUATION OF ANTECEDENT EXERCISE ON BEHAVIOR MAINTAINED BY AUTOMATIC REINFORCEMENT USING A THREE-COMPONENT MULTIPLE SCHEDULE

    PubMed Central

    Morrison, Heather; Roscoe, Eileen M; Atwell, Amy

    2011-01-01

    We evaluated antecedent exercise for treating the automatically reinforced problem behavior of 4 individuals with autism. We conducted preference assessments to identify leisure and exercise items that were associated with high levels of engagement and low levels of problem behavior. Next, we conducted three 3-component multiple-schedule sequences: an antecedent-exercise test sequence, a noncontingent leisure-item control sequence, and a social-interaction control sequence. Within each sequence, we used a 3-component multiple schedule to evaluate preintervention, intervention, and postintervention effects. Problem behavior decreased during the postintervention component relative to the preintervention component for 3 of the 4 participants during the exercise-item assessment; however, the effects could not be attributed solely to exercise for 1 of these participants. PMID:21941383

  8. Homology-integrated CRISPR-Cas (HI-CRISPR) system for one-step multigene disruption in Saccharomyces cerevisiae.

    PubMed

    Bao, Zehua; Xiao, Han; Liang, Jing; Zhang, Lu; Xiong, Xiong; Sun, Ning; Si, Tong; Zhao, Huimin

    2015-05-15

    One-step multiple gene disruption in the model organism Saccharomyces cerevisiae is a highly useful tool for both basic and applied research, but it remains a challenge. Here, we report a rapid, efficient, and potentially scalable strategy based on the type II Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR associated proteins (Cas) system to generate multiple gene disruptions simultaneously in S. cerevisiae. A 100 bp dsDNA mutagenizing homologous recombination donor is inserted between two direct repeats for each target gene in a CRISPR array consisting of multiple donor and guide sequence pairs. An ultrahigh copy number plasmid carrying iCas9, a variant of wild-type Cas9, trans-encoded RNA (tracrRNA), and a homology-integrated crRNA cassette is designed to greatly increase the gene disruption efficiency. As proof of concept, three genes, CAN1, ADE2, and LYP1, were simultaneously disrupted in 4 days with an efficiency ranging from 27 to 87%. Another three genes involved in an artificial hydrocortisone biosynthetic pathway, ATF2, GCY1, and YPR1, were simultaneously disrupted in 6 days with 100% efficiency. This homology-integrated CRISPR (HI-CRISPR) strategy represents a powerful tool for creating yeast strains with multiple gene knockouts.

  9. Isolation of an invertebrate-type lysozyme from the nephridia of the echiura, Urechis unicinctus, and its recombinant production and activities.

    PubMed

    Oh, Hye Young; Kim, Chan-Hee; Go, Hye-Jin; Park, Nam Gyu

    2018-05-09

    Invertebrates, unlike vertebrates which have adaptive immune system, rely heavily on the innate immune system for the defense against pathogenic bacteria. Lysozymes, along with other immune effectors, are regarded as an important group in this defense. An invertebrate-type (i-type) lysozyme, designated Urechis unicinctus invertebrate-type lysozyme, Uu-ilys, has been isolated from nephridia of Urechis unicinctus using a series of high performance liquid chromatography (HPLC), and ultrasensitive radial diffusion assay (URDA) as a bioassay system. Analyses of the primary structure and cDNA cloning revealed that Uu-ilys was approximately 14 kDa and composed of 122 amino acids (AAs) of which the precursor had a total of 160 AAs containing a signal peptide of 18 AAs and a pro-sequence of 20 AAs encoded by the nucleotide sequence of 714 bp that comprises a 5' untranslated region (UTR) of 42 bp, an open reading frame (ORF) of 483 bp, and a 3' UTR of 189 bp. Multiple sequence alignment showed Uu-ilys has high homology to i-type lysozymes from several annelids. Relatively high transcriptional expression levels of Uu-ilys was detected in nephridia, anal vesicle, and intestine. The native Uu-ilys exhibited comparable lysozyme enzymatic and antibacterial activities to hen egg white lysozyme. Collectively, these data suggest that Uu-ilys, the isolated antibacterial protein, plays a role in the immune defense mechanism of U. unicinctus. Recombinant Uu-ilys (rUu-ilys) produced in a bacterial expression system showed significantly decreased lysozyme lytic activity from that of the native while its potency on radial diffusion assay detecting antibacterial activity was retained, which may indicate the non-enzymatic antibacterial capacity of Uu-ilys. Copyright © 2018. Published by Elsevier Ltd.

  10. Reconstructing evolutionary trees in parallel for massive sequences.

    PubMed

    Zou, Quan; Wan, Shixiang; Zeng, Xiangxiang; Ma, Zhanshan Sam

    2017-12-14

    Building the evolutionary trees for massive unaligned DNA sequences is challenging and crucial. However, reconstructing evolutionary tree for ultra-large sequences is hard. Massive multiple sequence alignment is also challenging and time/space consuming. Hadoop and Spark are developed recently, which bring spring light for the classical computational biology problems. In this paper, we tried to solve the multiple sequence alignment and evolutionary reconstruction in parallel. HPTree, which is developed in this paper, can deal with big DNA sequence files quickly. It works well on the >1GB files, and gets better performance than other evolutionary reconstruction tools. Users could use HPTree for reonstructing evolutioanry trees on the computer clusters or cloud platform (eg. Amazon Cloud). HPTree could help on population evolution research and metagenomics analysis. In this paper, we employ the Hadoop and Spark platform and design an evolutionary tree reconstruction software tool for unaligned massive DNA sequences. Clustering and multiple sequence alignment are done in parallel. Neighbour-joining model was employed for the evolutionary tree building. We opened our software together with source codes via http://lab.malab.cn/soft/HPtree/ .

  11. Direct typing of Canine parvovirus (CPV) from infected dog faeces by rapid mini sequencing technique.

    PubMed

    V, Pavana Jyothi; S, Akila; Selvan, Malini K; Naidu, Hariprasad; Raghunathan, Shwethaa; Kota, Sathish; Sundaram, R C Raja; Rana, Samir Kumar; Raj, G Dhinakar; Srinivasan, V A; Mohana Subramanian, B

    2016-12-01

    Canine parvovirus (CPV) is a non-enveloped single stranded DNA virus with an icosahedral capsid. Mini-sequencing based CPV typing was developed earlier to detect and differentiate all the CPV types and FPV in a single reaction. This technique was further evaluated in the present study by performing the mini-sequencing directly from fecal samples which avoided tedious virus isolation steps by cell culture system. Fecal swab samples were collected from 84 dogs with enteritis symptoms, suggestive of parvoviral infection from different locations across India. Seventy six of these samples were positive by PCR; the subsequent mini-sequencing reaction typed 74 of them as type 2a virus, and 2 samples as type 2b. Additionally, 25 of the positive samples were typed by cycle sequencing of PCR products. Direct CPV typing from fecal samples using mini-sequencing showed 100% correlation with CPV typing by cycle sequencing. Moreover, CPV typing was achieved by mini-sequencing even with faintly positive PCR amplicons which was not possible by cycle sequencing. Therefore, the mini-sequencing technique is recommended for regular epidemiological follow up of CPV types, since the technique is rapid, highly sensitive and high capacity method for CPV typing. Copyright © 2016. Published by Elsevier B.V.

  12. The CD8α gene in duck (Anatidae): cloning, characterization, and expression during viral infection.

    PubMed

    Xu, Qi; Chen, Yang; Zhao, Wen Ming; Huang, Zheng Yang; Duan, Xiu Jun; Tong, Yi Yu; Zhang, Yang; Li, Xiu; Chang, Guo Bin; Chen, Guo Hong

    2015-02-01

    Cluster of differentiation 8 alpha (CD8α) is critical for cell-mediated immune defense and T-cell development. Although CD8α sequences have been reported for several species, very little is known about CD8α in ducks. To elucidate the mechanisms involved in the innate and adaptive immune responses of ducks, we cloned CD8α coding sequences from domestic, Muscovy, Mallard, and Spotbill ducks using reverse transcription polymerase chain reaction (RT-PCR). Each sequence consisted of 714 nucleotides and encoded a signal peptide, an IgV-like domain, a stalk region, a transmembrane region, and a cytoplasmic tail. We identified 58 nucleotide differences and 37 amino acid differences among the four types of duck; of these, 53 nucleotide and 33 amino acid differences were between Muscovy ducks and the other duck species. The CD8α cDNA sequence from domestic duck consisted of a 61-nucleotide 5' untranslated region (UTR), a 714-nucleotide open reading frame, and an 849-nucleotide 3' UTR. Multiple sequence alignments showed that the amino acid sequence of CD8α is conserved in vertebrates. RT-PCR revealed that expression of CD8α mRNA of domestic ducks was highest in the thymus and very low in the kidney, cerebrum, cerebellum, and muscle. Immunohistochemical analyses detected CD8α on the splenic corpuscle and periarterial lymphatic sheath of the spleen. CD8α mRNA in domestic ducklings was initially up-regulated, and then down-regulated, in the thymus, spleen, and liver after treatment with duck hepatitis virus type I (DHV-1) or the immunostimulant polyriboinosinic polyribocytidylic acid (poly I:C).

  13. Evolutionary insight into the ionotropic glutamate receptor superfamily of photosynthetic organisms.

    PubMed

    De Bortoli, Sara; Teardo, Enrico; Szabò, Ildikò; Morosinotto, Tomas; Alboresi, Alessandro

    2016-11-01

    Photosynthetic eukaryotes have a complex evolutionary history shaped by multiple endosymbiosis events that required a tight coordination between the organelles and the rest of the cell. Plant ionotropic glutamate receptors (iGLRs) form a large superfamily of proteins with a predicted or proven non-selective cation channel activity regulated by a broad range of amino acids. They are involved in different physiological processes such as C/N sensing, resistance against fungal infection, root and pollen tube growth and response to wounding and pathogens. Most of the present knowledge is limited to iGLRs located in plasma membranes. However, recent studies localized different iGLR isoforms to mitochondria and/or chloroplasts, suggesting the possibility that they play a specific role in bioenergetic processes. In this work, we performed a comparative analysis of GLR sequences from bacteria and various photosynthetic eukaryotes. In particular, novel types of selectivity filters of bacteria are reported adding new examples of the great diversity of the GLR superfamily. The highest variability in GLR sequences was found among the algal sequences (cryptophytes, diatoms, brown and green algae). GLRs of land plants are not closely related to the GLRs of green algae analyzed in this work. The GLR family underwent a great expansion in vascular plants. Among plant GLRs, Clade III includes sequences from Physcomitrella patens, Marchantia polymorpha and gymnosperms and can be considered the most ancient, while other clades likely emerged later. In silico analysis allowed the identification of sequences with a putative target to organelles. Sequences with a predicted localization to mitochondria and chloroplasts are randomly distributed among different type of GLRs, suggesting that no compartment-related specific function has been maintained across the species. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Epstein-Barr Virus, Human Papillomavirus and Mouse Mammary Tumour Virus as Multiple Viruses in Breast Cancer

    PubMed Central

    Glenn, Wendy K.; Heng, Benjamin; Delprado, Warick; Iacopetta, Barry; Whitaker, Noel J.; Lawson, James S.

    2012-01-01

    Background The purpose of this investigation is to determine if Epstein Barr virus (EBV), high risk human papillomavirus (HPV), and mouse mammary tumour viruses (MMTV) co-exist in some breast cancers. Materials and Methods All the specimens were from women residing in Australia. For investigations based on standard PCR, we used fresh frozen DNA extracts from 50 unselected invasive breast cancers. For normal breast specimens, we used DNA extracts from epithelial cells from milk donated by 40 lactating women. For investigations based on in situ PCR we used 27 unselected archival formalin fixed breast cancer specimens and 18 unselected archival formalin fixed normal breast specimens from women who had breast reduction surgery. Thirteen of these fixed breast cancer specimens were ductal carcinoma in situ (dcis) and 14 were predominantly invasive ductal carcinomas (idc). Results EBV sequences were identified in 68%, high risk HPV sequences in 50%, and MMTV sequences in 78% of DNA extracted from 50 invasive breast cancer specimens. These same viruses were identified in selected normal and breast cancer specimens by in situ PCR. Sequences from more than one viral type were identified in 72% of the same breast cancer specimens. Normal controls showed these viruses were also present in epithelial cells in human milk – EBV (35%), HPV, 20%) and MMTV (32%) of 40 milk samples from normal lactating women, with multiple viruses being identified in 13% of the same milk samples. Conclusions We conclude that (i) EBV, HPV and MMTV gene sequences are present and co-exist in many human breast cancers, (ii) the presence of these viruses in breast cancer is associated with young age of diagnosis and possibly an increased grade of breast cancer. PMID:23183846

  15. Epstein-Barr virus, human papillomavirus and mouse mammary tumour virus as multiple viruses in breast cancer.

    PubMed

    Glenn, Wendy K; Heng, Benjamin; Delprado, Warick; Iacopetta, Barry; Whitaker, Noel J; Lawson, James S

    2012-01-01

    The purpose of this investigation is to determine if Epstein Barr virus (EBV), high risk human papillomavirus (HPV), and mouse mammary tumour viruses (MMTV) co-exist in some breast cancers. All the specimens were from women residing in Australia. For investigations based on standard PCR, we used fresh frozen DNA extracts from 50 unselected invasive breast cancers. For normal breast specimens, we used DNA extracts from epithelial cells from milk donated by 40 lactating women. For investigations based on in situ PCR we used 27 unselected archival formalin fixed breast cancer specimens and 18 unselected archival formalin fixed normal breast specimens from women who had breast reduction surgery. Thirteen of these fixed breast cancer specimens were ductal carcinoma in situ (dcis) and 14 were predominantly invasive ductal carcinomas (idc). EBV sequences were identified in 68%, high risk HPV sequences in 50%, and MMTV sequences in 78% of DNA extracted from 50 invasive breast cancer specimens. These same viruses were identified in selected normal and breast cancer specimens by in situ PCR. Sequences from more than one viral type were identified in 72% of the same breast cancer specimens. Normal controls showed these viruses were also present in epithelial cells in human milk - EBV (35%), HPV, 20%) and MMTV (32%) of 40 milk samples from normal lactating women, with multiple viruses being identified in 13% of the same milk samples. We conclude that (i) EBV, HPV and MMTV gene sequences are present and co-exist in many human breast cancers, (ii) the presence of these viruses in breast cancer is associated with young age of diagnosis and possibly an increased grade of breast cancer.

  16. Conservation of tubulin-binding sequences in TRPV1 throughout evolution.

    PubMed

    Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

    2012-01-01

    Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Our analysis identifies the regions of TRPV1, which are important for structure-function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol®-induced neuropathy.

  17. Wolbachia association with the tsetse fly, Glossina fuscipes fuscipes, reveals high levels of genetic diversity and complex evolutionary dynamics

    PubMed Central

    2013-01-01

    Background Wolbachia pipientis, a diverse group of α-proteobacteria, can alter arthropod host reproduction and confer a reproductive advantage to Wolbachia-infected females (cytoplasmic incompatibility (CI)). This advantage can alter host population genetics because Wolbachia-infected females produce more offspring with their own mitochondrial DNA (mtDNA) haplotypes than uninfected females. Thus, these host haplotypes become common or fixed (selective sweep). Although simulations suggest that for a CI-mediated sweep to occur, there must be a transient phase with repeated initial infections of multiple individual hosts by different Wolbachia strains, this has not been observed empirically. Wolbachia has been found in the tsetse fly, Glossina fuscipes fuscipes, but it is not limited to a single host haplotype, suggesting that CI did not impact its population structure. However, host population genetic differentiation could have been generated if multiple Wolbachia strains interacted in some populations. Here, we investigated Wolbachia genetic variation in G. f. fuscipes populations of known host genetic composition in Uganda. We tested for the presence of multiple Wolbachia strains using Multi-Locus Sequence Typing (MLST) and for an association between geographic region and host mtDNA haplotype using Wolbachia DNA sequence from a variable locus, groEL (heat shock protein 60). Results MLST demonstrated that some G. f. fuscipes carry Wolbachia strains from two lineages. GroEL revealed high levels of sequence diversity within and between individuals (Haplotype diversity = 0.945). We found Wolbachia associated with 26 host mtDNA haplotypes, an unprecedented result. We observed a geographical association of one Wolbachia lineage with southern host mtDNA haplotypes, but it was non-significant (p = 0.16). Though most Wolbachia-infected host haplotypes were those found in the contact region between host mtDNA groups, this association was non-significant (p = 0.17). Conclusions High Wolbachia sequence diversity and the association of Wolbachia with multiple host haplotypes suggest that different Wolbachia strains infected G. f. fuscipes multiple times independently. We suggest that these observations reflect a transient phase in Wolbachia evolution that is influenced by the long gestation and low reproductive output of tsetse. Although G. f. fuscipes is superinfected with Wolbachia, our data does not support that bidirectional CI has influenced host genetic diversity in Uganda. PMID:23384159

  18. Searching for Partners of Cool Senior Citizens

    NASA Astrophysics Data System (ADS)

    Jao, Wei-Chun; Henry, T. J.

    2012-01-01

    Mass is one of the most fundamental parameters in stellar astronomy. In order to measure dynamical masses, one needs to find nearby binary systems that can be resolved and monitored, ideally with orbital periods that completely wrap in a reasonable amount of time. Many surveys have been made of nearby main sequence dwarfs, and their mass-luminosity relation is well established. As part of our Cool Subdwarf Investigations (CSI) program, we are searching for subdwarf binaries of spectral types K and M within 60 parsecs to measure their multiplicity rate and to reveal binaries appropriate for mass determinations. Here we present results of our CSI work using HST's Fine Guidance Sensors. When combined with previous CSI work and results in the literature, we find the multiplicity rate of subdwarfs, 21%, to be surprisingly low compared to that of similar main sequence K and M stars, 37%. This work has several implications, including that the star formation and/or evolution history of subdwarfs is different than for dwarfs, and that ideal systems for subdwarf mass determinations are difficult to find. This work is supported by HST grant GO-11943.

  19. Discovery and analysis of an active long terminal repeat-retrotransposable element in Aspergillus oryzae.

    PubMed

    Jie Jin, Feng; Hara, Seiichi; Sato, Atsushi; Koyama, Yasuji

    2014-01-01

    Wild-type Aspergillus oryzae RIB40 contains two copies of the AO090005001597 gene. We previously constructed A. oryzae RIB40 strain, RKuAF8B, with multiple chromosomal deletions, in which the AO090005001597 copy number was found to be increased significantly. Sequence analysis indicated that AO090005001597 is part of a putative 6,000-bp retrotransposable element, flanked by two long terminal repeats (LTRs) of 669 bp, with characteristics of retroviruses and retrotransposons, and thus designated AoLTR (A. oryzae LTR-retrotransposable element). AoLTR comprised putative reverse transcriptase, RNase H, and integrase domains. The deduced amino acid sequence alignment of AoLTR showed 94% overall identity with AFLAV, an A. flavus Tf1/sushi retrotransposon. Quantitative real-time RT-PCR showed that AoLTR gene expression was significantly increased in the RKuAF8B, in accordance with the increased copy number. Inverse PCR indicated that the full-length retrotransposable element was randomly integrated into multiple genomic locations. However, no obvious phenotypic changes were associated with the increased AoLTR gene copy number.

  20. MoonProt: a database for proteins that are known to moonlight

    PubMed Central

    Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.

    2015-01-01

    Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305

  1. Diversity in copy number and structure of a silkworm morphogenetic gene as a result of domestication.

    PubMed

    Sakudoh, Takashi; Nakashima, Takeharu; Kuroki, Yoko; Fujiyama, Asao; Kohara, Yuji; Honda, Naoko; Fujimoto, Hirofumi; Shimada, Toru; Nakagaki, Masao; Banno, Yutaka; Tsuchida, Kozo

    2011-03-01

    The carotenoid-binding protein (CBP) of the domesticated silkworm, Bombyx mori, a major determinant of cocoon color, is likely to have been substantially influenced by domestication of this species. We analyzed the structure of the CBP gene in multiple strains of B. mori, in multiple individuals of the wild silkworm, B. mandarina (the putative wild ancestor of B. mori), and in a number of other lepidopterans. We found the CBP gene copy number in genomic DNA to vary widely among B. mori strains, ranging from 1 to 20. The copies of CBP are of several types, based on the presence of a retrotransposon or partial deletion of the coding sequence. In contrast to B. mori, B. mandarina was found to possess a single copy of CBP without the retrotransposon insertion, regardless of habitat. Several other lepidopterans were found to contain sequences homologous to CBP, revealing that this gene is evolutionarily conserved in the lepidopteran lineage. Thus, domestication can generate significant diversity of gene copy number and structure over a relatively short evolutionary time. © 2011 by the Genetics Society of America

  2. Diversity in Copy Number and Structure of a Silkworm Morphogenetic Gene as a Result of Domestication

    PubMed Central

    Sakudoh, Takashi; Nakashima, Takeharu; Kuroki, Yoko; Fujiyama, Asao; Kohara, Yuji; Honda, Naoko; Fujimoto, Hirofumi; Shimada, Toru; Nakagaki, Masao; Banno, Yutaka; Tsuchida, Kozo

    2011-01-01

    The carotenoid-binding protein (CBP) of the domesticated silkworm, Bombyx mori, a major determinant of cocoon color, is likely to have been substantially influenced by domestication of this species. We analyzed the structure of the CBP gene in multiple strains of B. mori, in multiple individuals of the wild silkworm, B. mandarina (the putative wild ancestor of B. mori), and in a number of other lepidopterans. We found the CBP gene copy number in genomic DNA to vary widely among B. mori strains, ranging from 1 to 20. The copies of CBP are of several types, based on the presence of a retrotransposon or partial deletion of the coding sequence. In contrast to B. mori, B. mandarina was found to possess a single copy of CBP without the retrotransposon insertion, regardless of habitat. Several other lepidopterans were found to contain sequences homologous to CBP, revealing that this gene is evolutionarily conserved in the lepidopteran lineage. Thus, domestication can generate significant diversity of gene copy number and structure over a relatively short evolutionary time. PMID:21242537

  3. Generation of Recombinant Polioviruses Harboring RNA Affinity Tags in the 5′ and 3′ Noncoding Regions of Genomic RNAs

    PubMed Central

    Flather, Dylan; Cathcart, Andrea L.; Cruz, Casey; Baggs, Eric; Ngo, Tuan; Gershon, Paul D.; Semler, Bert L.

    2016-01-01

    Despite being intensely studied for more than 50 years, a complete understanding of the enterovirus replication cycle remains elusive. Specifically, only a handful of cellular proteins have been shown to be involved in the RNA replication cycle of these viruses. In an effort to isolate and identify additional cellular proteins that function in enteroviral RNA replication, we have generated multiple recombinant polioviruses containing RNA affinity tags within the 3′ or 5′ noncoding region of the genome. These recombinant viruses retained RNA affinity sequences within the genome while remaining viable and infectious over multiple passages in cell culture. Further characterization of these viruses demonstrated that viral protein production and growth kinetics were unchanged or only slightly altered relative to wild type poliovirus. However, attempts to isolate these genetically-tagged viral genomes from infected cells have been hindered by high levels of co-purification of nonspecific proteins and the limited matrix-binding efficiency of RNA affinity sequences. Regardless, these recombinant viruses represent a step toward more thorough characterization of enterovirus ribonucleoprotein complexes involved in RNA replication. PMID:26861382

  4. Immunological cross-reactivity to multiple autoantigens in patients with liver kidney microsomal type 1 autoimmune hepatitis.

    PubMed

    Choudhuri, K; Gregorio, G V; Mieli-Vergani, G; Vergani, D

    1998-11-01

    We describe two patients with liver kidney microsomal antibody type 1 (LKM1)-positive autoimmune hepatitis (AIH) with associated endocrinopathies. The first patient had insulin-dependent diabetes (IDDM), and the second patient had Addison's disease and hypoparathyroidism, and is also positive for islet cell antibodies, without overt diabetes. To account for the existence of multiple endocrinopathy in these patients, we investigated whether there is sequence similarity between the target of LKM1 antibodies, cytochrome P4502D6 (CYP2D6), and other human proteins, and if so, whether this structural similarity produces a detectable cross-reactive immune response. Our database search identified two proteins, carboxypeptidase H, an autoantigen in insulin-dependent diabetes, and 21-hydroxylase, the major autoantigen in Addison's disease, that share sequence similarity to the second major LKM1 epitope on CYP2D6. We tested the reactivity of sera from these patients to the homologous regions of the three autoantigens using an enzyme-linked immunosorbent assay (ELISA). The cut-off for positivity was established by testing sera from 22 healthy children. To determine the significance of reactivity to the peptide homologues of the three autoantigens, we investigated 16 additional patients with LKM1 AIH and 20 children with chronic hepatitis B virus infection as pathological controls. We found that reactivity to the second major epitope of CYP2D6 is significantly associated with reactivity to the homologous regions of carboxypeptidase H (CPH) and 21-hydroxylase (21-OHase) in patients with LKM1 AIH, and that this simultaneous recognition is cross-reactive. We suggest that a cross-reactive immune response between homologous autoantigens may contribute to the development of multiple endocrinopathies in LKM1 AIH.

  5. Modeling and interoperability of heterogeneous genomic big data for integrative processing and querying.

    PubMed

    Masseroli, Marco; Kaitoua, Abdulrahman; Pinoli, Pietro; Ceri, Stefano

    2016-12-01

    While a huge amount of (epi)genomic data of multiple types is becoming available by using Next Generation Sequencing (NGS) technologies, the most important emerging problem is the so-called tertiary analysis, concerned with sense making, e.g., discovering how different (epi)genomic regions and their products interact and cooperate with each other. We propose a paradigm shift in tertiary analysis, based on the use of the Genomic Data Model (GDM), a simple data model which links genomic feature data to their associated experimental, biological and clinical metadata. GDM encompasses all the data formats which have been produced for feature extraction from (epi)genomic datasets. We specifically describe the mapping to GDM of SAM (Sequence Alignment/Map), VCF (Variant Call Format), NARROWPEAK (for called peaks produced by NGS ChIP-seq or DNase-seq methods), and BED (Browser Extensible Data) formats, but GDM supports as well all the formats describing experimental datasets (e.g., including copy number variations, DNA somatic mutations, or gene expressions) and annotations (e.g., regarding transcription start sites, genes, enhancers or CpG islands). We downloaded and integrated samples of all the above-mentioned data types and formats from multiple sources. The GDM is able to homogeneously describe semantically heterogeneous data and makes the ground for providing data interoperability, e.g., achieved through the GenoMetric Query Language (GMQL), a high-level, declarative query language for genomic big data. The combined use of the data model and the query language allows comprehensive processing of multiple heterogeneous data, and supports the development of domain-specific data-driven computations and bio-molecular knowledge discovery. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Multiple DNA and protein sequence alignment on a workstation and a supercomputer.

    PubMed

    Tajima, K

    1988-11-01

    This paper describes a multiple alignment method using a workstation and supercomputer. The method is based on the alignment of a set of aligned sequences with the new sequence, and uses a recursive procedure of such alignment. The alignment is executed in a reasonable computation time on diverse levels from a workstation to a supercomputer, from the viewpoint of alignment results and computational speed by parallel processing. The application of the algorithm is illustrated by several examples of multiple alignment of 12 amino acid and DNA sequences of HIV (human immunodeficiency virus) env genes. Colour graphic programs on a workstation and parallel processing on a supercomputer are discussed.

  7. Piscine reovirus: Genomic and molecular phylogenetic analysis from farmed and wild salmonids collected on the Canada/US Pacific Coast

    USGS Publications Warehouse

    Siah, Ahmed; Morrison, Diane B.; Fringuelli, Elena; Savage, Paul S.; Richmond, Zina; Purcell, Maureen K.; Johns, Robert; Johnson, Stewart C.; Sakasida, Sonja M.

    2015-01-01

    Piscine reovirus (PRV) is a double stranded non-enveloped RNA virus detected in farmed and wild salmonids. This study examined the phylogenetic relationships among different PRV sequence types present in samples from salmonids in Western Canada and the US, including Alaska (US), British Columbia (Canada) and Washington State (US). Tissues testing positive for PRV were partially sequenced for segment S1, producing 71 sequences that grouped into 10 unique sequence types. Sequence analysis revealed no identifiable geographical or temporal variation among the sequence types. Identical sequence types were found in fish sampled in 2001, 2005 and 2014. In addition, PRV positive samples from fish derived from Alaska, British Columbia and Washington State share identical sequence types. Comparative analysis of the phylogenetic tree indicated that Canada/US Pacific Northwest sequences formed a subgroup with some Norwegian sequence types (group II), distinct from other Norwegian and Chilean sequences (groups I, III and IV). Representative PRV positive samples from farmed and wild fish in British Columbia and Washington State were subjected to genome sequencing using next generation sequencing methods. Individual analysis of each of the 10 partial segments indicated that the Canadian and US PRV sequence types clustered separately from available whole genome sequences of some Norwegian and Chilean sequences for all segments except the segment S4. In summary, PRV was genetically homogenous over a large geographic distance (Alaska to Washington State), and the sequence types were relatively stable over a 13 year period.

  8. Piscine Reovirus: Genomic and Molecular Phylogenetic Analysis from Farmed and Wild Salmonids Collected on the Canada/US Pacific Coast

    PubMed Central

    Siah, Ahmed; Morrison, Diane B.; Fringuelli, Elena; Savage, Paul; Richmond, Zina; Johns, Robert; Purcell, Maureen K.; Johnson, Stewart C.; Saksida, Sonja M.

    2015-01-01

    Piscine reovirus (PRV) is a double stranded non-enveloped RNA virus detected in farmed and wild salmonids. This study examined the phylogenetic relationships among different PRV sequence types present in samples from salmonids in Western Canada and the US, including Alaska (US), British Columbia (Canada) and Washington State (US). Tissues testing positive for PRV were partially sequenced for segment S1, producing 71 sequences that grouped into 10 unique sequence types. Sequence analysis revealed no identifiable geographical or temporal variation among the sequence types. Identical sequence types were found in fish sampled in 2001, 2005 and 2014. In addition, PRV positive samples from fish derived from Alaska, British Columbia and Washington State share identical sequence types. Comparative analysis of the phylogenetic tree indicated that Canada/US Pacific Northwest sequences formed a subgroup with some Norwegian sequence types (group II), distinct from other Norwegian and Chilean sequences (groups I, III and IV). Representative PRV positive samples from farmed and wild fish in British Columbia and Washington State were subjected to genome sequencing using next generation sequencing methods. Individual analysis of each of the 10 partial segments indicated that the Canadian and US PRV sequence types clustered separately from available whole genome sequences of some Norwegian and Chilean sequences for all segments except the segment S4. In summary, PRV was genetically homogenous over a large geographic distance (Alaska to Washington State), and the sequence types were relatively stable over a 13 year period. PMID:26536673

  9. TaxI: a software tool for DNA barcoding using distance methods

    PubMed Central

    Steinke, Dirk; Vences, Miguel; Salzburger, Walter; Meyer, Axel

    2005-01-01

    DNA barcoding is a promising approach to the diagnosis of biological diversity in which DNA sequences serve as the primary key for information retrieval. Most existing software for evolutionary analysis of DNA sequences was designed for phylogenetic analyses and, hence, those algorithms do not offer appropriate solutions for the rapid, but precise analyses needed for DNA barcoding, and are also unable to process the often large comparative datasets. We developed a flexible software tool for DNA taxonomy, named TaxI. This program calculates sequence divergences between a query sequence (taxon to be barcoded) and each sequence of a dataset of reference sequences defined by the user. Because the analysis is based on separate pairwise alignments this software is also able to work with sequences characterized by multiple insertions and deletions that are difficult to align in large sequence sets (i.e. thousands of sequences) by multiple alignment algorithms because of computational restrictions. Here, we demonstrate the utility of this approach with two datasets of fish larvae and juveniles from Lake Constance and juvenile land snails under different models of sequence evolution. Sets of ribosomal 16S rRNA sequences, characterized by multiple indels, performed as good as or better than cox1 sequence sets in assigning sequences to species, demonstrating the suitability of rRNA genes for DNA barcoding. PMID:16214755

  10. The evolution of neuropeptide signalling: insights from echinoderms.

    PubMed

    Semmens, Dean C; Elphick, Maurice R

    2017-09-01

    Neuropeptides are evolutionarily ancient mediators of neuronal signalling that regulate a wide range of physiological processes and behaviours in animals. Neuropeptide signalling has been investigated extensively in vertebrates and protostomian invertebrates, which include the ecdysozoans Drosophila melanogaster (Phylum Arthropoda) and Caenorhabditis elegans (Phylum Nematoda). However, until recently, an understanding of evolutionary relationships between neuropeptide signalling systems in vertebrates and protostomes has been impaired by a lack of genome/transcriptome sequence data from non-ecdysozoan invertebrates. The echinoderms-a deuterostomian phylum that includes sea urchins, sea cucumbers and starfish-have been particularly important in providing new insights into neuropeptide evolution. Sequencing of the genome of the sea urchin Strongylocentrotus purpuratus (Class Echinoidea) enabled discovery of (i) the first invertebrate thyrotropin-releasing hormone-type precursor, (ii) the first deuterostomian pedal peptide/orcokinin-type precursors and (iii) NG peptides-the 'missing link' between neuropeptide S in tetrapod vertebrates and crustacean cardioactive peptide in protostomes. More recently, sequencing of the neural transcriptome of the starfish Asterias rubens (Class Asteroidea) enabled identification of 40 neuropeptide precursors, including the first kisspeptin and melanin-concentrating hormone-type precursors to be identified outside of the chordates. Furthermore, the characterization of a corazonin-type neuropeptide signalling system in A. rubens has provided important new insights into the evolution of gonadotropin-releasing hormone-related neuropeptides. Looking forward, the discovery of multiple neuropeptide signalling systems in echinoderms provides opportunities to investigate how these systems are used to regulate physiological and behavioural processes in the unique context of a decentralized, pentaradial bauplan. © The Author 2017. Published by Oxford University Press.

  11. Pulsed-field gel electrophoresis and multi locus sequence typing for characterizing genotype variability of Yersinia ruckeri isolated from farmed fish in France.

    PubMed

    Calvez, Ségolène; Fournel, Catherine; Douet, Diane-Gaëlle; Daniel, Patrick

    2015-06-23

    Yersinia ruckeri is a pathogen that has an impact on aquaculture worldwide. The disease caused by this bacterial species, yersiniosis or redmouth disease, generates substantial economic losses due to the associated mortality and veterinary costs. For predicting outbreaks and improving control strategies, it is important to characterize the population structure of the bacteria. The phenotypic and genetic homogeneities described previously indicate a clonal population structure as observed in other fish bacteria. In this study, the pulsed-field gel electrophoresis (PFGE) and multi locus sequence typing (MLST) methods were used to describe a population of isolates from outbreaks on French fish farms. For the PFGE analysis, two enzymes (NotI and AscI) were used separately and together. Results from combining the enzymes showed the great homogeneity of the outbreak population with a similarity > 80.0% but a high variability within the cluster (cut-off value = 80.0%) with a total of 43 pulsotypes described and an index of diversity = 0.93. The dominant pulsotypes described with NotI (PtN4 and PtN7) have already been described in other European countries (Finland, Germany, Denmark, Spain and Italy). The MLST approach showed two dominant sequence types (ST31 and ST36), an epidemic structure of the French Y. ruckeri population and a preferentially clonal evolution for rainbow trout isolates. Our results point to multiple types of selection pressure on the Y. ruckeri population attributable to geographical origin, ecological niche specialization and movements of farmed fish.

  12. A sequence-specific transcription activator motif and powerful synthetic variants that bind Mediator using a fuzzy protein interface.

    PubMed

    Warfield, Linda; Tuttle, Lisa M; Pacheco, Derek; Klevit, Rachel E; Hahn, Steven

    2014-08-26

    Although many transcription activators contact the same set of coactivator complexes, the mechanism and specificity of these interactions have been unclear. For example, do intrinsically disordered transcription activation domains (ADs) use sequence-specific motifs, or do ADs of seemingly different sequence have common properties that encode activation function? We find that the central activation domain (cAD) of the yeast activator Gcn4 functions through a short, conserved sequence-specific motif. Optimizing the residues surrounding this short motif by inserting additional hydrophobic residues creates very powerful ADs that bind the Mediator subunit Gal11/Med15 with high affinity via a "fuzzy" protein interface. In contrast to Gcn4, the activity of these synthetic ADs is not strongly dependent on any one residue of the AD, and this redundancy is similar to that of some natural ADs in which few if any sequence-specific residues have been identified. The additional hydrophobic residues in the synthetic ADs likely allow multiple faces of the AD helix to interact with the Gal11 activator-binding domain, effectively forming a fuzzier interface than that of the wild-type cAD.

  13. Phylogeny of 54 representative strains of species in the family Pasteurellaceae as determined by comparison of 16S rRNA sequences.

    PubMed Central

    Dewhirst, F E; Paster, B J; Olsen, I; Fraser, G J

    1992-01-01

    Virtually complete 16S rRNA sequences were determined for 54 representative strains of species in the family Pasteurellaceae. Of these strains, 15 were Pasteurella, 16 were Actinobacillus, and 23 were Haemophilus. A phylogenetic tree was constructed based on sequence similarity, using the Neighbor-Joining method. Fifty-three of the strains fell within four large clusters. The first cluster included the type strains of Haemophilus influenzae, H. aegyptius, H. aphrophilus, H. haemolyticus, H. paraphrophilus, H. segnis, and Actinobacillus actinomycetemcomitans. This cluster also contained A. actinomycetemcomitans FDC Y4, ATCC 29522, ATCC 29523, and ATCC 29524 and H. aphrophilus NCTC 7901. The second cluster included the type strains of A. seminis and Pasteurella aerogenes and H. somnus OVCG 43826. The third cluster was composed of the type strains of Pasteurella multocida, P. anatis, P. avium, P. canis, P. dagmatis, P. gallinarum, P. langaa, P. stomatis, P. volantium, H. haemoglobinophilus, H. parasuis, H. paracuniculus, H. paragallinarum, and A. capsulatus. This cluster also contained Pasteurella species A CCUG 18782, Pasteurella species B CCUG 19974, Haemophilus taxon C CAPM 5111, H. parasuis type 5 Nagasaki, P. volantium (H. parainfluenzae) NCTC 4101, and P. trehalosi NCTC 10624. The fourth cluster included the type strains of Actinobacillus lignieresii, A. equuli, A. pleuropneumoniae, A. suis, A. ureae, H. parahaemolyticus, H. parainfluenzae, H. paraphrohaemolyticus, H. ducreyi, and P. haemolytica. This cluster also contained Actinobacillus species strain CCUG 19799 (Bisgaard taxon 11), A. suis ATCC 15557, H. ducreyi ATCC 27722 and HD 35000, Haemophilus minor group strain 202, and H. parainfluenzae ATCC 29242. The type strain of P. pneumotropica branched alone to form a fifth group. The branching of the Pasteurellaceae family tree was quite complex. The four major clusters contained multiple subclusters. The clusters contained both rapidly and slowly evolving strains (indicated by differing numbers of base changes incorporated into the 16S rRNA sequence relative to outgroup organisms). While the results presented a clear picture of the phylogenetic relationships, the complexity of the branching will make division of the family into genera a difficult and somewhat subjective task. We do not suggest any taxonomic changes at this time. PMID:1548238

  14. Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW.

    PubMed

    Oliver, Tim; Schmidt, Bertil; Nathan, Darran; Clemens, Ralf; Maskell, Douglas

    2005-08-15

    Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.

  15. Mango: multiple alignment with N gapped oligos.

    PubMed

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2008-06-01

    Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.

  16. Optimized scheduling technique of null subcarriers for peak power control in 3GPP LTE downlink.

    PubMed

    Cho, Soobum; Park, Sang Kyu

    2014-01-01

    Orthogonal frequency division multiple access (OFDMA) is a key multiple access technique for the long term evolution (LTE) downlink. However, high peak-to-average power ratio (PAPR) can cause the degradation of power efficiency. The well-known PAPR reduction technique, dummy sequence insertion (DSI), can be a realistic solution because of its structural simplicity. However, the large usage of subcarriers for the dummy sequences may decrease the transmitted data rate in the DSI scheme. In this paper, a novel DSI scheme is applied to the LTE system. Firstly, we obtain the null subcarriers in single-input single-output (SISO) and multiple-input multiple-output (MIMO) systems, respectively; then, optimized dummy sequences are inserted into the obtained null subcarrier. Simulation results show that Walsh-Hadamard transform (WHT) sequence is the best for the dummy sequence and the ratio of 16 to 20 for the WHT and randomly generated sequences has the maximum PAPR reduction performance. The number of near optimal iteration is derived to prevent exhausted iterations. It is also shown that there is no bit error rate (BER) degradation with the proposed technique in LTE downlink system.

  17. Optimized Scheduling Technique of Null Subcarriers for Peak Power Control in 3GPP LTE Downlink

    PubMed Central

    Park, Sang Kyu

    2014-01-01

    Orthogonal frequency division multiple access (OFDMA) is a key multiple access technique for the long term evolution (LTE) downlink. However, high peak-to-average power ratio (PAPR) can cause the degradation of power efficiency. The well-known PAPR reduction technique, dummy sequence insertion (DSI), can be a realistic solution because of its structural simplicity. However, the large usage of subcarriers for the dummy sequences may decrease the transmitted data rate in the DSI scheme. In this paper, a novel DSI scheme is applied to the LTE system. Firstly, we obtain the null subcarriers in single-input single-output (SISO) and multiple-input multiple-output (MIMO) systems, respectively; then, optimized dummy sequences are inserted into the obtained null subcarrier. Simulation results show that Walsh-Hadamard transform (WHT) sequence is the best for the dummy sequence and the ratio of 16 to 20 for the WHT and randomly generated sequences has the maximum PAPR reduction performance. The number of near optimal iteration is derived to prevent exhausted iterations. It is also shown that there is no bit error rate (BER) degradation with the proposed technique in LTE downlink system. PMID:24883376

  18. The Evolution of Mobile DNAs: When Will Transposons Create Phylogenies That Look As If There Is a Master Gene?

    PubMed Central

    Brookfield, John F. Y.; Johnson, Louise J.

    2006-01-01

    Some families of mammalian interspersed repetitive DNA, such as the Alu SINE sequence, appear to have evolved by the serial replacement of one active sequence with another, consistent with there being a single source of transposition: the “master gene.” Alternative models, in which multiple source sequences are simultaneously active, have been called “transposon models.” Transposon models differ in the proportion of elements that are active and in whether inactivation occurs at the moment of transposition or later. Here we examine the predictions of various types of transposon model regarding the patterns of sequence variation expected at an equilibrium between transposition, inactivation, and deletion. Under the master gene model, all bifurcations in the true tree of elements occur in a single lineage. We show that this property will also hold approximately for transposon models in which most elements are inactive and where at least some of the inactivation events occur after transposition. Such tree shapes are therefore not conclusive evidence for a single source of transposition. PMID:16790583

  19. XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets.

    PubMed

    Yu, Yao; Hu, Hao; Bohlender, Ryan J; Hu, Fulan; Chen, Jiun-Sheng; Holt, Carson; Fowler, Jerry; Guthery, Stephen L; Scheet, Paul; Hildebrandt, Michelle A T; Yandell, Mark; Huff, Chad D

    2018-04-06

    High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratification biases that overwhelm subtle signals of association in studies of complex traits. Here, we introduce the Cross-Platform Association Toolkit, XPAT, which provides a suite of tools designed to support and conduct large-scale association studies with heterogeneous sequencing datasets. XPAT includes tools to support cross-platform aware variant calling, quality control filtering, gene-based association testing and rare variant effect size estimation. To evaluate the performance of XPAT, we conducted case-control association studies for three diseases, including 783 breast cancer cases, 272 ovarian cancer cases, 205 Crohn disease cases and 3507 shared controls (including 1722 females) using sequencing data from multiple sources. XPAT greatly reduced Type I error inflation in the case-control analyses, while replicating many previously identified disease-gene associations. We also show that association tests conducted with XPAT using cross-platform data have comparable performance to tests using matched platform data. XPAT enables new association studies that combine existing sequencing datasets to identify genetic loci associated with common diseases and other complex traits.

  20. Using Next Generation Sequencing for Multiplexed Trait-Linked Markers in Wheat

    PubMed Central

    Bernardo, Amy; Wang, Shan; St. Amand, Paul; Bai, Guihua

    2015-01-01

    With the advent of next generation sequencing (NGS) technologies, single nucleotide polymorphisms (SNPs) have become the major type of marker for genotyping in many crops. However, the availability of SNP markers for important traits of bread wheat ( Triticum aestivum L.) that can be effectively used in marker-assisted selection (MAS) is still limited and SNP assays for MAS are usually uniplex. A shift from uniplex to multiplex assays will allow the simultaneous analysis of multiple markers and increase MAS efficiency. We designed 33 locus-specific markers from SNP or indel-based marker sequences that linked to 20 different quantitative trait loci (QTL) or genes of agronomic importance in wheat and analyzed the amplicon sequences using an Ion Torrent Proton Sequencer and a custom allele detection pipeline to determine the genotypes of 24 selected germplasm accessions. Among the 33 markers, 27 were successfully multiplexed and 23 had 100% SNP call rates. Results from analysis of "kompetitive allele-specific PCR" (KASP) and sequence tagged site (STS) markers developed from the same loci fully verified the genotype calls of 23 markers. The NGS-based multiplexed assay developed in this study is suitable for rapid and high-throughput screening of SNPs and some indel-based markers in wheat. PMID:26625271

  1. DAMe: a toolkit for the initial processing of datasets with PCR replicates of double-tagged amplicons for DNA metabarcoding analyses.

    PubMed

    Zepeda-Mendoza, Marie Lisandra; Bohmann, Kristine; Carmona Baez, Aldo; Gilbert, M Thomas P

    2016-05-03

    DNA metabarcoding is an approach for identifying multiple taxa in an environmental sample using specific genetic loci and taxa-specific primers. When combined with high-throughput sequencing it enables the taxonomic characterization of large numbers of samples in a relatively time- and cost-efficient manner. One recent laboratory development is the addition of 5'-nucleotide tags to both primers producing double-tagged amplicons and the use of multiple PCR replicates to filter erroneous sequences. However, there is currently no available toolkit for the straightforward analysis of datasets produced in this way. We present DAMe, a toolkit for the processing of datasets generated by double-tagged amplicons from multiple PCR replicates derived from an unlimited number of samples. Specifically, DAMe can be used to (i) sort amplicons by tag combination, (ii) evaluate PCR replicates dissimilarity, and (iii) filter sequences derived from sequencing/PCR errors, chimeras, and contamination. This is attained by calculating the following parameters: (i) sequence content similarity between the PCR replicates from each sample, (ii) reproducibility of each unique sequence across the PCR replicates, and (iii) copy number of the unique sequences in each PCR replicate. We showcase the insights that can be obtained using DAMe prior to taxonomic assignment, by applying it to two real datasets that vary in their complexity regarding number of samples, sequencing libraries, PCR replicates, and used tag combinations. Finally, we use a third mock dataset to demonstrate the impact and importance of filtering the sequences with DAMe. DAMe allows the user-friendly manipulation of amplicons derived from multiple samples with PCR replicates built in a single or multiple sequencing libraries. It allows the user to: (i) collapse amplicons into unique sequences and sort them by tag combination while retaining the sample identifier and copy number information, (ii) identify sequences carrying unused tag combinations, (iii) evaluate the comparability of PCR replicates of the same sample, and (iv) filter tagged amplicons from a number of PCR replicates using parameters of minimum length, copy number, and reproducibility across the PCR replicates. This enables an efficient analysis of complex datasets, and ultimately increases the ease of handling datasets from large-scale studies.

  2. CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.

    PubMed

    Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan

    2017-06-24

    The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn 2 ) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to accelerate multiple sequence alignment. Besides, adopting the co-run computation model can maximize the entire system utilization significantly. The source code is available at https://github.com/wangvsa/CMSA .

  3. Diversity of Group I and II Clostridium botulinum Strains from France Including Recently Identified Subtypes.

    PubMed

    Mazuet, Christelle; Legeay, Christine; Sautereau, Jean; Ma, Laurence; Bouchier, Christiane; Bouvet, Philippe; Popoff, Michel R

    2016-06-13

    In France, human botulism is mainly food-borne intoxication, whereas infant botulism is rare. A total of 99 group I and II Clostridium botulinum strains including 59 type A (12 historical isolates [1947-1961], 43 from France [1986-2013], 3 from other countries, and 1 collection strain), 31 type B (3 historical, 23 recent isolates, 4 from other countries, and 1 collection strain), and 9 type E (5 historical, 3 isolates, and 1 collection strain) were investigated by botulinum locus gene sequencing and multilocus sequence typing analysis. Historical C. botulinum A strains mainly belonged to subtype A1 and sequence type (ST) 1, whereas recent strains exhibited a wide genetic diversity: subtype A1 in orfX or ha locus, A1(B), A1(F), A2, A2b2, A5(B2') A5(B3'), as well as the recently identified A7 and A8 subtypes, and were distributed into 25 STs. Clostridium botulinum A1(B) was the most frequent subtype from food-borne botulism and food. Group I C. botulinum type B in France were mainly subtype B2 (14 out of 20 historical and recent strains) and were divided into 19 STs. Food-borne botulism resulting from ham consumption during the recent period was due to group II C. botulinum B4. Type E botulism is rare in France, 5 historical and 1 recent strains were subtype E3. A subtype E12 was recently identified from an unusual ham contamination. Clostridium botulinum strains from human botulism in France showed a wide genetic diversity and seems to result not from a single evolutionary lineage but from multiple and independent genetic rearrangements. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  4. Use of PCR with Sequence-specific Primers for High-Resolution Human Leukocyte Antigen Typing of Patients with Narcolepsy

    PubMed Central

    Woo, Hye In; Joo, Eun Yeon; Lee, Kyung Wha

    2012-01-01

    Background Narcolepsy is a neurologic disorder characterized by excessive daytime sleepiness, symptoms of abnormal rapid eye movement (REM) sleep, and a strong association with HLA-DRB1*1501, -DQA1*0102, and -DQB1*0602. Here, we investigated the clinico-physical characteristics of Korean patients with narcolepsy, their HLA types, and the clinical utility of high-resolution PCR with sequence-specific primers (PCR-SSP) as a simple typing method for identifying DRB1*15/16, DQA1, and DQB1 alleles. Methods The study population consisted of 67 consecutively enrolled patients having unexplained daytime sleepiness and diagnosed narcolepsy based on clinical and neurological findings. Clinical data and the results of the multiple sleep latency test and polysomnography were reviewed, and HLA typing was performed using both high-resolution PCR-SSP and sequence-based typing (SBT). Results The 44 narcolepsy patients with cataplexy displayed significantly higher frequencies of DRB1*1501 (Pc= 0.003), DQA1*0102 (Pc=0.001), and DQB1*0602 (Pc=0.014) than the patients without cataplexy. Among patients carrying DRB1*1501-DQB1*0602 or DQA1*0102, the frequencies of a mean REM sleep latency of less than 20 min in nocturnal polysomnography and clinical findings, including sleep paralysis and hypnagogic hallucination were significantly higher. SBT and PCR-SSP showed 100% concordance for high-resolution typing of DRB1*15/16 alleles and DQA1 and DQB1 loci. Conclusions The clinical characteristics and somnographic findings of narcolepsy patients were associated with specific HLA alleles, including DRB1*1501, DQA1*0102, and DQB1*0602. Application of high-resolution PCR-SSP, a reliable and simple method, for both allele- and locus-specific HLA typing of DRB1*15/16, DQA1, and DQB1 would be useful for characterizing clinical status among subjects with narcolepsy. PMID:22259780

  5. Learning of goal-relevant and -irrelevant complex visual sequences in human V1.

    PubMed

    Rosenthal, Clive R; Mallik, Indira; Caballero-Gaudes, Cesar; Sereno, Martin I; Soto, David

    2018-06-12

    Learning and memory are supported by a network involving the medial temporal lobe and linked neocortical regions. Emerging evidence indicates that primary visual cortex (i.e., V1) may contribute to recognition memory, but this has been tested only with a single visuospatial sequence as the target memorandum. The present study used functional magnetic resonance imaging to investigate whether human V1 can support the learning of multiple, concurrent complex visual sequences involving discontinous (second-order) associations. Two peripheral, goal-irrelevant but structured sequences of orientated gratings appeared simultaneously in fixed locations of the right and left visual fields alongside a central, goal-relevant sequence that was in the focus of spatial attention. Pseudorandom sequences were introduced at multiple intervals during the presentation of the three structured visual sequences to provide an online measure of sequence-specific knowledge at each retinotopic location. We found that a network involving the precuneus and V1 was involved in learning the structured sequence presented at central fixation, whereas right V1 was modulated by repeated exposure to the concurrent structured sequence presented in the left visual field. The same result was not found in left V1. These results indicate for the first time that human V1 can support the learning of multiple concurrent sequences involving complex discontinuous inter-item associations, even peripheral sequences that are goal-irrelevant. Copyright © 2018. Published by Elsevier Inc.

  6. Phylo-mLogo: an interactive and hierarchical multiple-logo visualization tool for alignment of many sequences

    PubMed Central

    Shih, Arthur Chun-Chieh; Lee, DT; Peng, Chin-Lin; Wu, Yu-Wei

    2007-01-01

    Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL . PMID:17319966

  7. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities.

    PubMed

    Troshin, Peter V; Postis, Vincent Lg; Ashworth, Denise; Baldwin, Stephen A; McPherson, Michael J; Barton, Geoffrey J

    2011-03-07

    Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  8. Evidence of Divergent Amino Acid Usage in Comparative Analyses of R5- and X4-Associated HIV-1 Vpr Sequences

    PubMed Central

    Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia

    2017-01-01

    Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613

  9. Long-range barcode labeling-sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Feng; Zhang, Tao; Singh, Kanwar K.

    Methods for sequencing single large DNA molecules by clonal multiple displacement amplification using barcoded primers. Sequences are binned based on barcode sequences and sequenced using a microdroplet-based method for sequencing large polynucleotide templates to enable assembly of haplotype-resolved complex genomes and metagenomes.

  10. Abundant and diverse clustered regularly interspaced short palindromic repeat spacers in Clostridium difficile strains and prophages target multiple phage types within this pathogen.

    PubMed

    Hargreaves, Katherine R; Flores, Cesar O; Lawley, Trevor D; Clokie, Martha R J

    2014-08-26

    Clostridium difficile is an important human-pathogenic bacterium causing antibiotic-associated nosocomial infections worldwide. Mobile genetic elements and bacteriophages have helped shape C. difficile genome evolution. In many bacteria, phage infection may be controlled by a form of bacterial immunity called the clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas) system. This uses acquired short nucleotide sequences (spacers) to target homologous sequences (protospacers) in phage genomes. C. difficile carries multiple CRISPR arrays, and in this paper we examine the relationships between the host- and phage-carried elements of the system. We detected multiple matches between spacers and regions in 31 C. difficile phage and prophage genomes. A subset of the spacers was located in prophage-carried CRISPR arrays. The CRISPR spacer profiles generated suggest that related phages would have similar host ranges. Furthermore, we show that C. difficile strains of the same ribotype could either have similar or divergent CRISPR contents. Both synonymous and nonsynonymous mutations in the protospacer sequences were identified, as well as differences in the protospacer adjacent motif (PAM), which could explain how phages escape this system. This paper illustrates how the distribution and diversity of CRISPR spacers in C. difficile, and its prophages, could modulate phage predation for this pathogen and impact upon its evolution and pathogenicity. Clostridium difficile is a significant bacterial human pathogen which undergoes continual genome evolution, resulting in the emergence of new virulent strains. Phages are major facilitators of genome evolution in other bacterial species, and we use sequence analysis-based approaches in order to examine whether the CRISPR/Cas system could control these interactions across divergent C. difficile strains. The presence of spacer sequences in prophages that are homologous to phage genomes raises an extra level of complexity in this predator-prey microbial system. Our results demonstrate that the impact of phage infection in this system is widespread and that the CRISPR/Cas system is likely to be an important aspect of the evolutionary dynamics in C. difficile. Copyright © 2014 Hargreaves et al.

  11. High-speed multiple sequence alignment on a reconfigurable platform.

    PubMed

    Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

    2006-01-01

    Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.

  12. MiR-191 Regulates Primary Human Fibroblast Proliferation and Directly Targets Multiple Oncogenes

    PubMed Central

    Polioudakis, Damon; Abell, Nathan S.; Iyer, Vishwanath R.

    2015-01-01

    miRNAs play a central role in numerous pathologies including multiple cancer types. miR-191 has predominantly been studied as an oncogene, but the role of miR-191 in the proliferation of primary cells is not well characterized, and the miR-191 targetome has not been experimentally profiled. Here we utilized RNA induced silencing complex immunoprecipitations as well as gene expression profiling to construct a genome wide miR-191 target profile. We show that miR-191 represses proliferation in primary human fibroblasts, identify multiple proto-oncogenes as novel miR-191 targets, including CDK9, NOTCH2, and RPS6KA3, and present evidence that miR-191 extensively mediates target expression through coding sequence (CDS) pairing. Our results provide a comprehensive genome wide miR-191 target profile, and demonstrate miR-191’s regulation of primary human fibroblast proliferation. PMID:25992613

  13. Institutional Protocol to Manage Consanguinity Detected by Genetic Testing in Pregnancy in a Minor

    PubMed Central

    Chen, Laura P.; Beck, Anita E.; Tsuchiya, Karen D.; Chow, Penny M.; Mirzaa, Ghayda M.; Wiester, Rebecca T.

    2015-01-01

    Single-nucleotide polymorphism arrays and other types of genetic tests have the potential to detect first-degree consanguinity and uncover parental rape in cases of minor teenage pregnancy. We present 2 cases in which genetic testing identified parental rape of a minor teenager. In case 1, single-nucleotide polymorphism array in a patient with multiple developmental abnormalities demonstrated multiple long stretches of homozygosity, revealing parental rape of a teenage mother. In case 2, a vague maternal sexual assault history and diagnosis of Pompe disease by direct gene sequencing identified parental rape of a minor. Given the medical, legal, and ethical implications of such revelations, a protocol was developed at our institution to manage consanguinity identified via genetic testing. PMID:25687148

  14. Human Neoplasms Elicit Multiple Specific Immune Responses in the Autologous Host

    NASA Astrophysics Data System (ADS)

    Sahin, Ugur; Tureci, Ozlem; Schmitt, Holger; Cochlovius, Bjorn; Johannes, Thomas; Schmits, Rudolf; Stenner, Frank; Luo, Guorong; Schobert, Ingrid; Pfreundschuh, Michael

    1995-12-01

    Expression of cDNA libraries from human melanoma, renal cancer, astrocytoma, and Hodgkin disease in Escherichia coli and screening for clones reactive with high-titer IgG antibodies in autologous patient serum lead to the discovery of at least four antigens with a restricted expression pattern in each tumor. Besides antigens known to elicit T-cell responses, such as MAGE-1 and tyrosinase, numerous additional antigens that were overexpressed or specifically expressed in tumors of the same type were identified. Sequence analyses suggest that many of these molecules, besides being the target of a specific immune response, might be of relevance for tumor growth. Antibodies to a given antigen were usually confined to patients with the same tumor type. The unexpected frequency of human tumor antigens, which can be readily defined at the molecular level by the serological analysis of autologous tumor cDNA expression cloning, indicates that human neoplasms elicit multiple specific immune responses in the autologous host and provides diagnostic and therapeutic approaches to human cancer.

  15. Multiple templates-based homology modeling enhances structure quality of AT1 receptor: validation by molecular dynamics and antagonist docking.

    PubMed

    Sokkar, Pandian; Mohandass, Shylajanaciyar; Ramachandran, Murugesan

    2011-07-01

    We present a comparative account on 3D-structures of human type-1 receptor (AT1) for angiotensin II (AngII), modeled using three different methodologies. AngII activates a wide spectrum of signaling responses via the AT1 receptor that mediates physiological control of blood pressure and diverse pathological actions in cardiovascular, renal, and other cell types. Availability of 3D-model of AT1 receptor would significantly enhance the development of new drugs for cardiovascular diseases. However, templates of AT1 receptor with low sequence similarity increase the complexity in straightforward homology modeling, and hence there is a need to evaluate different modeling methodologies in order to use the models for sensitive applications such as rational drug design. Three models were generated for AT1 receptor by, (1) homology modeling with bovine rhodopsin as template, (2) homology modeling with multiple templates and (3) threading using I-TASSER web server. Molecular dynamics (MD) simulation (15 ns) of models in explicit membrane-water system, Ramachandran plot analysis and molecular docking with antagonists led to the conclusion that multiple template-based homology modeling outweighs other methodologies for AT1 modeling.

  16. Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation.

    PubMed

    Szatkiewicz, Jin P; Wang, WeiBo; Sullivan, Patrick F; Wang, Wei; Sun, Wei

    2013-02-01

    Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth-based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth-based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.

  17. System, method and apparatus for generating phrases from a database

    NASA Technical Reports Server (NTRS)

    McGreevy, Michael W. (Inventor)

    2004-01-01

    A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.

  18. Symmetric convolution of asymmetric multidimensional sequences using discrete trigonometric transforms.

    PubMed

    Foltz, T M; Welsh, B M

    1999-01-01

    This paper uses the fact that the discrete Fourier transform diagonalizes a circulant matrix to provide an alternate derivation of the symmetric convolution-multiplication property for discrete trigonometric transforms. Derived in this manner, the symmetric convolution-multiplication property extends easily to multiple dimensions using the notion of block circulant matrices and generalizes to multidimensional asymmetric sequences. The symmetric convolution of multidimensional asymmetric sequences can then be accomplished by taking the product of the trigonometric transforms of the sequences and then applying an inverse trigonometric transform to the result. An example is given of how this theory can be used for applying a two-dimensional (2-D) finite impulse response (FIR) filter with nonlinear phase which models atmospheric turbulence.

  19. Bisulfite-independent analysis of CpG island methylation enables genome-scale stratification of single cells.

    PubMed

    Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M; Michor, Franziska; Fan, Rong; Pan, Xinghua

    2017-06-02

    Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Methods for MHC genotyping in non-model vertebrates.

    PubMed

    Babik, W

    2010-03-01

    Genes of the major histocompatibility complex (MHC) are considered a paradigm of adaptive evolution at the molecular level and as such are frequently investigated by evolutionary biologists and ecologists. Accurate genotyping is essential for understanding of the role that MHC variation plays in natural populations, but may be extremely challenging. Here, I discuss the DNA-based methods currently used for genotyping MHC in non-model vertebrates, as well as techniques likely to find widespread use in the future. I also highlight the aspects of MHC structure that are relevant for genotyping, and detail the challenges posed by the complex genomic organization and high sequence variation of MHC loci. Special emphasis is placed on designing appropriate PCR primers, accounting for artefacts and the problem of genotyping alleles from multiple, co-amplifying loci, a strategy which is frequently necessary due to the structure of the MHC. The suitability of typing techniques is compared in various research situations, strategies for efficient genotyping are discussed and areas of likely progress in future are identified. This review addresses the well established typing methods such as the Single Strand Conformation Polymorphism (SSCP), Denaturing Gradient Gel Electrophoresis (DGGE), Reference Strand Conformational Analysis (RSCA) and cloning of PCR products. In addition, it includes the intriguing possibility of direct amplicon sequencing followed by the computational inference of alleles and also next generation sequencing (NGS) technologies; the latter technique may, in the future, find widespread use in typing complex multilocus MHC systems. © 2009 Blackwell Publishing Ltd.

  1. Enterovirus Migration Patterns between France and Tunisia.

    PubMed

    Othman, Ines; Mirand, Audrey; Slama, Ichrak; Mastouri, Maha; Peigue-Lafeuille, Hélène; Aouni, Mahjoub; Bailly, Jean-Luc

    2015-01-01

    The enterovirus (EV) types echovirus (E-) 5, E-9, and E-18, and coxsackievirus (CV-) A9 are infrequently reported in human diseases and their epidemiologic features are poorly defined. Virus transmission patterns between countries have been estimated with phylogenetic data derived from the 1D/VP1 and 3CD gene sequences of a sample of 74 strains obtained in France (2000-2012) and Tunisia (2011-2013) and from the publicly available sequences. The EV types (E-5, E-9, and E-18) exhibited a lower worldwide genetic diversity (respective number of genogroups: 4, 5, and 3) in comparison to CV-A9 (n = 10). The phylogenetic trees estimated with both 1D/VP1 and 3CD sequence data showed variations in the number of co-circulating lineages over the last 20 years among the four EV types. Despite the low number of genogroups in E-18, the virus exhibited the highest number of recombinant 3CD lineages (n = 10) versus 4 (E-5) to 8 (E-9). The phylogenies provided evidence of multiple transportation events between France and Tunisia involving E-5, E-9, E-18, and CV-A9 strains. Virus spread events between France and 17 other countries in five continents had high probabilities of occurrence as those between Tunisia and two European countries other than France. All transportation events were supported by BF values > 10. Inferring the source of virus transmission from phylogenetic data may provide insights into the patterns of sporadic and epidemic diseases caused by EVs.

  2. Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

    PubMed Central

    2012-01-01

    Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678

  3. An epidemiological survey of bovine Babesia and Theileria parasites in cattle, buffaloes, and sheep in Egypt.

    PubMed

    Elsify, Ahmed; Sivakumar, Thillaiampalam; Nayel, Mohammed; Salama, Akram; Elkhtam, Ahmed; Rizk, Mohamed; Mosaab, Omar; Sultan, Khaled; Elsayed, Shimaa; Igarashi, Ikuo; Yokoyama, Naoaki

    2015-02-01

    Cattle, buffaloes, and sheep are the main sources of meat and milk in Egypt, but their productivity is thought to be greatly reduced by hemoprotozoan parasitic diseases. In this study, we analyzed the infection rates of Babesia bovis, Babesia bigemina, Theileria annulata, and Theileria orientalis, using parasite-specific PCR assays in blood-DNA samples sourced from cattle (n=439), buffaloes (n=50), and sheep (n=105) reared in Menoufia, Behera, Giza, and Sohag provinces of Egypt. In cattle, the positive rates of B. bovis, B. bigemina, T. annulata, and T. orientalis were 3.18%, 7.97%, 9.56%, and 0.68%, respectively. On the other hand, B. bovis and T. orientalis were the only parasites detected in buffaloes and each of these parasites was only found in two individual DNA samples (both 2%), while one (0.95%) and two (1.90%) of the sheep samples were positive for B. bovis and B. bigemina, respectively. Sequence analysis showed that the B. bovis Rhoptry Associated Protein-1 and the B. bigemina Apical Membrane Antigen-1 genes were highly conserved among the samples, with 99.3-100% and 95.3-100% sequence identity values, respectively. In contrast, the Egyptian T. annulata merozoite surface antigen-1 gene sequences were relatively diverse (87.8-100% identity values), dispersing themselves across several clades in the phylogenetic tree containing sequences from other countries. Additionally, the T. orientalis Major Piroplasm Surface Protein (MPSP) gene sequences were classified as types 1 and 2. This is the first report of T. orientalis in Egypt, and of type 2 MPSP in buffaloes. Detection of MPSP type 2, which is considered a relatively virulent genotype, suggests that T. orientalis infection may have veterinary and economic significance in Egypt. In conclusion, the present study, which analyzed multiple species of Babesia and Theileria parasites in different livestock animals, may shed an additional light on the epidemiology of hemoprotozoan parasites in Egypt. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  4. Multiple regulatory mechanisms of hepatocyte growth factor expression in malignant cells with a short poly(dA) sequence in the HGF gene promoter.

    PubMed

    Sakai, Kazuko; Takeda, Masayuki; Okamoto, Isamu; Nakagawa, Kazuhiko; Nishio, Kazuto

    2015-01-01

    Hepatocyte growth factor (HGF) expression is a poor prognostic factor in various types of cancer. Expression levels of HGF have been reported to be regulated by shorter poly(dA) sequences in the promoter region. In the present study, the poly(dA) mononucleotide tract in various types of human cancer cell lines was examined and compared with the HGF expression levels in those cells. Short deoxyadenosine repeat sequences were detected in five of the 55 cell lines used in the present study. The H69, IM95, CCK-81, Sui73 and H28 cells exhibited a truncated poly(dA) sequence in which the number of poly(dA) repeats was reduced by ≥5 bp. Two of the cell lines exhibited high HGF expression, determined by reverse transcription quantitative polymerase chain reaction and enzyme-linked immunosorbent assay. The CCK-81, Sui73 and H28 cells with shorter poly(dA) sequences exhibited low HGF expression. The cause of the suppression of HGF expression in the CCK-81, Sui73 and H28 cells was clarified by two approaches, suppression by methylation and single nucleotide polymorphisms in the HGF gene. Exposure to 5-Aza-dC, an inhibitor of DNA methyltransferase 1, induced an increased expression of HGF in the CCK-81 cells, but not in the other cells. Single-nucleotide polymorphism (SNP) rs72525097 in intron 1 was detected in the Sui73 and H28 cells. Taken together, it was found that the defect of poly(dA) in the HGF promoter was present in various types of cancer, including lung, stomach, colorectal, pancreas and mesothelioma. The present study proposes the negative regulation mechanisms by methylation and SNP in intron 1 of HGF for HGF expression in cancer cells with short poly(dA).

  5. Identification of Prostate Cancer-Specific microDNAs

    DTIC Science & Technology

    2016-02-01

    circular DNA by rolling circle amplification (RCA) and then amplified DNA fragments were subject to deep sequencing. Deep sequencing of the...demonstrate the existence of microDNAs in prostate cancer. We adopted multiple displacement amplification (MDA) with random 2 primers for enriched...prostate cancer cells through multiple displacement amplification and next generation sequencing. R e la ti v e c e ll g ro w th ( % ) 0 20

  6. mtDNA control-region sequence variation suggests multiple independent origins of an "Asian-specific" 9-bp deletion in sub-Saharan Africans.

    PubMed Central

    Soodyall, H.; Vigilant, L.; Hill, A. V.; Stoneking, M.; Jenkins, T.

    1996-01-01

    The intergenic COII/tRNA(Lys) 9-bp deletion in human mtDNA, which is found at varying frequencies in Asia, Southeast Asia, Polynesia, and the New World, was also found in 81 of 919 sub-Saharan Africans. Using mtDNA control-region sequence data from a subset of 41 individuals with the deletion, we identified 22 unique mtDNA types associated with the deletion in Africa. A comparison of the unique mtDNA types from sub-Saharan Africans and Asians with the 9-bp deletion revealed that sub-Saharan Africans and Asians have sequence profiles that differ in the locations and frequencies of variant sites. Both phylogenetic and mismatch-distribution analysis suggest that 9-bp deletion arose independently in sub-Saharan Africa and Asia and that the deletion has arisen more than once in Africa. Within Africa, the deletion was not found among Khoisan peoples and was rare to absent in western and southwestern African populations, but it did occur in Pygmy and Negroid populations from central Africa and in Malawi and southern African Bantu-speakers. The distribution of the 9-bp deletion in Africa suggests that the deletion could have arisen in central Africa and was then introduced to southern Africa via the recent "Bantu expansion." PMID:8644719

  7. Aspergillus Section Fumigati Typing by PCR-Restriction Fragment Polymorphism▿

    PubMed Central

    Staab, Janet F.; Balajee, S. Arunmozhi; Marr, Kieren A.

    2009-01-01

    Recent studies have shown that there are multiple clinically important members of the Aspergillus section Fumigati that are difficult to distinguish on the basis of morphological features (e.g., Aspergillus fumigatus, A. lentulus, and Neosartorya udagawae). Identification of these organisms may be clinically important, as some species vary in their susceptibilities to antifungal agents. In a prior study, we utilized multilocus sequence typing to describe A. lentulus as a species distinct from A. fumigatus. The sequence data show that the gene encoding β-tubulin, benA, has high interspecies variability at intronic regions but is conserved among isolates of the same species. These data were used to develop a PCR-restriction fragment length polymorphism (PCR-RFLP) method that rapidly and accurately distinguishes A. fumigatus, A. lentulus, and N. udagawae, three major species within the section Fumigati that have previously been implicated in disease. Digestion of the benA amplicon with BccI generated unique banding patterns; the results were validated by screening a collection of clinical strains and by in silico analysis of the benA sequences of Aspergillus spp. deposited in the GenBank database. PCR-RFLP of benA is a simple method for the identification of clinically important, similar morphotypes of Aspergillus spp. within the section Fumigati. PMID:19403766

  8. Aspergillus section Fumigati typing by PCR-restriction fragment polymorphism.

    PubMed

    Staab, Janet F; Balajee, S Arunmozhi; Marr, Kieren A

    2009-07-01

    Recent studies have shown that there are multiple clinically important members of the Aspergillus section Fumigati that are difficult to distinguish on the basis of morphological features (e.g., Aspergillus fumigatus, A. lentulus, and Neosartorya udagawae). Identification of these organisms may be clinically important, as some species vary in their susceptibilities to antifungal agents. In a prior study, we utilized multilocus sequence typing to describe A. lentulus as a species distinct from A. fumigatus. The sequence data show that the gene encoding beta-tubulin, benA, has high interspecies variability at intronic regions but is conserved among isolates of the same species. These data were used to develop a PCR-restriction fragment length polymorphism (PCR-RFLP) method that rapidly and accurately distinguishes A. fumigatus, A. lentulus, and N. udagawae, three major species within the section Fumigati that have previously been implicated in disease. Digestion of the benA amplicon with BccI generated unique banding patterns; the results were validated by screening a collection of clinical strains and by in silico analysis of the benA sequences of Aspergillus spp. deposited in the GenBank database. PCR-RFLP of benA is a simple method for the identification of clinically important, similar morphotypes of Aspergillus spp. within the section Fumigati.

  9. A novel BLAST-Based Relative Distance (BBRD) method can effectively group members of protein arginine methyltransferases and suggest their evolutionary relationship.

    PubMed

    Wang, Yi-Chun; Wang, Jing-Doo; Chen, Chin-Han; Chen, Yi-Wen; Li, Chuan

    2015-03-01

    We developed a novel BLAST-Based Relative Distance (BBRD) method by Pearson's correlation coefficient to avoid the problems of tedious multiple sequence alignment and complicated outgroup selection. We showed its application on reconstructing reliable phylogeny for nucleotide and protein sequences as exemplified by the fmr-1 gene and dihydrolipoamide dehydrogenase, respectively. We then used BBRD to resolve 124 protein arginine methyltransferases (PRMTs) that are homologues of nine mammalian PRMTs. The tree placed the uncharacterized PRMT9 with PRMT7 in the same clade, outside of all the Type I PRMTs including PRMT1 and its vertebrate paralogue PRMT8, PRMT3, PRMT6, PRMT2 and PRMT4. The PRMT7/9 branch then connects with the type II PRMT5. Some non-vertebrates contain different PRMTs without high sequence homology with the mammalian PRMTs. For example, in the case of Drosophila arginine methyltransferase (DART) and Trypanosoma brucei methyltransferases (TbPRMTs) in the analyses, the BBRD program grouped them with specific clades and thus suggested their evolutionary relationships. The BBRD method thus provided a great tool to construct a reliable tree for members of protein families through evolution. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Applying Agrep to r-NSA to solve multiple sequences approximate matching.

    PubMed

    Ni, Bing; Wong, Man-Hon; Lam, Chi-Fai David; Leung, Kwong-Sak

    2014-01-01

    This paper addresses the approximate matching problem in a database consisting of multiple DNA sequences, where the proposed approach applies Agrep to a new truncated suffix array, r-NSA. The construction time of the structure is linear to the database size, and the computations of indexing a substring in the structure are constant. The number of characters processed in applying Agrep is analysed theoretically, and the theoretical upper-bound can approximate closely the empirical number of characters, which is obtained through enumerating the characters in the actual structure built. Experiments are carried out using (synthetic) random DNA sequences, as well as (real) genome sequences including Hepatitis-B Virus and X-chromosome. Experimental results show that, compared to the straight-forward approach that applies Agrep to multiple sequences individually, the proposed approach solves the matching problem in much shorter time. The speed-up of our approach depends on the sequence patterns, and for highly similar homologous genome sequences, which are the common cases in real-life genomes, it can be up to several orders of magnitude.

  11. The utility of multiple molecular methods including whole genome sequencing as tools to differentiate Escherichia coli O157:H7 outbreaks.

    PubMed

    Berenger, Byron M; Berry, Chrystal; Peterson, Trevor; Fach, Patrick; Delannoy, Sabine; Li, Vincent; Tschetter, Lorelee; Nadon, Celine; Honish, Lance; Louie, Marie; Chui, Linda

    2015-01-01

    A standardised method for determining Escherichia coli O157:H7 strain relatedness using whole genome sequencing or virulence gene profiling is not yet established. We sought to assess the capacity of either high-throughput polymerase chain reaction (PCR) of 49 virulence genes, core-genome single nt variants (SNVs) or k-mer clustering to discriminate between outbreak-associated and sporadic E. coli O157:H7 isolates. Three outbreaks and multiple sporadic isolates from the province of Alberta, Canada were included in the study. Two of the outbreaks occurred concurrently in 2014 and one occurred in 2012. Pulsed-field gel electrophoresis (PFGE) and multilocus variable-number tandem repeat analysis (MLVA) were employed as comparator typing methods. The virulence gene profiles of isolates from the 2012 and 2014 Alberta outbreak events and contemporary sporadic isolates were mostly identical; therefore the set of virulence genes chosen in this study were not discriminatory enough to distinguish between outbreak clusters. Concordant with PFGE and MLVA results, core genome SNV and k-mer phylogenies clustered isolates from the 2012 and 2014 outbreaks as distinct events. k-mer phylogenies demonstrated increased discriminatory power compared with core SNV phylogenies. Prior to the widespread implementation of whole genome sequencing for routine public health use, issues surrounding cost, technical expertise, software standardisation, and data sharing/comparisons must be addressed.

  12. The distribution of a phage-related insertion sequence element in the cyanobacterium, Microcystis aeruginosa.

    PubMed

    Kuno, Sotaro; Yoshida, Takashi; Kamikawa, Ryoma; Hosoda, Naohiko; Sako, Yoshihiko

    2010-01-01

    The cyanophage Ma-LMM01, specifically-infecting Microcystis aeruginosa, has an insertion sequence (IS) element that we named IS607-cp showing high nucleotide similarity to a counterpart in the genome of the cyanobacterium Cyanothece sp. We tested 21 strains of M. aeruginosa for the presence of IS607-cp using PCR and detected the element in strains NIES90, NIES112, NIES604, and RM6. Thermal asymmetric interlaced PCR (TAIL-PCR) revealed each of these strains has multiple copies of IS607-cp. Some of the ISs were classified into three types based on their inserted positions; IS607-cp-1 is common in strains NIES90, NIES112 and NIES604, whereas IS607-cp-2 and IS607-cp-3 are specific to strains NIES90 and RM6, respectively. This multiplicity may reflect the replicative transposition of IS607-cp. The sequence of IS607-cp in Ma-LMM01 showed robust affinity to those found in M. aeruginosa and Cyanothece spp. in a phylogenetic tree inferred from counterparts of various bacteria. This suggests the transfer of IS607-cp between the cyanobacterium and its cyanophage. We discuss the potential role of Ma-LMM01-related phages as donors of IS elements that may mediate the transfer of IS607-cp; and thereby partially contribute to the genome plasticity of M. aeruginosa.

  13. Genomic Dissection of an Icelandic Epidemic of Respiratory Disease in Horses and Associated Zoonotic Cases

    PubMed Central

    Björnsdóttir, Sigríður; Harris, Simon R.; Svansson, Vilhjálmur; Gunnarsson, Eggert; Sigurðardóttir, Ólöf G.; Gammeljord, Kristina; Steward, Karen F.; Newton, J. Richard; Robinson, Carl; Charbonneau, Amelia R. L.

    2017-01-01

    ABSTRACT Iceland is free of the major infectious diseases of horses. However, in 2010 an epidemic of respiratory disease of unknown cause spread through the country’s native horse population of 77,000. Microbiological investigations ruled out known viral agents but identified the opportunistic pathogen Streptococcus equi subsp. zooepidemicus (S. zooepidemicus) in diseased animals. We sequenced the genomes of 257 isolates of S. zooepidemicus to differentiate epidemic from endemic strains. We found that although multiple endemic clones of S. zooepidemicus were present, one particular clone, sequence type 209 (ST209), was likely to have been responsible for the epidemic. Concurrent with the epidemic, ST209 was also recovered from a human case of septicemia, highlighting the pathogenic potential of this strain. Epidemiological investigation revealed that the incursion of this strain into one training yard during February 2010 provided a nidus for the infection of multiple horses that then transmitted the strain to farms throughout Iceland. This study represents the first time that whole-genome sequencing has been used to investigate an epidemic on a national scale to identify the likely causative agent and the link to an associated zoonotic infection. Our data highlight the importance of national biosecurity to protect vulnerable populations of animals and also demonstrate the potential impact of S. zooepidemicus transmission to other animals, including humans. PMID:28765219

  14. Exome sequencing of a colorectal cancer family reveals shared mutation pattern and predisposition circuitry along tumor pathways.

    PubMed

    Suleiman, Suleiman H; Koko, Mahmoud E; Nasir, Wafaa H; Elfateh, Ommnyiah; Elgizouli, Ubai K; Abdallah, Mohammed O E; Alfarouk, Khalid O; Hussain, Ayman; Faisal, Shima; Ibrahim, Fathelrahamn M A; Romano, Maurizio; Sultan, Ali; Banks, Lawrence; Newport, Melanie; Baralle, Francesco; Elhassan, Ahmed M; Mohamed, Hiba S; Ibrahim, Muntaser E

    2015-01-01

    The molecular basis of cancer and cancer multiple phenotypes are not yet fully understood. Next Generation Sequencing promises new insight into the role of genetic interactions in shaping the complexity of cancer. Aiming to outline the differences in mutation patterns between familial colorectal cancer cases and controls we analyzed whole exomes of cancer tissues and control samples from an extended colorectal cancer pedigree, providing one of the first data sets of exome sequencing of cancer in an African population against a background of large effective size typically with excess of variants. Tumors showed hMSH2 loss of function SNV consistent with Lynch syndrome. Sets of genes harboring insertions-deletions in tumor tissues revealed, however, significant GO enrichment, a feature that was not seen in control samples, suggesting that ordered insertions-deletions are central to tumorigenesis in this type of cancer. Network analysis identified multiple hub genes of centrality. ELAVL1/HuR showed remarkable centrality, interacting specially with genes harboring non-synonymous SNVs thus reinforcing the proposition of targeted mutagenesis in cancer pathways. A likely explanation to such mutation pattern is DNA/RNA editing, suggested here by nucleotide transition-to-transversion ratio that significantly departed from expected values (p-value 5e-6). NFKB1 also showed significant centrality along with ELAVL1, raising the suspicion of viral etiology given the known interaction between oncogenic viruses and these proteins.

  15. A novel high-resolution multilocus sequence typing of Giardia intestinalis Assemblage A isolates reveals zoonotic transmission, clonal outbreaks and recombination.

    PubMed

    Ankarklev, Johan; Lebbad, Marianne; Einarsson, Elin; Franzén, Oscar; Ahola, Harri; Troell, Karin; Svärd, Staffan G

    2018-06-01

    Molecular epidemiology and genotyping studies of the parasitic protozoan Giardia intestinalis have proven difficult due to multiple factors, such as low discriminatory power in the commonly used genotyping loci, which has hampered molecular analyses of outbreak sources, zoonotic transmission and virulence types. Here we have focused on assemblage A Giardia and developed a high-resolution assemblage-specific multilocus sequence typing (MLST) method. Analyses of sequenced G. intestinalis assemblage A genomes from different sub-assemblages identified a set of six genetic loci with high genetic variability. DNA samples from both humans (n = 44) and animals (n = 18) that harbored Giardia assemblage A infections, were PCR amplified (557-700 bp products) and sequenced at the six novel genetic loci. Bioinformatic analyses showed five to ten-fold higher levels of polymorphic sites than what was previously found among assemblage A samples using the classic genotyping loci. Phylogenetically, a division of two major clusters in assemblage A became apparent, separating samples of human and animal origin. A subset of human samples (n = 9) from a documented Giardia outbreak in a Swedish day-care center, showed full complementarity at nine genetic loci (the six new and the standard BG, TPI and GDH loci), strongly suggesting one source of infection. Furthermore, three samples of human origin displayed MLST profiles that were phylogenetically more closely related to MLST profiles from animal derived samples, suggesting zoonotic transmission. These new genotyping loci enabled us to detect events of recombination between different assemblage A isolates but also between assemblage A and E isolates. In summary, we present a novel and expanded MLST strategy with significantly improved sensitivity for molecular analyses of virulence types, zoonotic potential and source tracking for assemblage A Giardia. Copyright © 2018. Published by Elsevier B.V.

  16. Multiple ESBL-Producing Escherichia coli Sequence Types Carrying Quinolone and Aminoglycoside Resistance Genes Circulating in Companion and Domestic Farm Animals in Mwanza, Tanzania, Harbor Commonly Occurring Plasmids

    PubMed Central

    Seni, Jeremiah; Falgenhauer, Linda; Simeo, Nabina; Mirambo, Mariam M.; Imirzalioglu, Can; Matee, Mecky; Rweyemamu, Mark; Chakraborty, Trinad; Mshana, Stephen E.

    2016-01-01

    The increased presence of extended-spectrum beta-lactamase (ESBL)-producing bacteria in humans, animals, and their surrounding environments is of global concern. Currently there is limited information on ESBL presence in rural farming communities worldwide. We performed a cross-sectional study in Mwanza, Tanzania, involving 600 companion and domestic farm animals between August/September 2014. Rectal swab/cloaca specimens were processed to identify ESBL-producing Enterobacteriaceae. We detected 130 (21.7%) animals carrying ESBL-producing bacteria, the highest carriage being among dogs and pigs [39.2% (51/130) and 33.1% (43/130), respectively]. The majority of isolates were Escherichia coli [93.3% (125/134)] and exotic breed type [OR (95%CI) = 2.372 (1.460–3.854), p-value < 0.001] was found to be a predictor of ESBL carriage among animals. Whole-genome sequences of 25 ESBL-producing E. coli were analyzed for phylogenetic relationships using multi-locus sequence typing (MLST) and core genome comparisons. Fourteen different sequence types were detected of which ST617 (7/25), ST2852 (3/25), ST1303 (3/25) were the most abundant. All isolates harbored the blaCTX-M-15 allele, 22/25 carried strA and strB, 12/25 aac(6′)-lb-cr, and 11/25 qnrS1. Antibiotic resistance was associated with IncF, IncY, as well as non-typable plasmids. Eleven isolates carried pPGRT46-related plasmids, previously reported from isolates in Nigeria. Five isolates had plasmids exhibiting 85–99% homology to pCA28, previously detected in isolates from the US. Our findings indicate a pan-species distribution of ESBL-producing E. coli clonal groups in farming communities and provide evidence for plasmids harboring antibiotic resistances of regional and international impact. PMID:26904015

  17. Identifying micro-inversions using high-throughput sequencing reads.

    PubMed

    He, Feifei; Li, Yang; Tang, Yu-Hang; Ma, Jian; Zhu, Huaiqiu

    2016-01-11

    The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads. The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp. To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID .

  18. A genomic library-based amplification approach (GL-PCR) for the mapping of multiple IS6110 insertion sites and strain differentiation of Mycobacterium tuberculosis.

    PubMed

    Namouchi, Amine; Mardassi, Helmi

    2006-11-01

    Evidence suggests that insertion of the IS6110 element is not without consequence to the biology of Mycobacterium tuberculosis complex strains. Thus, mapping of multiple IS6110 insertion sites in the genome of biomedically relevant clinical isolates would result in a better understanding of the role of this mobile element, particularly with regard to transmission, adaptability and virulence. In the present paper, we describe a versatile strategy, referred to as GL-PCR, that amplifies IS6110-flanking sequences based on the construction of a genomic library. M. tuberculosis chromosomal DNA is fully digested with HincII and then ligated into a plasmid vector between T7 and T3 promoter sequences. The ligation reaction product is transformed into Escherichia coli and selective PCR amplification targeting both 5' and 3' IS6110-flanking sequences are performed on the plasmid library DNA. For this purpose, four separate PCR reactions are performed, each combining an outward primer specific for one IS6110 end with either T7 or T3 primer. Determination of the nucleotide sequence of the PCR products generated from a single ligation reaction allowed mapping of 21 out of the 24 IS6110 copies of two 12 banded M. tuberculosis strains, yielding an overall sensitivity of 87,5%. Furthermore, by simply comparing the migration pattern of GL-PCR-generated products, the strategy proved to be as valuable as IS6110 RFLP for molecular typing of M. tuberculosis complex strains. Importantly, GL-PCR was able to discriminate between strains differing by a single IS6110 band.

  19. A teat papillomatosis case in a Damascus goat (Shami goat) in Hatay province, Turkey: a new putative papillomavirus?

    PubMed

    Dogan, Fırat; Dorttas, Selvi Deniz; Bilge Dagalp, Seval; Ataseven, Veysel Soydal; Alkan, Feray

    2018-06-01

    Papillomaviruses (PVs) are epitheliotropic viruses that cause benign proliferative lesions in the skin (warts or papillomas) and mucous membranes of their natural hosts. Recently, new PVs have been found in many animal species. The most common current approach for identifying novel PV types is based on PCR, using various consensus or degenerated primer (broad-range primers), designed on the basis of the multiple alignment of nucleotide or amino acid sequences of a large number of different human papillomaviruses (HPV). PVs have been classified according to the sequence similarity of one of their capsid proteins, L1, without taking into account other regions of the genome and without considering the phenotypic characteristics of the viral infection. In this study, we performed molecular detection and typing of a PV in a goat with teat papillomatosis. Firstly, PCR was performed using the FAP59/FAP64 and MY09/MY11 primer pairs for the L1 gene region. The PV DNA was found to be positive only with the FAP59/FAP64 primer pair. PV DNA was then tested with three primer sets in four different combinations (L2Bf/FAP64, L2Bf/L1Br, FAP59/FAP64, L1Bf/LCRBr) for the gene region encoding the L1, L2 and LCR proteins. The goat teat papilloma sample was amplified using FAP59/FAP64 primers and two primer pairs (L2Bf/FAP64 and L2Bf/L1Br). We obtained products matching approximately 604 bp of the L1 region of the virus. PV DNA was used for typing using sequence analysis/PCR with some type-specific primers for bovids, caprids and cervids. The results of the sequence analysis suggested one new putative PV type with sequence identity ranging from 46.45 to 80.09% to other known papillomaviruses, including Capra hircus papillomavirus (ChPV-2), bovine papillomavirus (BPV) 6, 7, 10, 11 and 12, Rangifer tarandus papillomavirus 3 (RtPV-3) and BPV-7Z (Alpine wild ruminant papillomavirus; Cervus elaphus papillomavirus). We therefore propose that this is the first identification of a new putative type, MG523274 (HTY-goat-TR2016), in a goat with teat papillomatosis. It is essential to identify PV types in different animal species and investigate their prevalence/distribution and clinical consequences in order to develop appropriate prophylactic and/or therapeutic procedures and to determine the interspecies transmission potential and evolution of PVs.

  20. Bayesian phylogeny of sucrose transporters: ancient origins, differential expansion and convergent evolution in monocots and dicots

    PubMed Central

    Peng, Duo; Gu, Xi; Xue, Liang-Jiao; Leebens-Mack, James H.; Tsai, Chung-Jui

    2014-01-01

    Sucrose transporters (SUTs) are essential for the export and efficient movement of sucrose from source leaves to sink organs in plants. The angiosperm SUT family was previously classified into three or four distinct groups, Types I, II (subgroup IIB), and III, with dicot-specific Type I and monocot-specific Type IIB functioning in phloem loading. To shed light on the underlying drivers of SUT evolution, Bayesian phylogenetic inference was undertaken using 41 sequenced plant genomes, including seven basal lineages at key evolutionary junctures. Our analysis supports four phylogenetically and structurally distinct SUT subfamilies, originating from two ancient groups (AG1 and AG2) that diverged early during terrestrial colonization. In both AG1 and AG2, multiple intron acquisition events in the progenitor vascular plant established the gene structures of modern SUTs. Tonoplastic Type III and plasmalemmal Type II represent evolutionarily conserved descendants of AG1 and AG2, respectively. Type I and Type IIB were previously thought to evolve after the dicot-monocot split. We show, however, that divergence of Type I from Type III SUT predated basal angiosperms, likely associated with evolution of vascular cambium and phloem transport. Type I SUT was subsequently lost in monocots along with vascular cambium, and independent evolution of Type IIB coincided with modified monocot vasculature. Both Type I and Type IIB underwent lineage-specific expansion. In multiple unrelated taxa, the newly-derived SUTs exhibit biased expression in reproductive tissues, suggesting a functional link between phloem loading and reproductive fitness. Convergent evolution of Type I and Type IIB for SUT function in phloem loading and reproductive organs supports the idea that differential vascular development in dicots and monocots is a strong driver for SUT family evolution in angiosperms. PMID:25429293

  1. Comparative genomics provides new insights into the diversity, physiology, and sexuality of the only industrially exploited tremellomycete: Phaffia rhodozyma

    DOE PAGES

    Bellora, Nicolas; Moline, Martin; David-Palma, Marcia; ...

    2016-11-09

    The class Tremellomycete (Agaricomycotina) encompasses more than 380 fungi. Although there are a few edible Tremella spp., the only species with current biotechnological use is the astaxanthin-producing yeast Phaffia rhodozyma (Cystofilobasidiales). Besides astaxanthin, a carotenoid pigment with potent antioxidant activity and great value for aquaculture and pharmaceutical industries, P. rhodozyma possesses multiple exceptional traits of fundamental and applied interest. The aim of this study was to obtain, and analyze two new genome sequences of representative strains from the northern (CBS 7918 T, the type strain) and southern hemispheres (CRUB 1149) and compre them to a previously published genome sequence (strainmore » CBS 6938). Furthermore, photoprotection and antioxidant related genes, as well as genes involved in sexual reproduction were analyzed.« less

  2. Comparative genomics provides new insights into the diversity, physiology, and sexuality of the only industrially exploited tremellomycete: Phaffia rhodozyma

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bellora, Nicolas; Moline, Martin; David-Palma, Marcia

    The class Tremellomycete (Agaricomycotina) encompasses more than 380 fungi. Although there are a few edible Tremella spp., the only species with current biotechnological use is the astaxanthin-producing yeast Phaffia rhodozyma (Cystofilobasidiales). Besides astaxanthin, a carotenoid pigment with potent antioxidant activity and great value for aquaculture and pharmaceutical industries, P. rhodozyma possesses multiple exceptional traits of fundamental and applied interest. The aim of this study was to obtain, and analyze two new genome sequences of representative strains from the northern (CBS 7918 T, the type strain) and southern hemispheres (CRUB 1149) and compre them to a previously published genome sequence (strainmore » CBS 6938). Furthermore, photoprotection and antioxidant related genes, as well as genes involved in sexual reproduction were analyzed.« less

  3. Whole exome sequencing reveals concomitant mutations of multiple FA genes in individual Fanconi anemia patients

    PubMed Central

    2014-01-01

    Background Fanconi anemia (FA) is a rare inherited genetic syndrome with highly variable clinical manifestations. Fifteen genetic subtypes of FA have been identified. Traditional complementation tests for grouping studies have been used generally in FA patients and in stepwise methods to identify the FA type, which can result in incomplete genetic information from FA patients. Methods We diagnosed five pediatric patients with FA based on clinical manifestations, and we performed exome sequencing of peripheral blood specimens from these patients and their family members. The related sequencing data were then analyzed by bioinformatics, and the FANC gene mutations identified by exome sequencing were confirmed by PCR re-sequencing. Results Homozygous and compound heterozygous mutations of FANC genes were identified in all of the patients. The FA subtypes of the patients included FANCA, FANCM and FANCD2. Interestingly, four FA patients harbored multiple mutations in at least two FA genes, and some of these mutations have not been previously reported. These patients’ clinical manifestations were vastly different from each other, as were their treatment responses to androstanazol and prednisone. This finding suggests that heterozygous mutation(s) in FA genes could also have diverse biological and/or pathophysiological effects on FA patients or FA gene carriers. Interestingly, we were not able to identify de novo mutations in the genes implicated in DNA repair pathways when the sequencing data of patients were compared with those of their parents. Conclusions Our results indicate that Chinese FA patients and carriers might have higher and more complex mutation rates in FANC genes than have been conventionally recognized. Testing of the fifteen FANC genes in FA patients and their family members should be a regular clinical practice to determine the optimal care for the individual patient, to counsel the family and to obtain a better understanding of FA pathophysiology. PMID:24885126

  4. Whole exome sequencing reveals concomitant mutations of multiple FA genes in individual Fanconi anemia patients.

    PubMed

    Chang, Lixian; Yuan, Weiping; Zeng, Huimin; Zhou, Quanquan; Wei, Wei; Zhou, Jianfeng; Li, Miaomiao; Wang, Xiaomin; Xu, Mingjiang; Yang, Fengchun; Yang, Yungui; Cheng, Tao; Zhu, Xiaofan

    2014-05-15

    Fanconi anemia (FA) is a rare inherited genetic syndrome with highly variable clinical manifestations. Fifteen genetic subtypes of FA have been identified. Traditional complementation tests for grouping studies have been used generally in FA patients and in stepwise methods to identify the FA type, which can result in incomplete genetic information from FA patients. We diagnosed five pediatric patients with FA based on clinical manifestations, and we performed exome sequencing of peripheral blood specimens from these patients and their family members. The related sequencing data were then analyzed by bioinformatics, and the FANC gene mutations identified by exome sequencing were confirmed by PCR re-sequencing. Homozygous and compound heterozygous mutations of FANC genes were identified in all of the patients. The FA subtypes of the patients included FANCA, FANCM and FANCD2. Interestingly, four FA patients harbored multiple mutations in at least two FA genes, and some of these mutations have not been previously reported. These patients' clinical manifestations were vastly different from each other, as were their treatment responses to androstanazol and prednisone. This finding suggests that heterozygous mutation(s) in FA genes could also have diverse biological and/or pathophysiological effects on FA patients or FA gene carriers. Interestingly, we were not able to identify de novo mutations in the genes implicated in DNA repair pathways when the sequencing data of patients were compared with those of their parents. Our results indicate that Chinese FA patients and carriers might have higher and more complex mutation rates in FANC genes than have been conventionally recognized. Testing of the fifteen FANC genes in FA patients and their family members should be a regular clinical practice to determine the optimal care for the individual patient, to counsel the family and to obtain a better understanding of FA pathophysiology.

  5. SnipViz: a compact and lightweight web site widget for display and dissemination of multiple versions of gene and protein sequences.

    PubMed

    Jaschob, Daniel; Davis, Trisha N; Riffle, Michael

    2014-07-23

    As high throughput sequencing continues to grow more commonplace, the need to disseminate the resulting data via web applications continues to grow. Particularly, there is a need to disseminate multiple versions of related gene and protein sequences simultaneously--whether they represent alleles present in a single species, variations of the same gene among different strains, or homologs among separate species. Often this is accomplished by displaying all versions of the sequence at once in a manner that is not intuitive or space-efficient and does not facilitate human understanding of the data. Web-based applications needing to disseminate multiple versions of sequences would benefit from a drop-in module designed to effectively disseminate these data. SnipViz is a client-side software tool designed to disseminate multiple versions of related gene and protein sequences on web sites. SnipViz has a space-efficient, interactive, and dynamic interface for navigating, analyzing and visualizing sequence data. It is written using standard World Wide Web technologies (HTML, Javascript, and CSS) and is compatible with most web browsers. SnipViz is designed as a modular client-side web component and may be incorporated into virtually any web site and be implemented without any programming. SnipViz is a drop-in client-side module for web sites designed to efficiently visualize and disseminate gene and protein sequences. SnipViz is open source and is freely available at https://github.com/yeastrc/snipviz.

  6. Generating Models of Surgical Procedures using UMLS Concepts and Multiple Sequence Alignment

    PubMed Central

    Meng, Frank; D’Avolio, Leonard W.; Chen, Andrew A.; Taira, Ricky K.; Kangarloo, Hooshang

    2005-01-01

    Surgical procedures can be viewed as a process composed of a sequence of steps performed on, by, or with the patient’s anatomy. This sequence is typically the pattern followed by surgeons when generating surgical report narratives for documenting surgical procedures. This paper describes a methodology for semi-automatically deriving a model of conducted surgeries, utilizing a sequence of derived Unified Medical Language System (UMLS) concepts for representing surgical procedures. A multiple sequence alignment was computed from a collection of such sequences and was used for generating the model. These models have the potential of being useful in a variety of informatics applications such as information retrieval and automatic document generation. PMID:16779094

  7. Optimized, unequal pulse spacing in multiple echo sequences improves refocusing in magnetic resonance.

    PubMed

    Jenista, Elizabeth R; Stokes, Ashley M; Branca, Rosa Tamara; Warren, Warren S

    2009-11-28

    A recent quantum computing paper (G. S. Uhrig, Phys. Rev. Lett. 98, 100504 (2007)) analytically derived optimal pulse spacings for a multiple spin echo sequence designed to remove decoherence in a two-level system coupled to a bath. The spacings in what has been called a "Uhrig dynamic decoupling (UDD) sequence" differ dramatically from the conventional, equal pulse spacing of a Carr-Purcell-Meiboom-Gill (CPMG) multiple spin echo sequence. The UDD sequence was derived for a model that is unrelated to magnetic resonance, but was recently shown theoretically to be more general. Here we show that the UDD sequence has theoretical advantages for magnetic resonance imaging of structured materials such as tissue, where diffusion in compartmentalized and microstructured environments leads to fluctuating fields on a range of different time scales. We also show experimentally, both in excised tissue and in a live mouse tumor model, that optimal UDD sequences produce different T(2)-weighted contrast than do CPMG sequences with the same number of pulses and total delay, with substantial enhancements in most regions. This permits improved characterization of low-frequency spectral density functions in a wide range of applications.

  8. Texture analysis of common renal masses in multiple MR sequences for prediction of pathology

    NASA Astrophysics Data System (ADS)

    Hoang, Uyen N.; Malayeri, Ashkan A.; Lay, Nathan S.; Summers, Ronald M.; Yao, Jianhua

    2017-03-01

    This pilot study performs texture analysis on multiple magnetic resonance (MR) images of common renal masses for differentiation of renal cell carcinoma (RCC). Bounding boxes are drawn around each mass on one axial slice in T1 delayed sequence to use for feature extraction and classification. All sequences (T1 delayed, venous, arterial, pre-contrast phases, T2, and T2 fat saturated sequences) are co-registered and texture features are extracted from each sequence simultaneously. Random forest is used to construct models to classify lesions on 96 normal regions, 87 clear cell RCCs, 8 papillary RCCs, and 21 renal oncocytomas; ground truths are verified through pathology reports. The highest performance is seen in random forest model when data from all sequences are used in conjunction, achieving an overall classification accuracy of 83.7%. When using data from one single sequence, the overall accuracies achieved for T1 delayed, venous, arterial, and pre-contrast phase, T2, and T2 fat saturated were 79.1%, 70.5%, 56.2%, 61.0%, 60.0%, and 44.8%, respectively. This demonstrates promising results of utilizing intensity information from multiple MR sequences for accurate classification of renal masses.

  9. Swarm intelligence in bioinformatics: methods and implementations for discovering patterns of multiple sequences.

    PubMed

    Cui, Zhihua; Zhang, Yi

    2014-02-01

    As a promising and innovative research field, bioinformatics has attracted increasing attention recently. Beneath the enormous number of open problems in this field, one fundamental issue is about the accurate and efficient computational methodology that can deal with tremendous amounts of data. In this paper, we survey some applications of swarm intelligence to discover patterns of multiple sequences. To provide a deep insight, ant colony optimization, particle swarm optimization, artificial bee colony and artificial fish swarm algorithm are selected, and their applications to multiple sequence alignment and motif detecting problem are discussed.

  10. Score distributions of gapped multiple sequence alignments down to the low-probability tail

    NASA Astrophysics Data System (ADS)

    Fieth, Pascal; Hartmann, Alexander K.

    2016-08-01

    Assessing the significance of alignment scores of optimally aligned DNA or amino acid sequences can be achieved via the knowledge of the score distribution of random sequences. But this requires obtaining the distribution in the biologically relevant high-scoring region, where the probabilities are exponentially small. For gapless local alignments of infinitely long sequences this distribution is known analytically to follow a Gumbel distribution. Distributions for gapped local alignments and global alignments of finite lengths can only be obtained numerically. To obtain result for the small-probability region, specific statistical mechanics-based rare-event algorithms can be applied. In previous studies, this was achieved for pairwise alignments. They showed that, contrary to results from previous simple sampling studies, strong deviations from the Gumbel distribution occur in case of finite sequence lengths. Here we extend the studies to multiple sequence alignments with gaps, which are much more relevant for practical applications in molecular biology. We study the distributions of scores over a large range of the support, reaching probabilities as small as 10-160, for global and local (sum-of-pair scores) multiple alignments. We find that even after suitable rescaling, eliminating the sequence-length dependence, the distributions for multiple alignment differ from the pairwise alignment case. Furthermore, we also show that the previously discussed Gaussian correction to the Gumbel distribution needs to be refined, also for the case of pairwise alignments.

  11. Homozygosity Mapping and Whole Exome Sequencing to Detect SLC45A2 and G6PC3 Mutations in a Single Patient with Oculocutaneous Albinism and Neutropenia

    PubMed Central

    Cullinane, Andrew R.; Vilboux, Thierry; O’Brien, Kevin; Curry, James A.; Maynard, Dawn M.; Carlson-Donohoe, Hannah; Ciccone, Carla; Markello, Thomas C.; Gunay-Aygun, Meral; Huizing, Marjan; Gahl, William A.

    2011-01-01

    We evaluated a 32 year-old woman whose oculocutaneous albinism, bleeding diathesis, neutropenia, and history of recurrent infections prompted consideration of the diagnosis of Hermansky-Pudlak syndrome type 2 (HPS-2). This was ruled out due to the presence of platelet delta granules and absence of AP3B1 mutations. Since parental consanguinity suggested an autosomal recessive mode of inheritance, we employed homozygosity mapping, followed by whole exome sequencing, to identify two candidate disease-causing genes, SLC45A2 and G6PC3. Conventional di-deoxy sequencing confirmed pathogenic mutations in SLC45A2, associated with oculocutaneous albinism type 4 (OCA-4), and G6PC3, associated with neutropenia. The substantial reduction of SLC45A2 protein in the patient’s melanocytes caused the mis-localization of tyrosinase from melanosomes to the plasma membrane and also led to the incorporation of tyrosinase into exosomes and secretion into the culture medium, explaining the hypopigmentation in OCA-4. Our patient’s G6PC3 mRNA expression level was also reduced, leading to increased apoptosis of her fibroblasts under ER stress. This report describes the first North American patient with OCA-4, the first culture of human OCA-4 melanocytes, and the use of homozygosity mapping followed by whole exome sequencing to identify disease-causing mutations in multiple genes in a single affected individual. PMID:21677667

  12. Informational structure of genetic sequences and nature of gene splicing

    NASA Astrophysics Data System (ADS)

    Trifonov, E. N.

    1991-10-01

    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  13. A mammary cell-specific enhancer in mouse mammary tumor virus DNA is composed of multiple regulatory elements including binding sites for CTF/NFI and a novel transcription factor, mammary cell-activating factor.

    PubMed Central

    Mink, S; Härtig, E; Jennewein, P; Doppler, W; Cato, A C

    1992-01-01

    Mouse mammary tumor virus (MMTV) is a milk-transmitted retrovirus involved in the neoplastic transformation of mouse mammary gland cells. The expression of this virus is regulated by mammary cell type-specific factors, steroid hormones, and polypeptide growth factors. Sequences for mammary cell-specific expression are located in an enhancer element in the extreme 5' end of the long terminal repeat region of this virus. This enhancer, when cloned in front of the herpes simplex thymidine kinase promoter, endows the promoter with mammary cell-specific response. Using functional and DNA-protein-binding studies with constructs mutated in the MMTV long terminal repeat enhancer, we have identified two main regulatory elements necessary for the mammary cell-specific response. These elements consist of binding sites for a transcription factor in the family of CTF/NFI proteins and the transcription factor mammary cell-activating factor (MAF) that recognizes the sequence G Pu Pu G C/G A A G G/T. Combinations of CTF/NFI- and MAF-binding sites or multiple copies of either one of these binding sites but not solitary binding sites mediate mammary cell-specific expression. The functional activities of these two regulatory elements are enhanced by another factor that binds to the core sequence ACAAAG. Interdigitated binding sites for CTF/NFI, MAF, and/or the ACAAAG factor are also found in the 5' upstream regions of genes encoding whey milk proteins from different species. These findings suggest that mammary cell-specific regulation is achieved by a concerted action of factors binding to multiple regulatory sites. Images PMID:1328867

  14. Model for turbidite-to-contourite continuum and multiple process transport in deep marine settings: examples in the rock record

    NASA Astrophysics Data System (ADS)

    Stanley, Daniel Jean

    1993-01-01

    Petrological analysis of geological sections in St. Croix in the Caribbean, the Niesenflysch in Switzerland and the Annot Sandstone in the French Maritime Alps sheds light on multiple process transport in deep marine settings. A model depicting a turbidite-to-contourite continuum of stratal types is applied to these three rock units. Recognition of a diverse suite of bedforms, coupled with analysis of paleocurrents, helps to better interpret depositional origin and basin paleogeography. The St. Croix strata record emplacement by gravity flows and, subsequently, by bottom currents flowing parallel to the base of slope; these sediments accumulated on a lower slope apron. A Niesenflysch section in the Swiss Alps west of Adelboden includes turbidites which were deposited at fairly regular intervals beyond the base of slope, in a setting more distal than that of the St. Croix sequences. Most of these turbidites appear to have been partially reworked by bottom currents related to basin circulation or to density flows from the basin margins. In the Annot Sandstone, reworked turbidites (termed transitional variants) and packets of entirely rippled strata are observed in submarine fan and slope sequences in the Peira-Cava area. In contrast to those in St. Croix and the Niesenflysch, the current-emplaced deposits of the Annot Sandstone are directly associated with fan-valley deposits. Such rippled strata in channels are deposits of gravity flow origin which were subsequently reworked downslope by currents generated by successive gravity flows; they also occur on levees by overbank flow. Consideration of multiple process transport is of special help to interpret sections which are poorly exposed, or which can be examined in cores, or which are located in sequences that have been highly deformed structurally.

  15. Curcumin activates human glutathione S-transferase P1 expression through antioxidant response element.

    PubMed

    Nishinaka, Toru; Ichijo, Yusuke; Ito, Maki; Kimura, Masayoshi; Katsuyama, Masato; Iwata, Kazumi; Miura, Takeshi; Terada, Tomoyuki; Yabe-Nishimura, Chihiro

    2007-05-15

    Curcumin is a plant-derived diferuloylmethane compound extracted from Curcuma longa, possessing antioxidative and anticarcinogenic properties. Antioxidants and oxidative stress are known to induce the expression of certain classes of detoxification enzymes. Since the upregulation of detoxifying enzymes affects the drug metabolism and cell defense system, it is important to understand the gene regulation by such agents. In this study, we demonstrated that curcumin could induce the expression of human glutathione S-transferase P1 (GSTP1). In HepG2 cells treated with 20muM curcumin, the level of GSTP1 mRNA was significantly increased. In luciferase reporter assays, curcumin augmented the promoter activity of a reporter construct carrying 336bp upstream of the 5'-flanking region of the GSTP1 gene. Mutation analyses revealed that the region including antioxidant response element (ARE), which overlaps AP1 in sequence, was essential to the response to curcumin. While the introduction of a wild-type Nrf2 expression construct augmented the promoter activity of the GSTP1 gene, co-expression of a dominant-negative Nrf2 abolished the responsiveness to curcumin. In addition, curcumin activated the expression of the luciferase gene from a reporter construct carrying multiple ARE consensus sequences but not one with multiple AP1 sites. In a gel mobility shift assay with an oligonucleotide with GSTP1 ARE, an increase in the amount of the binding complex was observed in the nuclear extracts of curcumin-treated HepG2 cells. These results suggested that ARE is the primary sequence for the curcumin-induced transactivation of the GSTP1 gene. The induction of GSTP1 may be one of the mechanisms underlying the multiple actions of curcumin.

  16. Analysis of human immunodeficiency virus type 1 Vif gene sequences among men who have sex with men in Heilongjiang province of China.

    PubMed

    Shao, Bing; Li, Hang; Liu, Sheng-Yuan; Li, Wen-Jing; Huang, Chao-Qun; Lin, Yuan-Long; Wang, Fu-Xiang; Wang, Bin-You

    2013-05-01

    To identify the current prevalent subtypes and to study the genetic variation of HIV-1 strains in men who have sex with men (MSM) residing in Heilongjiang province, China. We analyzed the characteristics of the nucleotide sequences and the corresponding deduced protein of Vif of HIV-1 strains isolated from 17 drug-naive HIV-1-seropositive MSM. Subtypes B (7.65%) and B' (Thailand B) (11.76%), CRF07_BC (47.06%), and CRF01_AE (23.53%) were identified. Phylogenetic analysis showed that there was a close relationship between our strains and those from the same MSM population in Hebei province, which is geographically close to Heilongjiang. Most of the documented Vif functional motifs are well conserved in the majority of our analyzed sequences. Taken together, our results suggest that there might be multiple introductions of HIV in Heilongjiang MSM and frequent sexual communications with other geographically nearby MSM populations.

  17. Efficient generation of complete sequences of MDR-encoding plasmids by rapid assembly of MinION barcoding sequencing data.

    PubMed

    Li, Ruichao; Xie, Miaomiao; Dong, Ning; Lin, Dachuan; Yang, Xuemei; Wong, Marcus Ho Yin; Chan, Edward Wai-Chi; Chen, Sheng

    2018-03-01

    Multidrug resistance (MDR)-encoding plasmids are considered major molecular vehicles responsible for transmission of antibiotic resistance genes among bacteria of the same or different species. Delineating the complete sequences of such plasmids could provide valuable insight into the evolution and transmission mechanisms underlying bacterial antibiotic resistance development. However, due to the presence of multiple repeats of mobile elements, complete sequencing of MDR plasmids remains technically complicated, expensive, and time-consuming. Here, we demonstrate a rapid and efficient approach to obtaining multiple MDR plasmid sequences through the use of the MinION nanopore sequencing platform, which is incorporated in a portable device. By assembling the long sequencing reads generated by a single MinION run according to a rapid barcoding sequencing protocol, we obtained the complete sequences of 20 plasmids harbored by multiple bacterial strains. Importantly, single long reads covering a plasmid end-to-end were recorded, indicating that de novo assembly may be unnecessary if the single reads exhibit high accuracy. This workflow represents a convenient and cost-effective approach for systematic assessment of MDR plasmids responsible for treatment failure of bacterial infections, offering the opportunity to perform detailed molecular epidemiological studies to probe the evolutionary and transmission mechanisms of MDR-encoding elements.

  18. dCITE: Measuring Necessary Cladistic Information Can Help You Reduce Polytomy Artefacts in Trees.

    PubMed

    Wise, Michael J

    2016-01-01

    Biologists regularly create phylogenetic trees to better understand the evolutionary origins of their species of interest, and often use genomes as their data source. However, as more and more incomplete genomes are published, in many cases it may not be possible to compute genome-based phylogenetic trees due to large gaps in the assembled sequences. In addition, comparison of complete genomes may not even be desirable due to the presence of horizontally acquired and homologous genes. A decision must therefore be made about which gene, or gene combinations, should be used to compute a tree. Deflated Cladistic Information based on Total Entropy (dCITE) is proposed as an easily computed metric for measuring the cladistic information in multiple sequence alignments representing a range of taxa, without the need to first compute the corresponding trees. dCITE scores can be used to rank candidate genes or decide whether input sequences provide insufficient cladistic information, making artefactual polytomies more likely. The dCITE method can be applied to protein, nucleotide or encoded phenotypic data, so can be used to select which data-type is most appropriate, given the choice. In a series of experiments the dCITE method was compared with related measures. Then, as a practical demonstration, the ideas developed in the paper were applied to a dataset representing species from the order Campylobacterales; trees based on sequence combinations, selected on the basis of their dCITE scores, were compared with a tree constructed to mimic Multi-Locus Sequence Typing (MLST) combinations of fragments. We see that the greater the dCITE score the more likely it is that the computed phylogenetic tree will be free of artefactual polytomies. Secondly, cladistic information saturates, beyond which little additional cladistic information can be obtained by adding additional sequences. Finally, sequences with high cladistic information produce more consistent trees for the same taxa.

  19. dCITE: Measuring Necessary Cladistic Information Can Help You Reduce Polytomy Artefacts in Trees

    PubMed Central

    2016-01-01

    Biologists regularly create phylogenetic trees to better understand the evolutionary origins of their species of interest, and often use genomes as their data source. However, as more and more incomplete genomes are published, in many cases it may not be possible to compute genome-based phylogenetic trees due to large gaps in the assembled sequences. In addition, comparison of complete genomes may not even be desirable due to the presence of horizontally acquired and homologous genes. A decision must therefore be made about which gene, or gene combinations, should be used to compute a tree. Deflated Cladistic Information based on Total Entropy (dCITE) is proposed as an easily computed metric for measuring the cladistic information in multiple sequence alignments representing a range of taxa, without the need to first compute the corresponding trees. dCITE scores can be used to rank candidate genes or decide whether input sequences provide insufficient cladistic information, making artefactual polytomies more likely. The dCITE method can be applied to protein, nucleotide or encoded phenotypic data, so can be used to select which data-type is most appropriate, given the choice. In a series of experiments the dCITE method was compared with related measures. Then, as a practical demonstration, the ideas developed in the paper were applied to a dataset representing species from the order Campylobacterales; trees based on sequence combinations, selected on the basis of their dCITE scores, were compared with a tree constructed to mimic Multi-Locus Sequence Typing (MLST) combinations of fragments. We see that the greater the dCITE score the more likely it is that the computed phylogenetic tree will be free of artefactual polytomies. Secondly, cladistic information saturates, beyond which little additional cladistic information can be obtained by adding additional sequences. Finally, sequences with high cladistic information produce more consistent trees for the same taxa. PMID:27898695

  20. Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.

    The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less

Top