Wong, Gerard; Leckie, Christopher; Gorringe, Kylie L; Haviv, Izhak; Campbell, Ian G; Kowalczyk, Adam
2010-04-15
High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Detecting REcurrent Copy number change using rank-order Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/ approximately gwong/DRECS/index.html.
Protein-protein interaction network-based detection of functionally similar proteins within species.
Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli
2012-07-01
Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.
Community detection in sequence similarity networks based on attribute clustering
Chowdhary, Janamejaya; Loeffler, Frank E.; Smith, Jeremy C.
2017-07-24
Networks are powerful tools for the presentation and analysis of interactions in multi-component systems. A commonly studied mesoscopic feature of networks is their community structure, which arises from grouping together similar nodes into one community and dissimilar nodes into separate communities. Here in this paper, the community structure of protein sequence similarity networks is determined with a new method: Attribute Clustering Dependent Communities (ACDC). Sequence similarity has hitherto typically been quantified by the alignment score or its expectation value. However, pair alignments with the same score or expectation value cannot thus be differentiated. To overcome this deficiency, the method constructs,more » for pair alignments, an extended alignment metric, the link attribute vector, which includes the score and other alignment characteristics. Rescaling components of the attribute vectors qualitatively identifies a systematic variation of sequence similarity within protein superfamilies. The problem of community detection is then mapped to clustering the link attribute vectors, selection of an optimal subset of links and community structure refinement based on the partition density of the network. ACDC-predicted communities are found to be in good agreement with gold standard sequence databases for which the "ground truth" community structures (or families) are known. ACDC is therefore a community detection method for sequence similarity networks based entirely on pair similarity information. A serial implementation of ACDC is available from https://cmb.ornl.gov/resources/developments« less
Community detection in sequence similarity networks based on attribute clustering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chowdhary, Janamejaya; Loeffler, Frank E.; Smith, Jeremy C.
Networks are powerful tools for the presentation and analysis of interactions in multi-component systems. A commonly studied mesoscopic feature of networks is their community structure, which arises from grouping together similar nodes into one community and dissimilar nodes into separate communities. Here in this paper, the community structure of protein sequence similarity networks is determined with a new method: Attribute Clustering Dependent Communities (ACDC). Sequence similarity has hitherto typically been quantified by the alignment score or its expectation value. However, pair alignments with the same score or expectation value cannot thus be differentiated. To overcome this deficiency, the method constructs,more » for pair alignments, an extended alignment metric, the link attribute vector, which includes the score and other alignment characteristics. Rescaling components of the attribute vectors qualitatively identifies a systematic variation of sequence similarity within protein superfamilies. The problem of community detection is then mapped to clustering the link attribute vectors, selection of an optimal subset of links and community structure refinement based on the partition density of the network. ACDC-predicted communities are found to be in good agreement with gold standard sequence databases for which the "ground truth" community structures (or families) are known. ACDC is therefore a community detection method for sequence similarity networks based entirely on pair similarity information. A serial implementation of ACDC is available from https://cmb.ornl.gov/resources/developments« less
Dhir, Somdutta; Pacurar, Mircea; Franklin, Dino; Gáspári, Zoltán; Kertész-Farkas, Attila; Kocsor, András; Eisenhaber, Frank; Pongor, Sándor
2010-11-01
SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.
Karaboga, D; Aslan, S
2016-04-27
The great majority of biological sequences share significant similarity with other sequences as a result of evolutionary processes, and identifying these sequence similarities is one of the most challenging problems in bioinformatics. In this paper, we present a discrete artificial bee colony (ABC) algorithm, which is inspired by the intelligent foraging behavior of real honey bees, for the detection of highly conserved residue patterns or motifs within sequences. Experimental studies on three different data sets showed that the proposed discrete model, by adhering to the fundamental scheme of the ABC algorithm, produced competitive or better results than other metaheuristic motif discovery techniques.
Domain similarity based orthology detection.
Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich
2015-05-13
Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .
Orthology detection combining clustering and synteny for very large datasets.
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K; Prohaska, Sonja J; Stadler, Peter F
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.
Orthology Detection Combining Clustering and Synteny for Very Large Datasets
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K.; Prohaska, Sonja J.; Stadler, Peter F.
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. PMID:25137074
An early illness recognition framework using a temporal Smith Waterman algorithm and NLP.
Hajihashemi, Zahra; Popescu, Mihail
2013-01-01
In this paper we propose a framework for detecting health patterns based on non-wearable sensor sequence similarity and natural language processing (NLP). In TigerPlace, an aging in place facility from Columbia, MO, we deployed 47 sensor networks together with a nursing electronic health record (EHR) system to provide early illness recognition. The proposed framework utilizes sensor sequence similarity and NLP on EHR nursing comments to automatically notify the physician when health problems are detected. The reported methodology is inspired by genomic sequence annotation using similarity algorithms such as Smith Waterman (SW). Similarly, for each sensor sequence, we associate health concepts extracted from the nursing notes using Metamap, a NLP tool provided by Unified Medical Language System (UMLS). Since sensor sequences, unlike genomics ones, have an associated time dimension we propose a temporal variant of SW (TSW) to account for time. The main challenges presented by our framework are finding the most suitable time sequence similarity and aggregation of the retrieved UMLS concepts. On a pilot dataset from three Tiger Place residents, with a total of 1685 sensor days and 626 nursing records, we obtained an average precision of 0.64 and a recall of 0.37.
Chamings, Anthony; Nelson, Tiffanie M; Vibin, Jessy; Wille, Michelle; Klaassen, Marcel; Alexandersen, Soren
2018-04-13
We evaluated the presence of coronaviruses by PCR in 918 Australian wild bird samples collected during 2016-17. Coronaviruses were detected in 141 samples (15.3%) from species of ducks, shorebirds and herons and from multiple sampling locations. Sequencing of selected positive samples found mainly gammacoronaviruses, but also some deltacoronaviruses. The detection rate of coronaviruses was improved by using multiple PCR assays, as no single assay could detect all coronavirus positive samples. Sequencing of the relatively conserved Orf1 PCR amplicons found that Australian duck gammacoronaviruses were similar to duck gammacoronaviruses around the world. Some sequenced shorebird gammacoronaviruses belonged to Charadriiformes lineages, but others were more closely related to duck gammacoronaviruses. Australian duck and heron deltacoronaviruses belonged to lineages with other duck and heron deltacoronaviruses, but were almost 20% different in nucleotide sequence to other deltacoronavirus sequences available. Deltacoronavirus sequences from shorebirds formed a lineage with a deltacoronavirus from a ruddy turnstone detected in the United States. Given that Australian duck gammacoronaviruses are highly similar to those found in other regions, and Australian ducks rarely come into contact with migratory Palearctic duck species, we hypothesise that migratory shorebirds are the important vector for moving wild bird coronaviruses into and out of Australia.
Thomas, Lindsay H; Seryodkin, Ivan V; Goodrich, John M; Miquelle, Dale G; Birtles, Richard J; Lewis, John C M
2016-07-01
We collected 69 ticks from nine, free-ranging Amur tigers ( Panthera tigris altaica) between 2002 and 2011 and investigated them for tick-borne pathogens. DNA was extracted using alkaline digestion and PCR was performed to detect apicomplexan organisms. Partial 18S rDNA amplification products were obtained from 14 ticks from four tigers, of which 13 yielded unambiguous nucleotide sequence data. Comparative sequence analysis revealed all 13 partial 18S rDNA sequences were most similar to those belonging to strains of Hepatozoon felis (>564/572 base-pair identity, >99% sequence similarity). Although this tick-borne protozoon pathogen has been detected in wild felids from many parts of the world, this is the first record from the Russian Far East.
Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin
2016-06-15
Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence-structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM-HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods. Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx : xin.gao@kaust.edu.sa Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach
2014-01-01
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Computational mining for hypothetical patterns of amino acid side chains in protein data bank (PDB)
NASA Astrophysics Data System (ADS)
Ghani, Nur Syatila Ab; Firdaus-Raih, Mohd
2018-04-01
The three-dimensional structure of a protein can provide insights regarding its function. Functional relationship between proteins can be inferred from fold and sequence similarities. In certain cases, sequence or fold comparison fails to conclude homology between proteins with similar mechanism. Since the structure is more conserved than the sequence, a constellation of functional residues can be similarly arranged among proteins of similar mechanism. Local structural similarity searches are able to detect such constellation of amino acids among distinct proteins, which can be useful to annotate proteins of unknown function. Detection of such patterns of amino acids on a large scale can increase the repertoire of important 3D motifs since available known 3D motifs currently, could not compensate the ever-increasing numbers of uncharacterized proteins to be annotated. Here, a computational platform for an automated detection of 3D motifs is described. A fuzzy-pattern searching algorithm derived from IMagine an Amino Acid 3D Arrangement search EnGINE (IMAAAGINE) was implemented to develop an automated method for searching of hypothetical patterns of amino acid side chains in Protein Data Bank (PDB), without the need for prior knowledge on related sequence or structure of pattern of interest. We present an example of the searches, which is the detection of a hypothetical pattern derived from known structural motif of C2H2 structural pattern from zinc fingers. The conservation of particular patterns of amino acid side chains in unrelated proteins is highlighted. This approach can act as a complementary method for available structure- and sequence-based platforms and may contribute in improving functional association between proteins.
String Mining in Bioinformatics
NASA Astrophysics Data System (ADS)
Abouelhoda, Mohamed; Ghanem, Moustafa
Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].
String Mining in Bioinformatics
NASA Astrophysics Data System (ADS)
Abouelhoda, Mohamed; Ghanem, Moustafa
Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].
PaPrBaG: A machine learning approach for the detection of novel pathogens from NGS data
NASA Astrophysics Data System (ADS)
Deneke, Carlus; Rentzsch, Robert; Renard, Bernhard Y.
2017-01-01
The reliable detection of novel bacterial pathogens from next-generation sequencing data is a key challenge for microbial diagnostics. Current computational tools usually rely on sequence similarity and often fail to detect novel species when closely related genomes are unavailable or missing from the reference database. Here we present the machine learning based approach PaPrBaG (Pathogenicity Prediction for Bacterial Genomes). PaPrBaG overcomes genetic divergence by training on a wide range of species with known pathogenicity phenotype. To that end we compiled a comprehensive list of pathogenic and non-pathogenic bacteria with human host, using various genome metadata in conjunction with a rule-based protocol. A detailed comparative study reveals that PaPrBaG has several advantages over sequence similarity approaches. Most importantly, it always provides a prediction whereas other approaches discard a large number of sequencing reads with low similarity to currently known reference genomes. Furthermore, PaPrBaG remains reliable even at very low genomic coverages. CombiningPaPrBaG with existing approaches further improves prediction results.
Blood-Borne Candidatus Borrelia algerica in a Patient with Prolonged Fever in Oran, Algeria
Fotso Fotso, Aurélien; Angelakis, Emmanouil; Mouffok, Nadjet; Drancourt, Michel; Raoult, Didier
2015-01-01
To improve the knowledge base of Borrelia in north Africa, we tested 257 blood samples collected from febrile patients in Oran, Algeria, between January and December 2012 for Borrelia species using flagellin gene polymerase chain reaction sequencing. A sequence indicative of a new Borrelia sp. named Candidatus Borrelia algerica was detected in one blood sample. Further multispacer sequence typing indicated this Borrelia sp. had 97% similarity with Borrelia crocidurae, Borrelia duttonii, and Borrelia recurrentis. In silico comparison of Candidatus B. algerica spacer sequences with those of Borrelia hispanica and Borrelia garinii revealed 94% and 89% similarity, respectively. Candidatus B. algerica is a new relapsing fever Borrelia sp. detected in Oran. Further studies may help predict its epidemiological importance. PMID:26416117
Targeted Re-Sequencing Emulsion PCR Panel for Myopathies: Results in 94 Cases.
Punetha, Jaya; Kesari, Akanchha; Uapinyoying, Prech; Giri, Mamta; Clarke, Nigel F; Waddell, Leigh B; North, Kathryn N; Ghaoui, Roula; O'Grady, Gina L; Oates, Emily C; Sandaradura, Sarah A; Bönnemann, Carsten G; Donkervoort, Sandra; Plotz, Paul H; Smith, Edward C; Tesi-Rocha, Carolina; Bertorini, Tulio E; Tarnopolsky, Mark A; Reitter, Bernd; Hausmanowa-Petrusewicz, Irena; Hoffman, Eric P
2016-05-27
Molecular diagnostics in the genetic myopathies often requires testing of the largest and most complex transcript units in the human genome (DMD, TTN, NEB). Iteratively targeting single genes for sequencing has traditionally entailed high costs and long turnaround times. Exome sequencing has begun to supplant single targeted genes, but there are concerns regarding coverage and needed depth of the very large and complex genes that frequently cause myopathies. To evaluate efficiency of next-generation sequencing technologies to provide molecular diagnostics for patients with previously undiagnosed myopathies. We tested a targeted re-sequencing approach, using a 45 gene emulsion PCR myopathy panel, with subsequent sequencing on the Illumina platform in 94 undiagnosed patients. We compared the targeted re-sequencing approach to exome sequencing for 10 of these patients studied. We detected likely pathogenic mutations in 33 out of 94 patients with a molecular diagnostic rate of approximately 35%. The remaining patients showed variants of unknown significance (35/94 patients) or no mutations detected in the 45 genes tested (26/94 patients). Mutation detection rates for targeted re-sequencing vs. whole exome were similar in both methods; however exome sequencing showed better distribution of reads and fewer exon dropouts. Given that costs of highly parallel re-sequencing and whole exome sequencing are similar, and that exome sequencing now takes considerably less laboratory processing time than targeted re-sequencing, we recommend exome sequencing as the standard approach for molecular diagnostics of myopathies.
Goonesekere, Nalin Cw
2009-01-01
The large numbers of protein sequences generated by whole genome sequencing projects require rapid and accurate methods of annotation. The detection of homology through computational sequence analysis is a powerful tool in determining the complex evolutionary and functional relationships that exist between proteins. Homology search algorithms employ amino acid substitution matrices to detect similarity between proteins sequences. The substitution matrices in common use today are constructed using sequences aligned without reference to protein structure. Here we present amino acid substitution matrices constructed from the alignment of a large number of protein domain structures from the structural classification of proteins (SCOP) database. We show that when incorporated into the homology search algorithms BLAST and PSI-blast, the structure-based substitution matrices enhance the efficacy of detecting remote homologs.
FASH: A web application for nucleotides sequence search.
Veksler-Lublinksy, Isana; Barash, Danny; Avisar, Chai; Troim, Einav; Chew, Paul; Kedem, Klara
2008-05-27
: FASH (Fourier Alignment Sequence Heuristics) is a web application, based on the Fast Fourier Transform, for finding remote homologs within a long nucleic acid sequence. Given a query sequence and a long text-sequence (e.g, the human genome), FASH detects subsequences within the text that are remotely-similar to the query. FASH offers an alternative approach to Blast/Fasta for querying long RNA/DNA sequences. FASH differs from these other approaches in that it does not depend on the existence of contiguous seed-sequences in its initial detection phase. The FASH web server is user friendly and very easy to operate. FASH can be accessed athttps://fash.bgu.ac.il:8443/fash/default.jsp (secured website).
Comprehensive comparison of three commercial human whole-exome capture platforms.
Asan; Xu, Yu; Jiang, Hui; Tyler-Smith, Chris; Xue, Yali; Jiang, Tao; Wang, Jiawei; Wu, Mingzhi; Liu, Xiao; Tian, Geng; Wang, Jun; Wang, Jian; Yang, Huangming; Zhang, Xiuqing
2011-09-28
Exome sequencing, which allows the global analysis of protein coding sequences in the human genome, has become an effective and affordable approach to detecting causative genetic mutations in diseases. Currently, there are several commercial human exome capture platforms; however, the relative performances of these have not been characterized sufficiently to know which is best for a particular study. We comprehensively compared three platforms: NimbleGen's Sequence Capture Array and SeqCap EZ, and Agilent's SureSelect. We assessed their performance in a variety of ways, including number of genes covered and capture efficacy. Differences that may impact on the choice of platform were that Agilent SureSelect covered approximately 1,100 more genes, while NimbleGen provided better flanking sequence capture. Although all three platforms achieved similar capture specificity of targeted regions, the NimbleGen platforms showed better uniformity of coverage and greater genotype sensitivity at 30- to 100-fold sequencing depth. All three platforms showed similar power in exome SNP calling, including medically relevant SNPs. Compared with genotyping and whole-genome sequencing data, the three platforms achieved a similar accuracy of genotype assignment and SNP detection. Importantly, all three platforms showed similar levels of reproducibility, GC bias and reference allele bias. We demonstrate key differences between the three platforms, particularly advantages of solutions over array capture and the importance of a large gene target set.
Maruyama, Sandra Regina; Castro-Jorge, Luiza Antunes; Ribeiro, José Marcos Chaves; Gardinassi, Luiz Gustavo; Garcia, Gustavo Rocha; Brandão, Lucinda Giampietro; Rodrigues, Aline Rezende; Okada, Marcos Ituo; Abrão, Emiliana Pereira; Ferreira, Beatriz Rossetti; da Fonseca, Benedito Antonio Lopes; de Miranda-Santos, Isabel Kinney Ferreira
2013-01-01
Transcripts similar to those that encode the nonstructural (NS) proteins NS3 and NS5 from flaviviruses were found in a salivary gland (SG) complementary DNA (cDNA) library from the cattle tick Rhipicephalus microplus. Tick extracts were cultured with cells to enable the isolation of viruses capable of replicating in cultured invertebrate and vertebrate cells. Deep sequencing of the viral RNA isolated from culture supernatants provided the complete coding sequences for the NS3 and NS5 proteins and their molecular characterisation confirmed similarity with the NS3 and NS5 sequences from other flaviviruses. Despite this similarity, phylogenetic analyses revealed that this potentially novel virus may be a highly divergent member of the genus Flavivirus. Interestingly, we detected the divergent NS3 and NS5 sequences in ticks collected from several dairy farms widely distributed throughout three regions of Brazil. This is the first report of flavivirus-like transcripts in R. microplus ticks. This novel virus is a potential arbovirus because it replicated in arthropod and mammalian cells; furthermore, it was detected in a cDNA library from tick SGs and therefore may be present in tick saliva. It is important to determine whether and by what means this potential virus is transmissible and to monitor the virus as a potential emerging tick-borne zoonotic pathogen. PMID:24626302
Equine infectious anemia virus in naturally infected horses from the Brazilian Pantanal.
Cursino, Andreia Elisa; Vilela, Ana Paula Pessoa; Franco-Luiz, Ana Paula Moreira; de Oliveira, Jaquelline Germano; Nogueira, Márcia Furlan; Júnior, João Pessoa Araújo; de Aguiar, Daniel Moura; Kroon, Erna Geessien
2018-05-11
Equine infectious anemia (EIA) has a worldwide distribution, and is widespread in Brazil. The Brazilian Pantanal presents with high prevalence comprising equine performance and indirectly the livestock industry, since the horses are used for cattle management. Although EIA is routinely diagnosed by the agar gel immunodiffusion test (AGID), this serological assay has some limitations, so PCR-based detection methods have the potential to overcome these limitations and act as complementary tests to those currently used. Considering the limited number of equine infectious anemia virus (EIAV) sequences which are available in public databases and the great genome variability, studies of EIAV detection and characterization molecular remain important. In this study we detected EIAV proviral DNA from 23 peripheral blood mononuclear cell (PBMCs) samples of naturally infected horses from Brazilian Pantanal using a semi-nested-PCR (sn-PCR). The serological profile of the animals was also evaluated by AGID and ELISA for gp90 and p26. Furthermore, the EIAV PCR amplified DNA was sequenced and phylogenetically analyzed. Here we describe the first EIAV sequences of the 5' LTR of the tat gene in naturally infected horses from Brazil, which presented with 91% similarity to EIAV reference sequences. The Brazilian EIAV sequences also presented variable nucleotide similarities among themselves, ranging from 93,5% to 100%. Phylogenetic analysis showed that Brazilian EIAV sequences grouped in a separate clade relative to other reference sequences. Thus this molecular detection and characterization may provide information about EIAV circulation in Brazilian territories and improve phylogenetic inferences.
Cloud-based MOTIFSIM: Detecting Similarity in Large DNA Motif Data Sets.
Tran, Ngoc Tam L; Huang, Chun-Hsi
2017-05-01
We developed the cloud-based MOTIFSIM on Amazon Web Services (AWS) cloud. The tool is an extended version from our web-based tool version 2.0, which was developed based on a novel algorithm for detecting similarity in multiple DNA motif data sets. This cloud-based version further allows researchers to exploit the computing resources available from AWS to detect similarity in multiple large-scale DNA motif data sets resulting from the next-generation sequencing technology. The tool is highly scalable with expandable AWS.
Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood counts.
Göke, Jonathan; Schulz, Marcel H; Lasserre, Julia; Vingron, Martin
2012-03-01
The identity of cells and tissues is to a large degree governed by transcriptional regulation. A major part is accomplished by the combinatorial binding of transcription factors at regulatory sequences, such as enhancers. Even though binding of transcription factors is sequence-specific, estimating the sequence similarity of two functionally similar enhancers is very difficult. However, a similarity measure for regulatory sequences is crucial to detect and understand functional similarities between two enhancers and will facilitate large-scale analyses like clustering, prediction and classification of genome-wide datasets. We present the standardized alignment-free sequence similarity measure N2, a flexible framework that is defined for word neighbourhoods. We explore the usefulness of adding reverse complement words as well as words including mismatches into the neighbourhood. On simulated enhancer sequences as well as functional enhancers in mouse development, N2 is shown to outperform previous alignment-free measures. N2 is flexible, faster than competing methods and less susceptible to single sequence noise and the occurrence of repetitive sequences. Experiments on the mouse enhancers reveal that enhancers active in different tissues can be separated by pairwise comparison using N2. N2 represents an improvement over previous alignment-free similarity measures without compromising speed, which makes it a good candidate for large-scale sequence comparison of regulatory sequences. The software is part of the open-source C++ library SeqAn (www.seqan.de) and a compiled version can be downloaded at http://www.seqan.de/projects/alf.html. Supplementary data are available at Bioinformatics online.
Complete Genome Sequence of Porcine Parvovirus 2 Recovered from Swine Sera
Kluge, M.; Franco, A. C.; Giongo, A.; Valdez, F. P.; Saddi, T. M.; Brito, W. M. E. D.; Roehe, P. M.
2016-01-01
A complete genomic sequence of porcine parvovirus 2 (PPV-2) was detected by viral metagenome analysis on swine sera. A phylogenetic analysis of this genome reveals that it is highly similar to previously reported North American PPV-2 genomes. The complete PPV-2 sequence is 5,426 nucleotides long. PMID:26823583
mrtailor: a tool for PDB-file preparation for the generation of external restraints.
Gruene, Tim
2013-09-01
Model building starting from, for example, a molecular-replacement solution with low sequence similarity introduces model bias, which can be difficult to detect, especially at low resolution. The program mrtailor removes low-similarity regions from a template PDB file according to sequence similarity between the target sequence and the template sequence and maps the target sequence onto the PDB file. The modified PDB file can be used to generate external restraints for low-resolution refinement with reduced model bias and can be used as a starting point for model building and refinement. The program can call ProSMART [Nicholls et al. (2012), Acta Cryst. D68, 404-417] directly in order to create external restraints suitable for REFMAC5 [Murshudov et al. (2011), Acta Cryst. D67, 355-367]. Both a command-line version and a GUI exist.
Gibbs motif sampling: detection of bacterial outer membrane protein repeats.
Neuwald, A. F.; Liu, J. S.; Lawrence, C. E.
1995-01-01
The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles. PMID:8520488
Pseudouridines have context-dependent mutation and stop rates in high-throughput sequencing.
Zhou, Katherine I; Clark, Wesley C; Pan, David W; Eckwahl, Matthew J; Dai, Qing; Pan, Tao
2018-05-11
The abundant RNA modification pseudouridine (Ψ) has been mapped transcriptome-wide by chemically modifying pseudouridines with carbodiimide and detecting the resulting reverse transcription stops in high-throughput sequencing. However, these methods have limited sensitivity and specificity, in part due to the use of reverse transcription stops. We sought to use mutations rather than just stops in sequencing data to identify pseudouridine sites. Here, we identify reverse transcription conditions that allow read-through of carbodiimide-modified pseudouridine (CMC-Ψ), and we show that pseudouridines in carbodiimide-treated human ribosomal RNA have context-dependent mutation and stop rates in high-throughput sequencing libraries prepared under these conditions. Furthermore, accounting for the context-dependence of mutation and stop rates can enhance the detection of pseudouridine sites. Similar approaches could contribute to the sequencing-based detection of many RNA modifications.
Genome and Transcriptome Sequencing of the Ostreid herpesvirus 1 From Tomales Bay, California
NASA Astrophysics Data System (ADS)
Burge, C. A.; Langevin, S.; Closek, C. J.; Roberts, S. B.; Friedman, C. S.
2016-02-01
Mass mortalities of larval and seed bivalve molluscs attributed to the Ostreid herpesvirus 1 (OsHV-1) occur globally. OsHV-1 was fully sequenced and characterized as a member of the Family Malacoherpesviridae. Multiple strains of OsHV-1 exist and may vary in virulence, i.e. OsHV-1 µvar. For most global variants of OsHV-1, sequence data is limited to PCR-based sequencing of segments, including two recent genomes. In the United States, OsHV-1 is limited to detection in adjacent embayments in California, Tomales and Drakes bays. Limited DNA sequence data of OsHV-1 infecting oysters in Tomales Bay indicates the virus detected in Tomales Bay is similar but not identical to any one global variant of OsHV-1. In order to better understand both strain variation and virulence of OsHV-1 infecting oysters in Tomales Bay, we used genomic and transcriptomic sequencing. Meta-genomic sequencing (Illumina MiSeq) was conducted from infected oysters (n=4 per year) collected in 2003, 2007, and 2014, where full OsHV-1 genome sequences and low overall microbial diversity were achieved from highly infected oysters. Increased microbial diversity was detected in three of four samples sequenced from 2003, where qPCR based genome copy numbers of OsHV-1 were lower. Expression analysis (SOLiD RNA sequencing) of OsHV-1 genes expressed in oyster larvae at 24 hours post exposure revealed a nearly complete transcriptome, with several highly expressed genes, which are similar to recent transcriptomic analyses of other OsHV-1 variants. Taken together, our results indicate that genome and transcriptome sequencing may be powerful tools in understanding both strain variation and virulence of non-culturable marine viruses.
Local alignment of two-base encoded DNA sequence
Homer, Nils; Merriman, Barry; Nelson, Stanley F
2009-01-01
Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732
Complete Genome Sequence of Porcine Parvovirus 2 Recovered from Swine Sera.
Campos, F S; Kluge, M; Franco, A C; Giongo, A; Valdez, F P; Saddi, T M; Brito, W M E D; Roehe, P M
2016-01-28
A complete genomic sequence of porcine parvovirus 2 (PPV-2) was detected by viral metagenome analysis on swine sera. A phylogenetic analysis of this genome reveals that it is highly similar to previously reported North American PPV-2 genomes. The complete PPV-2 sequence is 5,426 nucleotides long. Copyright © 2016 Campos et al.
Spliced RNA of woodchuck hepatitis virus.
Ogston, C W; Razman, D G
1992-07-01
Polymerase chain reaction was used to investigate RNA splicing in liver of woodchucks infected with woodchuck hepatitis virus (WHV). Two spliced species were detected, and the splice junctions were sequenced. The larger spliced RNA has an intron of 1300 nucleotides, and the smaller spliced sequence shows an additional downstream intron of 1104 nucleotides. We did not detect singly spliced sequences from which the smaller intron alone was removed. Control experiments showed that spliced sequences are present in both RNA and DNA in infected liver, showing that the viral reverse transcriptase can use spliced RNA as template. Spliced sequences were detected also in virion DNA prepared from serum. The upstream intron produces a reading frame that fuses the core to the polymerase polypeptide, while the downstream intron causes an inframe deletion in the polymerase open reading frame. Whereas the splicing patterns in WHV are superficially similar to those reported recently in hepatitis B virus, we detected no obvious homology in the coding capacity of spliced RNAs from these two viruses.
Campion, S R; Ameen, A S; Lai, L; King, J M; Munzenmaier, T N
2001-08-15
This report describes the application of a simple computational tool, AAPAIR.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Laminin motif/sequence families at the two-amino acid level. Automated dipeptide frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipeptides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family-specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) and Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) also exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subsequently revealed the highly restricted localization of the G(F/Y) and N(G/T) sequence elements at two separate sites of extreme conservation in the consensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence families was further correlated with the concurrence of these shared molecular determinants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules.
CLAST: CUDA implemented large-scale alignment search tool.
Yano, Masahiro; Mori, Hiroshi; Akiyama, Yutaka; Yamada, Takuji; Kurokawa, Ken
2014-12-11
Metagenomics is a powerful methodology to study microbial communities, but it is highly dependent on nucleotide sequence similarity searching against sequence databases. Metagenomic analyses with next-generation sequencing technologies produce enormous numbers of reads from microbial communities, and many reads are derived from microbes whose genomes have not yet been sequenced, limiting the usefulness of existing sequence similarity search tools. Therefore, there is a clear need for a sequence similarity search tool that can rapidly detect weak similarity in large datasets. We developed a tool, which we named CLAST (CUDA implemented large-scale alignment search tool), that enables analyses of millions of reads and thousands of reference genome sequences, and runs on NVIDIA Fermi architecture graphics processing units. CLAST has four main advantages over existing alignment tools. First, CLAST was capable of identifying sequence similarities ~80.8 times faster than BLAST and 9.6 times faster than BLAT. Second, CLAST executes global alignment as the default (local alignment is also an option), enabling CLAST to assign reads to taxonomic and functional groups based on evolutionarily distant nucleotide sequences with high accuracy. Third, CLAST does not need a preprocessed sequence database like Burrows-Wheeler Transform-based tools, and this enables CLAST to incorporate large, frequently updated sequence databases. Fourth, CLAST requires <2 GB of main memory, making it possible to run CLAST on a standard desktop computer or server node. CLAST achieved very high speed (similar to the Burrows-Wheeler Transform-based Bowtie 2 for long reads) and sensitivity (equal to BLAST, BLAT, and FR-HIT) without the need for extensive database preprocessing or a specialized computing platform. Our results demonstrate that CLAST has the potential to be one of the most powerful and realistic approaches to analyze the massive amount of sequence data from next-generation sequencing technologies.
Detection of Plasmodium sp. in capybara.
dos Santos, Leonilda Correia; Curotto, Sandra Mara Rotter; de Moraes, Wanderlei; Cubas, Zalmir Silvino; Costa-Nascimento, Maria de Jesus; de Barros Filho, Ivan Roque; Biondo, Alexander Welker; Kirchgatter, Karin
2009-07-07
In the present study, we have microscopically and molecularly surveyed blood samples from 11 captive capybaras (Hydrochaeris hydrochaeris) from the Sanctuary Zoo for Plasmodium sp. infection. One animal presented positive on blood smear by light microscopy. Polymerase chain reaction was carried out accordingly using a nested genus-specific protocol, which uses oligonucleotides from conserved sequences flanking a variable sequence region in the small subunit ribosomal RNA (ssrRNA) of all Plasmodium organisms. This revealed three positive animals. Products from two samples were purified and sequenced. The results showed less than 1% divergence between the two capybara sequences. When compared with GenBank sequences, a 55% similarity was obtained to Toxoplasma gondii and a higher similarity (73-77.2%) was found to ssrRNAs from Plasmodium species that infect reptile, avian, rodents, and human beings. The most similar Plasmodium sequence was from Plasmodium mexicanum that infects lizards of North America, where around 78% identity was found. This work is the first report of Plasmodium in capybaras, and due to the low similarity with other Plasmodium species, we suggest it is a new species, which, in the future could be denominated "Plasmodium hydrochaeri".
NASA Astrophysics Data System (ADS)
Tibbetts, Clark; Lichanska, Agnieszka M.; Borsuk, Lisa A.; Weslowski, Brian; Morris, Leah M.; Lorence, Matthew C.; Schafer, Klaus O.; Campos, Joseph; Sene, Mohamadou; Myers, Christopher A.; Faix, Dennis; Blair, Patrick J.; Brown, Jason; Metzgar, David
2010-04-01
High-density resequencing microarrays support simultaneous detection and identification of multiple viral and bacterial pathogens. Because detection and identification using RPM is based upon multiple specimen-specific target pathogen gene sequences generated in the individual test, the test results enable both a differential diagnostic analysis and epidemiological tracking of detected pathogen strains and variants from one specimen to the next. The RPM assay enables detection and identification of pathogen sequences that share as little as 80% sequence similarity to prototype target gene sequences represented as detector tiles on the array. This capability enables the RPM to detect and identify previously unknown strains and variants of a detected pathogen, as in sentinel cases associated with an infectious disease outbreak. We illustrate this capability using assay results from testing influenza A virus vaccines configured with strains that were first defined years after the design of the RPM microarray. Results are also presented from RPM-Flu testing of three specimens independently confirmed to the positive for the 2009 Novel H1N1 outbreak strain of influenza virus.
Anomaly Detection in Large Sets of High-Dimensional Symbol Sequences
NASA Technical Reports Server (NTRS)
Budalakoti, Suratna; Srivastava, Ashok N.; Akella, Ram; Turkov, Eugene
2006-01-01
This paper addresses the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. The approach taken uses unsupervised clustering of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by detailed analysis of outliers to detect anomalies. As the LCS measure is expensive to compute, the first part of the paper discusses existing algorithms, such as the Hunt-Szymanski algorithm, that have low time-complexity. We then discuss why these algorithms often do not work well in practice and present a new hybrid algorithm for computing the LCS that, in our tests, outperforms the Hunt-Szymanski algorithm by a factor of five. The second part of the paper presents new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. The algorithms provide a coherent description to an analyst of the anomalies in the sequence, compared to more normal sequences. The algorithms we present are general and domain-independent, so we discuss applications in related areas such as anomaly detection.
2010-01-01
Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644
Marton, Szilvia; Ihász, Katalin; Lengyel, György; Farkas, Szilvia L; Dán, Ádám; Paulus, Petra; Bányai, Krisztián; Fehér, Enikő
2015-03-01
Circoviruses of pigs and birds are established pathogens, however, the exact role of other, recently described circoviruses and circovirus-like viruses remains to be elucidated. The aim of this study was the detection of circoviruses in neglected host species, including honey bees, exotic reptiles and free-living amoebae by widely used broad-spectrum polymerase chain reaction (PCR) assays specific for the replication initiation protein coding gene of these viruses. The majority of sequences obtained from honey bees were highly similar to canine and porcine circoviruses, or, were distantly related to dragonfly cycloviruses. Other rep sequences detected in some honey bees, reptiles and amoebae showed similarities to various rep sequences deposited in the GenBank. Back-to-back PCR primers designed for the amplification of whole viral genomes failed to work that suggested the existence of integrated rep-like elements in many samples. Rolling circle amplification and exonuclease treatment confirmed the absence of small circular DNA genomes in the specimens analysed. In case of honey bees Varroa mite DNA contamination might be a source of the identified endogenous rep-like elements. The reptile and amoebae rep-like sequences were nearly identical with each other and with sequences detected in chimpanzee feces raising the possibility that detection of novel or unusual rep-like elements in some host species might originate from the microbial community of the host. Our results indicate that attention is needed when broad-spectrum rep gene specific polymerase chain reaction is chosen for laboratory diagnosis of circovirus infections.
NASA Astrophysics Data System (ADS)
Bergen, K.; Yoon, C. E.; OReilly, O. J.; Beroza, G. C.
2015-12-01
Recent improvements in computational efficiency for waveform correlation-based detections achieved by new methods such as Fingerprint and Similarity Thresholding (FAST) promise to allow large-scale blind search for similar waveforms in long-duration continuous seismic data. Waveform similarity search applied to datasets of months to years of continuous seismic data will identify significantly more events than traditional detection methods. With the anticipated increase in number of detections and associated increase in false positives, manual inspection of the detection results will become infeasible. This motivates the need for new approaches to process the output of similarity-based detection. We explore data mining techniques for improved detection post-processing. We approach this by considering similarity-detector output as a sparse similarity graph with candidate events as vertices and similarities as weighted edges. Image processing techniques are leveraged to define candidate events and combine results individually processed at multiple stations. Clustering and graph analysis methods are used to identify groups of similar waveforms and assign a confidence score to candidate detections. Anomaly detection and classification are applied to waveform data for additional false detection removal. A comparison of methods will be presented and their performance will be demonstrated on a suspected induced and non-induced earthquake sequence.
Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine
2011-03-10
Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de.
Bradshaw, Charles Richard; Surendranath, Vineeth; Henschel, Robert; Mueller, Matthias Stefan; Habermann, Bianca Hermine
2011-01-01
Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs) are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10), a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in proteins based on weak sequence similarity. Our predictions open up new avenues for biological and medical studies. Genome-wide HMMerThread domains are available at http://vm1-hmmerthread.age.mpg.de. PMID:21423752
Entropic fluctuations in DNA sequences
NASA Astrophysics Data System (ADS)
Thanos, Dimitrios; Li, Wentian; Provata, Astero
2018-03-01
The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.
A new method to improve network topological similarity search: applied to fold recognition
Lhota, John; Hauptman, Ruth; Hart, Thomas; Ng, Clara; Xie, Lei
2015-01-01
Motivation: Similarity search is the foundation of bioinformatics. It plays a key role in establishing structural, functional and evolutionary relationships between biological sequences. Although the power of the similarity search has increased steadily in recent years, a high percentage of sequences remain uncharacterized in the protein universe. Thus, new similarity search strategies are needed to efficiently and reliably infer the structure and function of new sequences. The existing paradigm for studying protein sequence, structure, function and evolution has been established based on the assumption that the protein universe is discrete and hierarchical. Cumulative evidence suggests that the protein universe is continuous. As a result, conventional sequence homology search methods may be not able to detect novel structural, functional and evolutionary relationships between proteins from weak and noisy sequence signals. To overcome the limitations in existing similarity search methods, we propose a new algorithmic framework—Enrichment of Network Topological Similarity (ENTS)—to improve the performance of large scale similarity searches in bioinformatics. Results: We apply ENTS to a challenging unsolved problem: protein fold recognition. Our rigorous benchmark studies demonstrate that ENTS considerably outperforms state-of-the-art methods. As the concept of ENTS can be applied to any similarity metric, it may provide a general framework for similarity search on any set of biological entities, given their representation as a network. Availability and implementation: Source code freely available upon request Contact: lxie@iscb.org PMID:25717198
Burgess, Diane; Freeling, Michael
2014-01-01
In vertebrates, conserved noncoding elements (CNEs) are functionally constrained sequences that can show striking conservation over >400 million years of evolutionary distance and frequently are located megabases away from target developmental genes. Conserved noncoding sequences (CNSs) in plants are much shorter, and it has been difficult to detect conservation among distantly related genomes. In this article, we show not only that CNS sequences can be detected throughout the eudicot clade of flowering plants, but also that a subset of 37 CNSs can be found in all flowering plants (diverging ∼170 million years ago). These CNSs are functionally similar to vertebrate CNEs, being highly associated with transcription factor and development genes and enriched in transcription factor binding sites. Some of the most highly conserved sequences occur in genes encoding RNA binding proteins, particularly the RNA splicing–associated SR genes. Differences in sequence conservation between plants and animals are likely to reflect differences in the biology of the organisms, with plants being much more able to tolerate genomic deletions and whole-genome duplication events due, in part, to their far greater fecundity compared with vertebrates. PMID:24681619
Buschmann, Tilo; Zhang, Rong; Brash, Douglas E; Bystrykh, Leonid V
2014-08-07
DNA barcodes are short unique sequences used to label DNA or RNA-derived samples in multiplexed deep sequencing experiments. During the demultiplexing step, barcodes must be detected and their position identified. In some cases (e.g., with PacBio SMRT), the position of the barcode and DNA context is not well defined. Many reads start inside the genomic insert so that adjacent primers might be missed. The matter is further complicated by coincidental similarities between barcode sequences and reference DNA. Therefore, a robust strategy is required in order to detect barcoded reads and avoid a large number of false positives or negatives.For mass inference problems such as this one, false discovery rate (FDR) methods are powerful and balanced solutions. Since existing FDR methods cannot be applied to this particular problem, we present an adapted FDR method that is suitable for the detection of barcoded reads as well as suggest possible improvements. In our analysis, barcode sequences showed high rates of coincidental similarities with the Mus musculus reference DNA. This problem became more acute when the length of the barcode sequence decreased and the number of barcodes in the set increased. The method presented in this paper controls the tail area-based false discovery rate to distinguish between barcoded and unbarcoded reads. This method helps to establish the highest acceptable minimal distance between reads and barcode sequences. In a proof of concept experiment we correctly detected barcodes in 83% of the reads with a precision of 89%. Sensitivity improved to 99% at 99% precision when the adjacent primer sequence was incorporated in the analysis. The analysis was further improved using a paired end strategy. Following an analysis of the data for sequence variants induced in the Atp1a1 gene of C57BL/6 murine melanocytes by ultraviolet light and conferring resistance to ouabain, we found no evidence of cross-contamination of DNA material between samples. Our method offers a proper quantitative treatment of the problem of detecting barcoded reads in a noisy sequencing environment. It is based on the false discovery rate statistics that allows a proper trade-off between sensitivity and precision to be chosen.
Yilmaz, Huseyin; Altan, Eda; Cizmecigil, Utku Y; Gurel, Aydin; Ozturk, Gulay Yuzbasioglu; Bamac, Ozge Erdogan; Aydin, Ozge; Britton, Paul; Monne, Isabella; Cetinkaya, Burhan; Morgan, Kenton L; Faburay, Bonto; Richt, Juergen A; Turan, Nuri
2016-09-01
The avian coronavirus infectious bronchitis virus (AvCoV-IBV) is recognized as an important global pathogen because new variants are a continuous threat to the poultry industry worldwide. This study investigates the genetic origin and diversity of AvCoV-IBV by analysis of the S1 sequence derived from 49 broiler flocks and 14 layer flocks in different regions of Turkey. AvCoV-IBV RNA was detected in 41 (83.6%) broiler flocks and nine (64.2%) of the layer flocks by TaqMan real-time RT-PCR. In addition, AvCoV-IBV RNA was detected in the tracheas 27/30 (90%), lungs 31/49 (62.2%), caecal tonsils 7/22 (31.8%), and kidneys 4/49 (8.1%) of broiler flocks examined. Pathologic lesions, hemorrhages, and mononuclear infiltrations were predominantly observed in tracheas and to a lesser extent in the lungs and a few in kidneys. A phylogenetic tree based on partial S1 sequences of the detected AvCoV-IBVs (including isolates) revealed that 1) viruses detected in five broiler flocks were similar to the IBV vaccines Ma5, H120, M41; 2) viruses detected in 24 broiler flocks were similar to those previously reported from Turkey and to Israel variant-2 strains; 3) viruses detected in seven layer flocks were different from those found in any of the broiler flocks but similar to viruses previously reported from Iran, India, and China (similar to Israel variant-1 and 4/91 serotypes); and 4) that the AVCoV-IBV, Israeli variant-2 strain, found to be circulating in Turkey appears to be undergoing molecular evolution. In conclusion, genetically different AvCoV-IBV strains, including vaccine-like strains, based on their partial S1 sequence, are circulating in broiler and layer chicken flocks in Turkey and the Israeli variant-2 strain is undergoing evolution.
Molecular Evidence of Chlamydia-Like Organisms in the Feces of Myotis daubentonii Bats.
Hokynar, K; Vesterinen, E J; Lilley, T M; Pulliainen, A T; Korhonen, S J; Paavonen, J; Puolakkainen, M
2017-01-15
Chlamydia-like organisms (CLOs) are recently identified members of the Chlamydiales order. CLOs share intracellular lifestyles and biphasic developmental cycles, and they have been detected in environmental samples as well as in various hosts such as amoebae and arthropods. In this study, we screened bat feces for the presence of CLOs by molecular analysis. Using pan-Chlamydiales PCR targeting the 16S rRNA gene, Chlamydiales DNA was detected in 54% of the specimens. PCR amplification, sequencing, and phylogenetic analysis of the 16S rRNA and 23S rRNA genes were used to classify positive specimens and infer their phylogenetic relationships. Most sequences matched best with Rhabdochlamydia species or uncultured Chlamydia sequences identified in ticks. Another set of sequences matched best with sequences of the Chlamydia genus or uncultured Chlamydiales from snakes. To gain evidence of whether CLOs in bat feces are merely diet borne, we analyzed insects trapped from the same location where the bats foraged. Interestingly, the CLO sequences resembling Rhabdochlamydia spp. were detected in insect material as well, but the other set of CLO sequences was not, suggesting that this set might not originate from prey. Thus, bats represent another potential host for Chlamydiales and could harbor novel, previously unidentified members of this order. Several pathogenic viruses are known to colonize bats, and recent analyses indicate that bats are also reservoir hosts for bacterial genera. Chlamydia-like organisms (CLOs) have been detected in several animal species. CLOs have high 16S rRNA sequence similarity to Chlamydiaceae and exhibit similar intracellular lifestyles and biphasic developmental cycles. Our study describes the frequent occurrence of CLO DNA in bat feces, suggesting an expanding host species spectrum for the Chlamydiales As bats can acquire various infectious agents through their diet, prey insects were also studied. We identified CLO sequences in bats that matched best with sequences in prey insects but also CLO sequences not detected in prey insects. This suggests that a portion of CLO DNA present in bat feces is not prey borne. Furthermore, some sequences from bat droppings not originating from their diet might well represent novel, previously unidentified members of the Chlamydiales order. Copyright © 2016 American Society for Microbiology.
REPPER—repeats and their periodicities in fibrous proteins
Gruber, Markus; Söding, Johannes; Lupas, Andrei N.
2005-01-01
REPPER (REPeats and their PERiodicities) is an integrated server that detects and analyzes regions with short gapless repeats in protein sequences or alignments. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. Both programs use a sliding window to ensure that different periodic regions within the same protein are detected independently. FTwin and REPwin are complemented by secondary structure prediction (PSIPRED) and coiled coil prediction (COILS), making the server a versatile analysis tool for sequences of fibrous proteins. REPPER is available at . PMID:15980460
Mikkelsen, Martin; Frank-Hansen, Rune; Hansen, Anders J; Morling, Niels
2014-09-01
of sequencing of whole mitochondrial genome, HV1 and HV2 DNA with the second generation system (SGS) Roche 454 GS Junior were compared with results of Sanger sequencing and SNP typing with SNaPshot single base extension detected with MALDI-TOF and capillary electrophoresis. We investigated the performance of the software analysis of the data, reproducibility, ability to sequence homopolymeric regions, detection of mixtures and heteroplasmy as well as the implications of the depth of coverage. We found full reproducibility between samples sequenced twice with SGS. We found close to full concordance between the mtDNA sequences of 26 samples obtained with (1) the 454 SGS method using a depth of coverage above 100 and (2) Sanger sequencing and SNP typing. The discrepancies were primarily observed in homopolymeric regions. The 454 SGS method was able to sequence 95% of the reads correctly in homopolymers up to 4 bases, and up to 6 bases could be sequenced with similar success if the results were carefully, visually inspected. The 454 technology was able to detect mixtures or heteroplasmy of approximately 10%. We detected previously unreported heteroplasmy in the GM9947A component of the NIST human mitochondrial DNA SRM-2392 standard reference material. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
An unbiased study of debris discs around A-type stars with Herschel
NASA Astrophysics Data System (ADS)
Thureau, N. D.; Greaves, J. S.; Matthews, B. C.; Kennedy, G.; Phillips, N.; Booth, M.; Duchêne, G.; Horner, J.; Rodriguez, D. R.; Sibthorpe, B.; Wyatt, M. C.
2014-12-01
The Herschel DEBRIS (Disc Emission via a Bias-free Reconnaissance in the Infrared/Submillimetre) survey brings us a unique perspective on the study of debris discs around main-sequence A-type stars. Bias-free by design, the survey offers a remarkable data set with which to investigate the cold disc properties. The statistical analysis of the 100 and 160 μm data for 86 main-sequence A stars yields a lower than previously found debris disc rate. Considering better than 3σ excess sources, we find a detection rate ≥24 ± 5 per cent at 100 μm which is similar to the debris disc rate around main-sequence F/G/K-spectral type stars. While the 100 and 160 μm excesses slowly decline with time, debris discs with large excesses are found around some of the oldest A stars in our sample, evidence that the debris phenomenon can survive throughout the length of the main sequence (˜1 Gyr). Debris discs are predominantly detected around the youngest and hottest stars in our sample. Stellar properties such as metallicity are found to have no effect on the debris disc incidence. Debris discs are found around A stars in single systems and multiple systems at similar rates. While tight and wide binaries (<1 and >100 au, respectively) host debris discs with a similar frequency and global properties, no intermediate separation debris systems were detected in our sample.
Evaluation of nearest-neighbor methods for detection of chimeric small-subunit rRNA sequences
NASA Technical Reports Server (NTRS)
Robison-Cox, J. F.; Bateson, M. M.; Ward, D. M.
1995-01-01
Detection of chimeric artifacts formed when PCR is used to retrieve naturally occurring small-subunit (SSU) rRNA sequences may rely on demonstrating that different sequence domains have different phylogenetic affiliations. We evaluated the CHECK_CHIMERA method of the Ribosomal Database Project and another method which we developed, both based on determining nearest neighbors of different sequence domains, for their ability to discern artificially generated SSU rRNA chimeras from authentic Ribosomal Database Project sequences. The reliability of both methods decreases when the parental sequences which contribute to chimera formation are more than 82 to 84% similar. Detection is also complicated by the occurrence of authentic SSU rRNA sequences that behave like chimeras. We developed a naive statistical test based on CHECK_CHIMERA output and used it to evaluate previously reported SSU rRNA chimeras. Application of this test also suggests that chimeras might be formed by retrieving SSU rRNAs as cDNA. The amount of uncertainty associated with nearest-neighbor analyses indicates that such tests alone are insufficient and that better methods are needed.
Silveira, Júlia A G; Rabelo, Elida M L; Lacerda, Ana C R; Borges, Paulo A L; Tomás, Walfrido M; Pellegrin, Aiesca O; Tomich, Renata G P; Ribeiro, Múcio F B
2013-06-01
Hemoparasites were surveyed in 60 free-living pampas deer Ozotoceros bezoarticus from the central area of the Pantanal, known as Nhecolândia, State of Mato Grosso do Sul, Brazil, through the analysis of nested PCR assays and nucleotide sequencing. Blood samples were tested for Babesia/Theileria, Anaplasma spp., and Trypanosoma spp. using nPCR assays and sequencing of the 18S rRNA, msp4, ITS, and cathepsin L genes. The identity of each sequence was confirmed by comparison with sequences from GenBank using BLAST software. Forty-six (77%) pampas deer were positive for at least one hemoparasite, according to PCR assays. Co-infection occurred in 13 (22%) animals. Based on the sequencing results, 29 (48%) tested positive for A. marginale. Babesia/Theileria were detected in 23 (38%) samples, and according to the sequencing results 52% (12/23) of the samples were similar to T. cervi, 13% (3/23) were similar to Babesia bovis, and 9% (2/23) were similar to B. bigemina. No samples were amplified with the primers for T. vivax, while 11 (18%) were amplified with the ITS primers for T. evansi. The results showed pampas deer to be co-infected with several hemoparasites, including species that may cause serious disease in cattle. Pampas deer is an endangered species in Brazil, and the consequences of these infections to their health are poorly understood. Copyright © 2013 Elsevier GmbH. All rights reserved.
Rapid comparison of protein binding site surfaces with Property Encoded Shape Distributions (PESD)
Das, Sourav; Kokardekar, Arshad
2009-01-01
Patterns in shape and property distributions on the surface of binding sites are often conserved across functional proteins without significant conservation of the underlying amino-acid residues. To explore similarities of these sites from the viewpoint of a ligand, a sequence and fold-independent method was created to rapidly and accurately compare binding sites of proteins represented by property-mapped triangulated Gauss-Connolly surfaces. Within this paradigm, signatures for each binding site surface are produced by calculating their property-encoded shape distributions (PESD), a measure of the probability that a particular property will be at a specific distance to another on the molecular surface. Similarity between the signatures can then be treated as a measure of similarity between binding sites. As postulated, the PESD method rapidly detected high levels of similarity in binding site surface characteristics even in cases where there was very low similarity at the sequence level. In a screening experiment involving each member of the PDBBind 2005 dataset as a query against the rest of the set, PESD was able to retrieve a binding site with identical E.C. (Enzyme Commission) numbers as the top match in 79.5% of cases. The ability of the method in detecting similarity in binding sites with low sequence conservations were compared with state-of-the-art binding site comparison methods. PMID:19919089
How good are indirect tests at detecting recombination in human mtDNA?
White, Daniel James; Bryant, David; Gemmell, Neil John
2013-07-08
Empirical proof of human mitochondrial DNA (mtDNA) recombination in somatic tissues was obtained in 2004; however, a lack of irrefutable evidence exists for recombination in human mtDNA at the population level. Our inability to demonstrate convincingly a signal of recombination in population data sets of human mtDNA sequence may be due, in part, to the ineffectiveness of current indirect tests. Previously, we tested some well-established indirect tests of recombination (linkage disequilibrium vs. distance using D' and r(2), Homoplasy Test, Pairwise Homoplasy Index, Neighborhood Similarity Score, and Max χ(2)) on sequence data derived from the only empirically confirmed case of human mtDNA recombination thus far and demonstrated that some methods were unable to detect recombination. Here, we assess the performance of these six well-established tests and explore what characteristics specific to human mtDNA sequence may affect their efficacy by simulating sequence under various parameters with levels of recombination (ρ) that vary around an empirically derived estimate for human mtDNA (population parameter ρ = 5.492). No test performed infallibly under any of our scenarios, and error rates varied across tests, whereas detection rates increased substantially with ρ values > 5.492. Under a model of evolution that incorporates parameters specific to human mtDNA, including rate heterogeneity, population expansion, and ρ = 5.492, successful detection rates are limited to a range of 7-70% across tests with an acceptable level of false-positive results: the neighborhood similarity score incompatibility test performed best overall under these parameters. Population growth seems to have the greatest impact on recombination detection probabilities across all models tested, likely due to its impact on sequence diversity. The implications of our findings on our current understanding of mtDNA recombination in humans are discussed.
How Good Are Indirect Tests at Detecting Recombination in Human mtDNA?
White, Daniel James; Bryant, David; Gemmell, Neil John
2013-01-01
Empirical proof of human mitochondrial DNA (mtDNA) recombination in somatic tissues was obtained in 2004; however, a lack of irrefutable evidence exists for recombination in human mtDNA at the population level. Our inability to demonstrate convincingly a signal of recombination in population data sets of human mtDNA sequence may be due, in part, to the ineffectiveness of current indirect tests. Previously, we tested some well-established indirect tests of recombination (linkage disequilibrium vs. distance using D′ and r2, Homoplasy Test, Pairwise Homoplasy Index, Neighborhood Similarity Score, and Max χ2) on sequence data derived from the only empirically confirmed case of human mtDNA recombination thus far and demonstrated that some methods were unable to detect recombination. Here, we assess the performance of these six well-established tests and explore what characteristics specific to human mtDNA sequence may affect their efficacy by simulating sequence under various parameters with levels of recombination (ρ) that vary around an empirically derived estimate for human mtDNA (population parameter ρ = 5.492). No test performed infallibly under any of our scenarios, and error rates varied across tests, whereas detection rates increased substantially with ρ values > 5.492. Under a model of evolution that incorporates parameters specific to human mtDNA, including rate heterogeneity, population expansion, and ρ = 5.492, successful detection rates are limited to a range of 7−70% across tests with an acceptable level of false-positive results: the neighborhood similarity score incompatibility test performed best overall under these parameters. Population growth seems to have the greatest impact on recombination detection probabilities across all models tested, likely due to its impact on sequence diversity. The implications of our findings on our current understanding of mtDNA recombination in humans are discussed. PMID:23665874
Analysis of xylem formation in pine by cDNA sequencing
NASA Technical Reports Server (NTRS)
Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.;
1998-01-01
Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.
Okamoto, Nobuhiko; Nakashima, Mitsuko; Tsurusaki, Yoshinori; Miyake, Noriko; Saitsu, Hirotomo; Matsumoto, Naomichi
2013-01-01
Next-generation sequencing (NGS) combined with enrichment of target genes enables highly efficient and low-cost sequencing of multiple genes for genetic diseases. The aim of this study was to validate the accuracy and sensitivity of our method for comprehensive mutation detection in autism spectrum disorder (ASD). We assessed the performance of the bench-top Ion Torrent PGM and Illumina MiSeq platforms as optimized solutions for mutation detection, using microdroplet PCR-based enrichment of 62 ASD associated genes. Ten patients with known mutations were sequenced using NGS to validate the sensitivity of our method. The overall read quality was better with MiSeq, largely because of the increased indel-related error associated with PGM. The sensitivity of SNV detection was similar between the two platforms, suggesting they are both suitable for SNV detection in the human genome. Next, we used these methods to analyze 28 patients with ASD, and identified 22 novel variants in genes associated with ASD, with one mutation detected by MiSeq only. Thus, our results support the combination of target gene enrichment and NGS as a valuable molecular method for investigating rare variants in ASD. PMID:24066114
Mining for class-specific motifs in protein sequence classification
2013-01-01
Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms. PMID:23496846
Cankar, Katarina; Chauvensy-Ancel, Valérie; Fortabat, Marie-Noelle; Gruden, Kristina; Kobilinsky, André; Zel, Jana; Bertheau, Yves
2008-05-15
Detection of nonauthorized genetically modified organisms (GMOs) has always presented an analytical challenge because the complete sequence data needed to detect them are generally unavailable although sequence similarity to known GMOs can be expected. A new approach, differential quantitative polymerase chain reaction (PCR), for detection of nonauthorized GMOs is presented here. This method is based on the presence of several common elements (e.g., promoter, genes of interest) in different GMOs. A statistical model was developed to study the difference between the number of molecules of such a common sequence and the number of molecules identifying the approved GMO (as determined by border-fragment-based PCR) and the donor organism of the common sequence. When this difference differs statistically from zero, the presence of a nonauthorized GMO can be inferred. The interest and scope of such an approach were tested on a case study of different proportions of genetically modified maize events, with the P35S promoter as the Cauliflower Mosaic Virus common sequence. The presence of a nonauthorized GMO was successfully detected in the mixtures analyzed and in the presence of (donor organism of P35S promoter). This method could be easily transposed to other common GMO sequences and other species and is applicable to other detection areas such as microbiology.
A Cyber-Attack Detection Model Based on Multivariate Analyses
NASA Astrophysics Data System (ADS)
Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi
In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
Virus Identification in Unknown Tropical Febrile Illness Cases Using Deep Sequencing
Balmaseda, Angel; Harris, Eva; DeRisi, Joseph L.
2012-01-01
Dengue virus is an emerging infectious agent that infects an estimated 50–100 million people annually worldwide, yet current diagnostic practices cannot detect an etiologic pathogen in ∼40% of dengue-like illnesses. Metagenomic approaches to pathogen detection, such as viral microarrays and deep sequencing, are promising tools to address emerging and non-diagnosable disease challenges. In this study, we used the Virochip microarray and deep sequencing to characterize the spectrum of viruses present in human sera from 123 Nicaraguan patients presenting with dengue-like symptoms but testing negative for dengue virus. We utilized a barcoding strategy to simultaneously deep sequence multiple serum specimens, generating on average over 1 million reads per sample. We then implemented a stepwise bioinformatic filtering pipeline to remove the majority of human and low-quality sequences to improve the speed and accuracy of subsequent unbiased database searches. By deep sequencing, we were able to detect virus sequence in 37% (45/123) of previously negative cases. These included 13 cases with Human Herpesvirus 6 sequences. Other samples contained sequences with similarity to sequences from viruses in the Herpesviridae, Flaviviridae, Circoviridae, Anelloviridae, Asfarviridae, and Parvoviridae families. In some cases, the putative viral sequences were virtually identical to known viruses, and in others they diverged, suggesting that they may derive from novel viruses. These results demonstrate the utility of unbiased metagenomic approaches in the detection of known and divergent viruses in the study of tropical febrile illness. PMID:22347512
Sequence information gain based motif analysis.
Maynou, Joan; Pairó, Erola; Marco, Santiago; Perera, Alexandre
2015-11-09
The detection of regulatory regions in candidate sequences is essential for the understanding of the regulation of a particular gene and the mechanisms involved. This paper proposes a novel methodology based on information theoretic metrics for finding regulatory sequences in promoter regions. This methodology (SIGMA) has been tested on genomic sequence data for Homo sapiens and Mus musculus. SIGMA has been compared with different publicly available alternatives for motif detection, such as MEME/MAST, Biostrings (Bioconductor package), MotifRegressor, and previous work such Qresiduals projections or information theoretic based detectors. Comparative results, in the form of Receiver Operating Characteristic curves, show how, in 70% of the studied Transcription Factor Binding Sites, the SIGMA detector has a better performance and behaves more robustly than the methods compared, while having a similar computational time. The performance of SIGMA can be explained by its parametric simplicity in the modelling of the non-linear co-variability in the binding motif positions. Sequence Information Gain based Motif Analysis is a generalisation of a non-linear model of the cis-regulatory sequences detection based on Information Theory. This generalisation allows us to detect transcription factor binding sites with maximum performance disregarding the covariability observed in the positions of the training set of sequences. SIGMA is freely available to the public at http://b2slab.upc.edu.
Wang, Chun Guo; Chen, Xiao Qiang; Li, Hui; Zhao, Qian Cheng; Sun, De Ling; Song, Wen Qin
2008-02-01
Analysis of ISSR (Inter-Simple Sequence Repeat) and DDRT-PCR (Differential Display Reverse Transcriptase Polymerase Chain Reaction) was performed between cytoplasmic male sterility cauliflower ogura-A and its corresponding maintainer line ogura-B. Totally, 306 detectable bands were obtained by ISSR using thirty oligonucleotide primers. Commonly, six to twelve bands were produced per primer. Among all these primers only the amplification of primer ISSR3 was polymorphic, an 1100 bp specific band was only detected in maintainer line, named ISSR3(1100). Analysis of this sequence indicated that ISSR3(1100) was high homologous with the corresponding sequences of mitochondrial genome in Brassica napus and Arabidopsis thaliana,which suggested that ISSR3(1100) may derive from mitochondrial genome in cauliflower. To carry out DDRT-PCR analysis, three anchor primers and fifteen random primers were selected to combine. Totally, 1122 bands from 1 000 bp to 50 bp were detected. However, only four bands, named ogura-A 205, ogura-A383, ogura-B307 and ogura-B352, were confirmed to be different display in both lines. This result was further identified by reverse Northern dot blotting analysis. Among these four bands, ogura-A205 and ogura-A383 only express in cytoplasmic male sterility line, while ogura-B307 and ogura-B352 were only detected in maintainer line. Analysis of these sequences indicated that it was the first time that these four sequences were reported in cauliflower. Interestingly, ogura-A205 and ogura-B307 did not exhibit any similarities to other reported sequences in other species, more investigations were required to obtain further information. ogura-A383 and ogura-B352 were also two new sequences, they showed high similarities to corresponding chloroplast sequences of Arabidopsis thaliana and Brassica rapa subsp. pekinensis. So we speculated that these two sequences may derive from chloroplast genome. All these results obtained in this study offer new and significant information to investigate the molecular mechanism of cytoplasmic male sterility and fertile maintenance in cauliflower.
RAPTR-SV: a hybrid method for the detection of structural variants
USDA-ARS?s Scientific Manuscript database
Motivation: Identification of Structural Variants (SV) in sequence data results in a large number of false positive calls using existing software, which overburdens subsequent validation. Results: Simulations using RAPTR-SV and another software package that uses a similar algorithm for SV detection...
The Swiss-Army-Knife Approach to the Nearly Automatic Analysis for Microearthquake Sequences.
NASA Astrophysics Data System (ADS)
Kraft, T.; Simon, V.; Tormann, T.; Diehl, T.; Herrmann, M.
2017-12-01
Many Swiss earthquake sequence have been studied using relative location techniques, which often allowed to constrain the active fault planes and shed light on the tectonic processes that drove the seismicity. Yet, in the majority of cases the number of located earthquakes was too small to infer the details of the space-time evolution of the sequences, or their statistical properties. Therefore, it has mostly been impossible to resolve clear patterns in the seismicity of individual sequences, which are needed to improve our understanding of the mechanisms behind them. Here we present a nearly automatic workflow that combines well-established seismological analysis techniques and allows to significantly improve the completeness of detected and located earthquakes of a sequence. We start from the manually timed routine catalog of the Swiss Seismological Service (SED), which contains the larger events of a sequence. From these well-analyzed earthquakes we dynamically assemble a template set and perform a matched filter analysis on the station with: the best SNR for the sequence; and a recording history of at least 10-15 years, our typical analysis period. This usually allows us to detect events several orders of magnitude below the SED catalog detection threshold. The waveform similarity of the events is then further exploited to derive accurate and consistent magnitudes. The enhanced catalog is then analyzed statistically to derive high-resolution time-lines of the a- and b-value and consequently the occurrence probability of larger events. Many of the detected events are strong enough to be located using double-differences. No further manual interaction is needed; we simply time-shift the arrival-time pattern of the detecting template to the associated detection. Waveform similarity assures a good approximation of the expected arrival-times, which we use to calculate event-pair arrival-time differences by cross correlation. After a SNR and cycle-skipping quality check these are directly fed into hypoDD. Using this procedure we usually improve the number of well-relocated events by a factor 2-5. We demonstrate the successful application of the workflow at the example of natural sequences in Switzerland and present first results of the advanced analysis the was possible with the enhanced catalogs.
Observations of a hydrofracture induced earthquake sequence in Harrison County Ohio in 2014
NASA Astrophysics Data System (ADS)
Friberg, P. A.; Brudzinski, M. R.; Currie, B. S.; Skoumal, R.
2015-12-01
On October 7, 2014, a Mw 1.9 earthquake was detected and located using the IRIS Earthscope Transportable Array stations in Ohio. The earthquake was located at a depth of ~3 km near the interface of the Paleozoic sedimentary rocks with the crystalline Precambrian basement. The location is within a few kilometers laterally of a 2013 earthquake sequence that was linked to hydraulic fracturing (HF) operations on three wells in Harrison county (Friberg et al, 2014). Using the Mw 1.9 event as a template in a multi-component cross correlation detector on station O53A, over 1000 matching detections were revealed between September 26 - October 17, 2014. These detections were all coincident in time with HF operations on 3 nearby (< 1km away) horizontally drilled wells (Tarbert 1H, 3H, and 5H) in the Utica formation (~2.4 km depth). The HF operations at two of the wells (1H and 5H) were coincident with the majority of the detected events. The final well (3H) stimulated in the series, produced only about 20 identified events. In addition to the coincident timing with nearby HF operations, the time clustered nature of the detections were similar to the 2013 sequence and two other Ohio HF induced sequences in 2014 (Skoumal et al, 2015). All of the other HF induced earthquake sequences in Ohio were related to operations in the Utica formation. Interestingly, this sequence of earthquakes did not follow a simple Gutenberg-Richter magnitude frequency relationship and was deficient in positive magnitude events; the magnitude 1.9 was preceded by a magnitude 1.7, and only a ½ dozen events slightly above magnitude 0.0. The majority of the events detected were below magnitude 0.0, with some as low as magnitude -2.0. While the majority of detections are too small to locate, high similarity in waveform character indicate they are spatially near to the magnitude 1.9 event. Furthermore, gradual shifts in P phase arrival relative to S phases indicate events are moving away from the station with progressive HF stages. Given the orientation of the wells relative to the station, the migration of events away from the station with progressive stages is also supportive of this being an induced sequence.
Tsuchiya, Mariko; Amano, Kojiro; Abe, Masaya; Seki, Misato; Hase, Sumitaka; Sato, Kengo; Sakakibara, Yasubumi
2016-06-15
Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5'-end processing and 3'-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain. The source code of our program SHARAKU is available at http://www.dna.bio.keio.ac.jp/sharaku/, and the simulated dataset used in this work is available at the same link. Accession code: The sequence data from the whole RNA transcripts in the hippocampus of the left brain used in this work is available from the DNA DataBank of Japan (DDBJ) Sequence Read Archive (DRA) under the accession number DRA004502. yasu@bio.keio.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Abe, Niichiro; Maehara, Tomofumi
2013-06-01
The public health importance of Kudoa infection in fish remains unclear. Recently in Japan a Kudoa species, K. septempunctata, was newly implicated as a causative agent of unidentified food poisoning related to the consumption of raw olive flounder. Other marine fishery products are also suspected as causative raw foods of unidentified food poisoning. For this study, we detected kudoid parasites from sliced raw muscle tissues of a young Pacific bluefin and an adult yellowfin tuna. No cyst or pseudocyst was evident in muscles macroscopically, but pseudocysts were detected in both samples histologically. One substitution (within 1100 bp overlap) and ten substitutions (within 753 bp overlap) were found respectively between the partial sequences of 18S and 28S rDNAs from both isolates. Nucleotide sequence similarity searching of 18S and 28S rDNAs from both isolates showed the highest identity with those of K. neothunni from tuna. Based on the spore morphology, the mode of parasitism, and the nucleotide sequence similarity, these isolates from a Pacific bluefin and a yellowfin tuna were identified as K. neothunni. Phylogenetic analysis of the 28S rDNA sequence revealed that K. neothunni is classifiable into two genotypes: one from Pacific bluefin and the other from yellowfin tuna. Recently, an unidentified kudoid parasite morphologically and genetically similar K. neothunni were detected from stocked tuna samples in unidentified food poisoning cases in Japan. The possibility exists that K. neothunni, especially from the Pacific bluefin tuna, causes food poisoning, as does K. septempunctata.
Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons
Haas, Brian J.; Gevers, Dirk; Earl, Ashlee M.; Feldgarden, Mike; Ward, Doyle V.; Giannoukos, Georgia; Ciulla, Dawn; Tabbaa, Diana; Highlander, Sarah K.; Sodergren, Erica; Methé, Barbara; DeSantis, Todd Z.; Petrosino, Joseph F.; Knight, Rob; Birren, Bruce W.
2011-01-01
Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys. PMID:21212162
Analysis of sequence repeats of proteins in the PDB.
Mary Rajathei, David; Selvaraj, Samuel
2013-12-01
Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Singasa, Kanokwan; Songserm, Taweesak; Lertwatcharasarakul, Preeda; Arunvipas, Pipat
2017-10-01
Bovine coronavirus (BCoV) is involved mainly in enteric infections in cattle. This study reports the first molecular detection of BCoV in a diarrhea outbreak in dairy cows in the Central Region, Thailand. BCoV was molecularly detected from bloody diarrheic cattle feces by using nested PCR. Agarose gel electrophoresis of three diarrheic fecal samples yielded from the 25 samples desired amplicons that were 488 base pairs and sequencing substantiated that have BCoV. The sequence alignment indicated that nucleotide and amino acid sequences, the three TWD isolated in Thailand, were more quite homologous to each other (amino acid at position 39 of TWD1, TWD3 was proline, but TWD2 was serine) and closely related to OK-0514-3strain (virulent respiratory strain; RBCoV).The amino acid sequencing identities among TWD1, TWD2,TWD3, and OK-0514-3 strain were 96.0 to 96.6%, those at which T3I, H65N, D87G, H127Y, andQ136R were changed. In addition, the phylogenetic tree of the hypervariable region S1subunit spike glycoprotein BCoV gene was composed of three major clades by using the 54 sequences generated and showed that the evolutionally distance, TWD1, TWD2, and TWD3 were the isolated group together and most similar to OK-0514-3 strain (98.2 to 98.5% similarity). Further study will develop ELISA assay for serologic detection of winter dysentery disease.
Holm, Liisa; Laakso, Laura M
2016-07-08
The Dali server (http://ekhidna2.biocenter.helsinki.fi/dali) is a network service for comparing protein structures in 3D. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
First results of the SONS survey: submillimetre detections of debris discs
NASA Astrophysics Data System (ADS)
Panić, O.; Holland, W. S.; Wyatt, M. C.; Kennedy, G. M.; Matthews, B. C.; Lestrade, J. F.; Sibthorpe, B.; Greaves, J. S.; Marshall, J. P.; Phillips, N. M.; Tottle, J.
2013-10-01
New detections of debris discs at submillimetre wavelengths present highly valuable complementary information to prior observations of these sources at shorter wavelengths. Characterization of discs through spectral energy distribution modelling including the submillimetre fluxes is essential for our basic understanding of disc mass and temperature, and presents a starting point for further studies using millimetre interferometric observations. In the framework of the ongoing SCUBA-2 Observations of Nearby Stars, the instrument SCUBA-2 on the James Clerk Maxwell Telescope was used to provide measurements of 450 and 850 μm fluxes towards a large sample of nearby main-sequence stars with debris discs detected previously at shorter wavelengths. We present the first results from the ongoing survey, concerning 850 μm detections and 450 μm upper limits towards 10 stars, the majority of which are detected at submillimetre wavelengths for the first time. One, or possibly two, of these new detections is likely a background source. We fit the spectral energy distributions of the star+disc systems with a blackbody emission approach and derive characteristic disc temperatures. We use these temperatures to convert the observed fluxes to disc masses. We obtain a range of disc masses from 0.001 to 0.1 M⊕, values similar to the prior dust mass measurements towards debris discs. There is no evidence for evolution in dust mass with age on the main sequence, and indeed the upper envelope remains relatively flat at ≈0.5 M⊕ at all ages. The inferred disc masses are lower than those from disc detections around pre-main-sequence stars, which may indicate a depletion of solid mass. This may also be due to a change in disc opacity, though limited sensitivity means that it is not yet known what fraction of pre-main-sequence stars have discs with dust masses similar to debris disc levels. New, high-sensitivity detections are a path towards investigating the trends in dust mass evolution.
Neuwald, Andrew F
2009-08-01
The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.
Detection of a new bat gammaherpesvirus in the Philippines.
Watanabe, Shumpei; Ueda, Naoya; Iha, Koichiro; Masangkay, Joseph S; Fujii, Hikaru; Alviola, Phillip; Mizutani, Tetsuya; Maeda, Ken; Yamane, Daisuke; Walid, Azab; Kato, Kentaro; Kyuwa, Shigeru; Tohya, Yukinobu; Yoshikawa, Yasuhiro; Akashi, Hiroomi
2009-08-01
A new bat herpesvirus was detected in the spleen of an insectivorous bat (Hipposideros diadema, family Hipposideridae) collected on Panay Island, the Philippines. PCR analyses were performed using COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOPs) targeting the herpesvirus DNA polymerase (DPOL) gene. Although we obtained PCR products with CODEHOPs, direct sequencing using the primers was not possible because of high degree of degeneracy. Direct sequencing technology developed in our rapid determination system of viral RNA sequences (RDV) was applied in this study, and a partial DPOL nucleotide sequence was determined. In addition, a partial gB gene nucleotide sequence was also determined using the same strategy. We connected the partial gB and DPOL sequences with long-distance PCR, and a 3741-bp nucleotide fragment, including the 3' part of the gB gene and the 5' part of the DPOL gene, was finally determined. Phylogenetic analysis showed that the sequence was novel and most similar to those of the subfamily Gammaherpesvirinae.
Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A
2016-03-01
The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Provencher, Cathy; LaPointe, Gisèle; Sirois, Stéphane; Van Calsteren, Marie-Rose; Roy, Denis
2003-01-01
A primer design strategy named CODEHOP (consensus-degenerate hybrid oligonucleotide primer) for amplification of distantly related sequences was used to detect the priming glycosyltransferase (GT) gene in strains of the Lactobacillus casei group. Each hybrid primer consisted of a short 3′ degenerate core based on four highly conserved amino acids and a longer 5′ consensus clamp region based on six sequences of the priming GT gene products from exopolysaccharide (EPS)-producing bacteria. The hybrid primers were used to detect the priming GT gene of 44 commercial isolates and reference strains of Lactobacillus rhamnosus, L. casei, Lactobacillus zeae, and Streptococcus thermophilus. The priming GT gene was detected in the genome of both non-EPS-producing (EPS−) and EPS-producing (EPS+) strains of L. rhamnosus. The sequences of the cloned PCR products were similar to those of the priming GT gene of various gram-negative and gram-positive EPS+ bacteria. Specific primers designed from the L. rhamnosus RW-9595M GT gene were used to sequence the end of the priming GT gene in selected EPS+ strains of L. rhamnosus. Phylogenetic analysis revealed that Lactobacillus spp. form a distinctive group apart from other lactic acid bacteria for which GT genes have been characterized to date. Moreover, the sequences show a divergence existing among strains of L. rhamnosus with respect to the terminal region of the priming GT gene. Thus, the PCR approach with consensus-degenerate hybrid primers designed with CODEHOP is a practical approach for the detection of similar genes containing conserved motifs in different bacterial genomes. PMID:12788729
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.
Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai
2015-12-01
The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment
2013-01-01
Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.
Nagar, Anurag; Hahsler, Michael
2013-01-01
Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.
A reference human genome dataset of the BGISEQ-500 sequencer.
Huang, Jie; Liang, Xinming; Xuan, Yuankai; Geng, Chunyu; Li, Yuxiang; Lu, Haorong; Qu, Shoufang; Mei, Xianglin; Chen, Hongbo; Yu, Ting; Sun, Nan; Rao, Junhua; Wang, Jiahao; Zhang, Wenwei; Chen, Ying; Liao, Sha; Jiang, Hui; Liu, Xin; Yang, Zhaopeng; Mu, Feng; Gao, Shangxian
2017-05-01
BGISEQ-500 is a new desktop sequencer developed by BGI. Using DNA nanoball and combinational probe anchor synthesis developed from Complete Genomics™ sequencing technologies, it generates short reads at a large scale. Here, we present the first human whole-genome sequencing dataset of BGISEQ-500. The dataset was generated by sequencing the widely used cell line HG001 (NA12878) in two sequencing runs of paired-end 50 bp (PE50) and two sequencing runs of paired-end 100 bp (PE100). We also include examples of the raw images from the sequencer for reference. Finally, we identified variations using this dataset, estimated the accuracy of the variations, and compared to that of the variations identified from similar amounts of publicly available HiSeq2500 data. We found similar single nucleotide polymorphism (SNP) detection accuracy for the BGISEQ-500 PE100 data (false positive rate [FPR] = 0.00020%, sensitivity = 96.20%) compared to the PE150 HiSeq2500 data (FPR = 0.00017%, sensitivity = 96.60%) better SNP detection accuracy than the PE50 data (FPR = 0.0006%, sensitivity = 94.15%). But for insertions and deletions (indels), we found lower accuracy for BGISEQ-500 data (FPR = 0.00069% and 0.00067% for PE100 and PE50 respectively, sensitivity = 88.52% and 70.93%) than the HiSeq2500 data (FPR = 0.00032%, sensitivity = 96.28%). Our dataset can serve as the reference dataset, providing basic information not just for future development, but also for all research and applications based on the new sequencing platform. © The Authors 2017. Published by Oxford University Press.
Biodiversity of air-borne microorganisms at Halley Station, Antarctica.
Pearce, David A; Hughes, K A; Lachlan-Cope, T; Harangozo, S A; Jones, A E
2010-03-01
A study of air-borne microbial biodiversity over an isolated scientific research station on an ice-shelf in continental Antarctica was undertaken to establish the potential source of microbial colonists. The study aimed to assess: (1) whether microorganisms were likely to have a local (research station) or distant (marine or terrestrial) origin, (2) the effect of changes in sea ice extent on microbial biodiversity and (3) the potential human impact on the environment. Air samples were taken above Halley Research Station during the austral summer and austral winter over a 2-week period. Overall, a low microbial biodiversity was detected, which included many sequence replicates. No significant patterns were detected in the aerial biodiversity between the austral summer and the austral winter. In common with other environmental studies, particularly in the polar regions, many of the sequences obtained were from as yet uncultivated organisms. Very few marine sequences were detected irrespective of the distance to open water, and around one-third of sequences detected were similar to those identified in human studies, though both of these might reflect prevailing wind conditions. The detected aerial microorganisms were markedly different from those obtained in earlier studies over the Antarctic Peninsula in the maritime Antarctic.
Mucosal and Cutaneous Human Papillomaviruses Detected in Raw Sewages
La Rosa, Giuseppina; Fratini, Marta; Accardi, Luisa; D'Oro, Graziana; Della Libera, Simonetta; Muscillo, Michele; Di Bonito, Paola
2013-01-01
Epitheliotropic viruses can find their way into sewage. The aim of the present study was to investigate the occurrence, distribution, and genetic diversity of Human Papillomaviruses (HPVs) in urban wastewaters. Sewage samples were collected from treatment plants distributed throughout Italy. The DNA extracted from these samples was analyzed by PCR using five PV-specific sets of primers targeting the L1 (GP5/GP6, MY09/MY11, FAP59/64, SKF/SKR) and E1 regions (PM-A/PM-B), according to the protocols previously validated for the detection of mucosal and cutaneous HPV genotypes. PCR products underwent sequencing analysis and the sequences were aligned to reference genomes from the Papillomavirus Episteme database. Phylogenetic analysis was then performed to assess the genetic relationships among the different sequences and between the sequences of the samples and those of the prototype strains. A broad spectrum of sequences related to mucosal and cutaneous HPV types was detected in 81% of the sewage samples analyzed. Surprisingly, sequences related to the anogenital HPV6 and 11 were detected in 19% of the samples, and sequences related to the “high risk” oncogenic HPV16 were identified in two samples. Sequences related to HPV9, HPV20, HPV25, HPV76, HPV80, HPV104, HPV110, HPV111, HPV120 and HPV145 beta Papillomaviruses were detected in 76% of the samples. In addition, similarity searches and phylogenetic analysis of some sequences suggest that they could belong to putative new genotypes of the beta genus. In this study, for the first time, the presence of HPV viruses strongly related to human cancer is reported in sewage samples. Our data increases the knowledge of HPV genomic diversity and suggests that virological analysis of urban sewage can provide key information useful in supporting epidemiological studies. PMID:23341898
Molecular evidence for piroplasms in wild Reeves' muntjac (Muntiacus reevesi) in China.
Yang, Ji-fei; Li, You-quan; Liu, Zhi-jie; Liu, Jun-long; Guan, Gui-quan; Chen, Ze; Luo, Jian-xun; Wang, Xiao-long; Yin, Hong
2014-10-01
DNA from liver samples of 17 free-ranging wild Reeves' muntjac (Muntiacus reevesi) was used for PCR amplification of piropalsm 18S rRNA gene. Of 17 samples, 14 (82.4%) showed a specific PCR product which were cloned and sequenced. BLAST analysis of the sequences obtained showed similarities to Babesia sp., Theileria capreoli, Theileria uilenbergi and Theileria sp. BO302-SE. Phylogenetic analysis showed that the Babesia sp. detected in the present study was distantly separated from known Babesia species of wild and domestic animals. Six sequences showed 100% similarity to T. capreoli while five sequences were separated from all known Theileria species and constituted an independent clade with Theileria sp. BO302-SE derived from roe deer in Italy; two sequences were close to T. uilenbergi with 97% similarity. This is the first description of hemoparasite infection in free-ranging wild Reeves' muntjac in China. Our results indicate that wild Reeves' muntjac may play an important reservoir role for hemoparasites. Crown Copyright © 2014. Published by Elsevier Ireland Ltd. All rights reserved.
Directional genomic hybridization for chromosomal inversion discovery and detection.
Ray, F Andrew; Zimmerman, Erin; Robinson, Bruce; Cornforth, Michael N; Bedford, Joel S; Goodwin, Edwin H; Bailey, Susan M
2013-04-01
Chromosomal rearrangements are a source of structural variation within the genome that figure prominently in human disease, where the importance of translocations and deletions is well recognized. In principle, inversions-reversals in the orientation of DNA sequences within a chromosome-should have similar detrimental potential. However, the study of inversions has been hampered by traditional approaches used for their detection, which are not particularly robust. Even with significant advances in whole genome approaches, changes in the absolute orientation of DNA remain difficult to detect routinely. Consequently, our understanding of inversions is still surprisingly limited, as is our appreciation for their frequency and involvement in human disease. Here, we introduce the directional genomic hybridization methodology of chromatid painting-a whole new way of looking at structural features of the genome-that can be employed with high resolution on a cell-by-cell basis, and demonstrate its basic capabilities for genome-wide discovery and targeted detection of inversions. Bioinformatics enabled development of sequence- and strand-specific directional probe sets, which when coupled with single-stranded hybridization, greatly improved the resolution and ease of inversion detection. We highlight examples of the far-ranging applicability of this cytogenomics-based approach, which include confirmation of the alignment of the human genome database and evidence that individuals themselves share similar sequence directionality, as well as use in comparative and evolutionary studies for any species whose genome has been sequenced. In addition to applications related to basic mechanistic studies, the information obtainable with strand-specific hybridization strategies may ultimately enable novel gene discovery, thereby benefitting the diagnosis and treatment of a variety of human disease states and disorders including cancer, autism, and idiopathic infertility.
DNA-DNA hybridization was used to compare the Pseudomonas strain LB400 genes for polychlorinated biphenyl (PCB) degradation with those from seven other PCB-degrading strains. Significant hybridization was detected to the genome of Alcaligenes eutrophus H850, a strain similar to L...
Rahman, Arfatur; Sahrin, Mahfuza; Afrin, Sadia; Earley, Keith; Ahmed, Shahriar; Rahman, S M Mazidur; Banu, Sayera
2016-01-01
GeneXpert MTB/RIF (Xpert) and Genotype MTBDRplus (DRplus) are two World Health Organization (WHO) endorsed probe based molecular drug susceptibility testing (DST) methods for rapid diagnosis of drug resistant tuberculosis. Both methods target the same 81 bp Rifampicin Resistance Determining Region (RRDR) of bacterial RNA polymerase β subunit (rpoB) for detection of Rifampicin (RIF) resistance associated mutations using DNA probes. So there is a correspondence of the probes of each other and expected similarity of probe binding. We analyzed 92 sputum specimens by Xpert, DRplus and LJ proportion method (LJ-DST). We compared molecular DSTs with gold standard LJ-DST. We wanted to see the agreement level of two molecular methods for detection of RIF resistance associated mutations. The 81bp RRDR region of rpoB gene of discrepant cases between the two molecular methods was sequenced by Sanger sequencing. The agreement of Xpert and DRplus with LJ-DST for detection of RIF susceptibility was found to be 93.5% and 92.4%, respectively. We also found 92.4% overall agreement of two molecular methods for the detection of RIF susceptibility. A total of 84 out of 92 samples (91.3%) had agreement on the molecular locus of RRDR mutation by DRplus and Xpert. Sanger sequencing of 81bp RRDR revealed that Xpert probes detected seven of eight discrepant cases correctly and DRplus was erroneous in all the eight cases. Although the overall concordance with LJ-DST was similar for both Xpert and DRplus assay, Xpert demonstrated more accuracy in the detection of RIF susceptibility for discrepant isolates compared with DRplus. This observation would be helpful for the improvement of probe based detection of drug resistance associated mutations especially rpoB mutation in M. tuberculosis.
Cloning and characterization of two novel DNases from Streptococcus pyogenes.
Hasegawa, Tadao; Torii, Keizo; Hashikawa, Shinnosuke; Iinuma, Yoshitsugu; Ohta, Michio
2002-06-01
The proteins in the culture supernatant (exoproteins) from Streptococcus pyogenes serotype M1 were separated by two-dimensional gel electrophoresis, and their N-terminal amino acid sequences were determined. The amino acid sequences were compared to sequences in the S. pyogenes genome database. The coding sequence showed similarity to sequences of two genes, mf2-v ( mf2 variant) and mf3, which had sequence similarity to genes encoding mitogenic factor (MF); MF has DNase activity. The recombinant genes were expressed in Escherichia coli and the proteins were synthesized. Mf2-v and Mf3 had DNase activity. The activity of Mf2-v was localized to the C-terminal half of the protein. The mf3 gene was shown to be present in most clinically isolated strains of S. pyogenes tested, and the mf2gene was detected in 20% of the isolates. The products of the mf2 and mf3 genes in clinically isolated S. pyogenes strains were thus shown to be DNases.
Tuning Selectivity of Fluorescent Carbon Nanotube-Based Neurotransmitter Sensors.
Mann, Florian A; Herrmann, Niklas; Meyer, Daniel; Kruss, Sebastian
2017-06-28
Detection of neurotransmitters is an analytical challenge and essential to understand neuronal networks in the brain and associated diseases. However, most methods do not provide sufficient spatial, temporal, or chemical resolution. Near-infrared (NIR) fluorescent single-walled carbon nanotubes (SWCNTs) have been used as building blocks for sensors/probes that detect catecholamine neurotransmitters, including dopamine. This approach provides a high spatial and temporal resolution, but it is not understood if these sensors are able to distinguish dopamine from similar catecholamine neurotransmitters, such as epinephrine or norepinephrine. In this work, the organic phase (DNA sequence) around SWCNTs was varied to create sensors with different selectivity and sensitivity for catecholamine neurotransmitters. Most DNA-functionalized SWCNTs responded to catecholamine neurotransmitters, but both dissociation constants ( K d ) and limits of detection were highly dependent on functionalization (sequence). K d values span a range of 2.3 nM (SWCNT-(GC) 15 + norepinephrine) to 9.4 μM (SWCNT-(AT) 15 + dopamine) and limits of detection are mostly in the single-digit nM regime. Additionally, sensors of different SWCNT chirality show different fluorescence increases. Moreover, certain sensors (e.g., SWCNT-(GT) 10 ) distinguish between different catecholamines, such as dopamine and norepinephrine at low concentrations (50 nM). These results show that SWCNTs functionalized with certain DNA sequences are able to discriminate between catecholamine neurotransmitters or to detect them in the presence of interfering substances of similar structure. Such sensors will be useful to measure and study neurotransmitter signaling in complex biological settings.
Sequence Alignment to Predict Across Species Susceptibility ...
Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.
Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H
2017-04-15
Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification
Sinclair, Robert M.; Ravantti, Janne J.
2017-01-01
ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979
Precise genotyping and recombination detection of Enterovirus
2015-01-01
Enteroviruses (EV) with different genotypes cause diverse infectious diseases in humans and mammals. A correct EV typing result is crucial for effective medical treatment and disease control; however, the emergence of novel viral strains has impaired the performance of available diagnostic tools. Here, we present a web-based tool, named EVIDENCE (EnteroVirus In DEep conception, http://symbiont.iis.sinica.edu.tw/evidence), for EV genotyping and recombination detection. We introduce the idea of using mixed-ranking scores to evaluate the fitness of prototypes based on relatedness and on the genome regions of interest. Using phylogenetic methods, the most possible genotype is determined based on the closest neighbor among the selected references. To detect possible recombination events, EVIDENCE calculates the sequence distance and phylogenetic relationship among sequences of all sliding windows scanning over the whole genome. Detected recombination events are plotted in an interactive figure for viewing of fine details. In addition, all EV sequences available in GenBank were collected and revised using the latest classification and nomenclature of EV in EVIDENCE. These sequences are built into the database and are retrieved in an indexed catalog, or can be searched for by keywords or by sequence similarity. EVIDENCE is the first web-based tool containing pipelines for genotyping and recombination detection, with updated, built-in, and complete reference sequences to improve sensitivity and specificity. The use of EVIDENCE can accelerate genotype identification, aiding clinical diagnosis and enhancing our understanding of EV evolution. PMID:26678286
Merkel, Daniel; Brinkmann, Eckard; Kämmer, Joerg C; Köhler, Miriam; Wiens, Daniel; Derwahl, Karl-Michael
2015-09-01
The electronic colorization of grayscale B-mode sonograms using various color schemes aims to enhance the adaptability and practicability of B-mode sonography in daylight conditions. The purpose of this study was to determine the diagnostic effectiveness and importance of colorized B-mode sonography. Fifty-three video sequences of sonographic examinations of the liver were digitized and subsequently colorized in 2 different color combinations (yellow-brown and blue-white). The set of 53 images consisted of 33 with isoechoic masses, 8 with obvious lesions of the liver (hypoechoic or hyperechoic), and 12 with inconspicuous reference images of the liver. The video sequences were combined in a random order and edited into half-hour video clips. Isoechoic liver lesions were successfully detected in 58% of the yellow-brown video sequences and in 57% of the grayscale video sequences (P = .74, not significant). Fifty percent of the isoechoic liver lesions were successfully detected in the blue-white video sequences, as opposed to a 55% detection rate in the corresponding grayscale video sequences (P= .11, not significant). In 2 subgroups, significantly more liver lesions were detected with grayscale sonography compared to blue-white sonography. Yellow-brown-colorized B-mode sonography appears to be similarly effective for detection of isoechoic parenchymal liver lesions as traditional grayscale sonography. Blue-white colorization in B-mode sonography is probably not as effective as grayscale sonography, although a statistically significant disadvantage was shown only in the subgroup of hyperechoic liver lesions. © 2015 by the American Institute of Ultrasound in Medicine.
Adhesive Proteins of Stalked and Acorn Barnacles Display Homology with Low Sequence Similarities
Jonker, Jaimie-Leigh; Abram, Florence; Pires, Elisabete; Varela Coelho, Ana; Grunwald, Ingo; Power, Anne Marie
2014-01-01
Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins ‘sticky’ has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia) by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes). It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa). Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7–16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k) showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes). Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18–26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa) are more conserved within barnacles than others (20 kDa). PMID:25295513
Adhesive proteins of stalked and acorn barnacles display homology with low sequence similarities.
Jonker, Jaimie-Leigh; Abram, Florence; Pires, Elisabete; Varela Coelho, Ana; Grunwald, Ingo; Power, Anne Marie
2014-01-01
Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins 'sticky' has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia) by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes). It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa). Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7-16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k) showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes). Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18-26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa) are more conserved within barnacles than others (20 kDa).
Pagnuco, Inti Anabela; Revuelta, María Victoria; Bondino, Hernán Gabriel; Brun, Marcel; Ten Have, Arjen
2018-01-01
Protein superfamilies can be divided into subfamilies of proteins with different functional characteristics. Their sequences can be classified hierarchically, which is part of sequence function assignation. Typically, there are no clear subfamily hallmarks that would allow pattern-based function assignation by which this task is mostly achieved based on the similarity principle. This is hampered by the lack of a score cut-off that is both sensitive and specific. HMMER Cut-off Threshold Tool (HMMERCTTER) adds a reliable cut-off threshold to the popular HMMER. Using a high quality superfamily phylogeny, it clusters a set of training sequences such that the cluster-specific HMMER profiles show cluster or subfamily member detection with 100% precision and recall (P&R), thereby generating a specific threshold as inclusion cut-off. Profiles and thresholds are then used as classifiers to screen a target dataset. Iterative inclusion of novel sequences to groups and the corresponding HMMER profiles results in high sensitivity while specificity is maintained by imposing 100% P&R self detection. In three presented case studies of protein superfamilies, classification of large datasets with 100% precision was achieved with over 95% recall. Limits and caveats are presented and explained. HMMERCTTER is a promising protein superfamily sequence classifier provided high quality training datasets are used. It provides a decision support system that aids in the difficult task of sequence function assignation in the twilight zone of sequence similarity. All relevant data and source codes are available from the Github repository at the following URL: https://github.com/BBCMdP/HMMERCTTER.
Pagnuco, Inti Anabela; Revuelta, María Victoria; Bondino, Hernán Gabriel; Brun, Marcel
2018-01-01
Background Protein superfamilies can be divided into subfamilies of proteins with different functional characteristics. Their sequences can be classified hierarchically, which is part of sequence function assignation. Typically, there are no clear subfamily hallmarks that would allow pattern-based function assignation by which this task is mostly achieved based on the similarity principle. This is hampered by the lack of a score cut-off that is both sensitive and specific. Results HMMER Cut-off Threshold Tool (HMMERCTTER) adds a reliable cut-off threshold to the popular HMMER. Using a high quality superfamily phylogeny, it clusters a set of training sequences such that the cluster-specific HMMER profiles show cluster or subfamily member detection with 100% precision and recall (P&R), thereby generating a specific threshold as inclusion cut-off. Profiles and thresholds are then used as classifiers to screen a target dataset. Iterative inclusion of novel sequences to groups and the corresponding HMMER profiles results in high sensitivity while specificity is maintained by imposing 100% P&R self detection. In three presented case studies of protein superfamilies, classification of large datasets with 100% precision was achieved with over 95% recall. Limits and caveats are presented and explained. Conclusions HMMERCTTER is a promising protein superfamily sequence classifier provided high quality training datasets are used. It provides a decision support system that aids in the difficult task of sequence function assignation in the twilight zone of sequence similarity. All relevant data and source codes are available from the Github repository at the following URL: https://github.com/BBCMdP/HMMERCTTER. PMID:29579071
Saijuntha, Weerachai; Sithithaworn, Paiboon; Duenngai, Kunyarat; Kiatsopit, Nadda; Andrews, Ross H; Petney, Trevor N
2011-03-01
Multilocus enzyme electrophoresis (MEE) and DNA sequencing of the mitochondrial cytochrome c oxidase subunit 1 (CO1) gene were used to genetically compare four species of echinostomes of human health importance. Fixed genetic differences among adults of Echinostoma revolutum, Echinostoma malayanum, Echinoparyphium recurvatum and Hypoderaeum conoideum were detected at 51-75% of the enzyme loci examined, while interspecific differences in CO1 sequence were detected at 16-32 (8-16%) of the 205 alignment positions. The results of the MEE analyses also revealed fixed genetic differences between E. revolutum from Thailand and Lao PDR at five (19%) of 27 loci, which could either represent genetic variation between geographically separated populations of a single species, or the existence of a cryptic (i.e. genetically distinct but morphologically similar) species. However, there was no support for the existence of cryptic species within E. revolutum based on the CO1 sequence between the two geographical areas sampled. Genetic variation in CO1 sequence was also detected among E. malayanum from three different species of snail intermediate host. Separate phylogenetic analyses of the MEE and DNA sequence data revealed that the two species of Echinostoma (E. revolutum and E. malayanum) did not form a monophyletic clade. These results, together with the large number of morphologically similar species with inadequate descriptions, poor specific diagnoses and extensive synonymy, suggest that the morphological characters used for species taxonomy of echinostomes in South-East Asia should be reconsidered according to the concordance of biology, morphology and molecular classification. Copyright © 2010 Elsevier B.V. All rights reserved.
Reichert, Miriam; Morelli, John N; Runge, Val M; Tao, Ai; von Ritschl, Ruediger; von Ritschl, Andreas; Padua, Abraham; Dix, James E; Marra, Michael J; Schoenberg, Stefan O; Attenberger, Ulrike I
2013-01-01
The aim of this study was to compare the detection of brain metastases at 3 T using a 32-channel head coil with 2 different 3-dimensional (3D) contrast-enhanced sequences, a T1-weighted fast spin-echo-based (SPACE; sampling perfection with application-optimized contrasts using different flip angle evolutions) sequence and a conventional magnetization-prepared rapid gradient-echo (MP-RAGE) sequence. Seventeen patients with 161 brain metastases were examined prospectively using both SPACE and MP-RAGE sequences on a 3-T magnetic resonance system. Eight healthy volunteers were similarly examined for determination of signal-to-noise ratio (SNR) values. Parameters were adjusted to equalize acquisition times between the sequences (3 minutes and 30 seconds). The order in which sequences were performed was randomized. Two blinded board-certified neuroradiologists evaluated the number of detectable metastatic lesions with each sequence relative to a criterion standard reading conducted at the Gamma Knife facility by a neuroradiologist with access to all clinical and imaging data. In the volunteer assessment with SPACE and MP-RAGE, SNR (10.3 ± 0.8 vs 7.7 ± 0.7) and contrast-to-noise ratio (0.8 ± 0.2 vs 0.5 ± 0.1) were statistically significantly greater with the SPACE sequence (P < 0.05). Overall, lesion detection was markedly improved with the SPACE sequence (99.1% of lesions for reader 1 and 96.3% of lesions for reader 2) compared with the MP-RAGE sequence (73.6% of lesions for reader 1 and 68.5% of lesions for reader 2; P < 0.01). A 3D T1-weighted fast spin echo sequence (SPACE) improves detection of metastatic lesions relative to 3D T1-weighted gradient-echo-based scan (MP-RAGE) imaging when implemented with a 32-channel head coil at identical scan acquisition times (3 minutes and 30 seconds).
Contour Tracking with a Spatio-Temporal Intensity Moment.
Demi, Marcello
2016-06-01
Standard edge detection operators such as the Laplacian of Gaussian and the gradient of Gaussian can be used to track contours in image sequences. When using edge operators, a contour, which is determined on a frame of the sequence, is simply used as a starting contour to locate the nearest contour on the subsequent frame. However, the strategy used to look for the nearest edge points may not work when tracking contours of non isolated gray level discontinuities. In these cases, strategies derived from the optical flow equation, which look for similar gray level distributions, appear to be more appropriate since these can work with a lower frame rate than that needed for strategies based on pure edge detection operators. However, an optical flow strategy tends to propagate the localization errors through the sequence and an additional edge detection procedure is essential to compensate for such a drawback. In this paper a spatio-temporal intensity moment is proposed which integrates the two basic functions of edge detection and tracking.
Detecting Earthquakes over a Seismic Network using Single-Station Similarity Measures
NASA Astrophysics Data System (ADS)
Bergen, Karianne J.; Beroza, Gregory C.
2018-03-01
New blind waveform-similarity-based detection methods, such as Fingerprint and Similarity Thresholding (FAST), have shown promise for detecting weak signals in long-duration, continuous waveform data. While blind detectors are capable of identifying similar or repeating waveforms without templates, they can also be susceptible to false detections due to local correlated noise. In this work, we present a set of three new methods that allow us to extend single-station similarity-based detection over a seismic network; event-pair extraction, pairwise pseudo-association, and event resolution complete a post-processing pipeline that combines single-station similarity measures (e.g. FAST sparse similarity matrix) from each station in a network into a list of candidate events. The core technique, pairwise pseudo-association, leverages the pairwise structure of event detections in its network detection model, which allows it to identify events observed at multiple stations in the network without modeling the expected move-out. Though our approach is general, we apply it to extend FAST over a sparse seismic network. We demonstrate that our network-based extension of FAST is both sensitive and maintains a low false detection rate. As a test case, we apply our approach to two weeks of continuous waveform data from five stations during the foreshock sequence prior to the 2014 Mw 8.2 Iquique earthquake. Our method identifies nearly five times as many events as the local seismicity catalog (including 95% of the catalog events), and less than 1% of these candidate events are false detections.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.
Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P
1986-02-01
Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations.
Detection of herpes simplex virus-specific DNA sequences in latently infected mice and in humans.
Efstathiou, S; Minson, A C; Field, H J; Anderson, J R; Wildy, P
1986-01-01
Herpes simplex virus-specific DNA sequences have been detected by Southern hybridization analysis in both central and peripheral nervous system tissues of latently infected mice. We have detected virus-specific sequences corresponding to the junction fragment but not the genomic termini, an observation first made by Rock and Fraser (Nature [London] 302:523-525, 1983). This "endless" herpes simplex virus DNA is both qualitatively and quantitatively stable in mouse neural tissue analyzed over a 4-month period. In addition, examination of DNA extracted from human trigeminal ganglia has shown herpes simplex virus DNA to be present in an "endless" form similar to that found in the mouse model system. Further restriction enzyme analysis of latently infected mouse brainstem and human trigeminal DNA has shown that this "endless" herpes simplex virus DNA is present in all four isomeric configurations. Images PMID:3003377
Choi, In-Wook; Kim, Hwang-Yong; Quan, Juan-Hua; Ryu, Jae-Gee; Sun, Rubing; Lee, Young-Ha
2015-10-01
Fascioliasis, a food-borne trematode zoonosis, is a disease primarily in cattle and sheep and occasionally in humans. Water dropwort (Oenanthe javanica), an aquatic perennial herb, is a common second intermediate host of Fasciola, and the fresh stems and leaves are widely used as a seasoning in the Korean diet. However, no information regarding Fasciola species contamination in water dropwort is available. Here, we collected 500 samples of water dropwort in 3 areas in Korea during February and March 2015, and the water dropwort contamination of Fasciola species was monitored by DNA sequencing analysis of the Fasciola hepatica and Fasciola gigantica specific mitochondrial cytochrome c oxidase subunit 1 (cox1) and nuclear ribosomal internal transcribed spacer 2 (ITS-2). Among the 500 samples assessed, the presence of F. hepatica cox1 and 1TS-2 markers were detected in 2 samples, and F. hepatica contamination was confirmed by sequencing analysis. The nucleotide sequences of cox1 PCR products from the 2 F. hepatica-contaminated samples were 96.5% identical to the F. hepatica cox1 sequences in GenBank, whereas F. gigantica cox1 sequences were 46.8% similar with the sequence detected from the cox1 positive samples. However, F. gigantica cox1 and ITS-2 markers were not detected by PCR in the 500 samples of water dropwort. Collectively, in this survey of the water dropwort contamination with Fasciola species, very low prevalence of F. hepatica contamination was detected in the samples.
Gupta, Radhey S; Khadka, Bijendra
2016-02-01
Homologs showing high degree of sequence similarity to the three subunits of the protochlorophyllide oxidoreductase enzyme complex (viz. BchL, BchN, and BchB), which carries out a central role in chlorophyll-bacteriochlorophyll (Bchl) biosynthesis, are uniquely found in photosynthetic organisms. The results of BLAST searches and homology modeling presented here show that proteins exhibiting a high degree of sequence and structural similarity to the BchB and BchN proteins are also present in organisms from the high G+C Gram-positive phylum of Actinobacteria, specifically in members of the genus Rubrobacter (R. x ylanophilus and R. r adiotolerans). The results presented exclude the possibility that the observed BLAST hits are for subunits of the nitrogenase complex or the chlorin reductase complex. The branching in phylogenetic trees and the sequence characteristics of the Rubrobacter BchB/BchN homologs indicate that these homologs are distinct from those found in other photosynthetic bacteria and that they may represent ancestral forms of the BchB/BchN proteins. Although a homolog showing high degree of sequence similarity to the BchL protein was not detected in Rubrobacter, another protein, belonging to the ParA/Soj/MinD family, present in these bacteria, exhibits high degree of structural similarity to the BchL. In addition to the BchB/BchN homologs, Rubrobacter species also contain homologs showing high degree of sequence similarity to different subunits of magnesium chelatase (BchD, BchH, and BchI) as well as proteins showing significant similarity to the BchP and BchG proteins. Interestingly, no homologs corresponding to the BchX, BchY, and BchZ proteins were detected in the Rubrobacter species. These results provide the first suggestive evidence that some form of photosynthesis either exists or was anciently present within the phylum Actinobacteria (high G+C Gram-positive) in members of the genus Rubrobacter. The significance of these results concerning the origin of the Bchl-based photosynthesis is also discussed.
Quantifying the relationship between sequence and three-dimensional structure conservation in RNA
2010-01-01
Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657
Wen, B; Rikihisa, Y; Fuerst, P A; Chaichanasiriwithaya, W
1995-04-01
Ehrlichia risticii is the causative agent of Potomac horse fever. Variations among the major antigens of different local E. risticii strains have been detected previously. To further assess genetic variability in this species or species complex, the sequences of the 16S rRNA genes of several isolates obtained from sick horses diagnosed as having Potomac horse fever were determined. The sequences of six isolates obtained from Ohio and three isolates obtained from Kentucky were amplified by PCR. Three groups of sequences were identified. The sequences of five of the Ohio isolates were identical to the sequence of the type strain of E. risticii, the Illinois strain. The sequence of one Ohio isolate, isolate 081, was unique; this sequence differed in 10 nucleotides from the sequence of the type strain (level of similarity, 99.3%). The sequences of the three Kentucky isolates were identical to each other, but differed by five bases from the sequence of the type strain (level of similarity, 99.6%). The levels of sequence similarity of isolate 081, the Kentucky isolates, and the type strain to the next most closely related Ehrlichia sp., Ehrlichia sennetsu, were 99.3, 99.2, and 99.2%, respectively. On the basis of the distinct antigenic profiles and the levels of 16S rRNA sequence divergence, isolate 081 is as divergent from the type strain of E. risticii as E. sennetsu is. Therefore, we suggest that strain 081 and the Kentucky isolates may represent two new distinct Ehrlichia species.
A New Single-Step PCR Assay for the Detection of the Zoonotic Malaria Parasite Plasmodium knowlesi
Lucchi, Naomi W.; Poorak, Mitra; Oberstaller, Jenna; DeBarry, Jeremy; Srinivasamoorthy, Ganesh; Goldman, Ira; Xayavong, Maniphet; da Silva, Alexandre J.; Peterson, David S.; Barnwell, John W.; Kissinger, Jessica; Udhayakumar, Venkatachalam
2012-01-01
Background Recent studies in Southeast Asia have demonstrated substantial zoonotic transmission of Plasmodium knowlesi to humans. Microscopically, P. knowlesi exhibits several stage-dependent morphological similarities to P. malariae and P. falciparum. These similarities often lead to misdiagnosis of P. knowlesi as either P. malariae or P. falciparum and PCR-based molecular diagnostic tests are required to accurately detect P. knowlesi in humans. The most commonly used PCR test has been found to give false positive results, especially with a proportion of P. vivax isolates. To address the need for more sensitive and specific diagnostic tests for the accurate diagnosis of P. knowlesi, we report development of a new single-step PCR assay that uses novel genomic targets to accurately detect this infection. Methodology and Significant Findings We have developed a bioinformatics approach to search the available malaria parasite genome database for the identification of suitable DNA sequences relevant for molecular diagnostic tests. Using this approach, we have identified multi-copy DNA sequences distributed in the P. knowlesi genome. We designed and tested several novel primers specific to new target sequences in a single-tube, non-nested PCR assay and identified one set of primers that accurately detects P. knowlesi. We show that this primer set has 100% specificity for the detection of P. knowlesi using three different strains (Nuri, H, and Hackeri), and one human case of malaria caused by P. knowlesi. This test did not show cross reactivity with any of the four human malaria parasite species including 11 different strains of P. vivax as well as 5 additional species of simian malaria parasites. Conclusions The new PCR assay based on novel P. knowlesi genomic sequence targets was able to accurately detect P. knowlesi. Additional laboratory and field-based testing of this assay will be necessary to further validate its utility for clinical diagnosis of P. knowlesi. PMID:22363751
A Novel Center Star Multiple Sequence Alignment Algorithm Based on Affine Gap Penalty and K-Band
NASA Astrophysics Data System (ADS)
Zou, Quan; Shan, Xiao; Jiang, Yi
Multiple sequence alignment is one of the most important topics in computational biology, but it cannot deal with the large data so far. As the development of copy-number variant(CNV) and Single Nucleotide Polymorphisms(SNP) research, many researchers want to align numbers of similar sequences for detecting CNV and SNP. In this paper, we propose a novel multiple sequence alignment algorithm based on affine gap penalty and k-band. It can align more quickly and accurately, that will be helpful for mining CNV and SNP. Experiments prove the performance of our algorithm.
Discovering the Unknown: Improving Detection of Novel Species and Genera from Short Reads
Rosen, Gail L.; Polikar, Robi; Caseiro, Diamantino A.; ...
2011-01-01
High-throughput sequencing technologies enable metagenome profiling, simultaneous sequencing of multiple microbial species present within an environmental sample. Since metagenomic data includes sequence fragments (“reads”) from organisms that are absent from any database, new algorithms must be developed for the identification and annotation of novel sequence fragments. Homology-based techniques have been modified to detect novel species and genera, but, composition-based methods, have not been adapted. We develop a detection technique that can discriminate between “known” and “unknown” taxa, which can be used with composition-based methods, as well as a hybrid method. Unlike previous studies, we rigorously evaluate all algorithms for theirmore » ability to detect novel taxa. First, we show that the integration of a detector with a composition-based method performs significantly better than homology-based methods for the detection of novel species and genera, with best performance at finer taxonomic resolutions. Most importantly, we evaluate all the algorithms by introducing an “unknown” class and show that the modified version of PhymmBL has similar or better overall classification performance than the other modified algorithms, especially for the species-level and ultrashort reads. Finally, we evaluate theperformance of several algorithms on a real acid mine drainage dataset.« less
Ehrmann, M A; Vogel, R E
2001-11-01
An insertion sequence has been identified in the genome of Lactobacillus sanfranciscensis DSM 20451T as segment of 1351 nucleotides containing 37-bp imperfect terminal inverted repeats. The sequence of this element encodes two out of phase, overlapping open reading frames, orfA and orfB, from which three putative proteins are produced. OrfAB is a transframe protein produced by -1 translational frame shifting between orf A and orf B that is presumed to be the transposase. The large orfAB of this element encodes a 342 amino acid protein that displays similarities with transposases encoded by bacterial insertion sequences belonging to the IS3 family. In L. sanfranciscensis type strain DSM 20451T multiple truncated IS elements were identified. Inverse PCR was used to analyze target sites of four of these elements, but except of their highly AT rich character not any sequence specificity was identified so far. Moreover, no flanking direct repeats were identified. Multiple copies of IS153 were detected by hybridization in other strains of L. sanfranciscensis. Resulting hybridization patterns were shown to differentiate between organisms at strain level rather than a probe targeted against the 16S rDNA. With a PCR based approach IS153 or highly similar sequences were detected in L. acidophilus, L. casei, L. malefermentans, L. plantarum, L. hilgardii, L. collinoides L. farciminis L. sakei and L. salivarius, L. reuteri as well as in Enterococcus faecium, Pediococcus acidilactici and P. pentosaceus.
Chaudhary, A.; Haack, S.K.; Duris, J.W.; Marsh, T.L.
2009-01-01
Studies of sulfidic springs have provided new insights into microbial metabolism, groundwater biogeochemistry, and geologic processes. We investigated Great Sulphur Spring on the western shore of Lake Erie and evaluated the phylogenetic affiliations of 189 bacterial and 77 archaeal 16S rRNA gene sequences from three habitats: the spring origin (11-m depth), bacterial-algal mats on the spring pond surface, and whitish filamentous materials from the spring drain. Water from the spring origin water was cold, pH 6.3, and anoxic (H2, 5.4 nM; CH4, 2.70 ??M) with concentrations of S2- (0.03 mM), SO42- (14.8 mM), Ca2+ (15.7 mM), and HCO3- (4.1 mM) similar to those in groundwater from the local aquifer. No archaeal and few bacterial sequences were >95% similar to sequences of cultivated organisms. Bacterial sequences were largely affiliated with sulfur-metabolizing or chemolithotrophic taxa in Beta-, Gamma-, Delta-, and Epsilonproteobacteria. Epsilonproteobacteria sequences similar to those obtained from other sulfidic environments and a new clade of Cyanobacteria sequences were particularly abundant (16% and 40%, respectively) in the spring origin clone library. Crenarchaeota sequences associated with archaeal-bacterial consortia in whitish filaments at a German sulfidic spring were detected only in a similar habitat at Great Sulphur Spring. This study expands the geographic distribution of many uncultured Archaea and Bacteria sequences to the Laurentian Great Lakes, indicates possible roles for epsilonproteobacteria in local aquifer chemistry and karst formation, documents new oscillatorioid Cyanobacteria lineages, and shows that uncultured, cold-adapted Crenarchaeota sequences may comprise a significant part of the microbial community of some sulfidic environments. Copyright ?? 2009, American Society for Microbiology. All Rights Reserved.
Single reaction, real time RT-PCR detection of all known avian and human metapneumoviruses.
Lemaitre, E; Allée, C; Vabret, A; Eterradossi, N; Brown, P A
2018-01-01
Current molecular methods for the detection of avian and human metapneumovirus (AMPV, HMPV) are specifically targeted towards each virus species or individual subgroups of these. Here a broad range SYBR Green I real time RT-PCR was developed which amplified a highly conserved fragment of sequence in the N open reading frame. This method was sufficiently efficient and specific in detecting all MPVs. Its validation according to the NF U47-600 norm for the four AMPV subgroups estimated low limits of detection between 1000 and 10copies/μL, similar with detection levels described previously for real time RT-PCRs targeting specific subgroups. RNA viruses present a challenge for the design of durable molecular diagnostic test due to the rate of change in their genome sequences which can vary substantially in different areas and over time. The fact that the regions of sequence for primer hybridization in the described method have remained sufficiently conserved since the AMPV and HMPV diverged, should give the best chance of continued detection of current subgroups and of potential unknown or future emerging MPV strains. Copyright © 2017 Elsevier B.V. All rights reserved.
Searching Remote Homology with Spectral Clustering with Symmetry in Neighborhood Cluster Kernels
Maulik, Ujjwal; Sarkar, Anasua
2013-01-01
Remote homology detection among proteins utilizing only the unlabelled sequences is a central problem in comparative genomics. The existing cluster kernel methods based on neighborhoods and profiles and the Markov clustering algorithms are currently the most popular methods for protein family recognition. The deviation from random walks with inflation or dependency on hard threshold in similarity measure in those methods requires an enhancement for homology detection among multi-domain proteins. We propose to combine spectral clustering with neighborhood kernels in Markov similarity for enhancing sensitivity in detecting homology independent of “recent” paralogs. The spectral clustering approach with new combined local alignment kernels more effectively exploits the unsupervised protein sequences globally reducing inter-cluster walks. When combined with the corrections based on modified symmetry based proximity norm deemphasizing outliers, the technique proposed in this article outperforms other state-of-the-art cluster kernels among all twelve implemented kernels. The comparison with the state-of-the-art string and mismatch kernels also show the superior performance scores provided by the proposed kernels. Similar performance improvement also is found over an existing large dataset. Therefore the proposed spectral clustering framework over combined local alignment kernels with modified symmetry based correction achieves superior performance for unsupervised remote homolog detection even in multi-domain and promiscuous domain proteins from Genolevures database families with better biological relevance. Source code available upon request. Contact: sarkar@labri.fr. PMID:23457439
Searching remote homology with spectral clustering with symmetry in neighborhood cluster kernels.
Maulik, Ujjwal; Sarkar, Anasua
2013-01-01
Remote homology detection among proteins utilizing only the unlabelled sequences is a central problem in comparative genomics. The existing cluster kernel methods based on neighborhoods and profiles and the Markov clustering algorithms are currently the most popular methods for protein family recognition. The deviation from random walks with inflation or dependency on hard threshold in similarity measure in those methods requires an enhancement for homology detection among multi-domain proteins. We propose to combine spectral clustering with neighborhood kernels in Markov similarity for enhancing sensitivity in detecting homology independent of "recent" paralogs. The spectral clustering approach with new combined local alignment kernels more effectively exploits the unsupervised protein sequences globally reducing inter-cluster walks. When combined with the corrections based on modified symmetry based proximity norm deemphasizing outliers, the technique proposed in this article outperforms other state-of-the-art cluster kernels among all twelve implemented kernels. The comparison with the state-of-the-art string and mismatch kernels also show the superior performance scores provided by the proposed kernels. Similar performance improvement also is found over an existing large dataset. Therefore the proposed spectral clustering framework over combined local alignment kernels with modified symmetry based correction achieves superior performance for unsupervised remote homolog detection even in multi-domain and promiscuous domain proteins from Genolevures database families with better biological relevance. Source code available upon request. sarkar@labri.fr.
Albitar, Adam; Ma, Wanlong; DeDios, Ivan; Estella, Jeffrey; Ahn, Inhye; Farooqui, Mohammed; Wiestner, Adrian; Albitar, Maher
2017-03-14
Patients with chronic lymphocytic leukemia (CLL) that develop resistance to Bruton tyrosine kinase (BTK) inhibitors are typically positive for mutations in BTK or phospholipase c gamma 2 (PLCγ2). We developed a high sensitivity (HS) assay utilizing wild-type blocking polymerase chain reaction achieved via bridged and locked nucleic acids. We used this high sensitivity assay in combination with Sanger sequencing and next generation sequencing (NGS) and tested cellular DNA and cell-free DNA (cfDNA) from patients with CLL treated with the BTK inhibitor, ibrutinib. We also tested ibrutinib-naïve patients with CLL. HS testing achieved 100x greater sensitivity than Sanger. HS Sanger sequencing was capable of detecting < 1 mutant allele in background of 1000 wild-type alleles (1:1000). Similar sensitivity was achieved with HS NGS. No BTK or PLCγ2 mutations were detected in any of the 44 ibrutinib-naïve CLL patients. We demonstrate that without the HS testing 56% of positive samples would have been missed for BTK and 85% of PLCγ2 would have been missed. With the use of HS, we were able to detect multiple mutant clones in the same sample in 37.5% of patients; most would have been missed without HS testing. We also demonstrate that with HS sequencing, plasma cfDNA is more reliable than cellular DNA in detecting mutations. Our studies indicate that wild-type blocking and HS sequencing is necessary for proper and early detection of BTK or PLCγ2 mutations in monitoring patients treated with BTK inhibitors. Furthermore, cfDNA from plasma is very reliable sample-type for testing.
Patarca, R; Dorta, B; Ramirez, J L
1982-01-01
As part of a project pertaining the organization of ribosomal genes in Kinetoplastidae, we have created a data base for published sequences of ribosomal nucleic acids, with information in Spanish. As a first step in their processing, we have written a computer program which introduces the new feature of determining the length of the fragments produced after single or multiple digestion with any of the known restriction enzymes. With this information we have detected conserved SAU 3A sites: (i) at the 5' end of the 5.8S rRNA and at the 3' end of the small subunit rRNA, both included in similar larger sequences; (ii) in the 5.8S rRNA of vertebrates (a second one), which is not present in lower eukaryotes, showing a clear evolutive divergence; and, (iii) at the 5' terminal of the small subunit rRNA, included in a larger conserved sequence. The possible biological importance of these sequences is discussed. PMID:6278402
Oka, Tomoichiro; Doan, Yen Hai; Shimoike, Takashi; Haga, Kei; Takizawa, Takenori
2017-12-01
Sapoviruses (SaVs) are enteric viruses and have been detected in various mammals. They are divided into multiple genogroups and genotypes based on the entire major capsid protein (VP1) encoding region sequences. In this study, we determined the first complete genome sequences of two genogroup V, genotype 3 (GV.3) SaV strains detected from swine fecal samples, in combination with Illumina MiSeq sequencing of the libraries prepared from viral RNA and PCR products. The lengths of the viral genome (7494 nucleotides [nt] excluding polyA tail) and short 5'-untranslated region (14 nt) as well as two predicted open reading frames are similar to those of other SaVs. The amino acid differences between the two porcine SaVs are most frequent in the central region of the VP1-encoding region. A stem-loop structure which was predicted in the first 41 nt of the 5'-terminal region of GV.3 SaVs and the other available complete genome sequences of SaVs may have a critical role in viral genome replication. Our study provides complete genome sequences of rarely reported GV.3 SaV strains and highlights the common 5'-terminal genomic feature of SaVs detected from different mammalian species.
Rabausch, U.; Juergensen, J.; Ilmberger, N.; Böhnke, S.; Fischer, S.; Schubach, B.; Schulte, M.
2013-01-01
The functional detection of novel enzymes other than hydrolases from metagenomes is limited since only a very few reliable screening procedures are available that allow the rapid screening of large clone libraries. For the discovery of flavonoid-modifying enzymes in genome and metagenome clone libraries, we have developed a new screening system based on high-performance thin-layer chromatography (HPTLC). This metagenome extract thin-layer chromatography analysis (META) allows the rapid detection of glycosyltransferase (GT) and also other flavonoid-modifying activities. The developed screening method is highly sensitive, and an amount of 4 ng of modified flavonoid molecules can be detected. This novel technology was validated against a control library of 1,920 fosmid clones generated from a single Bacillus cereus isolate and then used to analyze more than 38,000 clones derived from two different metagenomic preparations. Thereby we identified two novel UDP glycosyltransferase (UGT) genes. The metagenome-derived gtfC gene encoded a 52-kDa protein, and the deduced amino acid sequence was weakly similar to sequences of putative UGTs from Fibrisoma and Dyadobacter. GtfC mediated the transfer of different hexose moieties and exhibited high activities on flavones, flavonols, flavanones, and stilbenes and also accepted isoflavones and chalcones. From the control library we identified a novel macroside glycosyltransferase (MGT) with a calculated molecular mass of 46 kDa. The deduced amino acid sequence was highly similar to sequences of MGTs from Bacillus thuringiensis. Recombinant MgtB transferred the sugar residue from UDP-glucose effectively to flavones, flavonols, isoflavones, and flavanones. Moreover, MgtB exhibited high activity on larger flavonoid molecules such as tiliroside. PMID:23686272
Virk, Ramandeep Kaur; Tambyah, Paul Anantharajah; Inoue, Masafumi; Lim, Elizabeth Ai-Sim; Chan, Ka-Wei; Chua, Catherine; Tan, Boon-Huan
2014-01-01
Southeast Asia is believed to be a potential locus for the emergence of novel influenza strains, and therefore accurate sentinel surveillance in the region is critical. Limited information exists on sentinel surveillance of influenza-like illness (ILI) in young adults in Singapore in a University campus setting. The objective of the present study was to determine the proportion of ILI caused by influenza A and B viruses in a university cohort in Singapore. We conducted a prospective surveillance study from May through October 2007, at the National University of Singapore (NUS). Basic demographic information and nasopharyngeal swabs were collected from students and staff with ILI. Reverse-transcriptase PCR (RT-PCR) and viral isolation were employed to detect influenza viruses. Sequencing of hemagglutinin (HA) and neuraminidase (NA) genes of some representative isolates was also performed. Overall proportions of influenza A and B virus infections were 47/266 (18%) and 9/266 (3%) respectively. The predominant subtype was A/H3N2 (55%) and the rest were A/H1N1 (45%). The overall sensitivity difference for detection of influenza A viruses using RT-PCR and viral isolation was 53%. Phylogenetic analyses of HA and NA gene sequences of Singapore strains showed identities higher than 98% within both the genes. The strains were more similar to strains included in the WHO vaccine recommendation for the following year (2008). Genetic markers of oseltamivir resistance were not detected in any of the sequenced Singapore isolates. HA and NA gene sequences of Singapore strains were similar to vaccine strains for the upcoming influenza season. No drug resistance was found. Sentinel surveillance on university campuses should make use of molecular methods to better detect emerging and re-emerging influenza viral threats.
Detection and phylogenetic analysis of hepatitis E viruses from mongooses in Okinawa, Japan.
Nidaira, Minoru; Takahashi, Kazuaki; Ogura, Go; Taira, Katsuya; Okano, Shou; Kudaka, Jun; Itokazu, Kiyomasa; Mishiro, Shunji; Nakamura, Masaji
2012-12-01
Hepatitis E virus (HEV) infection has previously been reported in wild mongooses on Okinawa Island; to date however, only one HEV RNA sequence has been identified in a mongoose. Hence, this study was performed to detect HEV RNA in 209 wild mongooses on Okinawa Island. Six (2.9%) samples tested positive for HEV RNA. Phylogenetic analysis revealed that 6 HEV RNAs belonged to genotype 3 and were classified into groups A and B. In group B, mongoose-derived HEV sequences were very similar to mongoose HEV previously detected on Okinawa Island, as well as to those of a pig. This investigation emphasized the possibility that the mongoose is a reservoir animal for HEV on Okinawa Island.
Detecting earthquakes over a seismic network using single-station similarity measures
NASA Astrophysics Data System (ADS)
Bergen, Karianne J.; Beroza, Gregory C.
2018-06-01
New blind waveform-similarity-based detection methods, such as Fingerprint and Similarity Thresholding (FAST), have shown promise for detecting weak signals in long-duration, continuous waveform data. While blind detectors are capable of identifying similar or repeating waveforms without templates, they can also be susceptible to false detections due to local correlated noise. In this work, we present a set of three new methods that allow us to extend single-station similarity-based detection over a seismic network; event-pair extraction, pairwise pseudo-association, and event resolution complete a post-processing pipeline that combines single-station similarity measures (e.g. FAST sparse similarity matrix) from each station in a network into a list of candidate events. The core technique, pairwise pseudo-association, leverages the pairwise structure of event detections in its network detection model, which allows it to identify events observed at multiple stations in the network without modeling the expected moveout. Though our approach is general, we apply it to extend FAST over a sparse seismic network. We demonstrate that our network-based extension of FAST is both sensitive and maintains a low false detection rate. As a test case, we apply our approach to 2 weeks of continuous waveform data from five stations during the foreshock sequence prior to the 2014 Mw 8.2 Iquique earthquake. Our method identifies nearly five times as many events as the local seismicity catalogue (including 95 per cent of the catalogue events), and less than 1 per cent of these candidate events are false detections.
The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.
Yooseph, Shibu; Sutton, Granger; Rusch, Douglas B; Halpern, Aaron L; Williamson, Shannon J; Remington, Karin; Eisen, Jonathan A; Heidelberg, Karla B; Manning, Gerard; Li, Weizhong; Jaroszewski, Lukasz; Cieplak, Piotr; Miller, Christopher S; Li, Huiying; Mashiyama, Susan T; Joachimiak, Marcin P; van Belle, Christopher; Chandonia, John-Marc; Soergel, David A; Zhai, Yufeng; Natarajan, Kannan; Lee, Shaun; Raphael, Benjamin J; Bafna, Vineet; Friedman, Robert; Brenner, Steven E; Godzik, Adam; Eisenberg, David; Dixon, Jack E; Taylor, Susan S; Strausberg, Robert L; Frazier, Marvin; Venter, J Craig
2007-03-01
Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.
[Soil propagule bank of ectomycorrhizal fungi in natural forest of Pinus bungeana].
Zhao, Nan Xing; Han, Qi Sheng; Huang, Jian
2017-12-01
To conserve and restore the forest of Pinu bungeana, we investigated the soil propagule bank of ectomycorrhizal (ECM) fungi in a severely disturbed natural forest of P. bungeana in Shaanxi Province, China. We used a seedling-bioassay method to bait the ECM fungal propagules in the soils collected from the forest site. ECM was identified by combining morph typing with ITS-PCR-sequencing. We obtained 73 unique sequences from the ECM associated with P. bungeana seedlings, and assigned them into 12 ECM fungal OTUs at the threshold of 97% based on the sequence similarity. Rarefaction curve displayed almost all ECM fungi in the propagule bank were detected. The most frequent OTU (80%) showed poor similarity (75%) with existing sequences in the online database, which suggested it might be a new species. Cenococcum geophilum, Tomentella sp., Tuber sp. were common species in the propagule bank. Although C. geophilum and Tomentella sp. were frequently detected in other soil propagule banks of pine forest, the most frequent OTU was not assigned to known genus or family, which indicated the host-specif of ECM propagule banks associa-ted with P. bungeana. This result confirmed the importance of the special ECM propagule banks associated with P. bungeana for natural forest restoration.
Salis, R. K.; Bruder, A.; Piggott, J. J.; Summerfield, T. C.; Matthaei, C. D.
2017-01-01
Disentangling the individual and interactive effects of multiple stressors on microbial communities is a key challenge to our understanding and management of ecosystems. Advances in molecular techniques allow studying microbial communities in situ and with high taxonomic resolution. However, the taxonomic level which provides the best trade-off between our ability to detect multiple-stressor effects versus the goal of studying entire communities remains unknown. We used outdoor mesocosms simulating small streams to investigate the effects of four agricultural stressors (nutrient enrichment, the nitrification inhibitor dicyandiamide (DCD), fine sediment and flow velocity reduction) on stream bacteria (phyla, orders, genera, and species represented by Operational Taxonomic Units with 97% sequence similarity). Community composition was assessed using amplicon sequencing (16S rRNA gene, V3-V4 region). DCD was the most pervasive stressor, affecting evenness and most abundant taxa, followed by sediment and flow velocity. Stressor pervasiveness was similar across taxonomic levels and lower levels did not perform better in detecting stressor effects. Community coverage decreased from 96% of all sequences for abundant phyla to 28% for species. Order-level responses were generally representative of responses of corresponding genera and species, suggesting that this level may represent the best compromise between stressor sensitivity and coverage of bacterial communities. PMID:28327636
Shafiee, Mohammad Javad; Chung, Audrey G; Khalvati, Farzad; Haider, Masoom A; Wong, Alexander
2017-10-01
While lung cancer is the second most diagnosed form of cancer in men and women, a sufficiently early diagnosis can be pivotal in patient survival rates. Imaging-based, or radiomics-driven, detection methods have been developed to aid diagnosticians, but largely rely on hand-crafted features that may not fully encapsulate the differences between cancerous and healthy tissue. Recently, the concept of discovery radiomics was introduced, where custom abstract features are discovered from readily available imaging data. We propose an evolutionary deep radiomic sequencer discovery approach based on evolutionary deep intelligence. Motivated by patient privacy concerns and the idea of operational artificial intelligence, the evolutionary deep radiomic sequencer discovery approach organically evolves increasingly more efficient deep radiomic sequencers that produce significantly more compact yet similarly descriptive radiomic sequences over multiple generations. As a result, this framework improves operational efficiency and enables diagnosis to be run locally at the radiologist's computer while maintaining detection accuracy. We evaluated the evolved deep radiomic sequencer (EDRS) discovered via the proposed evolutionary deep radiomic sequencer discovery framework against state-of-the-art radiomics-driven and discovery radiomics methods using clinical lung CT data with pathologically proven diagnostic data from the LIDC-IDRI dataset. The EDRS shows improved sensitivity (93.42%), specificity (82.39%), and diagnostic accuracy (88.78%) relative to previous radiomics approaches.
Detecting and treating breast cancer resistance to EGFR inhibitors
Moonlee, Sun-Young; Bissell, Mina J.; Furuta, Saori; Meier, Roland; Kenny, Paraic A.
2016-04-05
The application describes therapeutic compositions and methods for treating cancer. For example, therapeutic compositions and methods related to inhibition of FAM83A (family with sequence similarity 83) are provided. The application also describes methods for diagnosing cancer resistance to EGFR inhibitors. For example, a method of diagnosing cancer resistance to EGFR inhibitors by detecting increased FAM83A levels is described.
Use of vectors in sequence analysis.
Ishikawa, T; Yamamoto, K; Yoshikura, H
1987-10-01
Applications of the vector diagram, a new type of representation of protein structure, in homology search of various proteins including oncogene products are presented. The method takes account of various kinds of information concerning the properties of amino acids, such as Chou and Fasman's probability data. The method can detect conformational similarities of proteins which may not be detected by the conventional programs.
Evaluating the protein coding potential of exonized transposable element sequences
Piriyapongsa, Jittima; Rutledge, Mark T; Patel, Sanil; Borodovsky, Mark; Jordan, I King
2007-01-01
Background Transposable element (TE) sequences, once thought to be merely selfish or parasitic members of the genomic community, have been shown to contribute a wide variety of functional sequences to their host genomes. Analysis of complete genome sequences have turned up numerous cases where TE sequences have been incorporated as exons into mRNAs, and it is widely assumed that such 'exonized' TEs encode protein sequences. However, the extent to which TE-derived sequences actually encode proteins is unknown and a matter of some controversy. We have tried to address this outstanding issue from two perspectives: i-by evaluating ascertainment biases related to the search methods used to uncover TE-derived protein coding sequences (CDS) and ii-through a probabilistic codon-frequency based analysis of the protein coding potential of TE-derived exons. Results We compared the ability of three classes of sequence similarity search methods to detect TE-derived sequences among data sets of experimentally characterized proteins: 1-a profile-based hidden Markov model (HMM) approach, 2-BLAST methods and 3-RepeatMasker. Profile based methods are more sensitive and more selective than the other methods evaluated. However, the application of profile-based search methods to the detection of TE-derived sequences among well-curated experimentally characterized protein data sets did not turn up many more cases than had been previously detected and nowhere near as many cases as recent genome-wide searches have. We observed that the different search methods used were complementary in the sense that they yielded largely non-overlapping sets of hits and differed in their ability to recover known cases of TE-derived CDS. The probabilistic analysis of TE-derived exon sequences indicates that these sequences have low protein coding potential on average. In particular, non-autonomous TEs that do not encode protein sequences, such as Alu elements, are frequently exonized but unlikely to encode protein sequences. Conclusion The exaptation of the numerous TE sequences found in exons as bona fide protein coding sequences may prove to be far less common than has been suggested by the analysis of complete genomes. We hypothesize that many exonized TE sequences actually function as post-transcriptional regulators of gene expression, rather than coding sequences, which may act through a variety of double stranded RNA related regulatory pathways. Indeed, their relatively high copy numbers and similarity to sequences dispersed throughout the genome suggests that exonized TE sequences could serve as master regulators with a wide scope of regulatory influence. Reviewers: This article was reviewed by Itai Yanai, Kateryna D. Makova, Melissa Wilson (nominated by Kateryna D. Makova) and Cedric Feschotte (nominated by John M. Logsdon Jr.). PMID:18036258
32-channel single photon counting module for ultrasensitive detection of DNA sequences
NASA Astrophysics Data System (ADS)
Gudkov, Georgiy; Dhulla, Vinit; Borodin, Anatoly; Gavrilov, Dmitri; Stepukhovich, Andrey; Tsupryk, Andrey; Gorbovitski, Boris; Gorfinkel, Vera
2006-10-01
We continue our work on the design and implementation of multi-channel single photon detection systems for highly sensitive detection of ultra-weak fluorescence signals, for high-performance, multi-lane DNA sequencing instruments. A fiberized, 32-channel single photon detection (SPD) module based on single photon avalanche diode (SPAD), model C30902S-DTC, from Perkin Elmer Optoelectronics (PKI) has been designed and implemented. Unavailability of high performance, large area SPAD arrays and our desire to design high performance photon counting systems drives us to use individual diodes. Slight modifications in our quenching circuit has doubled the linear range of our system from 1MHz to 2MHz, which is the upper limit for these devices and the maximum saturation count rate has increased to 14 MHz. The detector module comprises of a single board computer PC-104 that enables data visualization, recording, processing, and transfer. Very low dark count (300-1000 counts/s), robust, efficient, simple data collection and processing, ease of connectivity to any other application demanding similar requirements and similar performance results to the best commercially available single photon counting module (SPCM from PKI) are some of the features of this system.
Sakai, Hiroaki; Kanamori, Hiroyuki; Arai-Kichise, Yuko; Shibata-Hatta, Mari; Ebana, Kaworu; Oono, Youko; Kurita, Kanako; Fujisawa, Hiroko; Katagiri, Satoshi; Mukai, Yoshiyuki; Hamada, Masao; Itoh, Takeshi; Matsumoto, Takashi; Katayose, Yuichi; Wakasa, Kyo; Yano, Masahiro; Wu, Jianzhong
2014-01-01
Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone. PMID:24578372
Interactive web-based identification and visualization of transcript shared sequences.
Azhir, Alaleh; Merino, Louis-Henri; Nauen, David W
2018-05-12
We have developed TraC (Transcript Consensus), a web-based tool for detecting and visualizing shared sequences among two or more mRNA transcripts such as splice variants. Results including exon-exon boundaries are returned in a highly intuitive, data-rich, interactive plot that permits users to explore the similarities and differences of multiple transcript sequences. The online tool (http://labs.pathology.jhu.edu/nauen/trac/) is free to use. The source code is freely available for download (https://github.com/nauenlab/TraC). Copyright © 2018 Elsevier Inc. All rights reserved.
Validation of a next-generation sequencing assay for clinical molecular oncology.
Cottrell, Catherine E; Al-Kateb, Hussam; Bredemeyer, Andrew J; Duncavage, Eric J; Spencer, David H; Abel, Haley J; Lockwood, Christina M; Hagemann, Ian S; O'Guin, Stephanie M; Burcea, Lauren C; Sawyer, Christopher S; Oschwald, Dayna M; Stratman, Jennifer L; Sher, Dorie A; Johnson, Mark R; Brown, Justin T; Cliften, Paul F; George, Bijoy; McIntosh, Leslie D; Shrivastava, Savita; Nguyen, Tudung T; Payton, Jacqueline E; Watson, Mark A; Crosby, Seth D; Head, Richard D; Mitra, Robi D; Nagarajan, Rakesh; Kulkarni, Shashikant; Seibert, Karen; Virgin, Herbert W; Milbrandt, Jeffrey; Pfeifer, John D
2014-01-01
Currently, oncology testing includes molecular studies and cytogenetic analysis to detect genetic aberrations of clinical significance. Next-generation sequencing (NGS) allows rapid analysis of multiple genes for clinically actionable somatic variants. The WUCaMP assay uses targeted capture for NGS analysis of 25 cancer-associated genes to detect mutations at actionable loci. We present clinical validation of the assay and a detailed framework for design and validation of similar clinical assays. Deep sequencing of 78 tumor specimens (≥ 1000× average unique coverage across the capture region) achieved high sensitivity for detecting somatic variants at low allele fraction (AF). Validation revealed sensitivities and specificities of 100% for detection of single-nucleotide variants (SNVs) within coding regions, compared with SNP array sequence data (95% CI = 83.4-100.0 for sensitivity and 94.2-100.0 for specificity) or whole-genome sequencing (95% CI = 89.1-100.0 for sensitivity and 99.9-100.0 for specificity) of HapMap samples. Sensitivity for detecting variants at an observed 10% AF was 100% (95% CI = 93.2-100.0) in HapMap mixes. Analysis of 15 masked specimens harboring clinically reported variants yielded concordant calls for 13/13 variants at AF of ≥ 15%. The WUCaMP assay is a robust and sensitive method to detect somatic variants of clinical significance in molecular oncology laboratories, with reduced time and cost of genetic analysis allowing for strategic patient management. Copyright © 2014 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Prediction of molecular mimicry candidates in human pathogenic bacteria.
Doxey, Andrew C; McConkey, Brendan J
2013-08-15
Molecular mimicry of host proteins is a common strategy adopted by bacterial pathogens to interfere with and exploit host processes. Despite the availability of pathogen genomes, few studies have attempted to predict virulence-associated mimicry relationships directly from genomic sequences. Here, we analyzed the proteomes of 62 pathogenic and 66 non-pathogenic bacterial species, and screened for the top pathogen-specific or pathogen-enriched sequence similarities to human proteins. The screen identified approximately 100 potential mimicry relationships including well-characterized examples among the top-scoring hits (e.g., RalF, internalin, yopH, and others), with about 1/3 of predicted relationships supported by existing literature. Examination of homology to virulence factors, statistically enriched functions, and comparison with literature indicated that the detected mimics target key host structures (e.g., extracellular matrix, ECM) and pathways (e.g., cell adhesion, lipid metabolism, and immune signaling). The top-scoring and most widespread mimicry pattern detected among pathogens consisted of elevated sequence similarities to ECM proteins including collagens and leucine-rich repeat proteins. Unexpectedly, analysis of the pathogen counterparts of these proteins revealed that they have evolved independently in different species of bacterial pathogens from separate repeat amplifications. Thus, our analysis provides evidence for two classes of mimics: complex proteins such as enzymes that have been acquired by eukaryote-to-pathogen horizontal transfer, and simpler repeat proteins that have independently evolved to mimic the host ECM. Ultimately, computational detection of pathogen-specific and pathogen-enriched similarities to host proteins provides insights into potentially novel mimicry-mediated virulence mechanisms of pathogenic bacteria.
Prediction of molecular mimicry candidates in human pathogenic bacteria
Doxey, Andrew C; McConkey, Brendan J
2013-01-01
Molecular mimicry of host proteins is a common strategy adopted by bacterial pathogens to interfere with and exploit host processes. Despite the availability of pathogen genomes, few studies have attempted to predict virulence-associated mimicry relationships directly from genomic sequences. Here, we analyzed the proteomes of 62 pathogenic and 66 non-pathogenic bacterial species, and screened for the top pathogen-specific or pathogen-enriched sequence similarities to human proteins. The screen identified approximately 100 potential mimicry relationships including well-characterized examples among the top-scoring hits (e.g., RalF, internalin, yopH, and others), with about 1/3 of predicted relationships supported by existing literature. Examination of homology to virulence factors, statistically enriched functions, and comparison with literature indicated that the detected mimics target key host structures (e.g., extracellular matrix, ECM) and pathways (e.g., cell adhesion, lipid metabolism, and immune signaling). The top-scoring and most widespread mimicry pattern detected among pathogens consisted of elevated sequence similarities to ECM proteins including collagens and leucine-rich repeat proteins. Unexpectedly, analysis of the pathogen counterparts of these proteins revealed that they have evolved independently in different species of bacterial pathogens from separate repeat amplifications. Thus, our analysis provides evidence for two classes of mimics: complex proteins such as enzymes that have been acquired by eukaryote-to-pathogen horizontal transfer, and simpler repeat proteins that have independently evolved to mimic the host ECM. Ultimately, computational detection of pathogen-specific and pathogen-enriched similarities to host proteins provides insights into potentially novel mimicry-mediated virulence mechanisms of pathogenic bacteria. PMID:23715053
Complete genome sequence of a potyvirus infecting yam beans (Pachyrhizus spp.) in Peru.
Fuentes, Segundo; Heider, Bettina; Tasso, Ruby Carolina; Romero, Elisa; Zum Felde, Thomas; Kreuze, Jan Frederik
2012-04-01
In 2010, yam beans in a field trial in Peru showed viral disease symptoms. Graft-transmission and positive ELISA results using potyvirus-specific antibodies suggested that the symptoms could be the result of a potyviral infection. Small interfering RNA (siRNA) were extracted from one of the samples and sent for high-throughput sequencing. The full genome of a new potyvirus could be assembled from the resulting siRNA sequences, and it was sufficiently different from other sequences to be considered a member of a new species, which we have designated Yam bean mosaic virus (YBMV). Sequence similarity suggests that YBMV has also been detected in yam beans in Indonesia.
2017-01-01
We present a sensor that exploits the phenomenon of upconversion luminescence to detect the presence of specific sequences of small oligonucleotides such as miRNAs among others. The sensor is based on NaYF4:Yb,Er@SiO2 nanoparticles functionalized with ssDNA that contain azide groups on the 3′ ends. In the presence of a target sequence, interstrand ligation is possible via the click-reaction between one azide of the upconversion probe and a DBCO-ssDNA-biotin probe present in the solution. As a result of this specific and selective process, biotin is covalently attached to the surface of the upconversion nanoparticles. The presence of biotin on the surface of the nanoparticles allows their selective capture on a streptavidin-coated support, giving a luminescent signal proportional to the amount of target strands present in the test samples. With the aim of studying the analytical properties of the sensor, total RNA samples were extracted from healthy mosquitoes and were spiked-in with a specific target sequence at different concentrations. The result of these experiments revealed that the sensor was able to detect 10–17 moles per well (100 fM) of the target sequence in mixtures containing 100 ng of total RNA per well. A similar limit of detection was found for spiked human serum samples, demonstrating the suitability of the sensor for detecting specific sequences of small oligonucleotides under real conditions. In contrast, in the presence of noncomplementary sequences or sequences having mismatches, the luminescent signal was negligible or conspicuously reduced. PMID:28332400
Eco-epidemiology of Novel Bartonella Genotypes from Parasitic Flies of Insectivorous Bats.
Sándor, Attila D; Földvári, Mihály; Krawczyk, Aleksandra I; Sprong, Hein; Corduneanu, Alexandra; Barti, Levente; Görföl, Tamás; Estók, Péter; Kováts, Dávid; Szekeres, Sándor; László, Zoltán; Hornok, Sándor; Földvári, Gábor
2018-04-29
Bats are important zoonotic reservoirs for many pathogens worldwide. Although their highly specialized ectoparasites, bat flies (Diptera: Hippoboscoidea), can transmit Bartonella bacteria including human pathogens, their eco-epidemiology is unexplored. Here, we analyzed the prevalence and diversity of Bartonella strains sampled from 10 bat fly species from 14 European bat species. We found high prevalence of Bartonella spp. in most bat fly species with wide geographical distribution. Bat species explained most of the variance in Bartonella distribution with the highest prevalence of infected flies recorded in species living in dense groups exclusively in caves. Bat gender but not bat fly gender was also an important factor with the more mobile male bats giving more opportunity for the ectoparasites to access several host individuals. We detected high diversity of Bartonella strains (18 sequences, 7 genotypes, in 9 bat fly species) comparable with tropical assemblages of bat-bat fly association. Most genotypes are novel (15 out of 18 recorded strains have a similarity of 92-99%, with three sequences having 100% similarity to Bartonella spp. sequences deposited in GenBank) with currently unknown pathogenicity; however, 4 of these sequences are similar (up to 92% sequence similarity) to Bartonella spp. with known zoonotic potential. The high prevalence and diversity of Bartonella spp. suggests a long shared evolution of these bacteria with bat flies and bats providing excellent study targets for the eco-epidemiology of host-vector-pathogen cycles.
Núñez-Vivanco, Gabriel; Valdés-Jiménez, Alejandro; Besoaín, Felipe; Reyes-Parada, Miguel
2016-01-01
Since the structure of proteins is more conserved than the sequence, the identification of conserved three-dimensional (3D) patterns among a set of proteins, can be important for protein function prediction, protein clustering, drug discovery and the establishment of evolutionary relationships. Thus, several computational applications to identify, describe and compare 3D patterns (or motifs) have been developed. Often, these tools consider a 3D pattern as that described by the residues surrounding co-crystallized/docked ligands available from X-ray crystal structures or homology models. Nevertheless, many of the protein structures stored in public databases do not provide information about the location and characteristics of ligand binding sites and/or other important 3D patterns such as allosteric sites, enzyme-cofactor interaction motifs, etc. This makes necessary the development of new ligand-independent methods to search and compare 3D patterns in all available protein structures. Here we introduce Geomfinder, an intuitive, flexible, alignment-free and ligand-independent web server for detailed estimation of similarities between all pairs of 3D patterns detected in any two given protein structures. We used around 1100 protein structures to form pairs of proteins which were assessed with Geomfinder. In these analyses each protein was considered in only one pair (e.g. in a subset of 100 different proteins, 50 pairs of proteins can be defined). Thus: (a) Geomfinder detected identical pairs of 3D patterns in a series of monoamine oxidase-B structures, which corresponded to the effectively similar ligand binding sites at these proteins; (b) we identified structural similarities among pairs of protein structures which are targets of compounds such as acarbose, benzamidine, adenosine triphosphate and pyridoxal phosphate; these similar 3D patterns are not detected using sequence-based methods; (c) the detailed evaluation of three specific cases showed the versatility of Geomfinder, which was able to discriminate between similar and different 3D patterns related to binding sites of common substrates in a range of diverse proteins. Geomfinder allows detecting similar 3D patterns between any two pair of protein structures, regardless of the divergency among their amino acids sequences. Although the software is not intended for simultaneous multiple comparisons in a large number of proteins, it can be particularly useful in cases such as the structure-based design of multitarget drugs, where a detailed analysis of 3D patterns similarities between a few selected protein targets is essential.
Taboada, Eduardo; Grant, Christopher C. R.; Blakeston, Connie; Pollari, Frank; Marshall, Barbara; Rahn, Kris; MacKinnon, Joanne; Daignault, Danielle; Pillai, Dylan; Ng, Lai-King
2012-01-01
Campylobacter spp. may be responsible for unreported outbreaks of food-borne disease. The detection of these outbreaks is made more difficult by the fact that appropriate methods for detecting clusters of Campylobacter have not been well defined. We have compared the characteristics of five molecular typing methods on Campylobacter jejuni and C. coli isolates obtained from human and nonhuman sources during sentinel site surveillance during a 3-year period. Comparative genomic fingerprinting (CGF) appears to be one of the optimal methods for the detection of clusters of cases, and it could be supplemented by the sequencing of the flaA gene short variable region (flaA SVR sequence typing), with or without subsequent multilocus sequence typing (MLST). Different methods may be optimal for uncovering different aspects of source attribution. Finally, the use of several different molecular typing or analysis methods for comparing individuals within a population reveals much more about that population than a single method. Similarly, comparing several different typing methods reveals a great deal about differences in how the methods group individuals within the population. PMID:22162562
Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.
2013-01-01
Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983
Sequence-structure relationships in RNA loops: establishing the basis for loop homology modeling.
Schudoma, Christian; May, Patrick; Nikiforova, Viktoria; Walther, Dirk
2010-01-01
The specific function of RNA molecules frequently resides in their seemingly unstructured loop regions. We performed a systematic analysis of RNA loops extracted from experimentally determined three-dimensional structures of RNA molecules. A comprehensive loop-structure data set was created and organized into distinct clusters based on structural and sequence similarity. We detected clear evidence of the hallmark of homology present in the sequence-structure relationships in loops. Loops differing by <25% in sequence identity fold into very similar structures. Thus, our results support the application of homology modeling for RNA loop model building. We established a threshold that may guide the sequence divergence-based selection of template structures for RNA loop homology modeling. Of all possible sequences that are, under the assumption of isosteric relationships, theoretically compatible with actual sequences observed in RNA structures, only a small fraction is contained in the Rfam database of RNA sequences and classes implying that the actual RNA loop space may consist of a limited number of unique loop structures and conserved sequences. The loop-structure data sets are made available via an online database, RLooM. RLooM also offers functionalities for the modeling of RNA loop structures in support of RNA engineering and design efforts.
Jose, Jency; Jalali, S K; Shivalingaswamy, T M; Kumar, N K Krishna; Bhatnagar, R; Bandyopadhyay, A
2013-06-01
A PCR based method for detection of viral DNA in nucleopolyhedrovirus of three lepidopterans, Spodoptera litura, Amsacta albistriga and Helicoverpa armigera, was developed by employing the late expression factor-8 (lef-8) gene of three NPV using specific primers. The amplicons of 689, 699 and 665 bp were amplified, respectively, and the nucleotide sequences were submitted to GenBank and the accession numbers were obtained. The sequences of lef-8 gene of S. litura NPV and H. armigera NPV matched with those of their respective references in the GenBank database, thereby confirming their identity, however, the sequence of A. albistriga NPV was the first sequence submitted to the GenBank database. The sequence similarity analysis between the three lef-8 gene of NPV sequenced in the present study revealed that there was no significant similarity between them, however A. albistriga NPV and S. litura NPV were found to be closely related. CLUSTAL alignment of the sequences generated revealed general relatedness among NPVs lef-8 gene. The study confirmed that lef-8 gene can be used for quick and correct discriminatory identification of insect viruses.
Gentle Masking of Low-Complexity Sequences Improves Homology Search
Frith, Martin C.
2011-01-01
Detection of sequences that are homologous, i.e. descended from a common ancestor, is a fundamental task in computational biology. This task is confounded by low-complexity tracts (such as atatatatatat), which arise frequently and independently, causing strong similarities that are not homologies. There has been much research on identifying low-complexity tracts, but little research on how to treat them during homology search. We propose to find homologies by aligning sequences with “gentle” masking of low-complexity tracts. Gentle masking means that the match score involving a masked letter is , where is the unmasked score. Gentle masking slightly but noticeably improves the sensitivity of homology search (compared to “harsh” masking), without harming specificity. We show examples in three useful homology search problems: detection of NUMTs (nuclear copies of mitochondrial DNA), recruitment of metagenomic DNA reads to reference genomes, and pseudogene detection. Gentle masking is currently the best way to treat low-complexity tracts during homology search. PMID:22205972
First detection of avian lineage H7N2 in Felis catus
USDA-ARS?s Scientific Manuscript database
In December 2016, influenza A (H7N2) was first detected among cats in the New York City shelter system with subsequent widespread transmission. The sequence of the first clinical isolate, A/feline/New York/16-040082-1/2016(H7N2), and its genetic similarity to the live bird market lineage of H7N2 low...
Identification and analysis of multigene families by comparison of exon fingerprints.
Brown, N P; Whittaker, A J; Newell, W R; Rawlings, C J; Beck, S
1995-06-02
Gene families are often recognised by sequence homology using similarity searching to find relationships, however, genomic sequence data provides gene architectural information not used by conventional search methods. In particular, intron positions and phases are expected to be relatively conserved features, because mis-splicing and reading frame shifts should be selected against. A fast search technique capable of detecting possible weak sequence homologies apparent at the intron/exon level of gene organization is presented for comparing spliceosomal genes and gene fragments. FINEX compares strings of exons delimited by intron/exon boundary positions and intron phases (exon fingerprint) using a global dynamic programming algorithm with a combined intron phase identity and exon size dissimilarity score. Exon fingerprints are typically two orders of magnitude smaller than their nucleic acid sequence counterparts giving rise to fast search times: a ranked search against a library of 6755 fingerprints for a typical three exon fingerprint completes in under 30 seconds on an ordinary workstation, while a worst case largest fingerprint of 52 exons completes in just over one minute. The short "sequence" length of exon fingerprints in comparisons is compensated for by the large exon alphabet compounded of intron phase types and a wide range of exon sizes, the latter contributing the most information to alignments. FINEX performs better in some searches than conventional methods, finding matches with similar exon organization, but low sequence homology. A search using a human serum albumin finds all members of the multigene family in the FINEX database at the top of the search ranking, despite very low amino acid percentage identities between family members. The method should complement conventional sequence searching and alignment techniques, offering a means of identifying otherwise hard to detect homologies where genomic data are available.
Genome-wide signatures of convergent evolution in echolocating mammals
Parker, Joe; Tsagkogeorga, Georgia; Cotton, James A.; Liu, Yuan; Provero, Paolo; Stupka, Elia; Rossiter, Stephen J.
2013-01-01
Evolution is typically thought to proceed through divergence of genes, proteins, and ultimately phenotypes1-3. However, similar traits might also evolve convergently in unrelated taxa due to similar selection pressures4,5. Adaptive phenotypic convergence is widespread in nature, and recent results from a handful of genes have suggested that this phenomenon is powerful enough to also drive recurrent evolution at the sequence level6-9. Where homoplasious substitutions do occur these have long been considered the result of neutral processes. However, recent studies have demonstrated that adaptive convergent sequence evolution can be detected in vertebrates using statistical methods that model parallel evolution9,10 although the extent to which sequence convergence between genera occurs across genomes is unknown. Here we analyse genomic sequence data in mammals that have independently evolved echolocation and show for the first time that convergence is not a rare process restricted to a handful of loci but is instead widespread, continuously distributed and commonly driven by natural selection acting on a small number of sites per locus. Systematic analyses of convergent sequence evolution in 805,053 amino acids within 2,326 orthologous coding gene sequences compared across 22 mammals (including four new bat genomes) revealed signatures consistent with convergence in nearly 200 loci. Strong and significant support for convergence among bats and the dolphin was seen in numerous genes linked to hearing or deafness, consistent with an involvement in echolocation. Surprisingly we also found convergence in many genes linked to vision: the convergent signal of many sensory genes was robustly correlated with the strength of natural selection. This first attempt to detect genome-wide convergent sequence evolution across divergent taxa reveals the phenomenon to be much more pervasive than previously recognised. PMID:24005325
Sharma, Monika; Devi, Kangjam Rekha; Sehgal, Rakesh; Narain, Kanwar; Mahanta, Jagadish; Malla, Nancy
2014-01-01
Taenia solium taeniasis/cysticercosis is a major public health problem in developing countries. This study reports genotypic analysis of T. solium cysticerci collected from two different endemic areas of North (Chandigarh) and North East India (Dibrugarh) by the sequencing of mitochondrial cytochrome c oxidase subunit 1 (cox1) gene. The variation in cox1 sequences of samples collected from these two different geographical regions located at a distance of 2585 km was minimal. Alignment of the nucleotide sequences with different species of Taenia showed the similarity with Asian genotype of T. solium. Among 50 isolates, 6 variant nucleotide positions (0.37% of total length) were detected. These results suggest that population in these geographical areas are homogenous. Copyright © 2013 Elsevier B.V. All rights reserved.
A conserved segmental duplication within ELA.
Brinkmeyer-Langford, C L; Murphy, W J; Childers, C P; Skow, L C
2010-12-01
The assembled genomic sequence of the horse major histocompatibility complex (MHC) (equine lymphocyte antigen, ELA) is very similar to the homologous human HLA, with the notable exception of a large segmental duplication at the boundary of ELA class I and class III that is absent in HLA. The segmental duplication consists of a ∼ 710 kb region of at least 11 repeated blocks: 10 blocks each contain an MHC class I-like sequence and the helicase domain portion of a BAT1-like sequence, and the remaining unit contains the full-length BAT1 gene. Similar genomic features were found in other Perissodactyls, indicating an ancient origin, which is consistent with phylogenetic analyses. Reverse-transcriptase PCR (RT-PCR) of mRNA from peripheral white blood cells of healthy and chronically or acutely infected horses detected transcription from predicted open reading frames in several of the duplicated blocks. This duplication is not present in the sequenced MHCs of most other mammals, although a similar feature at the same relative position is present in the feline MHC (FLA). Striking sequence conservation throughout Perissodactyl evolution is consistent with a functional role for at least some of the genes included within this segmental duplication. © 2010 The Authors, Journal compilation © 2010 Stichting International Foundation for Animal Genetics.
Coxiella Detection in Ticks from Wildlife and Livestock in Malaysia
Khoo, Jing-Jing; Lim, Fang-Shiang; Chen, Fezshin; Phoon, Wai-Hong; Khor, Chee-Sieng; Pike, Brian L.; Chang, Li-Yen
2016-01-01
Abstract Recent studies have shown that ticks harbor Coxiella-like bacteria, which are potentially tick-specific endosymbionts. We recently described the detection of Coxiella-like bacteria and possibly Coxiella burnetii in ticks found from rural areas in Malaysia. In the present study, we collected ticks, including Haemaphysalis bispinosa, Haemaphysalis hystricis, Dermacentor compactus, Dermacentor steini, and Amblyomma sp. from wildlife and domesticated goats from four different locations in Malaysia. Coxiella 16s rRNA genomic sequences were detected by PCR in 89% of ticks tested. Similarity analysis and phylogenetic analyses of the 16s rRNA and rpoB partial sequences were performed for 10 representative samples selected based on the tick species, sex, and location. The findings here suggested the presence of C. burnetii in two samples, each from D. steini and H. hystricis. The sequences of both samples clustered with published C. burnetii sequences. The remaining eight tick samples were shown to harbor 16s rRNA sequences of Coxiella-like bacteria, which clustered phylogenetically according to the respective tick host species. The findings presented here added to the growing evidence of the association between Coxiella-like bacteria and ticks across species and geographical boundaries. The importance of C. burnetii found in ticks in Malaysia warrants further investigation. PMID:27763821
Chai, Huan-Na; Du, Yu-Zhou; Qiu, Bao-Li; Zhai, Bao-Ping
2011-01-01
Wolbachia are a group of intracellular inherited endosymbiontic bacteria infecting a wide range of insects. In this study the infection status of Wolbachia (Rickettsiales: Rickettsiaceae) was measured in the Asiatic rice leafroller, Cnaphalocrocis medinalis (Guenée) (Lepidoptera: Pyralidae), from twenty locations in China by sequencing wsp, ftsZ and 16S rDNA genes. The results showed high infection rates of Wolbachia in C. medinalis populations. Wolbachia was detected in all geographically separate populations; the average infection rate was ∼ 62.5%, and the highest rates were 90% in Wenzhou and Yangzhou populations. The Wolbachia detected in different C. medinalis populations were 100% identical to each other when wsp, ftsZ, and 16S rDNA sequences were compared, with all sequences belonging to the Wolbachia B supergroup. Based on wsp, ftsZ and 16S rDNA sequences of Wolbachia, three phylogenetic trees of similar pattern emerged. This analysis indicated the possibility of inter-species and intra-species horizontal transmission of Wolbachia in different arthropods in related geographical regions. The migration route of C. medinalis in mainland China was also discussed since large differentiation had been found between the wsp sequences of Chinese and Thai populations. PMID:22233324
Improving performance of DS-CDMA systems using chaotic complex Bernoulli spreading codes
NASA Astrophysics Data System (ADS)
Farzan Sabahi, Mohammad; Dehghanfard, Ali
2014-12-01
The most important goal of spreading spectrum communication system is to protect communication signals against interference and exploitation of information by unintended listeners. In fact, low probability of detection and low probability of intercept are two important parameters to increase the performance of the system. In Direct Sequence Code Division Multiple Access (DS-CDMA) systems, these properties are achieved by multiplying the data information in spreading sequences. Chaotic sequences, with their particular properties, have numerous applications in constructing spreading codes. Using one-dimensional Bernoulli chaotic sequence as spreading code is proposed in literature previously. The main feature of this sequence is its negative auto-correlation at lag of 1, which with proper design, leads to increase in efficiency of the communication system based on these codes. On the other hand, employing the complex chaotic sequences as spreading sequence also has been discussed in several papers. In this paper, use of two-dimensional Bernoulli chaotic sequences is proposed as spreading codes. The performance of a multi-user synchronous and asynchronous DS-CDMA system will be evaluated by applying these sequences under Additive White Gaussian Noise (AWGN) and fading channel. Simulation results indicate improvement of the performance in comparison with conventional spreading codes like Gold codes as well as similar complex chaotic spreading sequences. Similar to one-dimensional Bernoulli chaotic sequences, the proposed sequences also have negative auto-correlation. Besides, construction of complex sequences with lower average cross-correlation is possible with the proposed method.
Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daily, Jeffrey A.
2015-05-01
The field of bioinformatics and computational biology is currently experiencing a data revolution. The exciting prospect of making fundamental biological discoveries is fueling the rapid development and deployment of numerous cost-effective, high-throughput next-generation sequencing technologies. The result is that the DNA and protein sequence repositories are being bombarded with new sequence information. Databases are continuing to report a Moore’s law-like growth trajectory in their database sizes, roughly doubling every 18 months. In what seems to be a paradigm-shift, individual projects are now capable of generating billions of raw sequence data that need to be analyzed in the presence of alreadymore » annotated sequence information. While it is clear that data-driven methods, such as sequencing homology detection, are becoming the mainstay in the field of computational life sciences, the algorithmic advancements essential for implementing complex data analytics at scale have mostly lagged behind. Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or “homologous”) on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. In this dissertation, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for a collection of 2.56M sequences show parallel efficiencies of ~75-100% on up to 8K cores, representing a time-to-solution of 33 seconds. We extend this work with a detailed analysis of single-node sequence alignment performance using the latest CPU vector instruction set extensions. Preliminary results reveal that current sequence alignment algorithms are unable to fully utilize widening vector registers.« less
Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis
Conceição, Inês C.; Long, Anthony D.; Gruber, Jonathan D.; Beldade, Patrícia
2011-01-01
Background Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. Methodology/Principal Findings We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations) and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes). Conclusions The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1) the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2) the high conservation of non-coding sequence around the genes wingless and Ecdysone receptor, both involved in multiple developmental processes including wing pattern formation. PMID:21909358
Öncü, Ceren; Brinkmann, Annika; Günay, Filiz; Kar, Sırrı; Öter, Kerem; Sarıkaya, Yasemen; Nitsche, Andreas; Linton, Yvonne-Marie; Alten, Bülent; Ergünay, Koray
2018-01-01
Mosquitoes are involved in the transmission and maintenance of several viral diseases with significant health impact. Biosurveillance efforts have also revealed insect-specific viruses, observed to cocirculate with pathogenic strains. This report describes the findings of flavivirus and rhabdovirus screening, performed in eastern Thrace and Aegean region of Anatolia during 2016, including and expanding on locations with previously-documented virus activity. A mosquito cohort of 1545 individuals comprising 14 species were collected and screened in 108 pools via generic and specific amplification and direct metagenomics by next generation sequencing. Seven mosquito pools (6.4%) were positive in the flavivirus screening. West Nile virus lineage 1 clade 1a sequences were characterized in a pool Culex pipiens sensu lato specimens, providing the initial virus detection in Aegean region following 2010 outbreak. In an Anopheles maculipennis sensu lato pool, sequences closely-related to Anopheles flaviviruses were obtained, with similarities to several African and Australian strains of this new insect-specific flavivirus clade. In pools comprising Uranotaenia unguiculata (n=3), Cx. pipiens s.l. (n=1) and Aedes caspius (n=1) mosquitoes, sequences of a novel flavivirus, distantly-related to Flavivirus AV2011, identified previously in Spain and Turkey, were characterized. Moreover, DNA forms of the novel flavivirus were detected in two Ur. unguiculata pools. These sequences were highly-similar to the sequences amplified from viral RNA, with undisrupted reading frames, suggest the occurrence of viral DNA forms in natural conditions within mosquito hosts. Rhabdovirus screening revealed sequences of a recently-described novel virus, named the Merida-like virus Turkey (MERDLVT) in 5 Cx. pipiens s.l. pools (4.6%). Partial L and N gene sequences of MERDLVT were well-conserved among strains, with evidence for geographical clustering in phylogenetic analyses. Metagenomics provided the near-full genomic sequence in a specimen, revealing an identical genome organization and limited divergence from the prototype MERDLVT isolate. Copyright © 2017 Elsevier B.V. All rights reserved.
Polynucleobacter bacteria in the brackish-water species Euplotes harpa (Ciliata Hypotrichia).
Vannini, Claudia; Petroni, Giulio; Verni, Franco; Rosati, Giovanna
2005-01-01
We have found a Polynucleobacter bacterium in the cytoplasm of Euplotes harpa, a species living in a brackish-water habitat, with a cirral pattern not corresponding to that of the freshwater Euplotes species known to harbor this type of bacteria. The symbiont has been found in three strains of the species, obtained by clonal cultures from ciliates collected in different geographic regions. The 16S rRNA gene sequence of this bacterium identifies it as a member of the beta-proteobacterial genus Polynucleobacter. This sequence shares a high similarity value (98.4-98.5%) with P. necessarius, the type species of the genus, and is associated with 16S rRNA gene sequences of environmental clones and bacterial strains included in the Polynucleobacter cluster (>95%). An oligonucleotide probe was designed to corroborate the assignment of the retrieved sequence to the symbiont and to detect similar bacteria rapidly. Antibiotic experiments showed that the elimination of the bacteria stops the reproductive cycle in E. harpa, as has been shown for the freshwater Euplotes species.
Palanga, Essowè; Martin, Darren P; Galzi, Serge; Zabré, Jean; Bouda, Zakaria; Neya, James Bouma; Sawadogo, Mahamadou; Traore, Oumar; Peterschmitt, Michel; Roumagnac, Philippe; Filloux, Denis
2017-07-01
The full-length genome sequences of two novel poleroviruses found infecting cowpea plants, cowpea polerovirus 1 (CPPV1) and cowpea polerovirus 2 (CPPV2), were determined using overlapping RT-PCR and RACE-PCR. Whereas the 5845-nt CPPV1 genome was most similar to chickpea chlorotic stunt virus (73% identity), the 5945-nt CPPV2 genome was most similar to phasey bean mild yellow virus (86% identity). The CPPV1 and CPPV2 genomes both have a typical polerovirus genome organization. Phylogenetic analysis of the inferred P1-P2 and P3 amino acid sequences confirmed that CPPV1 and CPPV2 are indeed poleroviruses. Four apparently unique recombination events were detected within a dataset of 12 full polerovirus genome sequences, including two events in the CPPV2 genome. Based on the current species demarcation criteria for the family Luteoviridae, we tentatively propose that CPPV1 and CPPV2 should be considered members of novel polerovirus species.
Jang, Yeongseon; Jang, Seokyoon; Min, Mihee; Hong, Joo-Hyun; Lee, Hanbyul; Lee, Hwanhwi; Lim, Young Woon; Kim, Jae-Jin
2015-10-01
In this study, three different methods (fruiting body collection, mycelial isolation, and 454 sequencing) were implemented to determine the diversity of wood-inhabiting basidiomycetes from dead Manchurian fir (Abies holophylla). The three methods recovered similar species richness (26 species from fruiting bodies, 32 species from mycelia, and 32 species from 454 sequencing), but Fisher's alpha, Shannon-Wiener, Simpson's diversity indices of fungal communities indicated fruiting body collection and mycelial isolation displayed higher diversity compared with 454 sequencing. In total, 75 wood-inhabiting basidiomycetes were detected. The most frequently observed species were Heterobasidion orientale (fruiting body collection), Bjerkandera adusta (mycelial isolation), and Trichaptum fusco-violaceum (454 sequencing). Only two species, Hymenochaete yasudae and Hypochnicium karstenii, were detected by all three methods. This result indicated that Manchurian fir harbors a diverse basidiomycetous fungal community and for complete estimation of fungal diversity, multiple methods should be used. Further studies are required to understand their ecology in the context of forest ecosystems.
Dong, Chongmei; Vincent, Kate; Sharp, Peter
2009-12-04
TILLING (Targeting Induced Local Lesions IN Genomes) is a powerful tool for reverse genetics, combining traditional chemical mutagenesis with high-throughput PCR-based mutation detection to discover induced mutations that alter protein function. The most popular mutation detection method for TILLING is a mismatch cleavage assay using the endonuclease CelI. For this method, locus-specific PCR is essential. Most wheat genes are present as three similar sequences with high homology in exons and low homology in introns. Locus-specific primers can usually be designed in introns. However, it is sometimes difficult to design locus-specific PCR primers in a conserved region with high homology among the three homoeologous genes, or in a gene lacking introns, or if information on introns is not available. Here we describe a mutation detection method which combines High Resolution Melting (HRM) analysis of mixed PCR amplicons containing three homoeologous gene fragments and sequence analysis using Mutation Surveyor software, aimed at simultaneous detection of mutations in three homoeologous genes. We demonstrate that High Resolution Melting (HRM) analysis can be used in mutation scans in mixed PCR amplicons containing three homoeologous gene fragments. Combining HRM scanning with sequence analysis using Mutation Surveyor is sensitive enough to detect a single nucleotide mutation in the heterozygous state in a mixed PCR amplicon containing three homoeoloci. The method was tested and validated in an EMS (ethylmethane sulfonate)-treated wheat TILLING population, screening mutations in the carboxyl terminal domain of the Starch Synthase II (SSII) gene. Selected identified mutations of interest can be further analysed by cloning to confirm the mutation and determine the genomic origin of the mutation. Polyploidy is common in plants. Conserved regions of a gene often represent functional domains and have high sequence similarity between homoeologous loci. The method described here is a useful alternative to locus-specific based methods for screening mutations in conserved functional domains of homoeologous genes. This method can also be used for SNP (single nucleotide polymorphism) marker development and eco-TILLING in polyploid species.
Gasser, R B; Rossi, L; Zhu, X
1999-11-01
The sequence of the second internal transcribed spacer of ribosomal DNA was determined for four species of Nematodirus (Nematodirus rupicaprae, Nematodirus oiratianus, Nematodirus davtiani alpinus and Nematodirus europaeus) from roe deer or alpine chamois. The second internal transcribed spacer of the four species varied in length from 228 to 236 bp, and the G + C contents ranged from 41 to 44%. While no intraspecific sequence variation was detected among multiple samples representing three of the taxa, sequence differences of 5.9-9.7% were detected among the four species, Nematodirus davtiani alpinus and N. rupicaprae were genetically most similar (94.1%), followed by N. oiratianus, N. europaeus and N. rupicaprae (91.1-91.5%), whereas N. oiratianus was genetically most different from N. davtiani alpinus. The interspecific sequence differences were exploited for the delineation of the four species by PCR-based restriction fragment length polymorphism (using two enzymes) and single-strand conformation polymorphism. The results have implications for diagnosis, epidemiology and for studying the systematics of the Nematodirinae.
Bidin, M; Lojkić, I; Bidin, Z; Tiljar, M; Majnarić, D
2011-12-01
Phylogenetic diversity of parvovirus detected in commercial chicken and turkey flocks is described. Nine chicken and six turkey flocks from Croatian farms were tested for parvovirus presence. Intestinal samples from one turkey and seven chicken flocks were found positive, and were sequenced. Natural parvovirus infection was more frequently detected in chickens than in turkeys examined in this study. Sequence analysis of 400 nucleotide fragments of the nonstructural gene (NS) showed that our sequences had more similarity with chicken parvovirus (ChPV) (92.3%-99.7%) than turkey parvovirus (TuPV) (89.5%-98.9%) strains. Phylogenetic analysis grouped our sequences in two clades. Also, the higher prevalence of ChPV than TuPV in tested flocks was defined. The necropsy findings suggested a malabsorption syndrome followed by a preascitic condition. Further research of parvovirus infection, pathogenesis, and the possibility of its association with poult enteritis and mortality syndrome (PEMS) and runting and stunting syndrome (RSS) is needed to clarify its significance as an agent of enteric disease.
Prevalence of pathogenic bacteria in Ixodes ricinus ticks in Central Bohemia.
Klubal, Radek; Kopecky, Jan; Nesvorna, Marta; Sparagano, Olivier A E; Thomayerova, Jana; Hubert, Jan
2016-01-01
Bacteria associated with the tick Ixodes ricinus were assessed in specimens unattached or attached to the skin of cats, dogs and humans, collected in the Czech Republic. The bacteria were detected by PCR in 97 of 142 pooled samples including 204 ticks, i.e. 1-7 ticks per sample, collected at the same time from one host. A fragment of the bacterial 16S rRNA gene was amplified, cloned and sequenced from 32 randomly selected samples. The most frequent sequences were those related to Candidatus Midichloria midichlori (71% of cloned sequences), followed by Diplorickettsia (13%), Spiroplasma (3%), Rickettsia (3%), Pasteurella (3%), Morganella (3%), Pseudomonas (2%), Bacillus (1%), Methylobacterium (1%) and Phyllobacterium (1%). The phylogenetic analysis of Spiroplasma 16S rRNA gene sequences showed two groups related to Spiroplasma eriocheiris and Spiroplasma melliferum, respectively. Using group-specific primers, the following potentially pathogenic bacteria were detected: Borellia (in 20% of the 142 samples), Rickettsia (12%), Spiroplasma (5%), Diplorickettsia (5%) and Anaplasma (2%). In total, 68% of I. ricinus samples (97/142) contained detectable bacteria and 13% contained two or more putative pathogenic groups. The prevalence of tick-borne bacteria was similar to the observations in other European countries.
Yokoyama, Naoaki; Sivakumar, Thillaiampalam; Tuvshintulga, Bumduuren; Hayashida, Kyoko; Igarashi, Ikuo; Inoue, Noboru; Long, Phung Thang; Lan, Dinh Thi Bich
2015-03-01
The genes that encode merozoite surface antigens (MSAs) in Babesia bovis are genetically diverse. In this study, we analyzed the genetic diversity of B. bovis MSA-1, MSA-2b, and MSA-2c genes in Vietnamese cattle and water buffaloes. Blood DNA samples from 258 cattle and 49 water buffaloes reared in the Thua Thien Hue province of Vietnam were screened with a B. bovis-specific diagnostic PCR assay. The B. bovis-positive DNA samples (23 cattle and 16 water buffaloes) were then subjected to PCR assays to amplify the MSA-1, MSA-2b, and MSA-2c genes. Sequencing analyses showed that the Vietnamese MSA-1 and MSA-2b sequences are genetically diverse, whereas MSA-2c is relatively conserved. The nucleotide identity values for these MSA gene sequences were similar in the cattle and water buffaloes. Consistent with the sequencing data, the Vietnamese MSA-1 and MSA-2b sequences were dispersed across several clades in the corresponding phylogenetic trees, whereas the MSA-2c sequences occurred in a single clade. Cattle- and water-buffalo-derived sequences also often clustered together on the phylogenetic trees. The Vietnamese MSA-1, MSA-2b, and MSA-2c sequences were then screened for recombination with automated methods. Of the seven recombination events detected, five and two were associated with the MSA-2b and MSA-2c recombinant sequences, respectively, whereas no MSA-1 recombinants were detected among the sequences analyzed. Recombination between the sequences derived from cattle and water buffaloes was very common, and the resultant recombinant sequences were found in both host animals. These data indicate that the genetic diversity of the MSA sequences does not differ between cattle and water buffaloes in Vietnam. They also suggest that recombination between the B. bovis MSA sequences in both cattle and water buffaloes might contribute to the genetic variation in these genes in Vietnam. Copyright © 2015 Elsevier B.V. All rights reserved.
Santamaría, Johanna; López, Liliana; Soto, Carlos Yesid
2011-01-01
Grassland-based production systems use ∼26% of land surface on earth. However, there are no evaluations of these systems as a source of antibiotic pollution. This study was conducted to evaluate the presence, diversity, and distribution of tetracycline resistance genes in the grasslands of the Colombian Andes, where administration of antibiotics to animals is limited to treat disease and growth promoters are not included in animals’ diet. Animal (ruminal fluid and feces) and environmental (soil and water) samples were collected from different dairy cattle farms and evaluated by PCR for the genes tet(M), tet(O), tetB(P), tet(Q), tet(W), tet(S), tet(T), otr(A), which encode ribosomal protection proteins (RPPs), and the genes tet(A), tet(B), tet(D), tet(H), tet(J), and tet(Z), encoding efflux pumps. A wide distribution and high frequency for genes tet(W) and tet(Q) were found in both sample types. Genes tet(O) and tetB(P), detected in high frequencies in feces, were detected in low frequencies or not detected at all in the environment. Other genes encoding RPPs, such as tet(M), tet(S), and tet(T), were detected at very low frequencies and restricted distributions. Genes encoding efflux pumps were not common in this region, and only two of them, tet(B) and tet(Z), were detected. DGGE–PCR followed by comparative sequence analysis of tet(W) and tet(Q) showed that the sequences detected in animals did not differ from those coming from soil and water. Finally, the farms sampled in this study showed more than 50% similarity in relation to the tet genes detected. In conclusion, there was a remarkable presence of tet genes in these production systems and, although not all genes detected in animal reservoirs were detected in the environment, there is a predominant distribution of tet(W) and tet(Q) in both animal and environmental reservoirs. Sequence similarity analysis suggests the transmission of these genes from animals to the environment. PMID:22174707
Walter, Vonn; Patel, Nirali M.; Eberhard, David A.; Hayward, Michele C.; Salazar, Ashley H.; Jo, Heejoon; Soloway, Matthew G.; Wilkerson, Matthew D.; Parker, Joel S.; Yin, Xiaoying; Zhang, Guosheng; Siegel, Marni B.; Rosson, Gary B.; Earp, H. Shelton; Sharpless, Norman E.; Gulley, Margaret L.; Weck, Karen E.
2015-01-01
The recent FDA approval of the MiSeqDx platform provides a unique opportunity to develop targeted next generation sequencing (NGS) panels for human disease, including cancer. We have developed a scalable, targeted panel-based assay termed UNCseq, which involves a NGS panel of over 200 cancer-associated genes and a standardized downstream bioinformatics pipeline for detection of single nucleotide variations (SNV) as well as small insertions and deletions (indel). In addition, we developed a novel algorithm, NGScopy, designed for samples with sparse sequencing coverage to detect large-scale copy number variations (CNV), similar to human SNP Array 6.0 as well as small-scale intragenic CNV. Overall, we applied this assay to 100 snap-frozen lung cancer specimens lacking same-patient germline DNA (07–0120 tissue cohort) and validated our results against Sanger sequencing, SNP Array, and our recently published integrated DNA-seq/RNA-seq assay, UNCqeR, where RNA-seq of same-patient tumor specimens confirmed SNV detected by DNA-seq, if RNA-seq coverage depth was adequate. In addition, we applied the UNCseq assay on an independent lung cancer tumor tissue collection with available same-patient germline DNA (11–1115 tissue cohort) and confirmed mutations using assays performed in a CLIA-certified laboratory. We conclude that UNCseq can identify SNV, indel, and CNV in tumor specimens lacking germline DNA in a cost-efficient fashion. PMID:26076459
Malecka, Kamila; Michalczuk, Lech; Radecka, Hanna; Radecki, Jerzy
2014-10-09
A DNA biosensor for detection of specific oligonucleotides sequences of Plum Pox Virus (PPV) in plant extracts and buffer is proposed. The working principles of a genosensor are based on the ion-channel mechanism. The NH2-ssDNA probe was deposited onto a glassy carbon electrode surface to form an amide bond between the carboxyl group of oxidized electrode surface and amino group from ssDNA probe. The analytical signals generated as a result of hybridization were registered in Osteryoung square wave voltammetry in the presence of [Fe(CN)6]3-/4- as a redox marker. The 22-mer and 42-mer complementary ssDNA sequences derived from PPV and DNA samples from plants infected with PPV were used as targets. Similar detection limits of 2.4 pM (31.0 pg/mL) and 2.3 pM (29.5 pg/mL) in the concentration range 1-8 pM were observed in the presence of the 22-mer ssDNA and 42-mer complementary ssDNA sequences of PPV, respectively. The genosensor was capable of discriminating between samples consisting of extracts from healthy plants and leaf extracts from infected plants in the concentration range 10-50 pg/mL. The detection limit was 12.8 pg/mL. The genosensor displayed good selectivity and sensitivity. The 20-mer partially complementary DNA sequences with four complementary bases and DNA samples from healthy plants used as negative controls generated low signal.
Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F.; Alzate, Juan F.; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor
2015-01-01
Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant. PMID:25855242
Kuhnert, Peter; Scholten, Edzard; Haefner, Stefan; Mayor, Désirée; Frey, Joachim
2010-01-01
Gram-negative, coccoid, non-motile bacteria that are catalase-, urease- and indole-negative, facultatively anaerobic and oxidase-positive were isolated from the bovine rumen using an improved selective medium for members of the Pasteurellaceae. All strains produced significant amounts of succinic acid under anaerobic conditions with glucose as substrate. Phenotypic characterization and multilocus sequence analysis (MLSA) using 16S rRNA, rpoB, infB and recN genes were performed on seven independent isolates. All four genes showed high sequence similarity to their counterparts in the genome sequence of the patent strain MBEL55E, but less than 95 % 16S rRNA gene sequence similarity to any other species of the Pasteurellaceae. Genetically these strains form a very homogeneous group in individual as well as combined phylogenetic trees, clearly separated from other genera of the family from which they can also be separated based on phenotypic markers. Genome relatedness as deduced from the recN gene showed high interspecies similarities, but again low similarity to any of the established genera of the family. No toxicity towards bovine, human or fish cells was observed and no RTX toxin genes were detected in members of the new taxon. Based on phylogenetic clustering in the MLSA analysis, the low genetic similarity to other genera and the phenotypic distinction, we suggest to classify these bovine rumen isolates as Basfia succiniciproducens gen. nov., sp. nov. The type strain is JF4016(T) (=DSM 22022(T) =CCUG 57335(T)).
Yao, Juan; Zhang, Zhang; Deng, Zhenghua; Wang, Youqiang; Guo, Yongcan
2017-10-23
An isothermal, enzyme free, ultra-specific and ultra-sensitive protocol for electrochemical detection of miRNAs is proposed based on the toehold-mediated strand displacement reaction (SDR) and non-enzymatic catalytic hairpin reaction (CHA) recycling. The SDR was first triggered only in the presence of target miRNA and this process also affects other miRNA interferences having similar target sequences, thus guaranteeing a high discrimination factor and could be used in rare content miRNA detection with various amounts of interferences having similar target sequences. The output protector strand then triggered enzyme free CHA amplification and generates plenty of hairpin self-assembly products. This process in turn influences SDR equilibrium to move to the right and generates large amounts of protector output to ensure analysis sensitivity. Compared with traditional CHA, our proposed method greatly improved the signal to noise ratio and shows excellent performance in rare miRNA detection with miRNA analogue interference. Under the optimal experimental conditions and using square wave voltammetry, the established biosensor could detect target miRNA-21 down to 30 fM (S/N = 3) with a dynamic range from 100 fM to 2 nM, and discriminate rare target miRNA-21 from mismatched miRNA with high selectivity. This method holds great promise in miRNA detection from human cancer cell lines and would be a versatile and powerful tool for clinical molecular diagnostics.
Hacker, Catrina M; Meschke, Emily X; Biederman, Irving
2018-03-20
Familiar objects, specified by name, can be identified with high accuracy when embedded in a rapidly presented sequence of images at rates exceeding 10 images/s. Not only can target objects be detected at such brief presentation rates, they can also be detected under high uncertainty, where their classification is defined negatively, e.g., "Not a Tool." The identification of a familiar speaker's voice declines precipitously when uncertainty is increased from one to a mere handful of possible speakers. Is the limitation imposed by uncertainty, i.e., the number of possible individuals, a general characteristic of processes for person individuation such that the identifiability of a familiar face would undergo a similar decline with uncertainty? Specifically, could the presence of an unnamed celebrity, thus any celebrity, be detected when presented in a rapid sequence of unfamiliar faces? If so, could the celebrity be identified? Despite the markedly greater physical similarity of faces compared to objects that are, say, not tools, the presence of a celebrity could be detected with moderately high accuracy (∼75%) at rates exceeding 7 faces/s. False alarms were exceedingly rare as almost all the errors were misses. Detection accuracy by moderate congenital prosopagnosics was lower than controls, but still well above chance. Given the detection of the presence of a celebrity, all subjects were almost always able to identify that celebrity, providing no role for a covert familiarity signal outside of awareness. Copyright © 2018 Elsevier Ltd. All rights reserved.
Chloroplast Genome Evolution in Early Diverged Leptosporangiate Ferns
Kim, Hyoung Tae; Chung, Myong Gi; Kim, Ki-Joong
2014-01-01
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. cinnamomea. In addition, putative RNA editing sites in the cp genome were rare in O. cinnamomea, even though the sites were frequently predicted to be present in leptosporangiate ferns. The complete cp genome sequence of Diplopterygium glaucum (Gleicheniales) was 151,007 bp and has a 9.7 kb inversion between the trnL-CAA and trnV-GCA genes when compared to O. cinnamomea. Several repeated sequences were detected around the inversion break points. The complete cp genome sequence of Lygodium japonicum (Schizaeales) was 157,142 bp and a deletion of the rpoC1 intron was detected. This intron loss was shared by all of the studied species of the genus Lygodium. The GC contents and the effective numbers of co-dons (ENCs) in ferns varied significantly when compared to seed plants. The ENC values of the early diverged leptosporangiate ferns showed intermediate levels between eusporangiate and core leptosporangiate ferns. However, our phylogenetic tree based on all of the cp gene sequences clearly indicated that the cp genome similarity between O. cinnamomea (Osmundales) and eusporangiate ferns are symplesiomorphies, rather than synapomorphies. Therefore, our data is in agreement with the view that Osmundales is a distinct early diverged lineage in the leptosporangiate ferns. PMID:24823358
Chloroplast genome evolution in early diverged leptosporangiate ferns.
Kim, Hyoung Tae; Chung, Myong Gi; Kim, Ki-Joong
2014-05-01
In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. cinnamomea. In addition, putative RNA editing sites in the cp genome were rare in O. cinnamomea, even though the sites were frequently predicted to be present in leptosporangiate ferns. The complete cp genome sequence of Diplopterygium glaucum (Gleicheniales) was 151,007 bp and has a 9.7 kb inversion between the trnL-CAA and trnVGCA genes when compared to O. cinnamomea. Several repeated sequences were detected around the inversion break points. The complete cp genome sequence of Lygodium japonicum (Schizaeales) was 157,142 bp and a deletion of the rpoC1 intron was detected. This intron loss was shared by all of the studied species of the genus Lygodium. The GC contents and the effective numbers of codons (ENCs) in ferns varied significantly when compared to seed plants. The ENC values of the early diverged leptosporangiate ferns showed intermediate levels between eusporangiate and core leptosporangiate ferns. However, our phylogenetic tree based on all of the cp gene sequences clearly indicated that the cp genome similarity between O. cinnamomea (Osmundales) and eusporangiate ferns are symplesiomorphies, rather than synapomorphies. Therefore, our data is in agreement with the view that Osmundales is a distinct early diverged lineage in the leptosporangiate ferns.
Identification of Functionally Related Enzymes by Learning-to-Rank Methods.
Stock, Michiel; Fober, Thomas; Hüllermeier, Eyke; Glinca, Serghei; Klebe, Gerhard; Pahikkala, Tapio; Airola, Antti; De Baets, Bernard; Waegeman, Willem
2014-01-01
Enzyme sequences and structures are routinely used in the biological sciences as queries to search for functionally related enzymes in online databases. To this end, one usually departs from some notion of similarity, comparing two enzymes by looking for correspondences in their sequences, structures or surfaces. For a given query, the search operation results in a ranking of the enzymes in the database, from very similar to dissimilar enzymes, while information about the biological function of annotated database enzymes is ignored. In this work, we show that rankings of that kind can be substantially improved by applying kernel-based learning algorithms. This approach enables the detection of statistical dependencies between similarities of the active cleft and the biological function of annotated enzymes. This is in contrast to search-based approaches, which do not take annotated training data into account. Similarity measures based on the active cleft are known to outperform sequence-based or structure-based measures under certain conditions. We consider the Enzyme Commission (EC) classification hierarchy for obtaining annotated enzymes during the training phase. The results of a set of sizeable experiments indicate a consistent and significant improvement for a set of similarity measures that exploit information about small cavities in the surface of enzymes.
NASA Astrophysics Data System (ADS)
Meng, X.; Peng, Z.; Deng, S.; Castro, R. R.
2015-12-01
The 2010 Mw7.2 El Mayor-Cucapah earthquake occurred southwest of the Pacific-North America plate boundary in north Baja California. It was preceded by an intensive foreshock sequence, and was followed by numerous aftershocks both on and off the mainshock rupture zone, hence providing us a great opportunity to study the physical mechanisms of foreshock and aftershock triggering. In our previously published work (Meng and Peng, GJI, 2014), we focused on the seismicity rate changes around the Salton Sea Geothermal Field (SSGF) and along the San Jacinto Fault (SJF) following the mainshock. Based on a recently developed matched filter technique, we were able to detect up to 20 times more events than listed in the SCSN catalog. We found that the seismicity rate near SSGF and SJF both experienced significant increase immediately following the mainshock. However, the seismicity rate near SSGF, where static Coulomb stress decreased, dropped below the pre-mainshock level after ~50 days. On the other hand, the seismicity rate near SJF, where static Coulomb stress increased, remained high till the end of our detecting time window. Such pattern indicates that both static and dynamic triggering may coexist, but dominate in different time scales. Motivated by this success, we shift our focus to the foreshock and aftershock sequence of the El Mayor-Cucapah event. We utilize available seismic stations immediately north to US-Mexico boarder and a few stations within Mexico to conduct a similar detection ~40 days before to 40 days after the mainshock. We aim to obtain a complete foreshock sequence and investigate its spatio-temporal evolutions before the mainshock. Moreover, we plan to study similar patterns for aftershocks and the corresponding triggering mechanisms. Updated results will be presented at the meeting.
Brown, S M; Crouch, M L
1990-01-01
We have isolated and characterized cDNA clones of a gene family (P2) expressed in Oenothera organensis pollen. This family contains approximately six to eight family members and is expressed at high levels only in pollen. The predicted protein sequence from a near full-length cDNA clone shows that the protein products of these genes are at least 38,000 daltons. We identified the protein encoded by one of the cDNAs in this family by using antibodies to beta-galactosidase/pollen cDNA fusion proteins. Immunoblot analysis using these antibodies identifies a family of proteins of approximately 40 kilodaltons that is present in mature pollen, indicating that these mRNAs are not stored solely for translation after pollen germination. These proteins accumulate late in pollen development and are not detectable in other parts of the plant. Although not present in unpollinated or self-pollinated styles, the 40-kilodalton to 45-kilodalton antigens are detectable in extracts from cross-pollinated styles, suggesting that the proteins are present in pollen tubes growing through the style during pollination. The proteins are also present in pollen tubes growing in vitro. Both nucleotide and amino acid sequences are similar to the published sequences for cDNAs encoding the enzyme polygalacturonase, which suggests that the P2 gene family may function in depolymerizing pectin during pollen development, germination, and tube growth. Cross-hybridizing RNAs and immunoreactive proteins were detected in pollen from a wide variety of plant species, which indicates that the P2 family of polygalacturonase-like genes are conserved and may be expressed in the pollen from many angiosperms. PMID:2152116
Wenchuan Event Detection And Localization Using Waveform Correlation Coupled With Double Difference
NASA Astrophysics Data System (ADS)
Slinkard, M.; Heck, S.; Schaff, D. P.; Young, C. J.; Richards, P. G.
2014-12-01
The well-studied Wenchuan aftershock sequence triggered by the May 12, 2008, Ms 8.0, mainshock offers an ideal test case for evaluating the effectiveness of using waveform correlation coupled with double difference relocation to detect and locate events in a large aftershock sequence. We use Sandia's SeisCorr detector to process 3 months of data recorded by permanent IRIS and temporary ASCENT stations using templates from events listed in a global catalog to find similar events in the raw data stream. Then we take the detections and relocate them using the double difference method. We explore both the performance that can be expected with using just a small number of stations, and, the benefits of reprocessing a well-studied sequence such as this one using waveform correlation to find even more events. We benchmark our results against previously published results describing relocations of regional catalog data. Before starting this project, we had examples where with just a few stations at far-regional distances, waveform correlation combined with double difference did and impressive job of detection and location events with precision at the few hundred and even tens of meters level.
Carvalho, Carina Luísa; Silva, Sara; Gouveia, Paz; Costa, Margarida; Duarte, Elsa Leclerc; Henriques, Ana Margarida; Barros, Sílvia Santos; Luís, Tiago; Ramos, Fernanda; Fagulha, Teresa; Fevereiro, Miguel; Duarte, Margarida Dias
2017-12-01
We report the detection of rabbit haemorrhagic disease virus 2 (RHDV2) in the Madeira archipelago, Portugal. Viral circulation was confirmed by RT-qPCR and vp60 sequencing. Epidemiological data revealed the outbreak initiated in October 2016 in Porto Santo affecting wild and domestic rabbits. It was then detected three months later on the island of Madeira. Five haplotypes were identified and a genetic overall similarity of 99.54 to 99.89% was observed between the two viral populations. Unique single nucleotide polymorphisms were recognised in the Madeira archipelago strains, two of which resulting in amino acid substitutions at positions 480 and 570 in the VP60 protein. Phylogenetic investigation by Maximum Likelihood showed all the vp60 sequences from the Madeira archipelago group together with high bootstraps. The analysis also showed that the Madeira archipelago strains are closely related to the strains detected in the south of mainland Portugal in 2016, suggesting a possible introduction from the mainland. The epidemiological data and high genetic similarity indicate a common source for the Porto Santo and Madeira RHDV2 outbreaks. Human activity related to hunting was most probably at the origin of the Madeira outbreak.
Lin, Lulu; Wang, Peikun; Yang, Yongli; Li, Haijuan; Huang, Teng; Wei, Ping
2017-12-01
Since 2014, cases of hemangioma associated with avian leukosis virus subgroup J (ALV-J) have been emerging in commercial chickens in Guangxi. In this study, four strains of the subgroup J avian leukosis virus (ALV-J), named GX14HG01, GX14HG04, GX14LT07, and GX14ZS14, were isolated from chickens with clinical hemangioma in 2014 by DF-1 cell culture and then identified with ELISA detection of ALV group specific antigen p27, the detection of subtype specific PCR and indirect immunofluorescence assay (IFA) with ALV-J specific monoclonal antibody. The complete genomes of the isolates were sequenced and it was found that the gag and pol were relatively conservative, while env was variable especially the gp85 gene. Homology analysis of the env gene sequences showed that the env gene of all the four isolates had higher similarities with the hemangioma (HE)-type reference strains than that of the myeloid leukosis (ML)-type strains, and moreover, the HE-type strains' specific deletion of 205-bp sequence covering the rTM and DR1 in 3'UTR fragment was also found in the four isolates. Further analysis on the sequences of subunits of env gene revealed an interesting finding: the gp85 of isolates GX14ZS14 and GX14HG04 had a higher similarity with HPRS-103 and much lower similarity with the HE-type reference strains resulting in GX14ZS14, GX14HG04, and HPRS-103 being clustered in the same branch, while gp37 had higher similarities with the HE-type reference strains when compared to that of HPRS-103, resulted in GX14ZS14, GX14HG04, and HE-type reference strains being clustered in the same branch. The results suggested that isolates GX14ZS14 and GX14HG04 may be the recombinant strains of the foreign strain HPRS-103 with the local epidemic HE-type strains of ALV-J.
Belak, Zachery R; Ovsenek, Nicholas; Eskiw, Christopher H
2018-05-23
Yin-Yang 1 (YY1) is a highly conserved transcription factor possessing RNA-binding activity. A putative YY1 homologue was previously identified in the developmental model organism Strongylocentrotus purpuratus (the purple sea urchin) by genomic sequencing. We identified a high degree of sequence similarity with YY1 homologues of vertebrate origin which shared 100% protein sequence identity over the DNA- and RNA-binding zinc-finger region with high similarity in the N-terminal transcriptional activation domain. SpYY1 demonstrated identical DNA- and RNA-binding characteristics between Xenopus laevis and S. purpuratus indicating that it maintains similar functional and biochemical properties across widely divergent deuterostome species. SpYY1 binds to the consensus YY1 DNA element, and also to U-rich RNA sequences. Although we detected SpYY1 RNA-binding activity in ova lysates and observed cytoplasmic localization, SpYY1 was not associated with maternal mRNA in ova. SpYY1 expressed in Xenopus oocytes was excluded from the nucleus and associated with maternally expressed cytoplasmic mRNA molecules. These data demonstrate the existence of an YY1 homologue in S. purpuratus with similar structural and biochemical features to those of the well-studied vertebrate YY1; however, the data reveal major differences in the biological role of YY1 in the regulation of maternally expressed mRNA in the two species.
Hepatitis A and E Viruses in Wastewaters, in River Waters, and in Bivalve Molluscs in Italy.
Iaconelli, M; Purpari, G; Della Libera, S; Petricca, S; Guercio, A; Ciccaglione, A R; Bruni, R; Taffon, S; Equestre, M; Fratini, M; Muscillo, M; La Rosa, Giuseppina
2015-12-01
Several studies have reported the detection of hepatitis A (HAV) and E (HEV) virus in sewage waters, indicating a possibility of contamination of aquatic environments. The objective of the present study was to assess the occurrence of HAV and HEV in different water environments, following the route of contamination from raw sewage through treated effluent to the surface waters receiving wastewater discharges . Bivalve molluscan shellfish samples were also analyzed, as sentinel of marine pollution. Samples were tested by RT-PCR nested type in the VP1/2A junction for HAV, and in the ORF1 and ORF2 regions for HEV. Hepatitis A RNA was detected in 12 water samples: 7/21 (33.3%) raw sewage samples, 3/21 (14.3%) treated sewage samples, and 2/27 (7.4%) river water samples. Five sequences were classified as genotype IA, while the remaining 7 sequences belonged to genotype IB. In bivalves, HAV was detected in 13/56 samples (23.2%), 12 genotype IB and one genotype IA. Whether the presence of HAV in the matrices tested indicates the potential for waterborne and foodborne transmission is unknown, since infectivity of the virus was not demonstrated. HEV was detected in one raw sewage sample and in one river sample, both belonging to genotype 3. Sequences were similar to sequences detected previously in Italy in patients with autochthonous HEV (no travel history) and in animals (swine). To our knowledge, this is the first detection of HEV in river waters in Italy, suggesting that surface water can be a potential source for exposure .
Molecular analysis of an oyster-related norovirus outbreak.
Nenonen, Nancy P; Hannoun, Charles; Olsson, Margareta B; Bergström, Tomas
2009-06-01
Contaminated raw oysters were implicated in a severe outbreak of norovirus (NoV) gastroenteritis affecting 30 restaurant guests. To define the outbreak source by using molecular methods to characterize NoV strains detected in patient and oyster samples. Molecular epidemiological studies based on nucleotide sequencing and phylogenetic analyses of patient and oyster NoV strains, and comparison to background dataset. NoV genotype (G) I.1 was detected in the one patient stool analyzed by in-house TaqMan real time RT-PCR and classical nested RT-PCR targeting NoV RNA-dependent polymerase (RdRp, 285 nt), and by nested RT-PCR targeting RdRp-capsid-poly(A)-3' (3085 nt). Patient strain showed >or=99% similarity (285 nt) with three NoV strains detected in two of five oysters examined by classical nested RT-PCR (RdRp). A third oyster tested positive for NoV GII.3. Phylogenetic analysis showed clustering of patient and oyster strains related to this outbreak with GI.1 strains from previous local outbreaks, and mussel studies. Sequence data revealed >or=99% similarity (285 nt) between NoV GI.1 strains detected in patient stool and suspect oysters, linking the contaminated oysters to the outbreak. Identification of human NoV GI and GII strains in oysters indicated contamination of human fecal origin, presumably from inappropriate storage in the harbor. Comparative long-fragment analysis of the patient strain revealed 99% similarity (3085 nt) with NoV GI.1 strains detected in previous outbreaks and environmental mussel studies from West Sweden, 87% with M87661 (Norwalk68) and 96% with L23828 (SRSV-KY-89/89/J). These results indicated considerable genomic stability of NoV GI.1 strains over time.
Distant plant homologues: don't throw out the baby.
Gardiner, John; Overall, Robyn; Marc, Jan
2012-03-01
Plants and metazoans share many similarities in terms of conserved proteins. Antibodies have been used extensively to detect remote homologues, many of which are yet to be identified conclusively. Genome sequencing and the creation of novel sequence or structure comparison programs have assisted greatly in the identification of distant protein homologues. The continuing development of new software algorithms and the combining of bioinformatics with proteomics offer hope that remaining homologues will be soon identified. Copyright © 2011 Elsevier Ltd. All rights reserved.
Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C J; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H; Cui, Helen; Markotter, Wanda
2018-01-01
Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard.
Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C. J.; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H.; Cui, Helen; Markotter, Wanda
2018-01-01
Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard. PMID:29579103
Mazloom, Amin R; Džakula, Željko; Oeth, Paul; Wang, Huiquan; Jensen, Taylor; Tynan, John; McCullough, Ron; Saldivar, Juan-Sebastian; Ehrich, Mathias; van den Boom, Dirk; Bombard, Allan T; Maeder, Margo; McLennan, Graham; Meschino, Wendy; Palomaki, Glenn E; Canick, Jacob A; Deciu, Cosmin
2013-06-01
Whole-genome sequencing of circulating cell free (ccf) DNA from maternal plasma has enabled noninvasive prenatal testing for common autosomal aneuploidies. The purpose of this study was to extend the detection to include common sex chromosome aneuploidies (SCAs): [47,XXX], [45,X], [47,XXY], and [47,XYY] syndromes. Massively parallel sequencing was performed on ccf DNA isolated from the plasma of 1564 pregnant women with known fetal karyotype. A classification algorithm for SCA detection was constructed and trained on this cohort. Another study of 411 maternal samples from women with blinded-to-laboratory fetal karyotypes was then performed to determine the accuracy of the classification algorithm. In the training cohort, the new algorithm had a detection rate (DR) of 100% (95%CI: 82.3%, 100%), a false positive rate (FPR) of 0.1% (95%CI: 0%, 0.3%), and nonreportable rate of 6% (95%CI: 4.9%, 7.4%) for SCA determination. The blinded validation yielded similar results: DR of 96.2% (95%CI: 78.4%, 99.8%), FPR of 0.3% (95%CI: 0%, 1.8%), and nonreportable rate of 5% (95%CI: 3.2%, 7.7%) for SCA determination Noninvasive prenatal identification of the most common sex chromosome aneuploidies is possible using ccf DNA and massively parallel sequencing with a high DR and a low FPR. © 2013 John Wiley & Sons, Ltd.
[Detection and diversity analysis of rumen methanogens in the co-cultures with anaerobic fungi].
Cheng, Yan-fen; Mao, Sheng-yong; Pei, Cai-xia; Liu, Jian-xin; Zhu, Wei-yun
2006-12-01
Rumen methanogen diversity in the co-cultures with anaerobic fungi from goat rumen was analyzed. Mix-cultures of anaerobic fungi and methanogens were obtained from goat rumen using anaerobic fungal medium and the addition of penicillin and streptomycin and then subcultured 62 times by transferring cultures every 3 - 4d. Total DNA from the original rumen fluid and subcultured fungal cultures was used for PCR/DGGE and RFLP analysis. 16S rDNA of clones corresponding to representative OTUs were sequenced. Results showed that the diversity index (Shannon index) of the methanogens generated from DGGE profiles reduced from 1.32 to 0.99 from rumen fluid to fungal culture after 45 subculturing, with the lowest similarity of DGGE profiles at 34.7%. The Shannon index increased from 0.99 to 1.15 from the fungal culture after 45 subculturing to that after 62 subculturing, with the lowest similarity at 89.2% . A total of 5 OTUs were obtained from 69. clones using RFLP analysis and six clones representing the 5 OTUs respectively were sequenced. Of the 5 OTUs, three had their cloned 16S rDNA sequences most closely related to uncultured archaeal symbiont PA202 with the same similarity of 95 %, but had not closely related to any identified culturable methanogen. The rest two OTUs had their cloned 16S rDNA sequences sharing the same closest relative, uncultured rumen methanogen 956, with the same similarity of 97% .Their 16S rDNA sequences of these two OTUs also showed 97% similar to the closest identified culturable methanogen Methanobrevibacter sp. NT7. In conclusion, diverse yet unidentified rumen methanogen species exist in the co-cultures with anaerobic fungi isolated from the goat rumen.
Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L
2011-06-02
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
Najm, Nour-Addeen; Meyer-Kayser, Elisabeth; Hoffmann, Lothar; Herb, Ingrid; Fensterer, Veronika; Pfister, Kurt; Silaghi, Cornelia
2014-06-01
Wild canines which are closely related to dogs constitute a potential reservoir for haemoparasites by both hosting tick species that infest dogs and harbouring tick-transmitted canine haemoparasites. In this study, the prevalence of Babesia spp. and Theileria spp. was investigated in German red foxes (Vulpes vulpes) and their ticks. DNA extracts of 261 spleen samples and 1953 ticks included 4 tick species: Ixodes ricinus (n=870), I. canisuga (n=585), I. hexagonus (n=485), and Dermacentor reticulatus (n=13) were examined for the presence of Babesia/Theileria spp. by a conventional PCR targeting the 18S rRNA gene. One hundred twenty-one out of 261 foxes (46.4%) were PCR-positive. Out of them, 44 samples were sequenced, and all sequences had 100% similarity to Theileria annae. Similarly, sequencing was carried out for 65 out of 118 PCR-positive ticks. Theileria annae DNA was detected in 61.5% of the sequenced samples, Babesia microti DNA was found in 9.2%, and Babesia venatorum in 7.6% of the sequenced samples. The foxes were most positive in June and October, whereas the peak of tick positivity was in October. Furthermore, the positivity of the ticks was higher for I. canisuga in comparison to the other tick species and for nymphs in comparison to adults. The high prevalence of T. annae DNA in red foxes in this study suggests a reservoir function of those animals for T. annae. To our knowledge, this is the first report of T. annae in foxes from Germany as well as the first detection of T. annae and B. microti in the fox tick I. canisuga. Detection of DNA of T. annae and B. microti in three tick species collected from foxes adds new potential vectors for these two pathogens and suggests a potential role of the red fox in their natural endemic cycles. Copyright © 2014 Elsevier GmbH. All rights reserved.
Hu, Lujun; Wang, Linlin; Lu, Wenwei; Zhao, Jianxin; Zhang, Hao; Chen, Wei
2017-01-01
A whole-bacterium-based SELEX (Systematic Evolution of Ligands by Exponential Enrichment) procedure was adopted in this study for the selection of an ssDNA aptamer that binds to Bifidobacterium bifidum. After 12 rounds of selection targeted against B. bifidum, 30 sequences were obtained and divided into seven families according to primary sequence homology and similarity of secondary structure. Four FAM (fluorescein amidite) labeled aptamer sequences from different families were selected for further characterization by flow cytometric analysis. The results reveal that the aptamer sequence CCFM641-5 demonstrated high-affinity and specificity for B. bifidum compared with the other sequences tested, and the estimated Kd value was 10.69 ± 0.89 nM. Additionally, sequence truncation experiments of the aptamer CCFM641-5 led to the conclusion that the 5′-primer and 3′-primer binding sites were essential for aptamer-target binding. In addition, the possible component of the target B. bifidum, bound by the aptamer CCFM641-5, was identified as a membrane protein by treatment with proteinase. Furthermore, to prove the potential application of the aptamer CCFM641-5, a colorimetric bioassay of the sandwich-type structure was used to detect B. bifidum. The assay had a linear range of 104 to 107 cfu/mL (R2 = 0.9834). Therefore, the colorimetric bioassay appears to be a promising method for the detection of B. bifidum based on the aptamer CCFM641-5. PMID:28441340
It's all relative: ranking the diversity of aquatic bacterial communities.
Shaw, Allison K; Halpern, Aaron L; Beeson, Karen; Tran, Bao; Venter, J Craig; Martiny, Jennifer B H
2008-09-01
The study of microbial diversity patterns is hampered by the enormous diversity of microbial communities and the lack of resources to sample them exhaustively. For many questions about richness and evenness, however, one only needs to know the relative order of diversity among samples rather than total diversity. We used 16S libraries from the Global Ocean Survey to investigate the ability of 10 diversity statistics (including rarefaction, non-parametric, parametric, curve extrapolation and diversity indices) to assess the relative diversity of six aquatic bacterial communities. Overall, we found that the statistics yielded remarkably similar rankings of the samples for a given sequence similarity cut-off. This correspondence, despite the different underlying assumptions of the statistics, suggests that diversity statistics are a useful tool for ranking samples of microbial diversity. In addition, sequence similarity cut-off influenced the diversity ranking of the samples, demonstrating that diversity statistics can also be used to detect differences in phylogenetic structure among microbial communities. Finally, a subsampling analysis suggests that further sequencing from these particular clone libraries would not have substantially changed the richness rankings of the samples.
Fukuda, Shinji; Sasaki, Yukie; Seno, Masato
2008-01-01
We developed a two-step isothermal amplification assay system, which achieved the detection of norovirus (NoV) genomes in oysters with a sensitivity similar to that of reverse transcription-seminested PCR. The time taken for the amplification of NoV genomes from RNA extracts was shortened to about 3 h. PMID:18456857
Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M
2012-02-01
Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.
Unusual Intron Conservation near Tissue-Regulated Exons Found by Splicing Microarrays
Sugnet, Charles W; Srinivasan, Karpagam; Clark, Tyson A; O'Brien, Georgeann; Cline, Melissa S; Wang, Hui; Williams, Alan; Kulp, David; Blume, John E; Haussler, David; Ares, Manuel
2006-01-01
Alternative splicing contributes to both gene regulation and protein diversity. To discover broad relationships between regulation of alternative splicing and sequence conservation, we applied a systems approach, using oligonucleotide microarrays designed to capture splicing information across the mouse genome. In a set of 22 adult tissues, we observe differential expression of RNA containing at least two alternative splice junctions for about 40% of the 6,216 alternative events we could detect. Statistical comparisons identify 171 cassette exons whose inclusion or skipping is different in brain relative to other tissues and another 28 exons whose splicing is different in muscle. A subset of these exons is associated with unusual blocks of intron sequence whose conservation in vertebrates rivals that of protein-coding exons. By focusing on sets of exons with similar regulatory patterns, we have identified new sequence motifs implicated in brain and muscle splicing regulation. Of note is a motif that is strikingly similar to the branchpoint consensus but is located downstream of the 5′ splice site of exons included in muscle. Analysis of three paralogous membrane-associated guanylate kinase genes reveals that each contains a paralogous tissue-regulated exon with a similar tissue inclusion pattern. While the intron sequences flanking these exons remain highly conserved among mammalian orthologs, the paralogous flanking intron sequences have diverged considerably, suggesting unusually complex evolution of the regulation of alternative splicing in multigene families. PMID:16424921
Lowe, Chinn-Woan; Thiriot, Joseph D.; Heder, Michael J.; March, Jordon K.; Drake, David S.; Lew, Cynthia S.; Bunnell, Annette J.; Moore, Emily S.; O'Neill, Kim L.; Robison, Richard A.
2016-01-01
The Burkholderia pseudomallei complex classically consisted of B. mallei, B. pseudomallei, and B. thailandensis, but has now expanded to include B. oklahomensis, B. humptydooensis, and three unassigned Burkholderia clades. Methods for detecting and differentiating the B. pseudomallei complex has been the topic of recent research due to phenotypic and genotypic similarities of these species. B. mallei and B. pseudomallei are recognized as CDC Tier 1 select agents, and are the causative agents of glanders and melioidosis, respectively. Although B. thailandensis and B. oklahomensis are generally avirulent, both display similar phenotypic characteristics to that of B. pseudomallei. B. humptydooensis and the Burkholderia clades are genetically similar to the B. pseudomallei complex, and are not associated with disease. Optimal identification of these species remains problematic, and PCR-based methods can resolve issues with B. pseudomallei complex detection and differentiation. Currently, no PCR assay is available that detects the major species of the B. pseudomallei complex. A real-time PCR assay in a multiplex single-tube format was developed to simultaneously detect and differentiate B. mallei, B. pseudomallei, and B. thailandensis, and a common sequence found in B. pseudomallei, B. mallei, B. thailandensis, and B. oklahomensis. A total of 309 Burkholderia isolates and 5 other bacterial species were evaluated. The assay was 100% sensitive and specific, demonstrated sensitivity beyond culture and GC methods for the isolates tested, and is completed in about an hour with a detection limit between 2.6pg and 48.9pg of gDNA. Bioinformatic analyses also showed the assay is likely 100% specific and sensitive for all 84 fully sequenced B. pseudomallei, B. mallei, B. thailandensis, and B. oklahomensis strains currently available in GenBank. For these reasons, this assay could be a rapid and sensitive tool in the detection and differentiation for those species of the B. pseudomallei complex with recognized clinical and practical significance. PMID:27736903
Lowe, Chinn-Woan; Satterfield, Benjamin A; Nelson, Daniel B; Thiriot, Joseph D; Heder, Michael J; March, Jordon K; Drake, David S; Lew, Cynthia S; Bunnell, Annette J; Moore, Emily S; O'Neill, Kim L; Robison, Richard A
2016-01-01
The Burkholderia pseudomallei complex classically consisted of B. mallei, B. pseudomallei, and B. thailandensis, but has now expanded to include B. oklahomensis, B. humptydooensis, and three unassigned Burkholderia clades. Methods for detecting and differentiating the B. pseudomallei complex has been the topic of recent research due to phenotypic and genotypic similarities of these species. B. mallei and B. pseudomallei are recognized as CDC Tier 1 select agents, and are the causative agents of glanders and melioidosis, respectively. Although B. thailandensis and B. oklahomensis are generally avirulent, both display similar phenotypic characteristics to that of B. pseudomallei. B. humptydooensis and the Burkholderia clades are genetically similar to the B. pseudomallei complex, and are not associated with disease. Optimal identification of these species remains problematic, and PCR-based methods can resolve issues with B. pseudomallei complex detection and differentiation. Currently, no PCR assay is available that detects the major species of the B. pseudomallei complex. A real-time PCR assay in a multiplex single-tube format was developed to simultaneously detect and differentiate B. mallei, B. pseudomallei, and B. thailandensis, and a common sequence found in B. pseudomallei, B. mallei, B. thailandensis, and B. oklahomensis. A total of 309 Burkholderia isolates and 5 other bacterial species were evaluated. The assay was 100% sensitive and specific, demonstrated sensitivity beyond culture and GC methods for the isolates tested, and is completed in about an hour with a detection limit between 2.6pg and 48.9pg of gDNA. Bioinformatic analyses also showed the assay is likely 100% specific and sensitive for all 84 fully sequenced B. pseudomallei, B. mallei, B. thailandensis, and B. oklahomensis strains currently available in GenBank. For these reasons, this assay could be a rapid and sensitive tool in the detection and differentiation for those species of the B. pseudomallei complex with recognized clinical and practical significance.
Cheng, Ji-Hong; Liu, Wen-Chun; Chang, Ting-Tsung; Hsieh, Sun-Yuan; Tseng, Vincent S
2017-10-01
Many studies have suggested that deletions of Hepatitis B Viral (HBV) are associated with the development of progressive liver diseases, even ultimately resulting in hepatocellular carcinoma (HCC). Among the methods for detecting deletions from next-generation sequencing (NGS) data, few methods considered the characteristics of virus, such as high evolution rates and high divergence among the different HBV genomes. Sequencing high divergence HBV genome sequences using the NGS technology outputs millions of reads. Thus, detecting exact breakpoints of deletions from these big and complex data incurs very high computational cost. We proposed a novel analytical method named VirDelect (Virus Deletion Detect), which uses split read alignment base to detect exact breakpoint and diversity variable to consider high divergence in single-end reads data, such that the computational cost can be reduced without losing accuracy. We use four simulated reads datasets and two real pair-end reads datasets of HBV genome sequence to verify VirDelect accuracy by score functions. The experimental results show that VirDelect outperforms the state-of-the-art method Pindel in terms of accuracy score for all simulated datasets and VirDelect had only two base errors even in real datasets. VirDelect is also shown to deliver high accuracy in analyzing the single-end read data as well as pair-end data. VirDelect can serve as an effective and efficient bioinformatics tool for physiologists with high accuracy and efficient performance and applicable to further analysis with characteristics similar to HBV on genome length and high divergence. The software program of VirDelect can be downloaded at https://sourceforge.net/projects/virdelect/. Copyright © 2017. Published by Elsevier Inc.
Unconventional P-35S sequence identified in genetically modified maize
Al-Hmoud, Nisreen; Al-Husseini, Nawar; Ibrahim-Alobaide, Mohammed A; Kübler, Eric; Farfoura, Mahmoud; Alobydi, Hytham; Al-Rousan, Hiyam
2014-01-01
The Cauliflower Mosaic Virus 35S promoter sequence, CaMV P-35S, is one of several commonly used genetic targets to detect genetically modified maize and is found in most GMOs. In this research we report the finding of an alternative P-35S sequence and its incidence in GM maize marketed in Jordan. The primer pair normally used to amplify a 123 bp DNA fragment of the CaMV P-35S promoter in GMOs also amplified a previously undetected alternative sequence of CaMV P-35S in GM maize samples which we term V3. The amplified V3 sequence comprises 386 base pairs and was not found in the standard wild-type maize, MON810 and MON 863 GM maize. The identified GM maize samples carrying the V3 sequence were found free of CaMV when compared with CaMV infected brown mustard sample. The data of sequence alignment analysis of the V3 genetic element showed 90% similarity with the matching P-35S sequence of the cauliflower mosaic virus isolate CabbB-JI and 99% similarity with matching P-35S sequences found in several binary plant vectors, of which the binary vector locus JQ693018 is one example. The current study showed an increase of 44% in the incidence of the identified 386 bp sequence in GM maize sold in Jordan’s markets during the period 2009 and 2012. PMID:24495911
Nagano, Daisuke; Sivakumar, Thillaiampalam; De De Macedo, Alane Caine Costa; Inpankaew, Tawin; Alhassan, Andy; Igarashi, Ikuo; Yokoyama, Naoaki
2013-11-01
In the present study, we screened blood DNA samples obtained from cattle bred in Brazil (n=164) and Ghana (n=80) for Babesia bovis using a diagnostic PCR assay and found prevalences of 14.6% and 46.3%, respectively. Subsequently, the genetic diversity of B. bovis in Thailand, Brazil and Ghana was analyzed, based on the DNA sequence of merozoite surface antigen-1 (MSA-1). In Thailand, MSA-1 sequences were relatively conserved and found in a single clade of the phylogram, while Brazilian MSA-1 sequences showed high genetic diversity and were dispersed across three different clades. In contrast, the sequences from Ghanaian samples were detected in two different clades, one of which contained only a single Ghanaian sequence. The identities among the MSA-1 sequences from Thailand, Brazil and Ghana were 99.0-100%, 57.5-99.4% and 60.3-100%, respectively, while the similarities among the deduced MSA-1 amino acid sequences within the respective countries were 98.4-100%, 59.4-99.7% and 58.7-100%, respectively. These observations suggested that the genetic diversity of B. bovis based on MSA-1 sequences was higher in Brazil and Ghana than in Thailand. The current data highlight the importance of conducting extensive studies on the genetic diversity of B. bovis before designing immune control strategies in each surveyed country.
Galbadrakh, Bulgan; Lee, Kyung-Eun; Park, Hyun-Seok
2012-12-01
Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequences in an inference framework of string compression algorithms. Our original motivation was to find any grammatical traits of several cancer genes that can be detected by string compression algorithms. Through this research, we could not find any meaningful unique traits of the cancer genes yet, but we could observe some interesting traits in regards to the relationship among gene length, similarity of sequences, the patterns of the generated grammar, and compression rate.
Goordial, J; Altshuler, Ianina; Hindson, Katherine; Chan-Yam, Kelly; Marcolefas, Evangelos; Whyte, Lyle G
2017-01-01
Significant progress is being made in the development of the next generation of low cost life detection instrumentation with much smaller size, mass and energy requirements. Here, we describe in situ life detection and sequencing in the field in soils over laying ice wedges in polygonal permafrost terrain on Axel Heiberg Island, located in the Canadian high Arctic (79°26'N), an analog to the polygonal permafrost terrain observed on Mars. The life detection methods used here include (1) the cryo-iPlate for culturing microorganisms using diffusion of in situ nutrients into semi-solid media (2) a Microbial Activity Microassay (MAM) plate (BIOLOG Ecoplate) for detecting viable extant microorganisms through a colourimetric assay, and (3) the Oxford Nanopore MinION for nucleic acid detection and sequencing of environmental samples and the products of MAM plate and cryo-iPlate. We obtained 39 microbial isolates using the cryo-iPlate, which included several putatively novel strains based on the 16S rRNA gene, including a Pedobacter sp. (96% closest similarity in GenBank) which we partially genome sequenced using the MinION. The MAM plate successfully identified an active community capable of L-serine metabolism, which was used for metagenomic sequencing with the MinION to identify the active and enriched community. A metagenome on environmental ice wedge soil samples was completed, with base calling and uplink/downlink carried out via satellite internet. Validation of MinION sequencing using the Illumina MiSeq platform was consistent with the results obtained with the MinION. The instrumentation and technology utilized here is pre-existing, low cost, low mass, low volume, and offers the prospect of equipping micro-rovers and micro-penetrators with aggressive astrobiological capabilities. Since potentially habitable astrobiology targets have been identified (RSLs on Mars, near subsurface water ice on Mars, the plumes and oceans of Europa and Enceladus), future astrobiology missions will certainly target these areas and there is a need for direct life detection instrumentation.
NASA Astrophysics Data System (ADS)
Akhir, Nor Azurah Mat; Nadzirin, Nurul; Mohamed, Rahmah; Firdaus-Raih, Mohd
2015-09-01
Hypothetical proteins of bacterial pathogens represent a large numbers of novel biological mechanisms which could belong to essential pathways in the bacteria. They lack functional characterizations mainly due to the inability of sequence homology based methods to detect functional relationships in the absence of detectable sequence similarity. The dataset derived from this study showed 550 candidates conserved in genomes that has pathogenicity information and only present in the Burkholderiales order. The dataset has been narrowed down to taxonomic clusters. Ten proteins were selected for ORF amplification, seven of them were successfully amplified, and only four proteins were successfully expressed. These proteins will be great candidates in determining the true function via structural biology.
Deutscher, Ania T; Burke, Catherine M; Darling, Aaron E; Riegler, Markus; Reynolds, Olivia L; Chapman, Toni A
2018-05-05
Gut microbiota affects tephritid (Diptera: Tephritidae) fruit fly development, physiology, behavior, and thus the quality of flies mass-reared for the sterile insect technique (SIT), a target-specific, sustainable, environmentally benign form of pest management. The Queensland fruit fly, Bactrocera tryoni (Tephritidae), is a significant horticultural pest in Australia and can be managed with SIT. Little is known about the impacts that laboratory-adaptation (domestication) and mass-rearing have on the tephritid larval gut microbiome. Read lengths of previous fruit fly next-generation sequencing (NGS) studies have limited the resolution of microbiome studies, and the diversity within populations is often overlooked. In this study, we used a new near full-length (> 1300 nt) 16S rRNA gene amplicon NGS approach to characterize gut bacterial communities of individual B. tryoni larvae from two field populations (developing in peaches) and three domesticated populations (mass- or laboratory-reared on artificial diets). Near full-length 16S rRNA gene sequences were obtained for 56 B. tryoni larvae. OTU clustering at 99% similarity revealed that gut bacterial diversity was low and significantly lower in domesticated larvae. Bacteria commonly associated with fruit (Acetobacteraceae, Enterobacteriaceae, and Leuconostocaceae) were detected in wild larvae, but were largely absent from domesticated larvae. However, Asaia, an acetic acid bacterium not frequently detected within adult tephritid species, was detected in larvae of both wild and domesticated populations (55 out of 56 larval gut samples). Larvae from the same single peach shared a similar gut bacterial profile, whereas larvae from different peaches collected from the same tree had different gut bacterial profiles. Clustering of the Asaia near full-length sequences at 100% similarity showed that the wild flies from different locations had different Asaia strains. Variation in the gut bacterial communities of B. tryoni larvae depends on diet, domestication, and horizontal acquisition. Bacterial variation in wild larvae suggests that more than one bacterial species can perform the same functional role; however, Asaia could be an important gut bacterium in larvae and warrants further study. A greater understanding of the functions of the bacteria detected in larvae could lead to increased fly quality and performance as part of the SIT.
Poly A- transcripts expressed in HeLa cells.
Wu, Qingfa; Kim, Yeong C; Lu, Jian; Xuan, Zhenyu; Chen, Jun; Zheng, Yonglan; Zhou, Tom; Zhang, Michael Q; Wu, Chung-I; Wang, San Ming
2008-07-30
Transcripts expressed in eukaryotes are classified as poly A+ transcripts or poly A- transcripts based on the presence or absence of the 3' poly A tail. Most transcripts identified so far are poly A+ transcripts, whereas the poly A- transcripts remain largely unknown. We developed the TRD (Total RNA Detection) system for transcript identification. The system detects the transcripts through the following steps: 1) depleting the abundant ribosomal and small-size transcripts; 2) synthesizing cDNA without regard to the status of the 3' poly A tail; 3) applying the 454 sequencing technology for massive 3' EST collection from the cDNA; and 4) determining the genome origins of the detected transcripts by mapping the sequences to the human genome reference sequences. Using this system, we characterized the cytoplasmic transcripts from HeLa cells. Of the 13,467 distinct 3' ESTs analyzed, 24% are poly A-, 36% are poly A+, and 40% are bimorphic with poly A+ features but without the 3' poly A tail. Most of the poly A- 3' ESTs do not match known transcript sequences; they have a similar distribution pattern in the genome as the poly A+ and bimorphic 3' ESTs, and their mapped intergenic regions are evolutionarily conserved. Experiments confirmed the authenticity of the detected poly A- transcripts. Our study provides the first large-scale sequence evidence for the presence of poly A- transcripts in eukaryotes. The abundance of the poly A- transcripts highlights the need for comprehensive identification of these transcripts for decoding the transcriptome, annotating the genome and studying biological relevance of the poly A- transcripts.
Campillo-Brocal, Jonatan C; Chacón-Verdú, María Dolores; Lucas-Elío, Patricia; Sánchez-Amat, Antonio
2015-03-24
L-Amino acid oxidases (LAOs) have been generally described as flavoproteins that oxidize amino acids releasing the corresponding ketoacid, ammonium and hydrogen peroxide. The generation of hydrogen peroxide gives to these enzymes antimicrobial characteristics. They are involved in processes such as biofilm development and microbial competition. LAOs are of great biotechnological interest in different applications such as the design of biosensors, biotransformations and biomedicine. The marine bacterium Marinomonas mediterranea synthesizes LodA, the first known LAO that contains a quinone cofactor. LodA is encoded in an operon that contains a second gene coding for LodB, a protein required for the post-translational modification generating the cofactor. Recently, GoxA, a quinoprotein with sequence similarity to LodA but with a different enzymatic activity (glycine oxidase instead of lysine-ε-oxidase) has been described. The aim of this work has been to study the distribution of genes similar to lodA and/or goxA in sequenced microbial genomes and to get insight into the evolution of this novel family of proteins through phylogenetic analysis. Genes encoding LodA-like proteins have been detected in several bacterial classes. However, they are absent in Archaea and detected only in a small group of fungi of the class Agaromycetes. The vast majority of the genes detected are in a genome region with a nearby lodB-like gene suggesting a specific interaction between both partner proteins. Sequence alignment of the LodA-like proteins allowed the detection of several conserved residues. All of them showed a Cys and a Trp that aligned with the residues that are forming part of the cysteine tryptophilquinone (CTQ) cofactor in LodA. Phylogenetic analysis revealed that LodA-like proteins can be clustered in different groups. Interestingly, LodA and GoxA are in different groups, indicating that those groups are related to the enzymatic activity of the proteins detected. Genome mining has revealed for the first time the broad distribution of LodA-like proteins containing a CTQ cofactor in many different microbial groups. This study provides a platform to explore the potentially novel enzymatic activities of the proteins detected, the mechanisms of post-translational modifications involved in their synthesis, as well as their biological relevance.
Tropical Archaea: Diversity associated with the surface microlayer of corals
Kellogg, C.A.
2004-01-01
Recent 16S rDNA studies have focused on detecting uncultivated bacteria associated with Caribbean reef corals in an effort to address the ecological roles of coral-associated microbes. Reports of Archaea associated with fishes and marine invertebrates raised the question of whether Archaea might also be part of the coral-associated microbial community. DNA analysis of mucus from 3 reef-building species of Caribbean corals, Montastraea annularis complex, Diploria strigosa and D. labyrinthiformis in the US Virgin Islands yielded 34 groups of archaeal 16S ribotypes (defined at the level of 97% similarity). The majority (75%) was most closely matched by BLAST searches to sequences derived from marine water column samples, whereas the remaining ribotypes were most similar to sequences isolated from anoxic environments (15%) and hydrothermal vents (9%). Unlike previous 16S studies of coral-associated Bacteria, the results do not suggest specific associations between particular archaeal sequences and individual coral species. Marine Archaea (Groups I, II and III) in addition to Thermoplasma-like, methanogen, and marine benthic crenarchaeote phylotypes, were detected in the mucus of tropical corals. The finding of sequences from coral-associated Archaea that are closely related to strict and facultative anaerobes, as well as to uncultivated Archaea from other types of anoxic environments, suggests that anaerobic micro-niches may exist in coral mucus layers. Archaea, with their unique biogeochemical capabilities, broaden the scope of possible interactions between corals and their associated microbial communities.
Analysis of koi herpesvirus latency in wild common carp and ornamental koi in Oregon, USA.
Xu, Jia-Rong; Bently, Jennifer; Beck, Linda; Reed, Aimee; Miller-Morgan, Tim; Heidel, Jerry R; Kent, Michael L; Rockey, Daniel D; Jin, Ling
2013-02-01
Koi herpesvirus (KHV) infection is associated with high mortalities in both common carp (Cyprinus carpio carpio) and koi carp (Cyprinus carpio koi) worldwide. Although acute infection has been reported in both domestic and wild common carp, the status of KHV latent infection is largely unknown in wild common carp. To investigate whether KHV latency is present in wild common carp, the distribution of KHV latent infection was investigated in two geographically distinct populations of wild common carp in Oregon, as well as in koi from an Oregon-based commercial supplier. Latent KHV infection was demonstrated in white blood cells from each of these populations. Although KHV isolated from acute infections has two distinct genetic groups, Asian and European, KHV detected in wild carp has not been genetically characterized. DNA sequences from ORF 25 to 26 that are unique between Asian and European were investigated in this study. KHV from captive koi and some wild common carp were found to have ORF-25-26 sequences similar to KHV-J (Asian), while the majority of KHV DNA detected in wild common carp has similarity to KHV-U/-I (European). In addition, DNA sequences from IL-10, and TNFR were sequenced and compared with no differences found, which suggests immune suppressor genes of KHV are conserved between KHV in wild common carp and koi, and is consistent with KHV-U, -I, -J. Copyright © 2012 Elsevier B.V. All rights reserved.
A travel time forecasting model based on change-point detection method
NASA Astrophysics Data System (ADS)
LI, Shupeng; GUANG, Xiaoping; QIAN, Yongsheng; ZENG, Junwei
2017-06-01
Travel time parameters obtained from road traffic sensors data play an important role in traffic management practice. A travel time forecasting model is proposed for urban road traffic sensors data based on the method of change-point detection in this paper. The first-order differential operation is used for preprocessing over the actual loop data; a change-point detection algorithm is designed to classify the sequence of large number of travel time data items into several patterns; then a travel time forecasting model is established based on autoregressive integrated moving average (ARIMA) model. By computer simulation, different control parameters are chosen for adaptive change point search for travel time series, which is divided into several sections of similar state.Then linear weight function is used to fit travel time sequence and to forecast travel time. The results show that the model has high accuracy in travel time forecasting.
Detection and isolation of novel rhizopine-catabolizing bacteria from the environment
Gardener; de Bruijn FJ
1998-12-01
Microbial rhizopine-catabolizing (Moc) activity was detected in serial dilutions of soil and rhizosphere washes. The activity observed generally ranged between 10(6) and 10(7) catabolic units per g, and the numbers of nonspecific culture-forming units were found to be approximately 10 times higher. A diverse set of 37 isolates was obtained by enrichment on scyllo-inosamine-containing media. However, none of the bacteria that were isolated were found to contain DNA sequences homologous to the known mocA, mocB, and mocC genes of Sinorhizobium meliloti L5-30. Twenty-one of the isolates could utilize an SI preparation as the sole carbon and nitrogen source for growth. Partial sequencing of 16S ribosomal DNAs (rDNAs) amplified from these strains indicated that five distinct bacterial genera (Arthrobacter, Sinorhizobium, Pseudomonas, Aeromonas, and Alcaligenes) were represented in this set. Only 6 of these 21 isolates could catabolize 3-O-methyl-scyllo-inosamine under standard assay conditions. Two of these, strains D1 and R3, were found to have 16S rDNA sequences very similar to those of Sinorhizobium meliloti. However, these strains are not symbiotically effective on Medicago sativa, and DNA sequences homologous to the nodB and nodC genes were not detected in strains D1 and R3 by Southern hybridization analysis.
Detection and Isolation of Novel Rhizopine-Catabolizing Bacteria from the Environment
Gardener, Brian B. McSpadden; de Bruijn, Frans J.
1998-01-01
Microbial rhizopine-catabolizing (Moc) activity was detected in serial dilutions of soil and rhizosphere washes. The activity observed generally ranged between 106 and 107 catabolic units per g, and the numbers of nonspecific culture-forming units were found to be approximately 10 times higher. A diverse set of 37 isolates was obtained by enrichment on scyllo-inosamine-containing media. However, none of the bacteria that were isolated were found to contain DNA sequences homologous to the known mocA, mocB, and mocC genes of Sinorhizobium meliloti L5-30. Twenty-one of the isolates could utilize an SI preparation as the sole carbon and nitrogen source for growth. Partial sequencing of 16S ribosomal DNAs (rDNAs) amplified from these strains indicated that five distinct bacterial genera (Arthrobacter, Sinorhizobium, Pseudomonas, Aeromonas, and Alcaligenes) were represented in this set. Only 6 of these 21 isolates could catabolize 3-O-methyl-scyllo-inosamine under standard assay conditions. Two of these, strains D1 and R3, were found to have 16S rDNA sequences very similar to those of Sinorhizobium meliloti. However, these strains are not symbiotically effective on Medicago sativa, and DNA sequences homologous to the nodB and nodC genes were not detected in strains D1 and R3 by Southern hybridization analysis. PMID:9835587
Molecular detection of viral agents in free-ranging and captive neotropical felids in Brazil.
Furtado, Mariana M; Taniwaki, Sueli A; de Barros, Iracema N; Brandão, Paulo E; Catão-Dias, José L; Cavalcanti, Sandra; Cullen, Laury; Filoni, Claudia; Jácomo, Anah T de Almeida; Jorge, Rodrigo S P; Silva, Nairléia Dos Santos; Silveira, Leandro; Ferreira Neto, José S
2017-09-01
We describe molecular testing for felid alphaherpesvirus 1 (FHV-1), carnivore protoparvovirus 1 (CPPV-1), feline calicivirus (FCV), alphacoronavirus 1 (feline coronavirus [FCoV]), feline leukemia virus (FeLV), feline immunodeficiency virus (FIV), and canine distemper virus (CDV) in whole blood samples of 109 free-ranging and 68 captive neotropical felids from Brazil. Samples from 2 jaguars ( Panthera onca) and 1 oncilla ( Leopardus tigrinus) were positive for FHV-1; 2 jaguars, 1 puma ( Puma concolor), and 1 jaguarundi ( Herpairulus yagouaroundi) tested positive for CPPV-1; and 1 puma was positive for FIV. Based on comparison of 103 nucleotides of the UL24-UL25 gene, the FHV-1 sequences were 99-100% similar to the FHV-1 strain of domestic cats. Nucleotide sequences of CPPV-1 were closely related to sequences detected in other wild carnivores, comparing 294 nucleotides of the VP1 gene. The FIV nucleotide sequence detected in the free-ranging puma, based on comparison of 444 nucleotides of the pol gene, grouped with other lentiviruses described in pumas, and had 82.4% identity with a free-ranging puma from Yellowstone Park and 79.5% with a captive puma from Brazil. Our data document the circulation of FHV-1, CPPV-1, and FIV in neotropical felids in Brazil.
Pistachio (Pistacia vera L.) is a new natural host of Hop stunt viroid.
Elleuch, Amine; Hamdi, Imen; Ellouze, Olfa; Ghrab, Mohamed; Fkahfakh, Hatem; Drira, Noureddine
2013-10-01
Besides hop, Hop stunt viroid (HpSVd) infects many woody species including grapevine, citrus, peach, plum, apricot, almond, pomegranate, mulberry and jujube. Here, we report the first detection of HpSVd in pistachio (Pistacia vera L.). Samples corresponding to 16 pistachio cultivars were obtained from a nearby almond collection. From these samples, low molecular weight RNAs were extracted for double polyacrylamide gel electrophoresis, northern-blot analysis and reverse transcription polymerase chain reaction assays. HpSVd was detected in 4 of the 16 pistachio cultivars in the first year and in 6 in the second, being also detected in the almond collection. Examination of the nucleotide sequences of pistachio and almond isolates revealed 13 new sequence variants. Sequences from pistachio shared 92-96 % similarity with the first reported HpSVd sequence (GenBank X00009), and multiple alignment and phylogenetic analyses showed that one pistachio isolate (HpSVdPis67Jabari) clustered with the plum group, whereas all the others clustered with the hop, and the recombinants plum-citrus and plum-Hop/cit3 groups. By identifying pistachio as a new natural host, we confirm that HpSVd is an ubiquitous and genetically variable viroid that infects many different fruit trees cultivated worldwide.
Abundant aftershock sequence of the 2015 Mw7.5 Hindu Kush intermediate-depth earthquake
NASA Astrophysics Data System (ADS)
Li, Chenyu; Peng, Zhigang; Yao, Dongdong; Guo, Hao; Zhan, Zhongwen; Zhang, Haijiang
2018-05-01
The 2015 Mw7.5 Hindu Kush earthquake occurred at a depth of 213 km beneath the Hindu Kush region of Afghanistan. While many early aftershocks were missing from the global earthquake catalogues, this sequence was recorded continuously by eight broad-band stations within 500 km. Here we use a waveform matching technique to systematically detect earthquakes around the main shock. More than 3000 events are detected within 35 d after the main shock, as compared with 42 listed in the Advanced National Seismic System catalogue (or 196 in the International Seismological Centre catalogue). The aftershock sequence generally follows the Omori's law with a decay constant p = 0.92. We also apply the recently developed double-pair double-difference technique to relocate all detected aftershocks. Most of them are located to the west of the hypocentre of the main shock, consistent with the westward propagation of the main-shock rupture. The aftershocks outline a nearly vertical southward dipping plane, which matches well with one of the nodal planes of the main shock. We conclude that the aftershock sequence of this intermediate-depth earthquake shares many similarities with those for shallow earthquakes and infer that there are some common mechanisms responsible for shallow and intermediate-depth earthquakes.
Flanking sequence determination and event-specific detection of genetically modified wheat B73-6-1.
Xu, Junyi; Cao, Jijuan; Cao, Dongmei; Zhao, Tongtong; Huang, Xin; Zhang, Piqiao; Luan, Fengxia
2013-05-01
In order to establish a specific identification method for genetically modified (GM) wheat, exogenous insert DNA and flanking sequence between exogenous fragment and recombinant chromosome of GM wheat B73-6-1 were successfully acquired by means of conventional polymerase chain reaction (PCR) and thermal asymmetric interlaced (TAIL)-PCR strategies. Newly acquired exogenous fragment covered the full-length sequence of transformed genes such as transformed plasmid and corresponding functional genes including marker uidA, herbicide-resistant bar, ubiquitin promoter, and high-molecular-weight gluten subunit. The flanking sequence between insert DNA revealed high similarity with Triticum turgidum A gene (GenBank: AY494981.1). A specific PCR detection method for GM wheat B73-6-1 was established on the basis of primers designed according to the flanking sequence. This specific PCR method was validated by GM wheat, GM corn, GM soybean, GM rice, and non-GM wheat. The specifically amplified target band was observed only in GM wheat B73-6-1. This method is of high specificity, high reproducibility, rapid identification, and excellent accuracy for the identification of GM wheat B73-6-1.
Torres-Cruz, Terry J.; Billingsley Tobias, Terri L.; Almatruk, Maryam; ...
2017-08-08
Illumina amplicon sequencing of soil in a temperate pine forest in the southeastern United States detected an abundant, nitrogen (N)-responsive fungal genotype of unknown phylogenetic affiliation. Two isolates with ribosomal sequences consistent with that genotype were subsequently obtained. Examination of records in GenBank revealed that a genetically similar fungus had been isolated previously as an endophyte of moss in a pine forest in the southwestern United States. The three isolates were characterized using morphological, genomic, and multilocus molecular data (18S, internal transcribed spacer [ITS], and 28S rRNA sequences). Phylogenetic and maximum likelihood phylogenomic reconstructions revealed that the taxon represents amore » novel lineage in Mucoromycotina, only preceded by Calcarisporiella, the earliest diverging lineage in the subphylum. Sequences for the novel taxon are frequently detected in environmental sequencing studies, and it is currently part of UNITE’s dynamic list of most wanted fungi. The fungus is dimorphic, grows best at room temperature, and is associated with a wide variety of bacteria. In this paper, a new monotypic genus, Bifiguratus, is proposed, typified by Bifiguratus adelaidae.« less
Iterative dictionary construction for compression of large DNA data sets.
Kuruppu, Shanika; Beresford-Smith, Bryan; Conway, Thomas; Zobel, Justin
2012-01-01
Genomic repositories increasingly include individual as well as reference sequences, which tend to share long identical and near-identical strings of nucleotides. However, the sequential processing used by most compression algorithms, and the volumes of data involved, mean that these long-range repetitions are not detected. An order-insensitive, disk-based dictionary construction method can detect this repeated content and use it to compress collections of sequences. We explore a dictionary construction method that improves repeat identification in large DNA data sets. Our adaptation, COMRAD, of an existing disk-based method identifies exact repeated content in collections of sequences with similarities within and across the set of input sequences. COMRAD compresses the data over multiple passes, which is an expensive process, but allows COMRAD to compress large data sets within reasonable time and space. COMRAD allows for random access to individual sequences and subsequences without decompressing the whole data set. COMRAD has no competitor in terms of the size of data sets that it can compress (extending to many hundreds of gigabytes) and, even for smaller data sets, the results are competitive compared to alternatives; as an example, 39 S. cerevisiae genomes compressed to 0.25 bits per base.
[Two novel pathogenic mutations of GAN gene identified in a patient with giant axonal neuropathy].
Wang, Juan; Ma, Qingwen; Cai, Qin; Liu, Yanna; Wang, Wei; Ren, Zhaorui
2016-06-01
To explore the disease-causing mutations in a patient suspected for giant axonal neuropathy(GAN). Target sequence capture sequencing was used to screen potential mutations in genomic DNA extracted from peripheral blood sample of the patient. Sanger sequencing was applied to confirm the detected mutation. The mutation was verified among 400 GAN alleles from 200 healthy individuals by Sanger sequencing. The function of the mutations was predicted by bioinformatics analysis. The patient was identified as a compound heterozygote carrying two novel pathogenic GAN mutations, i.e., c.778G>T (p.Glu260Ter) and c.277G>A (p.Gly93Arg). Sanger sequencing confirmed that the c.778G>T (p.Glu260Ter) mutation was inherited from his father, while c.277G>A (p.Gly93Arg) was inherited from his mother. The same mutations was not found in the 200 healthy individuals. Bioinformatics analysis predicted that the two mutations probably caused functional abnormality of gigaxonin. Two novel GAN mutations were detected in a patient with GAN. Both mutations are pathogenic and can cause abnormalities of gigaxonin structure and function, leading to pathogenesis of GAN. The results may also offer valuable information for similar diseases.
Single-primer fluorescent sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruth, J.L.; Morgan, C.A.; Middendorf, L.R.
Modified linker arm oligonucleotides complementary to standard M13 priming sites were synthesized, labelled with either one, two, or three fluoresceins, and purified by reverse-phase HPLC. When used as primers in standard dideoxy M13 sequencing with /sup 32/P-dNTPs, normal autoradiographic patterns were obtained. To eliminate the radioactivity, direct on-line fluorescence detection was achieved by the use of a scanning 10 mW Argon laser emitting 488 nm light. Fluorescent bands were detected directly in standard 0.2 or 0.35 mm thick polyacrylamide gels at a distance of 24 cm from the loading wells by a photomultiplier tube filtered at 520 nm. Horizontal andmore » temporal location of each band was displayed by computer as a band in real time, providing visual appearance similar to normal 4-lane autoradiograms. Using a single primer labelled with two fluoresceins, sequences of between 500 and 600 bases have been read in a single loading with better than 98% accuracy; up to 400 bases can be read reproducibly with no errors. More than 50 sequences have been determined by this method. This approach requires only 1-2 ug of cloned template, and produces continuous sequence data at about one band per minute.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Torres-Cruz, Terry J.; Billingsley Tobias, Terri L.; Almatruk, Maryam
Illumina amplicon sequencing of soil in a temperate pine forest in the southeastern United States detected an abundant, nitrogen (N)-responsive fungal genotype of unknown phylogenetic affiliation. Two isolates with ribosomal sequences consistent with that genotype were subsequently obtained. Examination of records in GenBank revealed that a genetically similar fungus had been isolated previously as an endophyte of moss in a pine forest in the southwestern United States. The three isolates were characterized using morphological, genomic, and multilocus molecular data (18S, internal transcribed spacer [ITS], and 28S rRNA sequences). Phylogenetic and maximum likelihood phylogenomic reconstructions revealed that the taxon represents amore » novel lineage in Mucoromycotina, only preceded by Calcarisporiella, the earliest diverging lineage in the subphylum. Sequences for the novel taxon are frequently detected in environmental sequencing studies, and it is currently part of UNITE’s dynamic list of most wanted fungi. The fungus is dimorphic, grows best at room temperature, and is associated with a wide variety of bacteria. In this paper, a new monotypic genus, Bifiguratus, is proposed, typified by Bifiguratus adelaidae.« less
Okuda, A; Imagawa, M; Maeda, Y; Sakai, M; Muramatsu, M
1989-10-05
We have recently identified a typical enhancer, termed GPEI, located about 2.5 kilobases upstream from the transcription initiation site of the rat glutathione transferase P gene. Analyses of 5' and 3' deletion mutants revealed that the cis-acting sequence of GPEI contained the phorbol 12-O-tetradecanoate 13-acetate responsive element (TRE)-like sequence in it. For the maximal activity, however, GPEI required an adjacent upstream sequence of about 19 base pairs in addition to the TRE-like sequence. With the DNA binding gel-shift assay, we could detect protein(s) that specifically binds to the TRE-like sequence of GPEI fragment, which was possibly c-jun.c-fos complex or a similar protein complex. The sequence immediately upstream of the TRE-like sequence did not have any activity by itself, but augmented the latter activity by about 5-fold.
The Neandertal genome and ancient DNA authenticity
Green, Richard E; Briggs, Adrian W; Krause, Johannes; Prüfer, Kay; Burbano, Hernán A; Siebauer, Michael; Lachmann, Michael; Pääbo, Svante
2009-01-01
Recent advances in high-thoughput DNA sequencing have made genome-scale analyses of genomes of extinct organisms possible. With these new opportunities come new difficulties in assessing the authenticity of the DNA sequences retrieved. We discuss how these difficulties can be addressed, particularly with regard to analyses of the Neandertal genome. We argue that only direct assays of DNA sequence positions in which Neandertals differ from all contemporary humans can serve as a reliable means to estimate human contamination. Indirect measures, such as the extent of DNA fragmentation, nucleotide misincorporations, or comparison of derived allele frequencies in different fragment size classes, are unreliable. Fortunately, interim approaches based on mtDNA differences between Neandertals and current humans, detection of male contamination through Y chromosomal sequences, and repeated sequencing from the same fossil to detect autosomal contamination allow initial large-scale sequencing of Neandertal genomes. This will result in the discovery of fixed differences in the nuclear genome between Neandertals and current humans that can serve as future direct assays for contamination. For analyses of other fossil hominins, which may become possible in the future, we suggest a similar ‘boot-strap' approach in which interim approaches are applied until sufficient data for more definitive direct assays are acquired. PMID:19661919
Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.
Archer, John; Weber, Jan; Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J; Robertson, David L; Mimms, Larry; Quiñones-Mateu, Miguel E
2012-01-01
HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.
Citrus and Prunuscopia-like retrotransposons.
Asíns, M J; Monforte, A J; Mestre, P F; Carbonell, E A
1999-08-01
Many of the world's most important citrus cultivars ("Washington Navel", satsumas, clementines) have arisen through somatic mutation. This phenomenon occurs fairly often in the various species and varieties of the genus.The presence of copia-like retrotransposons has been investigated in fruit trees, especially citrus, by using a PCR assay designed to detect copia-like reverse transcriptase (RT) sequences. Amplification products from a genotype of each the following species Citrus sinensis, Citrus grandis, Citrus clementina, Prunus armeniaca and Prunus amygdalus, were cloned and some of them sequenced. Southern-blot hybridization using RT clones as probes showed that multiple copies are integrated throughout the citrus genome, while only 1-3 copies are detected in the P. armeniaca genome, which is in accordance with the Citrus and Prunus genome sizes. Sequence analysis of RT clones allowed a search for homologous sequences within three gene banks. The most similar ones correspond to RT domains of copia-like retrotransposons from unrelated plant species. Cluster analysis of these sequences has shown a great heterogeneity among RT domains cloned from the same genotype. This finding supports the hypothesis that horizontal transmission of retrotransposons has occurred in the past. The species presenting a RT sequence most similar to citrus RT clones is Gnetum montanum, a gymnosperm whose distribution area coincides with two of the main centers of origin of Citrus spp. A new C-methylated restriction DNA fragment containing a RT sequence is present in navel sweet oranges, but not in Valencia oranges from which the former originated suggesting, that retrotransposon activity might be, at least in part, involved in the genetic variability among sweet orange cultivars. Given that retrotransposons are quite abundant throughout the citrus genome, their activity should be investigated thoroughly before commercializing any transgenic citrus plant where the transgene(s) is part of a viral genome in order to avoid its possible recombination with an active retroelement. Focusing on other strategies to control virus diseases is recommended in citrus.
Sequence analysis of Leukemia DNA
NASA Astrophysics Data System (ADS)
Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa
2018-03-01
Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.
Virome comparisons in wild-diseased and healthy captive giant pandas.
Zhang, Wen; Yang, Shixing; Shan, Tongling; Hou, Rong; Liu, Zhijian; Li, Wang; Guo, Lianghua; Wang, Yan; Chen, Peng; Wang, Xiaochun; Feng, Feifei; Wang, Hua; Chen, Chao; Shen, Quan; Zhou, Chenglin; Hua, Xiuguo; Cui, Li; Deng, Xutao; Zhang, Zhihe; Qi, Dunwu; Delwart, Eric
2017-08-07
The giant panda (Ailuropoda melanoleuca) is a vulnerable mammal herbivore living wild in central China. Viral infections have become a potential threat to the health of these endangered animals, but limited information related to these infections is available. Using a viral metagenomic approach, we surveyed viruses in the feces, nasopharyngeal secretions, blood, and different tissues from a wild giant panda that died from an unknown disease, a healthy wild giant panda, and 46 healthy captive animals. The previously uncharacterized complete or near complete genomes of four viruses from three genera in Papillomaviridae family, six viruses in a proposed new Picornaviridae genus (Aimelvirus), two unclassified viruses related to posaviruses in Picornavirales order, 19 anelloviruses in four different clades of Anelloviridae family, four putative circoviruses, and 15 viruses belonging to the recently described Genomoviridae family were sequenced. Reflecting the diet of giant pandas, numerous insect virus sequences related to the families Iflaviridae, Dicistroviridae, Iridoviridae, Baculoviridae, Polydnaviridae, and subfamily Densovirinae and plant viruses sequences related to the families Tombusviridae, Partitiviridae, Secoviridae, Geminiviridae, Luteoviridae, Virgaviridae, and Rhabdoviridae; genus Umbravirus, Alphaflexiviridae, and Phycodnaviridae were also detected in fecal samples. A small number of insect virus sequences were also detected in the nasopharyngeal secretions of healthy giant pandas and lung tissues from the dead wild giant panda. Although the viral families present in the sick giant panda were also detected in the healthy ones, a higher proportion of papillomaviruses, picornaviruses, and anelloviruses reads were detected in the diseased panda. This viral survey increases our understanding of eukaryotic viruses in giant pandas and provides a baseline for comparison to viruses detected in future infectious disease outbreaks. The similar viral families detected in sick and healthy giant pandas indicate that these viruses result in commensal infections in most immuno-competent animals.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zemla, A; Lang, D; Kostova, T
2010-11-29
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
Improved localisation for 2-hydroxyglutarate detection at 3T using long-TE semi-LASER.
Berrington, Adam; Voets, Natalie L; Plaha, Puneet; Larkin, Sarah J; Mccullagh, James; Stacey, Richard; Yildirim, Muhammed; Schofield, Christopher J; Jezzard, Peter; Cadoux-Hudson, Tom; Ansorge, Olaf; Emir, Uzay E
2016-06-01
2-hydroxyglutarate (2-HG) has emerged as a biomarker of tumour cell IDH mutations that may enable the differential diagnosis of glioma patients. At 3 Tesla, detection of 2-HG with magnetic resonance spectroscopy is challenging because of metabolite signal overlap and a spectral pattern modulated by slice selection and chemical shift displacement. Using density matrix simulations and phantom experiments, an optimised semi-LASER scheme (TE = 110 ms) improves localisation of the 2-HG spin system considerably compared to an existing PRESS sequence. This results in a visible 2-HG peak in the in vivo spectra at 1.9 ppm in the majority of IDH mutated tumours. Detected concentrations of 2-HG were similar using both sequences, although the use of semi-LASER generated narrower confidence intervals. Signal overlap with glutamate and glutamine, as measured by pairwise fitting correlation was reduced. Lactate was readily detectable across glioma patients using the method presented here (mean CLRB: (10±2)%). Together with more robust 2-HG detection, long TE semi-LASER offers the potential to investigate tumour metabolism and stratify patients in vivo at 3T.
MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes
Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe
2016-01-01
The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a ‘dark matter.’ We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body. PMID:27065984
Manswr, Basim; Ball, Christopher; Forrester, Anne; Chantrey, Julian; Ganapathy, Kannan
2018-08-01
Sequence variability in the S1 gene determines the genotype of infectious bronchitis virus (IBV) strains. A single RT-PCR assay was developed to amplify and sequence the full S1 gene for six classical and variant IBVs (M41, D274, 793B, IS/885/00, IS/1494/06 and Q1) enriched in allantoic fluid (AF) or the same AF inoculated onto Flinders Technology Association (FTA) cards. Representative strains from each genotype were grown in specific-pathogen-free eggs and RNA was extracted from AF. Full S1 gene amplification was achieved using primer A and primer 22.51. Products were sequenced using primers A, 1050+, 1380+ and SX3+ to obtain short sequences covering the full gene. Following serial dilutions of AF, detection limits of the partial assay were higher than those of the full S1 gene. Partial S1 sequences exhibited higher-than-average nucleotide similarity percentages (79%; 352 bp) compared to full S1 sequences (77%; 1756 bp), suggesting that full S1 analysis allows greater strain differentiation. For IBV detection from AF-inoculated FTA cards, four serotypes were incubated for up to 21 days at three temperatures, 4°C, room temperature (approximately 24°C) and 40°C. RNA was extracted and tested with partial and full S1 protocols. Through partial sequencing, all IBVs were successfully detected at all sampling points and storage temperatures. In contrast, using full S1 sequencing it was not possible to amplify the gene beyond 14 days or when stored at 40°C. Data presented show that for full S1 sequencing, a substantial amount of RNA is needed. Field samples collected onto FTA cards are unlikely to yield such quantity or quality. AF: allantoic fluid; CD50: ciliostatic dose 50; FTA: Flinders Technology Association; IB: infectious bronchitis; IBV: infectious bronchitis virus.
First detection of canine parvovirus type 2b from diarrheic dogs in Himachal Pradesh.
Sharma, Shalini; Dhar, Prasenjit; Thakur, Aneesh; Sharma, Vivek; Sharma, Mandeep
2016-09-01
The present study was conducted to detect the presence of canine parvovirus (CPV) among diarrheic dogs in Himachal Pradesh and to identify the most prevalent antigenic variant of CPV based on molecular typing and sequence analysis of VP2 gene. A total of 102 fecal samples were collected from clinical cases of diarrhea or hemorrhagic gastroenteritis from CPV vaccinated or non-vaccinated dogs. Samples were tested using CPV-specific polymerase chain reaction (PCR) targeting VP2 gene, multiplex PCR for detection of CPV-2a and CPV-2b antigenic variants, and a PCR for the detection of CPV-2c. CPV-2b isolate was cultured on Madin-Darby canine kidney (MDCK) cell lines and sequenced using VP2 structural protein gene. Multiple alignment and phylogenetic analysis was done using ClustalW and MEGA6 and inferred using the Neighbor-Joining method. No sample was found positive for the original CPV strain usually present in the vaccine. However, about 50% (52 out of 102) of the samples were found to be positive with CPV-2ab PCR assay that detects newer variants of CPV circulating in the field. In addition, multiplex PCR assay that identifies both CPV-2ab and CPV-2b revealed that CPV-2b was the major antigenic variant present in the affected dogs. A PCR positive isolate of CPV-2b was adapted to grow in MDCK cells and produced characteristic cytopathic effect after 5 th passage. Multiple sequence alignment of VP2 structural gene of CPV-2b isolate (Accession number HG004610) used in the study was found to be similar to other sequenced isolates in NCBI sequence database and showed 98-99% homology. This study reports the first detection of CPV-2b in dogs with hemorrhagic gastroenteritis in Himachal Pradesh and absence of other antigenic types of CPV. Further, CPV-specific PCR assay can be used for rapid confirmation of circulating virus strains under field conditions.
2013-01-01
Background In April 2009, public health surveillance detected an increased number of influenza-like illnesses in Mexico City’s hospitals. The etiological agent was subsequently determined to be a spread of a worldwide novel influenza A (H1N1) triple reassortant. The purpose of the present study was to demonstrate that molecular detection of pandemic influenza A (H1N1) 2009 strains is possible in archival material such as paraffin-embedded lung samples. Methods In order to detect A (H1N1) virus sequences in archived biological samples, eight paraffin-embedded lung samples from patients who died of pneumonia and respiratory failure were tested for influenza A (H1N1) Neuraminidase (NA) RNA using in situ RT-PCR. Results We detected NA transcripts in 100% of the previously diagnosed A (H1N1)-positive samples as a cytoplasmic signal. No expression was detected by in situ RT-PCR in two Influenza-like Illness A (H1N1)-negative patients using standard protocols nor in a non-related cervical cell line. In situ relative transcription levels correlated with those obtained when in vitro RT-PCR assays were performed. Partial sequences of the NA gene from A (H1N1)-positive patients were obtained by the in situ RT-PCR-sequencing method. Sequence analysis showed 98% similarity with influenza viruses reported previously in other places. Conclusions We have successfully amplified specific influenza A (H1N1) NA sequences using stored clinical material; results suggest that this strategy could be useful when clinical RNA samples are quantity limited, or when poor quality is obtained. Here, we provide a very sensitive method that specifically detects the neuraminidase viral RNA in lung samples from patients who died from pneumonia caused by Influenza A (H1N1) outbreak in Mexico City. PMID:23327529
Innate Immune Complexity in the Purple Sea Urchin: Diversity of the Sp185/333 System
Smith, L. Courtney
2012-01-01
The California purple sea urchin, Strongylocentrotus purpuratus, is a long-lived echinoderm with a complex and sophisticated innate immune system. There are several large gene families that function in immunity in this species including the Sp185/333 gene family that has ∼50 (±10) members. The family shows intriguing sequence diversity and encodes a broad array of diverse yet similar proteins. The genes have two exons of which the second encodes the mature protein and has repeats and blocks of sequence called elements. Mosaics of element patterns plus single nucleotide polymorphisms-based variants of the elements result in significant sequence diversity among the genes yet maintains similar structure among the members of the family. Sequence of a bacterial artificial chromosome insert shows a cluster of six, tightly linked Sp185/333 genes that are flanked by GA microsatellites. The sequences between the GA microsatellites in which the Sp185/333 genes and flanking regions are located, are much more similar to each other than are the sequences outside the microsatellites suggesting processes such as gene conversion, recombination, or duplication. However, close linkage does not correspond with greater sequence similarity compared to randomly cloned and sequenced genes that are unlikely to be linked. There are three segmental duplications that are bounded by GAT microsatellites and include three almost identical genes plus flanking regions. RNA editing is detectible throughout the mRNAs based on comparisons to the genes, which, in combination with putative post-translational modifications to the proteins, results in broad arrays of Sp185/333 proteins that differ among individuals. The mature proteins have an N-terminal glycine-rich region, a central RGD motif, and a C-terminal histidine-rich region. The Sp185/333 proteins are localized to the cell surface and are found within vesicles in subsets of polygonal and small phagocytes. The coelomocyte proteome shows full-length and truncated proteins, including some with missense sequence. Current results suggest that both native Sp185/333 proteins and a recombinant protein bind bacteria and are likely important in sea urchin innate immunity. PMID:22566951
Detection and characterization of hepatitis A virus circulating in Egypt.
Hamza, Hazem; Abd-Elshafy, Dina Nadeem; Fayed, Sayed A; Bahgat, Mahmoud Mohamed; El-Esnawy, Nagwa Abass; Abdel-Mobdy, Emam
2017-07-01
Hepatitis A virus (HAV) still poses a considerable problem worldwide. In the current study, hepatitis A virus was recovered from wastewater samples collected from three wastewater treatment plants over one year. Using RT-PCR, HAV was detected in 43 out of 68 samples (63.2%) representing both inlet and outlet. Eleven positive samples were subjected to sequencing targeting the VP1-2A junction region. Phylogenetic analysis revealed that all samples belonged to subgenotype IB with few substitutions at the amino acid level. The complete sequence of one isolate (HAV/Egy/BI-11/2015) showed that the similarity at the amino acid level was not reflected at the nucleotide level. However, the deduced amino acid sequence derived from the complete nucleotide sequence showed distinct substitutions in the 2B, 2C, and 3A regions. Recombination analysis revealed a recombination event between X75215 (subgenotype IA) and AF268396 (subgenotype IB) involving a portion of the 2B nonstructural protein coding region (nucleotides 3757-3868) assuming the herein characterized sequence an actual recombinant. Despite the role of recombination in picornaviruses evolution, its involvement in HAV evolution has rarely been reported, and this may be due to the limited available complete HAV sequences. To our knowledge, this represents the first characterized complete sequence of an Egyptian isolate and the described recombination event provides an important update on the circulating HAV strains in Egypt.
Integrative network alignment reveals large regions of global network similarity in yeast and human.
Kuchaiev, Oleksii; Przulj, Natasa
2011-05-15
High-throughput methods for detecting molecular interactions have produced large sets of biological network data with much more yet to come. Analogous to sequence alignment, efficient and reliable network alignment methods are expected to improve our understanding of biological systems. Unlike sequence alignment, network alignment is computationally intractable. Hence, devising efficient network alignment heuristics is currently a foremost challenge in computational biology. We introduce a novel network alignment algorithm, called Matching-based Integrative GRAph ALigner (MI-GRAAL), which can integrate any number and type of similarity measures between network nodes (e.g. proteins), including, but not limited to, any topological network similarity measure, sequence similarity, functional similarity and structural similarity. Hence, we resolve the ties in similarity measures and find a combination of similarity measures yielding the largest contiguous (i.e. connected) and biologically sound alignments. MI-GRAAL exposes the largest functional, connected regions of protein-protein interaction (PPI) network similarity to date: surprisingly, it reveals that 77.7% of proteins in the baker's yeast high-confidence PPI network participate in such a subnetwork that is fully contained in the human high-confidence PPI network. This is the first demonstration that species as diverse as yeast and human contain so large, continuous regions of global network similarity. We apply MI-GRAAL's alignments to predict functions of un-annotated proteins in yeast, human and bacteria validating our predictions in the literature. Furthermore, using network alignment scores for PPI networks of different herpes viruses, we reconstruct their phylogenetic relationship. This is the first time that phylogeny is exactly reconstructed from purely topological alignments of PPI networks. Supplementary files and MI-GRAAL executables: http://bio-nets.doc.ic.ac.uk/MI-GRAAL/.
2012-01-01
Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742
Diversity in VP3, NSP3, and NSP4 of rotavirus B detected from Japanese cattle.
Hayashi-Miyamoto, Michiko; Murakami, Toshiaki; Minami-Fukuda, Fujiko; Tsuchiaka, Shinobu; Kishimoto, Mai; Sano, Kaori; Naoi, Yuki; Asano, Keigo; Ichimaru, Toru; Haga, Kei; Omatsu, Tsutomu; Katayama, Yukie; Oba, Mami; Aoki, Hiroshi; Shirai, Junsuke; Ishida, Motohiko; Katayama, Kazuhiko; Mizutani, Tetsuya; Nagai, Makoto
2017-04-01
Bovine rotavirus B (RVB) is an etiological agent of diarrhea mostly in adult cattle. Currently, a few sequences of viral protein (VP)1, 2, 4, 6, and 7 and nonstructural protein (NSP)1, 2, and 5 of bovine RVB are available in the DDBJ/EMBL/GenBank databases, and none have been reported for VP3, NSP3, and NSP4. In order to fill this gap in the genetic characterization of bovine RVB strains, we used a metagenomics approach and sequenced and analyzed the complete coding sequences (CDS) of VP3, NSP3, and NSP4 genes, as well as the partial or complete CDS of other genes of RVBs detected from Japanese cattle. VP3, NSP3, and NSP4 of bovine RVBs shared low nucleotide sequence identities (63.3-64.9% for VP3, 65.9-68.2% for NSP3, and 52.6-56.2% for NSP4) with those of murine, human, and porcine RVBs, suggesting that bovine RVBs belong to a novel genotype. Furthermore, significantly low amino acid sequence identities were observed for NSP4 (36.1-39.3%) between bovine RVBs and the RVBs of other species. In contrast, hydrophobic plot analysis of NSP4 revealed profiles similar to those of RVBs of other species and rotavirus A (RVA) strains. Phylogenetic analyses of all gene segments revealed that bovine RVB strains formed a cluster that branched distantly from other RVBs. These results suggest that bovine RVBs have evolved independently from other RVBs but in a similar manner to other rotaviruses. These findings provide insights into the evolution and diversity of RVB strains. Copyright © 2017 Elsevier B.V. All rights reserved.
Bjørnsgaard Aas, Anders; Davey, Marie Louise; Kauserud, Håvard
2017-07-01
The formation of chimeric sequences can create significant methodological bias in PCR-based DNA metabarcoding analyses. During mixed-template amplification of barcoding regions, chimera formation is frequent and well documented. However, profiling of fungal communities typically uses the more variable rDNA region ITS. Due to a larger research community, tools for chimera detection have been developed mainly for the 16S/18S markers. However, these tools are widely applied to the ITS region without verification of their performance. We examined the rate of chimera formation during amplification and 454 sequencing of the ITS2 region from fungal mock communities of different complexities. We evaluated the chimera detecting ability of two common chimera-checking algorithms: perseus and uchime. Large proportions of the chimeras reported were false positives. No false negatives were found in the data set. Verified chimeras accounted for only 0.2% of the total ITS2 reads, which is considerably less than what is typically reported in 16S and 18S metabarcoding analyses. Verified chimeric 'parent sequences' had significantly higher per cent identity to one another than to random members of the mock communities. Community complexity increased the rate of chimera formation. GC content was higher around the verified chimeric break points, potentially facilitating chimera formation through base pair mismatching in the neighbouring regions of high similarity in the chimeric region. We conclude that the hypervariable nature of the ITS region seems to buffer the rate of chimera formation in comparison with other, less variable barcoding regions, due to shorter regions of high sequence similarity. © 2016 John Wiley & Sons Ltd.
Ramlal, Shylaja; Mondal, Bhairab; Lavu, Padma Sudharani; N, Bhavanashri; Kingston, Joseph
2018-01-16
In the present study, a high throughput whole cell SELEX method has been applied successfully in selecting specific aptamers against whole cells of Staphylococcus aureus, a potent food poisoning bacterium. A total ten rounds of SELEX and three rounds of intermittent counter SELEX, was performed to obtain specific aptamers. Obtained oligonucleotide pool were cloned, sequenced and then grouped into different families based on their primary sequence homology and secondary structure similarity. FITC labeled sequences from different families were selected for further characterization via flow cytometry analysis. The dissociation constant (K d ) values of specific and higher binders ranged from 34 to 128nM. Binding assays to assess the selectivity of aptamer RAB10, RAB 20, RAB 28 and RAB 35 demonstrated high affinity against S. aureus and low binding affinity for other bacteria. To demonstrate the potential use of the aptamer a sensitive dual labeled sandwich detection system was developed using aptamer RAB10 and RAB 35 with a detection limit of 10 2 CFU/mL. Furthermore detection from mixed cell population and spiked sample emphasized the robustness as well as applicability of the developed method. Altogether, the established assay could be a reliable detection tool for the routine investigation of Staphylococcus aureus in samples from food and clinical sources. Copyright © 2017. Published by Elsevier B.V.
Wang, Fuan; Freage, Lina; Orbach, Ron; Willner, Itamar
2013-09-03
The progressive development of amplified DNA sensors and aptasensors using replication/nicking enzymes/DNAzyme machineries is described. The sensing platforms are based on the tailoring of a DNA template on which the recognition of the target DNA or the formation of the aptamer-substrate complex trigger on the autonomous isothermal replication/nicking processes and the displacement of a Mg(2+)-dependent DNAzyme that catalyzes the generation of a fluorophore-labeled nucleic acid acting as readout signal for the analyses. Three different DNA sensing configurations are described, where in the ultimate configuration the target sequence is incorporated into a nucleic acid blocker structure associated with the sensing template. The target-triggered isothermal autonomous replication/nicking process on the modified template results in the formation of the Mg(2+)-dependent DNAzyme tethered to a free strand consisting of the target sequence. This activates additional template units for the nucleic acid self-replication process, resulting in the ultrasensitive detection of the target DNA (detection limit 1 aM). Similarly, amplified aptamer-based sensing platforms for cocaine are developed along these concepts. The modification of the cocaine-detection template by the addition of a nucleic acid sequence that enables the autonomous secondary coupled activation of a polymerization/nicking machinery and DNAzyme generation path leads to an improved analysis of cocaine (detection limit 10 nM).
Detection and molecular identification of leishmania RNA virus (LRV) in Iranian Leishmania species.
Hajjaran, Homa; Mahdi, Maryam; Mohebali, Mehdi; Samimi-Rad, Katayoun; Ataei-Pirkooh, Angila; Kazemi-Rad, Elham; Naddaf, Saied Reza; Raoofian, Reza
2016-12-01
Leishmania RNA virus (LRV) was first detected in members of the subgenus Leishmania (Viannia), and later, the virulence and metastasis of the New World species were attributed to this virus. The data on the presence of LRV in Old World species are confined to Leishmania major and a few Leishmania aethiopica isolates. The aim of this study was to survey the presence of LRV in various Iranian Leishmania species originating from patients and animal reservoir hosts. Genomic nucleic acids were extracted from 50 cultured isolates belonging to the species Leishmania major, Leishmania tropica, and Leishmania infantum. A partial sequence of the viral RNA-dependent RNA polymerase (RdRp) gene was amplified, sequenced and compared with appropriate sequences from the GenBank database. We detected the virus in two parasite specimens: an isolate of L. infantum derived from a visceral leishmaniasis (VL) patient who was unresponsive to meglumine antimoniate treatment, and an L. major isolate originating from a great gerbil, Rhombomys opimus. The Iranian LRV sequences showed the highest similarities to an Old World L. major LRV2 and were genetically distant from LRV1 isolates detected in New World Leishmania parasites. We could not attribute treatment failure in VL patient to the presence of LRV due to the limited number of specimens analyzed. Further studies with inclusion of more clinical samples are required to elucidate the potential role of LRVs in pathogenesis or treatment failure of Old World leishmaniasis.
Detection of a novel circovirus in mute swans (Cygnus olor) by using nested broad-spectrum PCR.
Halami, M Y; Nieper, H; Müller, H; Johne, R
2008-03-01
Circoviruses are the causative agents of acute and chronic diseases in several animal species. Clinical symptoms of circovirus infections range from depression and diarrhoea to immunosuppression and feather disorders in birds. Eleven different members of the genus Circovirus are known so far, which infect pigs and birds in a species-specific manner. Here, a nested PCR was developed for the detection of a broad range of different circoviruses in clinical samples. Using this assay, a novel circovirus was detected in mute swans (Cygnus olor) found dead in Germany in 2006. Sequence analysis of the swan circovirus (SwCV) genome, amplified by multiply primed rolling-circle amplification and PCR, indicates that SwCV is a distinct virus most closely related to the goose circovirus (73.2% genome sequence similarity). Sequence variations between SwCV genomes derived from two different individuals were high (15.5% divergence) and mainly confined to the capsid protein-encoding region. By PCR testing of 32 samples derived from swans found dead in two different regions of Germany, detection rates of 20.0 and 77.3% were determined, thus indicating a high incidence of SwCV infection. The clinical significance of SwCV infection, however, needs to be investigated further.
Identification of cancer-specific motifs in mimotope profiles of serum antibody repertoire.
Gerasimov, Ekaterina; Zelikovsky, Alex; Măndoiu, Ion; Ionov, Yurij
2017-06-07
For fighting cancer, earlier detection is crucial. Circulating auto-antibodies produced by the patient's own immune system after exposure to cancer proteins are promising bio-markers for the early detection of cancer. Since an antibody recognizes not the whole antigen but 4-7 critical amino acids within the antigenic determinant (epitope), the whole proteome can be represented by a random peptide phage display library. This opens the possibility to develop an early cancer detection test based on a set of peptide sequences identified by comparing cancer patients' and healthy donors' global peptide profiles of antibody specificities. Due to the enormously large number of peptide sequences contained in global peptide profiles generated by next generation sequencing, the large number of cancer and control sera is required to identify cancer-specific peptides with high degree of statistical significance. To decrease the number of peptides in profiles generated by nextgen sequencing without losing cancer-specific sequences we used for generation of profiles the phage library enriched by panning on the pool of cancer sera. To further decrease the complexity of profiles we used computational methods for transforming a list of peptides constituting the mimotope profiles to the list motifs formed by similar peptide sequences. We have shown that the amino-acid order is meaningful in mimotope motifs since they contain significantly more peptides than motifs among peptides where amino-acids are randomly permuted. Also the single sample motifs significantly differ from motifs in peptides drawn from multiple samples. Finally, multiple cancer-specific motifs have been identified.
Leroux, Robin A; Dutton, Peter H; Abreu-Grobois, F Alberto; Lagueux, Cynthia J; Campbell, Cathi L; Delcroix, Eric; Chevalier, Johan; Horrocks, Julia A; Hillis-Starr, Zandy; Troëng, Sebastian; Harrison, Emma; Stapleton, Seth
2012-01-01
Management of the critically endangered hawksbill turtle in the Wider Caribbean (WC) has been hampered by knowledge gaps regarding stock structure. We carried out a comprehensive stock structure re-assessment of 11 WC hawksbill rookeries using longer mtDNA sequences, larger sample sizes (N = 647), and additional rookeries compared to previous surveys. Additional variation detected by 740 bp sequences between populations allowed us to differentiate populations such as Barbados-Windward and Guadeloupe (F (st) = 0.683, P < 0.05) that appeared genetically indistinguishable based on shorter 380 bp sequences. POWSIM analysis showed that longer sequences improved power to detect population structure and that when N < 30, increasing the variation detected was as effective in increasing power as increasing sample size. Geographic patterns of genetic variation suggest a model of periodic long-distance colonization coupled with region-wide dispersal and subsequent secondary contact within the WC. Mismatch analysis results for individual clades suggest a general population expansion in the WC following a historic bottleneck about 100 000-300 000 years ago. We estimated an effective female population size (N (ef)) of 6000-9000 for the WC, similar to the current estimated numbers of breeding females, highlighting the importance of these regional rookeries to maintaining genetic diversity in hawksbills. Our results provide a basis for standardizing future work to 740 bp sequence reads and establish a more complete baseline for determining stock boundaries in this migratory marine species. Finally, our findings illustrate the value of maintaining an archive of specimens for re-analysis as new markers become available.
Boutte, Julien; Aliaga, Benoît; Lima, Oscar; Ferreira de Carvalho, Julie; Ainouche, Abdelkader; Macas, Jiri; Rousseau-Gueutin, Mathieu; Coriton, Olivier; Ainouche, Malika; Salmon, Armel
2015-01-01
Gene and whole-genome duplications are widespread in plant nuclear genomes, resulting in sequence heterogeneity. Identification of duplicated genes may be particularly challenging in highly redundant genomes, especially when there are no diploid parents as a reference. Here, we developed a pipeline to detect the different copies in the ribosomal RNA gene family in the hexaploid grass Spartina maritima from next-generation sequencing (Roche-454) reads. The heterogeneity of the different domains of the highly repeated 45S unit was explored by identifying single nucleotide polymorphisms (SNPs) and assembling reads based on shared polymorphisms. SNPs were validated using comparisons with Illumina sequence data sets and by cloning and Sanger (re)sequencing. Using this approach, 29 validated polymorphisms and 11 validated haplotypes were reported (out of 34 and 20, respectively, that were initially predicted by our program). The rDNA domains of S. maritima have similar lengths as those found in other Poaceae, apart from the 5′-ETS, which is approximately two-times longer in S. maritima. Sequence homogeneity was encountered in coding regions and both internal transcribed spacers (ITS), whereas high intragenomic variability was detected in the intergenic spacer (IGS) and the external transcribed spacer (ETS). Molecular cytogenetic analysis by fluorescent in situ hybridization (FISH) revealed the presence of one pair of 45S rDNA signals on the chromosomes of S. maritima instead of three expected pairs for a hexaploid genome, indicating loss of duplicated homeologous loci through the diploidization process. The procedure developed here may be used at any ploidy level and using different sequencing technologies. PMID:26530424
[Hepatitis C virus: sequence homology of a European isolate and divergence from the prototype].
Seelig, R; Seelig, H P; Renz, M
1991-08-01
The polymerase chain reaction (PCR) detected specific hepatitis C viral (HCV) RNA sequences in liver biopsies from two patients with chronic hepatitis, in the tissue of a liver implantate, in plasma from four chronic non-A, non-B hepatitis (NANBH) patients and, for the first time, in an infectious anti-D-immunoglobulin preparation. A comparison of the viral sequences coding for a region for the nonstructural NS3 protein from the liver tissues revealed only a very small degree of sequence divergence on the cDNA as well as on the amino acid level (between 0 and 5%). The sequence similarities of the RNA isolated from plasma of the four chronic NANBH patients and the anti-D-immunoglobulin preparation were partly somewhat lower but altogether also high (between 90 and 100%). In contrast, all eight cDNA and amino acid sequences exhibited a significantly higher degree of divergence in comparison with the HCV prototype sequence (between 29 and 32%) than among themselves (between 0 and 10%). This unexpected high sequence similarity of the eight European isolates and their low homology to the Northamerican prototype sequence is indicative for the existence of different types of HCV. This will be important not only for epidemiological studies but also for the development of effective diagnostic procedures and vaccines. Concerning the pathogenesis of NANBH, a double infection or a helper mechanism has to be considered: in addition to the C virus, sequences of an other virus particle were found in the infectious IgG preparation as well as in the liver biopsies.
Nishiyama, Minako; Yamamoto, Shuichi; Kurosawa, Norio
2013-08-01
Ibusuki hot spring is located on the coastline of Kagoshima Bay, Japan. The hot spring water is characterized by high salinity, high temperature, and neutral pH. The hot spring is covered by the sea during high tide, which leads to severe fluctuations in several environmental variables. A combination of molecular- and culture-based techniques was used to determine the bacterial and archaeal diversity of the hot spring. A total of 48 thermophilic bacterial strains were isolated from two sites (Site 1: 55.6°C; Site 2: 83.1°C) and they were categorized into six groups based on their 16S rRNA gene sequence similarity. Two groups (including 32 isolates) demonstrated low sequence similarity with published species, suggesting that they might represent novel taxa. The 148 clones from the Site 1 bacterial library included 76 operational taxonomy units (OTUs; 97% threshold), while 132 clones from the Site 2 bacterial library included 31 OTUs. Proteobacteria, Bacteroidetes, and Firmicutes were frequently detected in both clone libraries. The clones were related to thermophilic, mesophilic and psychrophilic bacteria. Approximately half of the sequences in bacterial clone libraries shared <92% sequence similarity with their closest sequences in a public database, suggesting that the Ibusuki hot spring may harbor a unique and novel bacterial community. By contrast, 77 clones from the Site 2 archaeal library contained only three OTUs, most of which were affiliated with Thaumarchaeota.
Primary Human Immunodeficiency Virus Type 1 (HIV-1) Infection during HIV-1 Gag Vaccination▿
Balamurugan, Arumugam; Lewis, Martha J.; Kitchen, Christina M. R.; Robertson, Michael N.; Shiver, John W.; Daar, Eric S.; Pitt, Jacqueline; Ali, Ayub; Ng, Hwee L.; Currier, Judith S.; Yang, Otto O.
2008-01-01
Vaccination for human immunodeficiency virus type 1 (HIV-1) remains an elusive goal. Whether an unsuccessful vaccine might not only fail to provoke detectable immune responses but also could actually interfere with subsequent natural immunity upon HIV-1 infection is unknown. We performed detailed assessment of an HIV-1 gag DNA vaccine recipient (subject 00015) who was previously uninfected but sustained HIV-1 infection before completing a vaccination trial and another contemporaneously acutely infected individual (subject 00016) with the same strain of HIV-1. Subject 00015 received the vaccine at weeks 0, 4, and 8 and was found to have been acutely HIV-1 infected around the time of the third vaccination. Subject 00016 was a previously HIV-1-seronegative sexual contact who had symptoms of acute HIV-1 infection approximately 2 weeks earlier than subject 00015 and demonstrated subsequent seroconversion. Both individuals reached an unusually low level of chronic viremia (<1,000 copies/ml) without treatment. Subject 00015 had no detectable HIV-1-specific cytotoxic T-lymphocyte (CTL) responses until a borderline response was noted at the time of the third vaccination. The magnitude and breadth of Gag-specific CTL responses in subject 00015 were similar to those of subject 00016 during early chronic infection. Viral sequences from gag, pol, and nef confirmed the common source of HIV-1 between these individuals. The diversity and divergence of sequences in subjects 00015 and 00016 were similar, indicating similar immune pressure on these proteins (including Gag). As a whole, the data suggested that while the gag DNA vaccine did not prime detectable early CTL responses in subject 00015, vaccination did not appreciably impair his ability to contain viremia at levels similar to those in subject 00016. PMID:18199650
Hammondia heydorni oocysts in the faeces of a greyhound in New Zealand.
Ellis, J T; Pomroy, W E
2003-02-01
To identify oocysts found in faecal material of a greyhound. Polymerase chain reaction (PCR) and DNA sequencing were used to study genomic DNA isolated from oocysts purified from faeces of a greyhound. Database searches with the DNA sequences obtained showed they were derived from Hammondia heydorni. A species-specific PCR was developed to detect H. heydorni DNA. Light microscopy in conjunction with PCR and DNA sequencing definitively identified the presence of H. heydorni oocysts in faeces of a greyhound. This study confirms the presence of H. heydorni in New Zealand and indicates the need to correctly identify similar oocysts from dogs, rather than assume they are Neospora caninum.
Fowler, Elizabeth V; Peters, Jennifer M; Gatton, Michelle L; Chen, Nanhua; Cheng, Qin
2002-03-01
In Plasmodium falciparum a highly polymorphic multi-copy gene family, var, encodes the variant surface antigen P. falciparum erythrocyte membrane protein 1 (PfEMP1), which has an important role in cytoadherence and immune evasion. Using previously described universal PCR primers for the first Duffy binding-like domain (DBLalpha) of var we analysed the DBLalpha repertoires of Dd2 (originally from Thailand) and eight isolates from the Solomon Islands (n=4), Philippines (n=2), Papua New Guinea (n=1) and Africa (n=1). We found 15-32 unique DBLalpha sequence types among these isolates and estimated detectable DBLalpha repertoire sizes ranging from 33-38 to 52-57 copies per genome. Our data suggest that var gene repertoires generally consist of 40-50 copies per genome. Eighteen DBLalpha sequences appeared in more than one Asia-Pacific isolate with the number of sequences shared between any two isolates ranging from 0 to 6 (mean=2.0 +/-1.6). At the amino acid level DBLalpha sequence similarity within isolates ranged from 45.2 +/- 7.1 to 50.2 +/- 6.9%, and was not significantly different from the DBLalpha amino acid sequence similarity among isolates (P>0.1). Comparisons with published sequences also revealed little overlap among DBLalpha sequences from different regions. High DBLalpha sequence diversity and minimal overlap among these isolates suggest that the global var gene repertoire is immense, and may potentially be selected for by the host's protective immune response to the var gene products, PfEMP1.
Detection and sequencing of rotavirus among sudanese children.
Magzoub, Magzoub Abbas; Bilal, Naser Eldin; Bilal, Jalal Ali; Alzohairy, Mohammad Abdulrahman; Elamin, Bahaeldin Khalid; Gasim, Gasim Ibrahim
2017-01-01
Diarrheal diseases are a big public health problem worldwide, particularly among developing countries. The current study was conducted to detect and characterize group A rotavirus among admitted children with gastroenteritis to the pediatric hospitals, Sudan. A total of 755 stool samples were collected from Sudanese children with less than 5 years of age presenting with acute gastroenteritis during the period from April to September 2010. Enzyme-linked immunosorbent assay (ELISA) was used to Detection of Rotavirus antigens. Ribonucleic acid (RNAs) were extracted from rotavirus-positive stool samples using (QIAamp ® Viral RNA Mini Kit). (Omniscript ® Reverse Transcription kit) was used to convert RNA to complementary Deoxyribonucleic acid (cDNA). The cDNAs were used as template for detection of VP4-P (P for Protease-sensitive) and VP7-G (G for Glycoprotein) genotyping of Rotavirus using nested PCR and sequencing. Out of the 755 stool samples from children with acute gastroenteritis, 121 were positive for rotavirus A. Among 24 samples that were sequenced; the VP7 predominant G type was G1 (83.3%), followed by G9 (16.7%). Out of these samples, only one VP4 P[8] genotype was detected. As a conclusion the VP7 predominant G type was G1, followed by G9 whereas only one VP4 genotype was detected and showed similarity to P[8] GenBank strain. It appears that the recently approved rotavirus vaccines in Sudan are well matched to the rotavirus genotypes identified in this study, though more studies are needed.
Santos, Paula Fernanda Gonçalves Dos; Costa, Elis Regina Dalla; Ramalho, Daniela M; Rossetti, Maria Lucia; Barcellos, Regina Bones; Nunes, Luciana de Souza; Esteves, Leonardo Souza; Rodenbusch, Rodrigo; Anthony, Richard M; Bergval, Indra; Sengstake, Sarah; Viveiros, Miguel; Kritski, Afrânio; Oliveira, Martha M
2017-06-01
To cope with the emergence of multidrug-resistant tuberculosis (MDR-TB), new molecular methods that can routinely be used to screen for a wide range of drug resistance related genetic markers in the Mycobacterium tuberculosis genome are urgently needed. To evaluate the performance of multiplex ligaton-dependent probe amplification (MLPA) against Genotype® MTBDRplus to detect resistance to isoniazid (INHr) and rifampicin (RIFr). 96 culture isolates characterised for identification, drug susceptibility testing (DST) and sequencing of rpoB, katG, and inhA genes were evaluated by the MLPA and Genotype®MTBDRplus assays. With sequencing as a reference standard, sensitivity (SE) to detect INHr was 92.8% and 85.7%, and specificity (SP) was 100% and 97.5%, for MLPA and Genotype®MTBDRplus, respectively. In relation to RIFr, SE was 87.5% and 100%, and SP was 100% and 98.8%, respectively. Kappa value was identical between Genotype®MTBDRplus and MLPA compared with the standard DST and sequencing for detection of INHr [0.83 (0.75-0.91)] and RIFr [0.93 (0.88-0.98)]. Compared to Genotype®MTBDRplus, MLPA showed similar sensitivity to detect INH and RIF resistance. The results obtained by the MLPA and Genotype®MTBDRplus assays indicate that both molecular tests can be used for the rapid detection of drug-resistant TB with high accuracy. MLPA has the added value of providing information on the circulating M. tuberculosis lineages.
Diversity of Basidiomycetes in Michigan Agricultural Soils▿
Lynch, Michael D. J.; Thorn, R. Greg
2006-01-01
We analyzed the communities of soil basidiomycetes in agroecosystems that differ in tillage history at the Kellogg Biological Station Long-Term Ecological Research site near Battle Creek, Michigan. The approach combined soil DNA extraction through a bead-beating method modified to increase recovery of fungal DNA, PCR amplification with basidiomycete-specific primers, cloning and restriction fragment length polymorphism screening of mixed PCR products, and sequencing of unique clones. Much greater diversity was detected than was anticipated in this habitat on the basis of culture-based methods or surveys of fruiting bodies. With “species” defined as organisms yielding PCR products with ≥99% identity in the 5′ 650 bases of the nuclear large-subunit ribosomal DNA, 241 “species” were detected among 409 unique basidiomycete sequences recovered. Almost all major clades of basidiomycetes from basidiomycetous yeasts and other heterobasidiomycetes through polypores and euagarics (gilled mushrooms and relatives) were represented, with a majority from the latter clade. Only 24 of 241 “species” had 99% or greater sequence similarity to named reference sequences in GenBank, and several clades with multiple “species” could not be identified at the genus level by phylogenetic comparisons with named sequences. The total estimated “species” richness for this 11.2-ha site was 367 “species” of basidiomycetes. Since >99% of the study area has not been sampled, the accuracy of our diversity estimate is uncertain. Replication in time and space is required to detect additional diversity and the underlying community structure. PMID:16950900
Al-Habsi, Khalid; Yang, Rongchang; Ryan, Una; Jacobson, Caroline; Miller, David W
2017-02-15
Uninucleated Entamoeba cysts measuring 7.3×7.7μm were detected in faecal samples collected from wild Rangeland goats (Capra hircus) after arrival at a commercial goat depot near Geraldton, Western Australia at a prevalence of 6.4% (8/125). Sequences were obtained at the 18S rRNA (n=8) and actin (n=5) loci following PCR amplification. At the 18S locus, phylogenetic analysis grouped the isolates closest with an E. bovis isolate (FN666250) from a sheep from Sweden with 99% similarity. At the actin locus, no E. bovis sequences were available, and the isolates shared 94.0% genetic similarity with E. suis from a pig in Western Japan. This is the first report to describe the morphology and molecular characterisation of Entamoeba from Rangeland goats in Western Australia and the first study to produce actin sequences from E. bovis-like Entamoeba sp. Copyright © 2017 Elsevier B.V. All rights reserved.
Xavier, Crislaine; Cabral-de-Mello, Diogo Cavalcanti; de Moura, Rita Cássia
2014-12-01
Cytogenetic studies of the Neotropical beetle genus Dichotomius (Scarabaeinae, Coleoptera) have shown dynamism for centromeric constitutive heterochromatin sequences. In the present work we studied the chromosomes and isolated repetitive sequences of Dichotomius schiffleri aiming to contribute to the understanding of coleopteran genome/chromosomal organization. Dichotomius schiffleri presented a conserved karyotype and heterochromatin distribution in comparison to other species of the genus with 2n = 18, biarmed chromosomes, and pericentromeric C-positive blocks. Similarly to heterochromatin distributional patterns, the highly and moderately repetitive DNA fraction (C 0 t-1 DNA) was detected in pericentromeric areas, contrasting with the euchromatic mapping of an isolated TE (named DsmarMITE). After structural analyses, the DsmarMITE was classified as a non-autonomous element of the type miniature inverted-repeat transposable element (MITE) with terminal inverted repeats similar to Mariner elements of insects from different orders. The euchromatic distribution for DsmarMITE indicates that it does not play a part in the dynamics of constitutive heterochromatin sequences.
Detection and identification of Rickettsia species in Ixodes tick populations from Estonia.
Katargina, Olga; Geller, Julia; Ivanova, Anna; Värv, Kairi; Tefanova, Valentina; Vene, Sirkka; Lundkvist, Åke; Golovljova, Irina
2015-09-01
A total of 1640 ticks collected in different geographical parts of Estonia were screened for the presence of Rickettsia species DNA by real-time PCR. DNA of Rickettsia was detected in 83 out of 1640 questing ticks with an overall prevalence of 5.1%. The majority of the ticks infected by rickettsiae were Ixodes ricinus (74 of 83), while 9 of the 83 positive ticks were Ixodes persulcatus. For rickettsial species identification, a part of the citrate synthase gltA gene was sequenced. The majority of the positive samples were identified as Rickettsia helvetica (81 out of 83) and two of the samples were identified as Rickettsia monacensis and Candidatus R. tarasevichiae, respectively. Genetic characterization based on the partial gltA gene showed that the Estonian sequences within the R. helvetica, R. monacensis and Candidatus R. tarasevichiae species demonstrated 100% similarity with sequences deposited in GenBank, originating from Rickettsia species distributed over large territories from Europe to Asia. Copyright © 2015 Elsevier GmbH. All rights reserved.
Molecular prevalence and genetic diversity of bovine Theileria orientalis in Myanmar.
Bawm, Saw; Shimizu, Kohei; Hirota, Jun-Ichi; Tosa, Yusuke; Htun, Lat Lat; Maw, Ni Ni; Thein, Myint; Kato, Hirotomo; Sakurai, Tatsuya; Katakura, Ken
2014-08-01
Theileria orientalis is a causative agent of benign theileriosis in cattle and distributed in mainly Asian countries. In the present study, we examined the prevalence of T. orientalis infection by PCR based on the major piroplasm surface protein gene (MPSP) sequences in cattle in Myanmar, followed by phylogenetic analysis of the MPSP genes. The MPSP gene was amplified in 258 of 713 (36.2%) cattle blood DNA samples collected from five cities in different geographical regions of Myanmar. Phylogenetic analysis of MPSP sequences from 54 T. orientalis-positive DNA samples revealed the presence of six allelic genotypes, including Types 1, 3, 4, 5, 7, and N-3. Types 5 and 7 were the predominant types detected. Sequences of the MPSP genes detected in Myanmar were closely related to those from Thailand, Vietnam or Mongolia. These findings suggest that movement of animals carrying T. orientalis parasites between Southeast Asian countries could be a reason for the similar genotype distribution of the parasites in Myanmar. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Expression of glutathione peroxidase I gene in selenium-deficient rats.
Reddy, A P; Hsu, B L; Reddy, P S; Li, N Q; Thyagaraju, K; Reddy, C C; Tam, M F; Tu, C P
1988-01-01
We have characterized a cDNA pGPX1211 encoding rat glutathione peroxidase I. The selenocysteine in the protein corresponded to a TGA codon in the coding region of the cDNA, similar to earlier findings in mouse and human genes, and a gene encoding the formate dehydrogenase from E. coli, another selenoenzyme. The rat GSH peroxidase I has a calculated subunit molecular weight of 22,155 daltons and shares 95% and 86% sequence homology with the mouse and human subunits, respectively. The 3'-noncoding sequence (greater than 930 bp) in pGPX1211 is much longer than that of the human sequences. We found that glutathione peroxidase I mRNA, but not the polypeptide, was expressed under nutritional stress of selenium deficiency where no glutathione peroxidase I activity can be detected. The failure of detecting any apoprotein for the glutathione peroxidase I under selenium deficiency and results published from other laboratories supports the proposal that selenium may be incorporated into the glutathione peroxidase I co-translationally. Images PMID:2838821
Rajendran, Senthilnathan; Jothi, Arunachalam
2018-05-16
The Three-dimensional structure of a protein depends on the interaction between their amino acid residues. These interactions are in turn influenced by various biophysical properties of the amino acids. There are several examples of proteins that share the same fold but are very dissimilar at the sequence level. For proteins to share a common fold some crucial interactions should be maintained despite insignificant sequence similarity. Since the interactions are because of the biophysical properties of the amino acids, we should be able to detect descriptive patterns for folds at such a property level. In this line, the main focus of our research is to analyze such proteins and to characterize them in terms of their biophysical properties. Protein structures with sequence similarity lesser than 40% were selected for ten different subfolds from three different mainfolds (according to CATH classification) and were used for this analysis. We used the normalized values of the 49 physio-chemical, energetic and conformational properties of amino acids. We characterize the folds based on the average biophysical property values. We also observed a fold specific correlational behavior of biophysical properties despite a very low sequence similarity in our data. We further trained three different binary classification models (Naive Bayes-NB, Support Vector Machines-SVM and Bayesian Generalized Linear Model-BGLM) which could discriminate mainfold based on the biophysical properties. We also show that among the three generated models, the BGLM classifier model was able to discriminate protein sequences coming under all beta category with 81.43% accuracy and all alpha, alpha-beta proteins with 83.37% accuracy. Copyright © 2018 Elsevier Ltd. All rights reserved.
Poly A- Transcripts Expressed in HeLa Cells
Lu, Jian; Xuan, Zhenyu; Chen, Jun; Zheng, Yonglan; Zhou, Tom; Zhang, Michael Q.; Wu, Chung-I; Wang, San Ming
2008-01-01
Background Transcripts expressed in eukaryotes are classified as poly A+ transcripts or poly A- transcripts based on the presence or absence of the 3′ poly A tail. Most transcripts identified so far are poly A+ transcripts, whereas the poly A- transcripts remain largely unknown. Methodology/Principal Findings We developed the TRD (Total RNA Detection) system for transcript identification. The system detects the transcripts through the following steps: 1) depleting the abundant ribosomal and small-size transcripts; 2) synthesizing cDNA without regard to the status of the 3′ poly A tail; 3) applying the 454 sequencing technology for massive 3′ EST collection from the cDNA; and 4) determining the genome origins of the detected transcripts by mapping the sequences to the human genome reference sequences. Using this system, we characterized the cytoplasmic transcripts from HeLa cells. Of the 13,467 distinct 3′ ESTs analyzed, 24% are poly A-, 36% are poly A+, and 40% are bimorphic with poly A+ features but without the 3′ poly A tail. Most of the poly A- 3′ ESTs do not match known transcript sequences; they have a similar distribution pattern in the genome as the poly A+ and bimorphic 3′ ESTs, and their mapped intergenic regions are evolutionarily conserved. Experiments confirmed the authenticity of the detected poly A- transcripts. Conclusion/Significance Our study provides the first large-scale sequence evidence for the presence of poly A- transcripts in eukaryotes. The abundance of the poly A- transcripts highlights the need for comprehensive identification of these transcripts for decoding the transcriptome, annotating the genome and studying biological relevance of the poly A- transcripts. PMID:18665230
Blouin, Arnaud G; Chooi, Kar Mun; Warren, Ben; Napier, Kathryn R; Barrero, Roberto A; MacDiarmid, Robin M
2018-05-01
A novel virus, with characteristics of viruses classified within the genus Vitivirus, was identified from a sample of Vitis vinifera cv. Chardonnay in New Zealand. The virus was detected with high throughput sequencing (small RNA and total RNA) and its sequence was confirmed by Sanger sequencing. Its genome is 7507 nt long (excluding the polyA tail) with an organisation similar to that described for other classifiable members of the genus Vitivirus. The closest relative of the virus is grapevine virus E (GVE) with 65% aa identity in ORF1 (65% nt identity) and 63% aa identity in the coat protein (66% nt identity). The relationship with GVE was confirmed with phylogenetic analysis, showing the new virus branching with GVE, Agave tequilina leaf virus and grapevine virus G (GVG). A limited survey revealed the presence of this virus in multiple plants from the same location where the newly described GVG was discovered, and in most cases both viruses were detected as co-infections. The genetic characteristics of this virus suggest it represents an isolate of a new species within the genus Vitivirus and following the current nomenclature, we propose the name "Grapevine virus I".
Yamagata, A; Kato, J; Hirota, R; Kuroda, A; Ikeda, T; Takiguchi, N; Ohtake, H
1999-06-01
Two plasmids were discovered in the ammonia-oxidizing bacterium Nitrosomonas sp. strain ENI-11, which was isolated from activated sludge. The plasmids, designated pAYS and pAYL, were relatively small, being approximately 1.9 kb long. They were cryptic plasmids, having no detectable plasmid-linked antibiotic resistance or heavy metal resistance markers. The complete nucleotide sequences of pAYS and pAYL were determined, and their physical maps were constructed. There existed two major open reading frames, ORF1 in pAYS and ORF2 in pAYL, each of which was more than 500 bp long. The predicted product of ORF2 was 28% identical to part of the replication protein of a Bacillus plasmid, pBAA1. However, no significant similarity to any known protein sequences was detected with the predicted product of ORF1. pAYS and pAYL had a highly homologous region, designated HHR, of 262 bp. The overall identity was 98% between the two nucleotide sequences. Interestingly, HHR-homologous sequences were also detected in the genomes of ENI-11 and the plasmidless strain Nitrosomonas europaea IFO14298. Deletion analysis of pAYS and pAYL indicated that HHR, together with either ORF1 or ORF2, was essential for plasmid maintenance in ENI-11. To our knowledge, pAYS and pAYL are the first plasmids found in the ammonia-oxidizing autotrophic bacteria.
Pathogenic bacteria in sewage treatment plants as revealed by 454 pyrosequencing.
Ye, Lin; Zhang, Tong
2011-09-01
This study applied 454 high-throughput pyrosequencing to analyze potentially pathogenic bacteria in activated sludge from 14 municipal wastewater treatment plants (WWTPs) across four countries (China, U.S., Canada, and Singapore), plus the influent and effluent of one of the 14 WWTPs. A total of 370,870 16S rRNA gene sequences with average length of 207 bps were obtained and all of them were assigned to corresponding taxonomic ranks by using RDP classifier and MEGAN. It was found that the most abundant potentially pathogenic bacteria in the WWTPs were affiliated with the genera of Aeromonas and Clostridium. Aeromonas veronii, Aeromonas hydrophila, and Clostridium perfringens were species most similar to the potentially pathogenic bacteria found in this study. Some sequences highly similar (>99%) to Corynebacterium diphtheriae were found in the influent and activated sludge samples from a saline WWTP. Overall, the percentage of the sequences closely related (>99%) to known pathogenic bacteria sequences was about 0.16% of the total sequences. Additionally, a platform-independent Java application (BAND) was developed for graphical visualization of the data of microbial abundance generated by high-throughput pyrosequencing. The approach demonstrated in this study could examine most of the potentially pathogenic bacteria simultaneously instead of one-by-one detection by other methods.
Satellite DNA Sequences in Canidae and Their Chromosome Distribution in Dog and Red Fox.
Vozdova, Miluse; Kubickova, Svatava; Cernohorska, Halina; Fröhlich, Jan; Rubes, Jiri
2016-01-01
Satellite DNA is a characteristic component of mammalian centromeric heterochromatin, and a comparative analysis of its evolutionary dynamics can be used for phylogenetic studies. We analysed satellite and satellite-like DNA sequences available in NCBI for 4 species of the family Canidae (red fox, Vulpes vulpes, VVU; domestic dog, Canis familiaris, CFA; arctic fox, Vulpes lagopus, VLA; raccoon dog, Nyctereutes procyonoides procyonoides, NPR) by comparative sequence analysis, which revealed 86-90% intraspecies and 76-79% interspecies similarity. Comparative fluorescence in situ hybridisation in the red fox and dog showed signals of the red fox satellite probe in canine and vulpine autosomal centromeres, on VVUY, B chromosomes, and in the distal parts of VVU9q and VVU10p which were shown to contain nucleolus organiser regions. The CFA satellite probe stained autosomal centromeres only in the dog. The CFA satellite-like DNA did not show any significant sequence similarity with the satellite DNA of any species analysed and was localised to the centromeres of 9 canine chromosome pairs. No significant heterochromatin block was detected on the B chromosomes of the red fox. Our results show extensive heterogeneity of satellite sequences among Canidae and prove close evolutionary relationships between the red and arctic fox. © 2017 S. Karger AG, Basel.
Font, María Isabel; Rubio, Luis; Martínez-Culebras, Pedro Vicente; Jordá, Concepción
2007-09-01
The population structure and genetic variation of two begomoviruses: tomato yellow leaf curl Sardinia virus (TYLCSV) and tomato yellow leaf curl virus (TYLCV) in tomato crops of Spain were studied from 1997 until 2001. Restriction digestion of a genomic region comprised of the CP coat protein gene (CPR) of 358 TYLC virus isolates enabled us to classify them into 14 haplotypes. Nucleotide sequences of two genomic regions: CPR, and the surrounding intergenic region (SIR) were determined for at least two isolates per haplotype. SIR was more variable than CPR and showed multiple recombination events whereas no recombination was detected within CPR. In all geographic regions except Murcia, the population was, or evolved to be composed of one predominant haplotype with a low genetic diversity (<0.0180). In Murcia, two successive changes of the predominant haplotype were observed in the best studied population. Phylogenetic analysis showed that the TYLCSV sequences determined clustered with sequences obtained from the GenBank of other TYLCSV Spanish isolates which were clearly separated from TYLCSV Italian isolates. Most of our TYLCV sequences were similar to those of isolates from Japan and Portugal, and the sequences obtained from TYLCV isolates from the Canary island of Lanzarote were similar to those of Caribbean TYLCV isolates.
Brittnacher, Mitchell J; Heltshe, Sonya L; Hayden, Hillary S; Radey, Matthew C; Weiss, Eli J; Damman, Christopher J; Zisman, Timothy L; Suskind, David L; Miller, Samuel I
2016-01-01
Comparative analysis of gut microbiomes in clinical studies of human diseases typically rely on identification and quantification of species or genes. In addition to exploring specific functional characteristics of the microbiome and potential significance of species diversity or expansion, microbiome similarity is also calculated to study change in response to therapies directed at altering the microbiome. Established ecological measures of similarity can be constructed from species abundances, however methods for calculating these commonly used ecological measures of similarity directly from whole genome shotgun (WGS) metagenomic sequence are lacking. We present an alignment-free method for calculating similarity of WGS metagenomic sequences that is analogous to the Bray-Curtis index for species, implemented by the General Utility for Testing Sequence Similarity (GUTSS) software application. This method was applied to intestinal microbiomes of healthy young children to measure developmental changes toward an adult microbiome during the first 3 years of life. We also calculate similarity of donor and recipient microbiomes to measure establishment, or engraftment, of donor microbiota in fecal microbiota transplantation (FMT) studies focused on mild to moderate Crohn's disease. We show how a relative index of similarity to donor can be calculated as a measure of change in a patient's microbiome toward that of the donor in response to FMT. Because clinical efficacy of the transplant procedure cannot be fully evaluated without analysis methods to quantify actual FMT engraftment, we developed a method for detecting change in the gut microbiome that is independent of species identification and database bias, sensitive to changes in relative abundance of the microbial constituents, and can be formulated as an index for correlating engraftment success with clinical measures of disease. More generally, this method may be applied to clinical evaluation of human microbiomes and provide potential diagnostic determination of individuals who may be candidates for specific therapies directed at alteration of the microbiome.
Active bacterial community structure along vertical redox gradients in Baltic Sea sediment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jansson, Janet; Edlund, Anna; Hardeman, Fredrik
Community structures of active bacterial populations were investigated along a vertical redox profile in coastal Baltic Sea sediments by terminal-restriction fragment length polymorphism (T-RFLP) and clone library analysis. According to correspondence analysis of T-RFLP results and sequencing of cloned 16S rRNA genes, the microbial community structures at three redox depths (179 mV, -64 mV and -337 mV) differed significantly. The bacterial communities in the community DNA differed from those in bromodeoxyuridine (BrdU)-labeled DNA, indicating that the growing members of the community that incorporated BrdU were not necessarily the most dominant members. The structures of the actively growing bacterial communities weremore » most strongly correlated to organic carbon followed by total nitrogen and redox potentials. Bacterial identification by sequencing of 16S rRNA genes from clones of BrdU-labeled DNA and DNA from reverse transcription PCR (rt-PCR) showed that bacterial taxa involved in nitrogen and sulfur cycling were metabolically active along the redox profiles. Several sequences had low similarities to previously detected sequences indicating that novel lineages of bacteria are present in Baltic Sea sediments. Also, a high number of different 16S rRNA gene sequences representing different phyla were detected at all sampling depths.« less
Detection and decay rates of prey and prey symbionts in the gut of a predator through metagenomics.
Paula, Débora P; Linard, Benjamin; Andow, David A; Sujii, Edison R; Pires, Carmen S S; Vogler, Alfried P
2015-07-01
DNA methods are useful to identify ingested prey items from the gut of predators, but reliable detection is hampered by low amounts of degraded DNA. PCR-based methods can retrieve minute amounts of starting material but suffer from amplification biases and cross-reactions with the predator and related species genomes. Here, we use PCR-free direct shotgun sequencing of total DNA isolated from the gut of the harlequin ladybird Harmonia axyridis at five time points after feeding on a single pea aphid Acyrthosiphon pisum. Sequence reads were matched to three reference databases: Insecta mitogenomes of 587 species, including H. axyridis sequenced here; A. pisum nuclear genome scaffolds; and scaffolds and complete genomes of 13 potential bacterial symbionts. Immediately after feeding, multicopy mtDNA of A. pisum was detected in tens of reads, while hundreds of matches to nuclear scaffolds were detected. Aphid nuclear DNA and mtDNA decayed at similar rates (0.281 and 0.11 h(-1) respectively), and the detectability periods were 32.7 and 23.1 h. Metagenomic sequencing also revealed thousands of reads of the obligate Buchnera aphidicola and facultative Regiella insecticola aphid symbionts, which showed exponential decay rates significantly faster than aphid DNA (0.694 and 0.80 h(-1) , respectively). However, the facultative aphid symbionts Hamiltonella defensa, Arsenophonus spp. and Serratia symbiotica showed an unexpected temporary increase in population size by 1-2 orders of magnitude in the predator guts before declining. Metagenomics is a powerful tool that can reveal complex relationships and the dynamics of interactions among predators, prey and their symbionts. © 2014 John Wiley & Sons Ltd.
Simmons, Greg; Clarke, Daniel; McKee, Jeff; Young, Paul; Meers, Joanne
2014-01-01
Gibbon ape leukaemia virus (GALV) and koala retrovirus (KoRV) share a remarkably close sequence identity despite the fact that they occur in distantly related mammals on different continents. It has previously been suggested that infection of their respective hosts may have occurred as a result of a species jump from another, as yet unidentified vertebrate host. To investigate possible sources of these retroviruses in the Australian context, DNA samples were obtained from 42 vertebrate species and screened using PCR in order to detect proviral sequences closely related to KoRV and GALV. Four proviral partial sequences totalling 2880 bases which share a strong similarity with KoRV and GALV were detected in DNA from a native Australian rodent, the grassland melomys, Melomys burtoni. We have designated this novel gammaretrovirus Melomys burtoni retrovirus (MbRV). The concatenated nucleotide sequence of MbRV shares 93% identity with the corresponding sequence from GALV-SEATO and 83% identity with KoRV. The geographic ranges of the grassland melomys and of the koala partially overlap. Thus a species jump by MbRV from melomys to koalas is conceivable. However the genus Melomys does not occur in mainland South East Asia and so it appears most likely that another as yet unidentified host was the source of GALV.
Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song
2012-08-01
Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.
Liberti, D; Marais, A; Svanella-Dumas, L; Dulucq, M J; Alioto, D; Ragozzino, A; Rodoni, B; Candresse, T
2005-04-01
ABSTRACT A trichovirus closely related to Apple chlorotic leaf spot virus (ACLSV) was detected in symptomatic apricot and Japanese plum from Italy. The Sus2 isolate of this agent cross-reacted with anti-ACLSV polyclonal reagents but was not detected by broad-specificity anti- ACLSV monoclonal antibodies. It had particles with typical trichovirus morphology but, contrary to ACLSV, was unable to infect Chenopodium quinoa and C. amaranticolor. The sequence of its genome (7,494 nucleotides [nt], missing only approximately 30 to 40 nt of the 5' terminal sequence) and the partial sequence of another isolate were determined. The new virus has a genomic organization similar to that of ACLSV, with three open reading frames coding for a replication-associated protein (RNA-dependent RNA polymerase), a movement protein, and a capsid protein, respectively. However, it had only approximately 65 to 67% nucleotide identity with sequenced isolates of ACLSV. The differences in serology, host range, genome sequence, and phylogenetic reconstructions for all viral proteins support the idea that this agent should be considered a new virus, for which the name Apricot pseudo-chlorotic leaf spot virus (APCLSV) is proposed. APCLSV shows substantial sequence variability and has been recovered from various Prunus sources coming from seven countries, an indication that it is likely to have a wide geographical distribution.
Diversity of viruses detected by deep sequencing in pigs from a common background
USDA-ARS?s Scientific Manuscript database
The trial was successful in identifying a number of viruses in the feces of the pigs demonstrating the application of this technology to determine the background noise in the animals. The findings in this study are similar to the fecal virome in pigs from a typical commercial swine farm in the Unite...
Li, Zheng; Wang, Shu; Gui, Xiao-Ling; Chang, Xiao-Bei; Gong, Zhen-Hui
2013-01-01
Mature pepper (Capsicum sp.) fruits come in a variety of colors, including red, orange, yellow, brown, and white. To better understand the genetic and regulatory relationships between the yellow fruit phenotype and the capsanthin-capsorubin synthase gene (Ccs), we examined 156 Capsicum varieties, most of which were collected from Northwest Chinese landraces. A new ccs variant was identified in the yellow fruit cultivar CK7. Cluster analysis revealed that CK7, which belongs to the C. annuum species, has low genetic similarity to other yellow C. annuum varieties. In the coding sequence of this ccs allele, we detected a premature stop codon derived from a C to G change, as well as a downstream frame-shift caused by a 1-bp nucleotide deletion. In addition, the expression of the gene was detected in mature CK7 fruit. Furthermore, the promoter sequences of Ccs from some pepper varieties were examined, and we detected a 176-bp tandem repeat sequence in the promoter region. In all C. annuum varieties examined in this study, the repeat number was three, compared with four in two C. chinense accessions. The sequence similarity ranged from 84.8% to 97.7% among the four types of repeats, and some putative cis-elements were also found in every repeat. This suggests that the transcriptional regulation of Ccs expression is complex. Based on the analysis of the novel C. annuum mutation reported here, along with the studies of three mutation types in yellow C. annuum and C. chinense accessions, we suggest that the mechanism leading to the production of yellow color fruit may be not as complex as that leading to orange fruit production.
Gui, Xiao-Ling; Chang, Xiao-Bei; Gong, Zhen-Hui
2013-01-01
Mature pepper (Capsicum sp.) fruits come in a variety of colors, including red, orange, yellow, brown, and white. To better understand the genetic and regulatory relationships between the yellow fruit phenotype and the capsanthin-capsorubin synthase gene (Ccs), we examined 156 Capsicum varieties, most of which were collected from Northwest Chinese landraces. A new ccs variant was identified in the yellow fruit cultivar CK7. Cluster analysis revealed that CK7, which belongs to the C. annuum species, has low genetic similarity to other yellow C. annuum varieties. In the coding sequence of this ccs allele, we detected a premature stop codon derived from a C to G change, as well as a downstream frame-shift caused by a 1-bp nucleotide deletion. In addition, the expression of the gene was detected in mature CK7 fruit. Furthermore, the promoter sequences of Ccs from some pepper varieties were examined, and we detected a 176-bp tandem repeat sequence in the promoter region. In all C. annuum varieties examined in this study, the repeat number was three, compared with four in two C. chinense accessions. The sequence similarity ranged from 84.8% to 97.7% among the four types of repeats, and some putative cis-elements were also found in every repeat. This suggests that the transcriptional regulation of Ccs expression is complex. Based on the analysis of the novel C. annuum mutation reported here, along with the studies of three mutation types in yellow C. annuum and C. chinense accessions, we suggest that the mechanism leading to the production of yellow color fruit may be not as complex as that leading to orange fruit production. PMID:23637942
Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin
2017-01-21
RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
2014-01-01
Background Huanglongbing (HLB) or citrus greening is a devastating disease of citrus. The gram-negative bacterium Candidatus Liberibacter asiaticus (Las) belonging to the α-proteobacteria is responsible for HLB in North America as well as in Asia. Currently, there is no cure for this disease. Early detection and quarantine of Las-infected trees are important management strategies used to prevent HLB from invading HLB-free citrus producing regions. Quantitative real-time PCR (qRT-PCR) based molecular diagnostic assays have been routinely used in the detection and diagnosis of Las. The oligonucleotide primer pairs based on conserved genes or regions, which include 16S rDNA and the β-operon, have been widely employed in the detection of Las by qRT-PCR. The availability of whole genome sequence of Las now allows the design of primers beyond the conserved regions for the detection of Las explicitly. Results We took a complimentary approach by systematically screening the genes in a genome-wide fashion, to identify the unique signatures that are only present in Las by an exhaustive sequence based similarity search against the nucleotide sequence database. Our search resulted in 34 probable unique signatures. Furthermore, by designing the primer pair specific to the identified signatures, we showed that most of our primer sets are able to detect Las from the infected plant and psyllid materials collected from the USA and China by qRT-PCR. Overall, 18 primer pairs of the 34 are found to be highly specific to Las with no cross reactivity to the closely related species Ca. L. americanus (Lam) and Ca. L. africanus (Laf). Conclusions We have designed qRT-PCR primers based on Las specific genes. Among them, 18 are suitable for the detection of Las from Las-infected plant and psyllid samples. The repertoire of primers that we have developed and characterized in this study enhanced the qRT-PCR based molecular diagnosis of HLB. PMID:24533511
Seven-Tesla Magnetization Transfer Imaging to Detect Multiple Sclerosis White Matter Lesions.
Chou, I-Jun; Lim, Su-Yin; Tanasescu, Radu; Al-Radaideh, Ali; Mougin, Olivier E; Tench, Christopher R; Whitehouse, William P; Gowland, Penny A; Constantinescu, Cris S
2018-03-01
Fluid-attenuated inversion recovery (FLAIR) imaging at 3 Tesla (T) field strength is the most sensitive modality for detecting white matter lesions in multiple sclerosis. While 7T FLAIR is effective in detecting cortical lesions, it has not been fully optimized for visualization of white matter lesions and thus has not been used for delineating lesions in quantitative magnetic resonance imaging (MRI) studies of the normal appearing white matter in multiple sclerosis. Therefore, we aimed to evaluate the sensitivity of 7T magnetization-transfer-weighted (MT w ) images in the detection of white matter lesions compared with 3T-FLAIR. Fifteen patients with clinically isolated syndrome, 6 with multiple sclerosis, and 10 healthy participants were scanned with 7T 3-dimensional (D) MT w and 3T-2D-FLAIR sequences on the same day. White matter lesions visible on either sequence were delineated. Of 662 lesions identified on 3T-2D-FLAIR images, 652 were detected on 7T-3D-MT w images (sensitivity, 98%; 95% confidence interval, 97% to 99%). The Spearman correlation coefficient between lesion loads estimated by the two sequences was .910. The intrarater and interrater reliability for 7T-3D-MT w images was good with an intraclass correlation coefficient (ICC) of 98.4% and 81.8%, which is similar to that for 3T-2D-FLAIR images (ICC 96.1% and 96.7%). Seven-Tesla MT w sequences detected most of the white matter lesions identified by FLAIR at 3T. This suggests that 7T-MT w imaging is a robust alternative for detecting demyelinating lesions in addition to 3T-FLAIR. Future studies need to compare the roles of optimized 7T-FLAIR and of 7T-MT w imaging. © 2017 The Authors. Journal of Neuroimaging published by Wiley Periodicals, Inc. on behalf of American Society of Neuroimaging.
Object detection in cinematographic video sequences for automatic indexing
NASA Astrophysics Data System (ADS)
Stauder, Jurgen; Chupeau, Bertrand; Oisel, Lionel
2003-06-01
This paper presents an object detection framework applied to cinematographic post-processing of video sequences. Post-processing is done after production and before editing. At the beginning of each shot of a video, a slate (also called clapperboard) is shown. The slate contains notably an electronic audio timecode that is necessary for audio-visual synchronization. This paper presents an object detection framework to detect slates in video sequences for automatic indexing and post-processing. It is based on five steps. The first two steps aim to reduce drastically the video data to be analyzed. They ensure high recall rate but have low precision. The first step detects images at the beginning of a shot possibly showing up a slate while the second step searches in these images for candidates regions with color distribution similar to slates. The objective is to not miss any slate while eliminating long parts of video without slate appearance. The third and fourth steps are statistical classification and pattern matching to detected and precisely locate slates in candidate regions. These steps ensure high recall rate and high precision. The objective is to detect slates with very little false alarms to minimize interactive corrections. In a last step, electronic timecodes are read from slates to automize audio-visual synchronization. The presented slate detector has a recall rate of 89% and a precision of 97,5%. By temporal integration, much more than 89% of shots in dailies are detected. By timecode coherence analysis, the precision can be raised too. Issues for future work are to accelerate the system to be faster than real-time and to extend the framework for several slate types.
Comparative Genetic Analyses of Human Rhinovirus C (HRV-C) Complete Genome from Malaysia.
Khaw, Yam Sim; Chan, Yoke Fun; Jafar, Faizatul Lela; Othman, Norlijah; Chee, Hui Yee
2016-01-01
Human rhinovirus-C (HRV-C) has been implicated in more severe illnesses than HRV-A and HRV-B, however, the limited number of HRV-C complete genomes (complete 5' and 3' non-coding region and open reading frame sequences) has hindered the in-depth genetic study of this virus. This study aimed to sequence seven complete HRV-C genomes from Malaysia and compare their genetic characteristics with the 18 published HRV-Cs. Seven Malaysian HRV-C complete genomes were obtained with newly redesigned primers. The seven genomes were classified as HRV-C6, C12, C22, C23, C26, C42, and pat16 based on the VP4/VP2 and VP1 pairwise distance threshold classification. Five of the seven Malaysian isolates, namely, 3430-MY-10/C22, 8713-MY-10/C23, 8097-MY-11/C26, 1570-MY-10/C42, and 7383-MY-10/pat16 are the first newly sequenced complete HRV-C genomes. All seven Malaysian isolates genomes displayed nucleotide similarity of 63-81% among themselves and 63-96% with other HRV-Cs. Malaysian HRV-Cs had similar putative immunogenic sites, putative receptor utilization and potential antiviral sites as other HRV-Cs. The genomic features of Malaysian isolates were similar to those of other HRV-Cs. Negative selections were frequently detected in HRV-Cs complete coding sequences indicating that these sequences were under functional constraint. The present study showed that HRV-Cs from Malaysia have diverse genetic sequences but share conserved genomic features with other HRV-Cs. This genetic information could provide further aid in the understanding of HRV-C infection.
Lashkari, Mohammadreza; Manzari, Shahab; Sahragard, Ahad; Malagnini, Valeria; Boykin, Laura M; Hosseini, Reza
2014-07-01
The Asian citrus psyllid, Diaphorina citri Kuwayama (Hemiptera: Liviidae), is one of the most serious pests of citrus in the world, because it transmits the pathogen that causes citrus greening disease. To determine genetic variation among geographic populations of D. citri, microsatellite markers, mitochondrial gene cytochrome oxidase I (mtCOI) and the Wolbachia-Diaphorina, wDi, gene wsp sequence data were used to characterize Iranian and Pakistani populations. Also, a Bayesian phylogenetic technique was utilized to elucidate the relationships among the sequences data in this study and all mtCOI and wsp sequence data available in GenBank and the Wolbachia database. Microsatellite markers revealed significant genetic differentiation among Iranian populations, as well as between Iranian and Pakistani populations (FST = 0.0428, p < 0.01). Within Iran, the Sistan-Baluchestan population is significantly different from the Hormozgan (Fareghan) and Fars populations. By contrast, mtCOI data revealed two polymorphic sites separating the sequences from Iran and Pakistan. Global phylogenetic analyses showed that D. citri populations in Iran, India, Saudi Arabia, Brazil, Mexico, Florida and Texas (USA) are similar. Wolbachia, wDi, wsp sequences were similar among Iranian populations, but different between Iranian and Pakistani populations. The South West Asia (SWA) group is the most likely source of the introduced Iranian populations of D. citri. This assertion is also supported by the sequence similarity of the Wolbachia, wDi, strains from the Florida, USA and Iranian D. citri. These results should be considered when looking for biological controls in either country. © 2013 Society of Chemical Industry.
Comparative Genetic Analyses of Human Rhinovirus C (HRV-C) Complete Genome from Malaysia
Khaw, Yam Sim; Chan, Yoke Fun; Jafar, Faizatul Lela; Othman, Norlijah; Chee, Hui Yee
2016-01-01
Human rhinovirus-C (HRV-C) has been implicated in more severe illnesses than HRV-A and HRV-B, however, the limited number of HRV-C complete genomes (complete 5′ and 3′ non-coding region and open reading frame sequences) has hindered the in-depth genetic study of this virus. This study aimed to sequence seven complete HRV-C genomes from Malaysia and compare their genetic characteristics with the 18 published HRV-Cs. Seven Malaysian HRV-C complete genomes were obtained with newly redesigned primers. The seven genomes were classified as HRV-C6, C12, C22, C23, C26, C42, and pat16 based on the VP4/VP2 and VP1 pairwise distance threshold classification. Five of the seven Malaysian isolates, namely, 3430-MY-10/C22, 8713-MY-10/C23, 8097-MY-11/C26, 1570-MY-10/C42, and 7383-MY-10/pat16 are the first newly sequenced complete HRV-C genomes. All seven Malaysian isolates genomes displayed nucleotide similarity of 63–81% among themselves and 63–96% with other HRV-Cs. Malaysian HRV-Cs had similar putative immunogenic sites, putative receptor utilization and potential antiviral sites as other HRV-Cs. The genomic features of Malaysian isolates were similar to those of other HRV-Cs. Negative selections were frequently detected in HRV-Cs complete coding sequences indicating that these sequences were under functional constraint. The present study showed that HRV-Cs from Malaysia have diverse genetic sequences but share conserved genomic features with other HRV-Cs. This genetic information could provide further aid in the understanding of HRV-C infection. PMID:27199901
Ujike, Makoto; Ejima, Miho; Anraku, Akane; Shimabukuro, Kozue; Obuchi, Masatsugu; Kishida, Noriko; Hong, Xu; Takashita, Emi; Fujisaki, Seiichiro; Yamashita, Kazuyo; Horikawa, Hiroshi; Kato, Yumiko; Oguchi, Akio; Fujita, Nobuyuki; Tashiro, Masato
2011-01-01
To monitor and characterize oseltamivir-resistant (OR) pandemic (H1N1) 2009 virus with the H275Y mutation, we analyzed 4,307 clinical specimens from Japan by neuraminidase (NA) sequencing or inhibition assay; 61 OR pandemic (H1N1) 2009 viruses were detected. NA inhibition assay and M2 sequencing indicated that OR pandemic (H1N1) 2009 virus was resistant to M2 inhibitors, but sensitive to zanamivir. Full-genome sequencing showed OR and oseltamivir-sensitive (OS) viruses had high sequence similarity, indicating that domestic OR virus was derived from OS pandemic (H1N1) 2009 virus. Hemagglutination inhibition test demonstrated that OR and OS pandemic (H1N1) 2009 viruses were antigenically similar to the A/California/7/2009 vaccine strain. Of 61 case-patients with OR viruses, 45 received oseltamivir as treatment, and 10 received it as prophylaxis, which suggests that most cases emerged sporadically from OS pandemic (H1N1) 2009, due to selective pressure. No evidence of sustained spread of OR pandemic (H1N1) 2009 was found in Japan; however, 2 suspected incidents of human-to-human transmission were reported. PMID:21392439
Kosushkin, S A; Borodulina, O R; Solov'eva, E N; Grechko, V V
2008-01-01
We have isolated and characterised sequences of a SINE family specific for squamate reptiles from a genome of lacertid lizard that we called Squam1. Copies are 360-390 bp in length and share a significant similarity with tRNA gene sequence on its 5'-end. This family was also detected by us in DNA of representatives of varanids, iguanids (anolis), gekkonids, and snakes. No signs of it were found in DNA of mammals, birds, amphibians, and crocodiles. Detailed analysis of primary structure of the retroposons obtained by us from genomic libraries or GenBank sequences was carried out. Most taxa possess 2-3 subfamilies of the SINE in their genomes with specific diagnostic features in their primary structure. Individual variability of copies in different families is about 85% and is just slightly lower on the genera level. Comparison of consensus sequences on family level reveals a high degree of structural similarity with a number of specific apomorphic features which makes it a useful marker of phylogeny for this group of reptiles. Snakes do not show specific affinity to varanids when compared to other lizards, as it was suggested earlier.
Coffey, Lark L; Page, Brady L; Greninger, Alexander L; Herring, Belinda L; Russell, Richard C; Doggett, Stephen L; Haniotis, John; Wang, Chunlin; Deng, Xutao; Delwart, Eric L
2014-01-05
Viral metagenomics characterizes known and identifies unknown viruses based on sequence similarities to any previously sequenced viral genomes. A metagenomics approach was used to identify virus sequences in Australian mosquitoes causing cytopathic effects in inoculated mammalian cell cultures. Sequence comparisons revealed strains of Liao Ning virus (Reovirus, Seadornavirus), previously detected only in China, livestock-infecting Stretch Lagoon virus (Reovirus, Orbivirus), two novel dimarhabdoviruses, named Beaumont and North Creek viruses, and two novel orthobunyaviruses, named Murrumbidgee and Salt Ash viruses. The novel virus proteomes diverged by ≥ 50% relative to their closest previously genetically characterized viral relatives. Deep sequencing also generated genomes of Warrego and Wallal viruses, orbiviruses linked to kangaroo blindness, whose genomes had not been fully characterized. This study highlights viral metagenomics in concert with traditional arbovirus surveillance to characterize known and new arboviruses in field-collected mosquitoes. Follow-up epidemiological studies are required to determine whether the novel viruses infect humans. © 2013 Elsevier Inc. All rights reserved.
A microRNA detection system based on padlock probes and rolling circle amplification
Jonstrup, Søren Peter; Koch, Jørn; Kjems, Jørgen
2006-01-01
The differential expression and the regulatory roles of microRNAs (miRNAs) are being studied intensively these years. Their minute size of only 19–24 nucleotides and strong sequence similarity among related species call for enhanced methods for reliable detection and quantification. Moreover, miRNA expression is generally restricted to a limited number of specific cells within an organism and therefore requires highly sensitive detection methods. Here we present a simple and reliable miRNA detection protocol based on padlock probes and rolling circle amplification. It can be performed without specialized equipment and is capable of measuring the content of specific miRNAs in a few nanograms of total RNA. PMID:16888321
A microRNA detection system based on padlock probes and rolling circle amplification.
Jonstrup, Søren Peter; Koch, Jørn; Kjems, Jørgen
2006-09-01
The differential expression and the regulatory roles of microRNAs (miRNAs) are being studied intensively these years. Their minute size of only 19-24 nucleotides and strong sequence similarity among related species call for enhanced methods for reliable detection and quantification. Moreover, miRNA expression is generally restricted to a limited number of specific cells within an organism and therefore requires highly sensitive detection methods. Here we present a simple and reliable miRNA detection protocol based on padlock probes and rolling circle amplification. It can be performed without specialized equipment and is capable of measuring the content of specific miRNAs in a few nanograms of total RNA.
Noise-robust speech recognition through auditory feature detection and spike sequence decoding.
Schafer, Phillip B; Jin, Dezhe Z
2014-03-01
Speech recognition in noisy conditions is a major challenge for computer systems, but the human brain performs it routinely and accurately. Automatic speech recognition (ASR) systems that are inspired by neuroscience can potentially bridge the performance gap between humans and machines. We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes from a population of simulated auditory feature-detecting neurons. Each neuron is trained to respond selectively to a brief spectrotemporal pattern, or feature, drawn from the simulated auditory nerve response to speech. The neural population conveys the time-dependent structure of a sound by its sequence of spikes. We compare two methods for decoding the spike sequences--one using a hidden Markov model-based recognizer, the other using a novel template-based recognition scheme. In the latter case, words are recognized by comparing their spike sequences to template sequences obtained from clean training data, using a similarity measure based on the length of the longest common sub-sequence. Using isolated spoken digits from the AURORA-2 database, we show that our combined system outperforms a state-of-the-art robust speech recognizer at low signal-to-noise ratios. Both the spike-based encoding scheme and the template-based decoding offer gains in noise robustness over traditional speech recognition methods. Our system highlights potential advantages of spike-based acoustic coding and provides a biologically motivated framework for robust ASR development.
Current challenges in genome annotation through structural biology and bioinformatics.
Furnham, Nicholas; de Beer, Tjaart A P; Thornton, Janet M
2012-10-01
With the huge volume in genomic sequences being generated from high-throughout sequencing projects the requirement for providing accurate and detailed annotations of gene products has never been greater. It is proving to be a huge challenge for computational biologists to use as much information as possible from experimental data to provide annotations for genome data of unknown function. A central component to this process is to use experimentally determined structures, which provide a means to detect homology that is not discernable from just the sequence and permit the consequences of genomic variation to be realized at the molecular level. In particular, structures also form the basis of many bioinformatics methods for improving the detailed functional annotations of enzymes in combination with similarities in sequence and chemistry. Copyright © 2012. Published by Elsevier Ltd.
[Structural organization of 5S ribosomal DNA of Rosa rugosa].
Tynkevych, Iu O; Volkov, R A
2014-01-01
In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.
Detection of porcine circovirus type 2 in pigs imported from Indonesia.
Manokaran, Gayathri; Lin, Yueh-Nuo; Soh, Moi-Lien; Lim, Elizabeth Ai-Sim; Lim, Chee-Wee; Tan, Boon-Huan
2008-11-25
We have detected the presence of porcine circovirus (PCV) type 2 in Indonesian pigs imported to Singapore for food consumption. A total of three viral isolates were identified, and to genetically characterise them further, their full genomes were sequenced. Each genome showed a typical organization of PCV type 2, with the three isolates sharing similar genome lengths of 1767 nucleotide (nt) at high nt identities of 99.8-100%, further indicating that the viral isolates were quite homogeneous. Sequence analysis further revealed that the ORF2 genes contain the nt sequence CCCCGC (from nt position 262 to 267) that was previously reported to be associated with PCV type 2, group 1C. The phylogenetic tree was constructed for the ORF2 genes, and the PCV type 2 isolates distributed into two distinctive groups. The Indonesian PCV type 2 clustered tightly with one China isolate, accession number AY035820, as a sub-cluster in group 1C. The sequence and phylogenetic analyses both confirmed that the three Indonesian PCV type 2 isolates belong to group 1C, and that the genetic changes for the three Indonesian isolates were very stable, possibly due to the low-scale evolution.
Gleave, A P; Taylor, R K; Morris, B A; Greenwood, D R
1995-09-15
Janthinobacterium lividum secretes a major 56-kDa chitinase and a minor 69-kDa chitinase. A chitinase gene was defined on a 3-kb fragment of clone pRKT10, by virtue of fluorescent colonies in the presence of 4-methylumbelliferyl-beta-D-N,N',N"-chitotrioside. Nucleotide sequencing revealed an 1998-bp open reading frame with the potential to encode a 69,716-Da protein with amino acid sequences similar to those in other chitinases, suggesting it encodes the minor chitinase (Chi69). Chitinase activity of Escherichia coli (pRKT10) lysates was detected mainly in the periplasmic fraction and immunoblotting detected a 70-kDa protein in this fraction. Chi69 has an N-terminal secretory leader peptide preceding two probable chitin-binding domains and a catalytic domain. These functional domains are separated by linker regions of proline-threonine repeats. Amino acid sequencing of cyanogen bromide cleavage-derived peptides from the major 56-kDa chitinase suggested that Chi69 may be a precursor of Chi56. In addition, an N-terminally truncated version of Chi69 retained chitinase activity as expected if in vivo processing of Chi69 generates Chi56.
Congruence analysis of point clouds from unstable stereo image sequences
NASA Astrophysics Data System (ADS)
Jepping, C.; Bethmann, F.; Luhmann, T.
2014-06-01
This paper deals with the correction of exterior orientation parameters of stereo image sequences over deformed free-form surfaces without control points. Such imaging situation can occur, for example, during photogrammetric car crash test recordings where onboard high-speed stereo cameras are used to measure 3D surfaces. As a result of such measurements 3D point clouds of deformed surfaces are generated for a complete stereo sequence. The first objective of this research focusses on the development and investigation of methods for the detection of corresponding spatial and temporal tie points within the stereo image sequences (by stereo image matching and 3D point tracking) that are robust enough for a reliable handling of occlusions and other disturbances that may occur. The second objective of this research is the analysis of object deformations in order to detect stable areas (congruence analysis). For this purpose a RANSAC-based method for congruence analysis has been developed. This process is based on the sequential transformation of randomly selected point groups from one epoch to another by using a 3D similarity transformation. The paper gives a detailed description of the congruence analysis. The approach has been tested successfully on synthetic and real image data.
Viau, Roberto A; Hujer, Andrea M; Marshall, Steven H; Perez, Federico; Hujer, Kristine M; Briceño, David F; Dul, Michael; Jacobs, Michael R; Grossberg, Richard; Toltzis, Philip; Bonomo, Robert A
2012-05-01
Klebsiella pneumoniae isolates harboring the K. pneumoniae carbapenemase gene (bla(KPC)) are creating a significant healthcare threat in both acute and long-term care facilities (LTCFs). As part of a study conducted in 2004 to determine the risk of stool colonization with extended-spectrum cephalosporin-resistant gram-negative bacteria, 12 isolates of K. pneumoniae that exhibited nonsusceptibility to extended-spectrum cephalosporins were detected. All were gastrointestinal carriage isolates that were not associated with infection. Reassessment of the carbapenem minimum inhibitory concentrations using revised 2011 Clinical Laboratory Standards Institute breakpoints uncovered carbapenem resistance. To further investigate, a DNA microarray assay, PCR-sequencing of bla genes, immunoblotting, repetitive-sequence-based PCR (rep-PCR) and multilocus sequence typing (MLST) were performed. The DNA microarray detected bla(KPC) in all 12 isolates, and bla(KPC-3) was identified by PCR amplification and sequencing of the amplicon. In addition, a bla(SHV-11) gene was detected in all isolates. Immunoblotting revealed "low-level" production of the K. pneumoniae carbapenemase, and rep-PCR indicated that all bla(KPC-3)-positive K. pneumoniae strains were genetically related (≥98% similar). According to MLST, all isolates belonged to sequence type 36. This sequence type has not been previously linked with bla(KPC) carriage. Plasmids from 3 representative isolates readily transferred the bla(KPC-3) to Escherichia coli J-53 recipients. Our findings reveal the "silent" dissemination of bla(KPC-3) as part of Tn4401b on a mobile plasmid in Northeast Ohio nearly a decade ago and establish the first report, to our knowledge, of K. pneumoniae containing bla(KPC-3) in an LTCF caring for neurologically impaired children and young adults.
Siebert, Stefan; Robinson, Mark D; Tintori, Sophia C; Goetz, Freya; Helm, Rebecca R; Smith, Stephen A; Shaner, Nathan; Haddock, Steven H D; Dunn, Casey W
2011-01-01
We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing.
Siebert, Stefan; Robinson, Mark D.; Tintori, Sophia C.; Goetz, Freya; Helm, Rebecca R.; Smith, Stephen A.; Shaner, Nathan; Haddock, Steven H. D.; Dunn, Casey W.
2011-01-01
We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing. PMID:21829563
Microbial detection with low molecular weight RNA.
Kourentzi, K D; Fox, G E; Willson, R C
2001-12-01
The need to monitor microorganisms in the environment has increased interest in assays based on hybridization probes that target nucleic acids (e.g., rRNA). We report the development of liquid-phase assays for specific bacterial 5S rRNA sequences or similarly sized artificial RNAs (aRNAs) using molecular beacon technology. These beacons fluoresce only in the presence of specific target sequences, rendering as much as a 27-fold fluorescence enhancement. The assays can be used with both crude cell lysates and purified total RNA preparations. Minimal sample preparation (e.g., heating to promote leakage from cells) is sufficient to detect many Gram-negative bacteria. Using this approach it was possible to detect an aRNA-labeled Escherichia coli strain in the presence of a large background of an otherwise identical E. coli strain. Finally, by using a longer wavelength carboxytetramethylrhodamine beacon it was possible to reduce the fraction of the signal due to cellular autofluorescence to below 0.5%.
Microbial detection with low molecular weight RNA
NASA Technical Reports Server (NTRS)
Kourentzi, K. D.; Fox, G. E.; Willson, R. C.
2001-01-01
The need to monitor microorganisms in the environment has increased interest in assays based on hybridization probes that target nucleic acids (e.g., rRNA). We report the development of liquid-phase assays for specific bacterial 5S rRNA sequences or similarly sized artificial RNAs (aRNAs) using molecular beacon technology. These beacons fluoresce only in the presence of specific target sequences, rendering as much as a 27-fold fluorescence enhancement. The assays can be used with both crude cell lysates and purified total RNA preparations. Minimal sample preparation (e.g., heating to promote leakage from cells) is sufficient to detect many Gram-negative bacteria. Using this approach it was possible to detect an aRNA-labeled Escherichia coli strain in the presence of a large background of an otherwise identical E. coli strain. Finally, by using a longer wavelength carboxytetramethylrhodamine beacon it was possible to reduce the fraction of the signal due to cellular autofluorescence to below 0.5%.
Radiation resistance of biological reagents for in situ life detection.
Carr, Christopher E; Rowedder, Holli; Vafadari, Cyrus; Lui, Clarissa S; Cascio, Ethan; Zuber, Maria T; Ruvkun, Gary
2013-01-01
Life on Mars, if it exists, may share a common ancestry with life on Earth derived from meteoritic transfer of microbes between the planets. One means to test this hypothesis is to isolate, detect, and sequence nucleic acids in situ on Mars, then search for similarities to known common features of life on Earth. Such an instrument would require biological and chemical components, such as polymerase and fluorescent dye molecules. We show that reagents necessary for detection and sequencing of DNA survive several analogues of the radiation expected during a 2-year mission to Mars, including proton (H-1), heavy ion (Fe-56, O-18), and neutron bombardment. Some reagents have reduced performance or fail at higher doses. Overall, our findings suggest it is feasible to utilize space instruments with biological components, particularly for mission durations of up to several years in environments without large accumulations of charged particles, such as the surface of Mars, and have implications for the meteoritic transfer of microbes between planets.
Ai, Jing-Wen; Li, Yang; Cheng, Qi; Cui, Peng; Wu, Hong-Long; Xu, Bin; Zhang, Wen-Hong
2018-06-01
A 45-year-old man who complained of continuous fever and multiple hepatic masses was admitted to our hospital. Repeated MRI manifestations were similar while each radiological report suggested contradictory diagnosis pointing to infections or malignances respectively. Pathologic examination of the liver tissue showed no direct evidence of either infections or tumor. We performed next-generation sequencing on the liver tissue and peripheral blood to further investigate the possible etiology. High throughput sequencing was performed on the liver lesion tissues using BGISEQ-100 platform, and data was mapped to the Microbial Genome Databases after filtering low quality data and human reads. We identified a total of 299 sequencing reads of Mycobacterium tuberculosis (M. tuberculosis) complex sequences from the liver tissue, including 8, 229 of 4,424,435 of the M. tuberculosis nucleotide sequences, and Mycobacterium africanum, Mycobacterium bovis, and Mycobacterium canettii were also detected due to the 99.9% identical rate among these strains. No specific Mycobacterial tuberculosis nucleotide sequence was detected in the sample of peripheral blood. Patient's symptom quickly recovered after anti-tuberculosis treatment and repeated Ziehl-Neelsen staining of the liver tissue finally identified small numbers of positive bacillus. The diagnosis of this patient was difficult to establish before the next-generation sequencing because of contradictive radiological results and negative pathological findings. More sensitive diagnostic methods are urgently needed. This is the first case reporting hepatic tuberculosis confirmed by the next-generation sequencing, and marks the promising potential of the application of the next-generation sequencing in the diagnosis of hepatic lesions with unknown etiology. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Endosymbiotic Microbiota of the Bamboo Pseudococcid Antonina crawii (Insecta, Homoptera)
Fukatsu, Takema; Nikoh, Naruo
2000-01-01
We characterized the intracellular symbiotic microbiota of the bamboo pseudococcid Antonina crawii by performing a molecular phylogenetic analysis in combination with in situ hybridization. Almost the entire length of the bacterial 16S rRNA gene was amplified and cloned from A. crawii whole DNA. Restriction fragment length polymorphism analysis revealed that the clones obtained included three distinct types of sequences. Nucleotide sequences of the three types were determined and subjected to a molecular phylogenetic analysis. The first sequence was a member of the γ subdivision of the division Proteobacteria (γ-Proteobacteria) to which no sequences in the database were closely related, although the sequences of endosymbionts of other homopterans, such as psyllids and aphids, were distantly related. The second sequence was a β-Proteobacteria sequence and formed a monophyletic group with the sequences of endosymbionts from other pseudococcids. The third sequence exhibited a high level of similarity to sequences of Spiroplasma spp. from ladybird beetles and a tick. Localization of the endosymbionts was determined by using tissue sections of A. crawii and in situ hybridization with specific oligonucleotide probes. The γ- and β-Proteobacteria symbionts were packed in the cytoplasm of the same mycetocytes (or bacteriocytes) and formed a large mycetome (or bacteriome) in the abdomen. The spiroplasma symbionts were also present intracellularly in various tissues at a low density. We observed that the anterior poles of developing eggs in the ovaries were infected by the γ- and β-Proteobacteria symbionts in a systematic way, which ensured vertical transmission. Five representative pseudococcids were examined by performing diagnostic PCR experiments with specific primers; the β-Proteobacteria symbiont was detected in all five pseudococcids, the γ-Proteobacteria symbiont was found in three, and the spiroplasma symbiont was detected only in A. crawii. PMID:10653730
Gucciardo, Sébastian; Wisniewski, Jean-Pierre; Brewin, Nicholas J; Bornemann, Stephen
2007-01-01
The cDNAs encoding three germin-like proteins (PsGER1, PsGER2a, and PsGER2b) were isolated from Pisum sativum. The coding sequence of PsGER1 transiently expressed in tobacco leaves gave a protein with superoxide dismutase activity but no detectable oxalate oxidase activity according to in-gel activity stains. The transient expression of wheat germin gf-2.8 oxalate oxidase showed oxalate oxidase but no superoxide dismutase activity under the same conditions. The superoxide dismutase activity of PsGER1 was resistant to high temperature, denaturation by detergent, and high concentrations of hydrogen peroxide. In salt-stressed pea roots, a heat-resistant superoxide dismutase activity was observed with an electrophoretic mobility similar to that of the PsGER1 protein, but this activity was below the detection limit in non-stressed or H(2)O(2)-stressed pea roots. Oxalate oxidase activity was not detected in either pea roots or nodules. Following in situ hybridization in developing pea nodules, PsGER1 transcript was detected in expanding cells just proximal to the meristematic zone and also in the epidermis, but to a lesser extent. PsGER1 is the first known germin-like protein with superoxide dismutase activity to be associated with nodules. It shared protein sequence identity with the N-terminal sequence of a putative plant receptor for rhicadhesin, a bacterial attachment protein. However, its primary location in nodules suggests functional roles other than as a rhicadhesin receptor required for the first stage of bacterial attachment to root hairs.
Martin, Lauren; Damaso, Natalie; Mills, DeEtta
2016-10-01
Molecular methods for the detection of mammalian coat color phenotypes have expanded greatly within the past decade. Many phenotypes are associated with a single nucleotide polymorphism mutation in the genetic sequence. Traditionally, these mutations are detected through sequencing, hybridization assays or mini-sequencing. However, these techniques can be expensive and tedious. Previously, CE-SSCP using the F-108 polymer was able to distinguish SNPs for the melanocortin-1 receptor (mc1r) coat color gene in horses (Equus caballus) that differed by one nucleotide substitution. The objective of this study was to expand the detection of coat color SNPs in horses. The genes for the solute carrier family member 2 (slc45a2/matp), type III receptor protein-tyrosine kinase (kit) and mc1r genes using CE-SSCP and F-108 polymer were compared to mini-sequencing with the SNaPshot TM kit. The F-108 polymer reproducibly resolved homozygous and heterozygous individuals for the mc1r and kit markers, but was unable to resolve heterozygous individuals for slc45a2 at 38ºC. The need for temperatures <15ºC, the SNP position being close to the 5'-end, and conformational structures/free energy with similar values resulted in the inability to resolve the secondary structures. Despite this limitation, the CE-SSCP method could be used to provide a rapid phenotypic description for equine forensic investigations. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Hide, Geoff; Hughes, Jacqueline M; McNuff, Robert
2003-01-01
Background The rapid expansion in the availability of genome and DNA sequence information has opened up new possibilities for the development of methods for detecting free-living protozoa in environmental samples. The protozoan Blepharisma japonicum was used to investigate a rapid and simple detection system based on polymerase chain reaction amplification (PCR) from organisms immobilised on FTA paper. Results Using primers designed from the α-tubulin genes of Blepharisma, specific and sensitive detection to the equivalent of a single Blepharisma cell could be achieved. Similar detection levels were found using water samples, containing Blepharisma, which were dried onto Whatman FTA paper. Conclusion This system has potential as a sensitive convenient detection system for Blepharisma and could be applied to other protozoan organisms. PMID:14516472
Improved localisation for 2-hydroxyglutarate detection at 3T using long-TE semi-LASER
Berrington, Adam; Voets, Natalie L.; Plaha, Puneet; Larkin, Sarah J.; Mccullagh, James; Stacey, Richard; Yildirim, Muhammed; Schofield, Christopher J.; Jezzard, Peter; Cadoux-Hudson, Tom; Ansorge, Olaf; Emir, Uzay E.
2016-01-01
2-hydroxyglutarate (2-HG) has emerged as a biomarker of tumour cell IDH mutations that may enable the differential diagnosis of glioma patients. At 3 Tesla, detection of 2-HG with magnetic resonance spectroscopy is challenging because of metabolite signal overlap and a spectral pattern modulated by slice selection and chemical shift displacement. Using density matrix simulations and phantom experiments, an optimised semi-LASER scheme (TE = 110 ms) improves localisation of the 2-HG spin system considerably compared to an existing PRESS sequence. This results in a visible 2-HG peak in the in vivo spectra at 1.9 ppm in the majority of IDH mutated tumours. Detected concentrations of 2-HG were similar using both sequences, although the use of semi-LASER generated narrower confidence intervals. Signal overlap with glutamate and glutamine, as measured by pairwise fitting correlation was reduced. Lactate was readily detectable across glioma patients using the method presented here (mean CLRB: (10±2)%). Together with more robust 2-HG detection, long TE semi-LASER offers the potential to investigate tumour metabolism and stratify patients in vivo at 3T. PMID:27547821
Torchetti, Enza Maria; Navarro, Beatriz; Di Serio, Francesco
2012-12-01
The spread of viroids belonging to the genus Pospiviroid (family Pospiviroidae), recorded recently in ornamentals and vegetables in several European countries, calls for fast, efficient and sensitive detection methods. Based on bioinformatics analyses of sequence identity among all pospiviroids, a digoxigenin-labeled polyprobe (POSPIprobe) was developed that, when tested by dot-blot and Northern-blot hybridization, detected Potato spindle tuber viroid, Citrus exocortis viroid, Columnea latent viroid, Mexican papita viroid, Tomato planta macho viroid, Tomato apical stunt viroid, Pepper chat fruit viroid and Chrysanthemum stunt viroid. The end-point detection limits of the POSPIprobe ranged from 5(-2) to 5(-4), and from 5(-1) to 5(-3) for nucleic acid preparations obtained by phenol extraction and silica-capture, respectively, similar to those of single probes. Based on sequence identity, the POSPIprobe is expected to detect also the two pospiviroid species not tested in this study (Tomato chlorotic dwarf viroid and Iresine viroid-1). Dot-blot assays with the POSPIprobe were validated by testing 68 samples from tomato, chrysanthemum and argyranthemum infected by different pospiviroids as revealed by RT-PCR, thus confirming the potential of this polyprobe for quarantine, certification and survey programs. Copyright © 2012 Elsevier B.V. All rights reserved.
The property distance index PD predicts peptides that cross-react with IgE antibodies
Ivanciuc, Ovidiu; Midoro-Horiuti, Terumi; Schein, Catherine H.; Xie, Liping; Hillman, Gilbert R.; Goldblum, Randall M.; Braun, Werner
2009-01-01
Similarities in the sequence and structure of allergens can explain clinically observed cross-reactivities. Distinguishing sequences that bind IgE in patient sera can be used to identify potentially allergenic protein sequences and aid in the design of hypo-allergenic proteins. The property distance index PD, incorporated in our Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/), may identify potentially cross-reactive segments of proteins, based on their similarity to known IgE epitopes. We sought to obtain experimental validation of the PD index as a quantitative predictor of IgE cross-reactivity, by designing peptide variants with predetermined PD scores relative to three linear IgE epitopes of Jun a 1, the dominant allergen from mountain cedar pollen. For each of the three epitopes, 60 peptides were designed with increasing PD values (decreasing physicochemical similarity) to the starting sequence. The peptides synthesized on a derivatized cellulose membrane were probed with sera from patients who were allergic to Jun a 1, and the experimental data were interpreted with a PD classification method. Peptides with low PD values relative to a given epitope were more likely to bind IgE from the sera than were those with PD values larger than 6. Control sequences, with PD values between 18 and 20 to all the three epitopes, did not bind patient IgE, thus validating our procedure for identifying negative control peptides. The PD index is a statistically validated method to detect discrete regions of proteins that have a high probability of cross-reacting with IgE from allergic patients. PMID:18950868
Getachew, Yitbarek; Zakaria, Zunita; Abdul Aziz, Saleha
2013-01-01
Vancomycin-resistant enterococci (VRE) have been reported to be present in humans, chickens, and pigs in Malaysia. In the present study, representative samples of VRE isolated from these populations were examined for similarities and differences by using the multilocus sequence typing (MLST) method. Housekeeping genes of Enterococcus faecium (n = 14) and Enterococcus faecalis (n = 11) isolates were sequenced and analyzed using the MLST databases eBURST and goeBURST. We found five sequence types (STs) of E. faecium and six STs of E. faecalis existing in Malaysia. Enterococcus faecium isolates belonging to ST203, ST17, ST55, ST79, and ST29 were identified, and E. faecium ST203 was the most common among humans. The MLST profiles of E. faecium from humans in this study were similar to the globally reported nosocomial-related strain lineage belonging to clonal complex 17 (CC17). Isolates from chickens and pigs have few similarities to those from humans, except for one isolate from a chicken, which was identified as ST203. E. faecalis isolates were more diverse and were identified as ST4, ST6, ST87, ST108, ST274, and ST244, which were grouped as specific to the three hosts. E. faecalis, belonging to the high-risk CC2 and CC87, were detected among isolates from humans. In conclusion, even though one isolate from a chicken was found clonal to that of humans, the MLST analysis of E. faecium and E. faecalis supports the findings of others who suggest VRE to be predominantly host specific and that clinically important strains are found mainly among humans. The infrequent detection of a human VRE clone in a chicken may in fact suggest a reverse transmission of VRE from humans to animals. PMID:23666337
Conserved small mRNA with an unique, extended Shine-Dalgarno sequence
Hahn, Julia; Migur, Anzhela; von Boeselager, Raphael Freiherr; Kubatova, Nina; Kubareva, Elena; Schwalbe, Harald
2017-01-01
ABSTRACT Up to now, very small protein-coding genes have remained unrecognized in sequenced genomes. We identified an mRNA of 165 nucleotides (nt), which is conserved in Bradyrhizobiaceae and encodes a polypeptide with 14 amino acid residues (aa). The small mRNA harboring a unique Shine-Dalgarno sequence (SD) with a length of 17 nt was localized predominantly in the ribosome-containing P100 fraction of Bradyrhizobium japonicum USDA 110. Strong interaction between the mRNA and 30S ribosomal subunits was demonstrated by their co-sedimentation in sucrose density gradient. Using translational fusions with egfp, we detected weak translation and found that it is impeded by both the extended SD and the GTG start codon (instead of ATG). Biophysical characterization (CD- and NMR-spectroscopy) showed that synthesized polypeptide remained unstructured in physiological puffer. Replacement of the start codon by a stop codon increased the stability of the transcript, strongly suggesting additional posttranscriptional regulation at the ribosome. Therefore, the small gene was named rreB (ribosome-regulated expression in Bradyrhizobiaceae). Assuming that the unique ribosome binding site (RBS) is a hallmark of rreB homologs or similarly regulated genes, we looked for similar putative RBS in bacterial genomes and detected regions with at least 16 nt complementarity to the 3′-end of 16S rRNA upstream of sORFs in Caulobacterales, Rhizobiales, Rhodobacterales and Rhodospirillales. In the Rhodobacter/Roseobacter lineage of α-proteobacteria the corresponding gene (rreR) is conserved and encodes an 18 aa protein. This shows how specific RBS features can be used to identify new genes with presumably similar control of expression at the RNA level. PMID:27834614
Molecular characterization of Hepatozoon spp. infection in endangered Indian wild felids and canids.
Pawar, Rahul Mohanchandra; Poornachandar, Anantula; Srinivas, Pasham; Rao, Kancharapu Ramachandra; Lakshmikantan, Uthandaraman; Shivaji, Sisinthy
2012-05-25
Hepatozoon species are parasites that infect a wide variety of domestic and wild animals. The objective of this study was to perform the molecular detection and characterization of Hepatozoon spp. in Asiatic lion, Indian tiger, Indian leopard, Indian wild dog, Indian domestic dog and cat based on partial 18S rRNA gene sequences from Hepatozoon spp. in the naturally infected animals. Hepatozoon spp. could be detected in blood samples of 5 out of 9 Asiatic lions, 2 out of 5 Indian tigers, 2 out of 4 Indian leopards and 2 out of 2 Indian wild dogs and, 2 out of 4 domestic cats and 2 out of 3 domestic dog samples by PCR. Sequencing of PCR amplicon and BLAST analysis of partial 18S rRNA gene sequences indicated that the Hepatozoon spp. in Asiatic lion, Bengal tiger, Indian leopard and domestic cat was Hepatozoon felis (98-99% similarity) and in the Indian wild and domestic dog the phylogenetic neighbour was Hepatozoon canis (97-100% similarity). Presence of H. felis and H. canis in both domestic and wild animals suggested that they are not host specific and the same parasite causes infection in domestic and wild felids and canids in India and from different parts of the world. To our knowledge, this is the first report on detection and molecular characterization of H. felis infection in Asiatic lions, Indian tigers, Indian leopards and H. canis in Indian wild dog. Hepatozoon spp. may be a potential pathogen and an opportunistic parasite in immuno-compromised animals and could thus represent a threat to endangered Indian wild felids and canids. Copyright © 2011 Elsevier B.V. All rights reserved.
Krisinger, J; Jeung, E B; Simmen, R C; Leung, P C
1995-01-01
The expression of Calbindin-D9k (CaBP-9k) in the pig uterus and placenta was measured by Northern blot analysis and reverse transcription polymerase chain reaction (PCR), respectively. Progesterone (P4) administration to ovariectomized pigs decreased CaBP-9k mRNA levels. Expression of endometrial CaBP-9k mRNA was high on pregnancy Days 10-12 and below the detection limit on Days 15 and 18. On Day 60, expression could be detected at low levels. In myometrium and placenta, CaBP-9k mRNA expression was not detectable by Northern analysis using total RNA. Reverse-transcribed RNA from both tissues demonstrated the presence of CaBP-9k transcripts by means of PCR. The partial CaBP-9k gene was amplified by PCR and cloned to determine the sequence of intron A. In contrast to the rat CaBP-9k gene, the pig gene does not contain a functional estrogen response element (ERE) within this region. A similar ERE-like sequence located at the identical location was examined by gel retardation analysis and failed to bind the estradiol receptor. A similar disruption of this ERE-like sequence has been described in the human CaBP-9k gene, which is not expressed at any level in placenta, myometrium, or endometrium. It is concluded that the pig CaBP-9k gene is regulated in these reproductive tissues in a manner distinct from that in rat and human tissues. The regulation is probably due to a regulatory region outside of intron A, which in the rat gene contains the key cis element for uterine expression of the CaBP-9k gene.
Xiao, Jian; Cao, Hongyuan; Chen, Jun
2017-09-15
Next generation sequencing technologies have enabled the study of the human microbiome through direct sequencing of microbial DNA, resulting in an enormous amount of microbiome sequencing data. One unique characteristic of microbiome data is the phylogenetic tree that relates all the bacterial species. Closely related bacterial species have a tendency to exhibit a similar relationship with the environment or disease. Thus, incorporating the phylogenetic tree information can potentially improve the detection power for microbiome-wide association studies, where hundreds or thousands of tests are conducted simultaneously to identify bacterial species associated with a phenotype of interest. Despite much progress in multiple testing procedures such as false discovery rate (FDR) control, methods that take into account the phylogenetic tree are largely limited. We propose a new FDR control procedure that incorporates the prior structure information and apply it to microbiome data. The proposed procedure is based on a hierarchical model, where a structure-based prior distribution is designed to utilize the phylogenetic tree. By borrowing information from neighboring bacterial species, we are able to improve the statistical power of detecting associated bacterial species while controlling the FDR at desired levels. When the phylogenetic tree is mis-specified or non-informative, our procedure achieves a similar power as traditional procedures that do not take into account the tree structure. We demonstrate the performance of our method through extensive simulations and real microbiome datasets. We identified far more alcohol-drinking associated bacterial species than traditional methods. R package StructFDR is available from CRAN. chen.jun2@mayo.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Application of Broad-Spectrum Resequencing Microarray for Genotyping Rhabdoviruses▿
Dacheux, Laurent; Berthet, Nicolas; Dissard, Gabriel; Holmes, Edward C.; Delmas, Olivier; Larrous, Florence; Guigon, Ghislaine; Dickinson, Philip; Faye, Ousmane; Sall, Amadou A.; Old, Iain G.; Kong, Katherine; Kennedy, Giulia C.; Manuguerra, Jean-Claude; Cole, Stewart T.; Caro, Valérie; Gessain, Antoine; Bourhy, Hervé
2010-01-01
The rapid and accurate identification of pathogens is critical in the control of infectious disease. To this end, we analyzed the capacity for viral detection and identification of a newly described high-density resequencing microarray (RMA), termed PathogenID, which was designed for multiple pathogen detection using database similarity searching. We focused on one of the largest and most diverse viral families described to date, the family Rhabdoviridae. We demonstrate that this approach has the potential to identify both known and related viruses for which precise sequence information is unavailable. In particular, we demonstrate that a strategy based on consensus sequence determination for analysis of RMA output data enabled successful detection of viruses exhibiting up to 26% nucleotide divergence with the closest sequence tiled on the array. Using clinical specimens obtained from rabid patients and animals, this method also shows a high species level concordance with standard reference assays, indicating that it is amenable for the development of diagnostic assays. Finally, 12 animal rhabdoviruses which were currently unclassified, unassigned, or assigned as tentative species within the family Rhabdoviridae were successfully detected. These new data allowed an unprecedented phylogenetic analysis of 106 rhabdoviruses and further suggest that the principles and methodology developed here may be used for the broad-spectrum surveillance and the broader-scale investigation of biodiversity in the viral world. PMID:20610710
Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L.; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M.
2017-01-01
Abstract Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population. PMID:28126923
Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki
2014-08-01
Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Human Coronavirus NL63 Molecular Epidemiology and Evolutionary Patterns in Rural Coastal Kenya.
Kiyuka, Patience K; Agoti, Charles N; Munywoki, Patrick K; Njeru, Regina; Bett, Anne; Otieno, James R; Otieno, Grieven P; Kamau, Everlyn; Clark, Taane G; van der Hoek, Lia; Kellam, Paul; Nokes, D James; Cotten, Matthew
2018-05-05
Human coronavirus NL63 (HCoV-NL63) is a globally endemic pathogen causing mild and severe respiratory tract infections with reinfections occurring repeatedly throughout a lifetime. Nasal samples were collected in coastal Kenya through community-based and hospital-based surveillance. HCoV-NL63 was detected with multiplex real-time reverse transcription PCR, and positive samples were targeted for nucleotide sequencing of the spike (S) protein. Additionally, paired samples from 25 individuals with evidence of repeat HCoV-NL63 infection were selected for whole-genome virus sequencing. HCoV-NL63 was detected in 1.3% (75/5573) of child pneumonia admissions. Two HCoV-NL63 genotypes circulated in Kilifi between 2008 and 2014. Full genome sequences formed a monophyletic clade closely related to contemporary HCoV-NL63 from other global locations. An unexpected pattern of repeat infections was observed with some individuals showing higher viral titers during their second infection. Similar patterns for 2 other endemic coronaviruses, HCoV-229E and HCoV-OC43, were observed. Repeat infections by HCoV-NL63 were not accompanied by detectable genotype switching. In this coastal Kenya setting, HCoV-NL63 exhibited low prevalence in hospital pediatric pneumonia admissions. Clade persistence with low genetic diversity suggest limited immune selection, and absence of detectable clade switching in reinfections indicates initial exposure was insufficient to elicit a protective immune response.
Molecular detection of Toxoplasma gondii in snakes.
Nasiri, Vahid; Teymurzadeh, Shohreh; Karimi, Gholamreza; Nasiri, Mehdi
2016-10-01
Toxoplasma gondii, an obligate intracellular protozoan parasite, is responsible for one of the most common zoonotic parasitic diseases in almost all warm-blooded vertebrates worldwide, and it is estimated that about one-third of the world human population is chronically infected with this parasite. Little is known about the circulation of T. gondii in snakes and this study for the first time aimed to evaluate the infection rates of snakes by this parasite by PCR methods. The brain of 68 Snakes, that were collected between May 2012 and September 2015 and died after the hold in captivity, under which they were kept for taking poisons, were examined for the presence of this parasite. DNA was extracted and Nested-PCR method was carried out with two of pairs of primers to detect the 344 bp fragment of T. gondii GRA6 gene. Five positive nested-PCR products were directly sequenced in the forward and reverse directions by Sequetech Company (Mountain View, CA). T. gondii GRA6 gene were detected from 55 (80.88%) of 68 snakes brains. Sequencing of the GRA6 gene revealed 98-100% of similarity with T. gondii sequences deposited in GenBank. To our knowledge, this is the first study of molecular detection of T. gondii in snakes and our findings show a higher frequency of this organism among them. Copyright © 2016 Elsevier Inc. All rights reserved.
Winata, Patrick; Williams, Marissa; McGowan, Eileen; Nassif, Najah; van Zandwijk, Nico; Reid, Glen
2017-11-17
MicroRNAs are frequently downregulated in cancer, and restoring expression has tumour suppressive activity in tumour cells. Our recent phase I clinical trial investigated microRNA-based therapy in patients with malignant pleural mesothelioma. Treatment with TargomiRs, microRNA mimics with novel sequence packaged in EGFR antibody-targeted bacterial minicells, revealed clear signs of clinical activity. In order to detect delivery of microRNA mimics to tumour cells in future clinical trials, we tested hydrolysis probe-based assays specific for the sequence of the novel mimics in transfected mesothelioma cell lines using RT-qPCR. The custom assays efficiently and specifically amplified the consensus mimics. However, we found that these assays gave a signal when total RNA from untransfected and control mimic-transfected cells were used as templates. Further investigation revealed that the reverse transcription step using stem-loop primers appeared to introduce substantial non-specific amplification with either total RNA or synthetic RNA templates. This suggests that reverse transcription using stem-loop primers suffers from an intrinsic lack of specificity for the detection of highly similar microRNAs in the same family, especially when analysing total RNA. These results suggest that RT-qPCR is unlikely to be an effective means to detect delivery of microRNA mimic-based drugs to tumour cells in patients.
Benetka, V; Leschnik, M; Affenzeller, N; Möstl, K
2011-04-09
Austrian field cases of canine distemper (14 dogs, one badger [Meles meles] and one stone marten [Martes foina]) from 2002 to 2007 were investigated and the case histories were summarised briefly. Phylogenetic analysis of fusion (F) and haemagglutinin (H) gene sequences revealed different canine distemper virus (CDV) lineages circulating in Austria. The majority of CDV strains detected from 2002 to 2004 were well embedded in the European lineage. One Austrian canine sample detected in 2003, with a high similarity to Hungarian sequences from 2005 to 2006, could be assigned to the Arctic group (phocine distemper virus type 2-like). The two canine sequences from 2007 formed a clearly distinct group flanked by sequences detected previously in China and the USA on an intermediate position between the European wildlife and the Asia-1 cluster. The Austrian wildlife strains (2006 and 2007) could be assigned to the European wildlife group and were most closely related to, yet clearly different from, the 2007 canine samples. To elucidate the epidemiological role of Austrian wildlife in the transmission of the disease to dogs and vice versa, H protein residues related to receptor and host specificity (residues 530 and 549) were analysed. All samples showed the amino acids expected for their host of origin, with the exception of a canine sequence from 2007, which had an intermediate position between wildlife and canine viral strains. In the period investigated, canine strains circulating in Austria could be assigned to four different lineages reflecting both a high diversity and probably different origins of virus introduction to Austria in different years.
Soares, René Arderius; Passaglia, Luciane Maria Pereira
2010-10-01
Bradyrhizobium elkanii is successfully used in the formulation of commercial inoculants and, together with B. japonicum, it fully supplies the plant nitrogen demands. Despite the similarity between B. japonicum and B. elkanii species, several works demonstrated genetic and physiological differences between them. In this work Representational Difference Analysis (RDA) was used for genomic comparison between B. elkanii SEMIA 587, a crop inoculant strain, and B. japonicum USDA 110, a reference strain. Two hundred sequences were obtained. From these, 46 sequences belonged exclusively to the genome of B. elkanii strain, and 154 showed similarity to sequences from B. japonicum genome. From the 46 sequences with no similarity to sequences from B. japonicum, 39 showed no similarity to sequences in public databases and seven showed similarity to sequences of genes coding for known proteins. These seven sequences were divided in three groups: similar to sequences from other Bradyrhizobium strains, similar to sequences from other nitrogen-fixing bacteria, and similar to sequences from non nitrogen-fixing bacteria. These new sequences could be used as DNA markers in order to investigate the rates of genetic material gain and loss in natural Bradyrhizobium strains.
Zelinsky, G J
2001-02-01
Search, memory, and strategy constraints on change detection were analyzed in terms of oculomotor variables. Observers viewed a repeating sequence of three displays (Scene 1-->Mask-->Scene 2-->Mask...) and indicated the presence-absence of a changing object between Scenes 1 and 2. Scenes depicted real-world objects arranged on a surface. Manipulations included set size (one, three, or nine items) and the orientation of the changing objects (similar or different). Eye movements increased with the number of potentially changing objects in the scene, with this set size effect suggesting a relationship between change detection and search. A preferential fixation analysis determined that memory constraints are better described by the operation comparing the pre- and postchange objects than as a capacity limitation, and a scanpath analysis revealed a change detection strategy relying on the peripheral encoding and comparison of display items. These findings support a signal-in-noise interpretation of change detection in which the signal varies with the similarity of the changing objects and the noise is determined by the distractor objects and scene background.
Genetic diversity of merozoite surface antigens in Babesia bovis detected from Sri Lankan cattle.
Sivakumar, Thillaiampalam; Okubo, Kazuhiro; Igarashi, Ikuo; de Silva, Weligodage Kumarawansa; Kothalawala, Hemal; Silva, Seekkuge Susil Priyantha; Vimalakumar, Singarayar Caniciyas; Meewewa, Asela Sanjeewa; Yokoyama, Naoaki
2013-10-01
Babesia bovis, the causative agent of severe bovine babesiosis, is endemic in Sri Lanka. The live attenuated vaccine (K-strain), which was introduced in the early 1990s, has been used to immunize cattle populations in endemic areas of the country. The present study was undertaken to determine the genetic diversity of merozoite surface antigens (MSAs) in B. bovis isolates from Sri Lankan cattle, and to compare the gene sequences obtained from such isolates against those of the K-strain. Forty-four bovine blood samples isolated from different geographical regions of Sri Lanka and judged to be B. bovis-positive by PCR screening were used to amplify MSAs (MSA-1, MSA-2c, MSA-2a1, MSA-2a2, and MSA-2b), AMA-1, and 12D3 genes from parasite DNA. Although the AMA-1 and 12D3 gene sequences were highly conserved among the Sri Lankan isolates, the MSA gene sequences from the same isolates were highly diverse. Sri Lankan MSA-1, MSA-2c, MSA-2a1, MSA-2a2, and MSA-2b sequences clustered within 5, 2, 4, 1, and 9 different clades in the gene phylograms, respectively, while the minimum similarity values among the deduced amino acid sequences of these genes were 36.8%, 68.7%, 80.3%, 100%, and 68.3%, respectively. In the phylograms, none of the Sri Lankan sequences fell within clades containing the respective K-strain sequences. Additionally, the similarity values for MSA-1 and MSA-2c were 40-61.8% and 90.9-93.2% between the Sri Lankan isolates and the K-strain, respectively, while the K-strain MSA-2a/b sequence shared 64.5-69.8%, 69.3%, and 70.5-80.3% similarities with the Sri Lankan MSA-2a1, MSA-2a2, and MSA-2b sequences, respectively. The present study has shown that genetic diversity among MSAs of Sri Lankan B. bovis isolates is very high, and that the sequences of field isolates diverged genetically from the K-strain. Copyright © 2013 Elsevier B.V. All rights reserved.
The Influence of Similarity on Visual Working Memory Representations
Lin, Po-Han; Luck, Steven J.
2007-01-01
In verbal memory, similarity between items in memory often leads to interference and impaired memory performance. The present study sought to determine whether analogous interference effects would be observed in visual working memory by varying the similarity of the to-be-remembered objects in a color change-detection task. Instead of leading to interference and impaired performance, increased similarity among the items being held in memory led to improved performance. Moreover, when two similar colors were presented along with one dissimilar color, memory performance was better for the similar colors than for the dissimilar color. Similarity produced better performance even when the objects were presented sequentially and even when memory for the first item in the sequence was tested. These findings show that similarity does not lead to interference between representations in visual working memory. Instead, similarity may lead to improved task performance, possibly due to increased stability or precision of the memory representations during maintenance. PMID:19430536
Knowledge-based model building of proteins: concepts and examples.
Bajorath, J.; Stenkamp, R.; Aruffo, A.
1993-01-01
We describe how to build protein models from structural templates. Methods to identify structural similarities between proteins in cases of significant, moderate to low, or virtually absent sequence similarity are discussed. The detection and evaluation of structural relationships is emphasized as a central aspect of protein modeling, distinct from the more technical aspects of model building. Computational techniques to generate and complement comparative protein models are also reviewed. Two examples, P-selectin and gp39, are presented to illustrate the derivation of protein model structures and their use in experimental studies. PMID:7505680
Liebl, Hans; Heilmeier, Ursula; Lee, Sonia; Nardo, Lorenzo; Patsch, Janina; Schuppert, Christopher; Han, Misung; Rondak, Ina-Christine; Banerjee, Suchandrima; Koch, Kevin; Link, Thomas M.; Krug, Roland
2014-01-01
PURPOSE To assess lesion detection and artifact size reduction of a MAVRIC-SEMAC hybrid sequence (MAVRIC-SL) compared to standard sequences at 1.5T and 3T in porcine knee specimens with metal hardware. METHODS Artificial cartilage and bone lesions of defined size were created in the proximity of titanium and steel screws with 2.5 mm diameter in 12 porcine knee specimens and were imaged at 1.5T and 3T MRI with MAVRIC-SL PD and STIR, standard FSE T2 PD and STIR and fat-saturated T2 FSE sequences. Three radiologists blinded to the lesion locations assessed lesion detection rates on randomized images for each sequence using ROC. Artifact length and width were measured. RESULTS Metal artifact sizes were largest in the presence of steel screws at 3T (FSE T2 FS: 28.7cm2) and 1.5T (16.03cm2). MAVRIC-SL PD and STIR reduced artifact sizes at both 3T (1.43cm2; 2.46cm2) and 1.5T (1.16cm2; 1.59cm2) compared to FS T2 FSE sequences (27.57cm2; 13.20cm2). At 3T, ROC derived AUC values using MAVRIC-SL sequences were significantly higher compared to standard sequences (MAVRIC-PD: 0.87, versus FSE-T2-FS: 0.73 (p=0.025); MAVRIC- STIR: 0.9 versus T2-STIR: 0.78 (p=0.001) and versus FSE-T2-FS: 0.73 (p=0.026)). Similar values were observed at 1.5T. Comparison of 3T and 1.5T showed no significant differences (MAVRIC-SL PD: p=0.382; MAVRIC-SL STIR: p=0.071. CONCLUSION MAVRIC-SL sequences provided superior lesion detection and reduced metal artifact size at both 1.5T and 3T compared to conventionally used FSE sequences. No significant disadvantage was found comparing MAVRIC-SL at 3T and 1.5T, though metal artifacts at 3T were larger. PMID:24912802
[Prokaryotic expression of recombinant prochymosin gene and its antiserum preparation].
Li, Xin-ping; Liu, Huan-huan; Pu, Yan; Zhang, Fu-chun; Li, Yi-jie
2012-07-01
To optimize the prochymosin (pCHY) gene codons and express the gene in Escherichia coli (E.coli), and to prepare its antiserum and detect chymosin protein specifically. According to codon usage bias of E.coli, prochymosin gene sequence was synthesized based on the conserved sequences of prochymosin gene from bovine, lamb and camel, and then cloned into the plasmid pET-30a and pcDNA3-AAT-COMP-C3d3 (pcD-ACC), respectively. pET-30a-pCHY was expressed, as the detected antigen, in E.coli BL21(DE3) after IPTG induction. RT-PCR was used to detect prochymosin mRNA expression in liver from the mice injected pcDNA3-AAT-COMP-pCHY-C3d3(pACCC) by hydrodynamics-based transfection method. To prepare the antiserum of prochymosin, pACCC and GST-pCHY proteins were used to immunize New Zealand rabbits in accordance with DNA prime-protein boost strategy. Antibody levels were tested by ELISA. Western blotting showed the molecular weight of His-pCHY protein was about 55 000, similar to the expected molecular size. ELISA demonstrated that the titer level of prochymosin antiserum was high. Based on the codon optimization, we have obtained high-titer prochymosin antiserum through DNA vaccine vector pcD-ACC combined with DNA prime-protein boost strategy, similar to that by protein vaccine.
Pulliam Holoman, Tracey R.; Elberson, Margaret A.; Cutter, Leah A.; May, Harold D.; Sowers, Kevin R.
1998-01-01
Defined microbial communities were developed by combining selective enrichment with molecular monitoring of total community genes coding for 16S rRNAs (16S rDNAs) to identify potential polychlorinated biphenyl (PCB)-dechlorinating anaerobes that ortho dechlorinate 2,3,5,6-tetrachlorobiphenyl. In enrichment cultures that contained a defined estuarine medium, three fatty acids, and sterile sediment, a Clostridium sp. was predominant in the absence of added PCB, but undescribed species in the δ subgroup of the class Proteobacteria, the low-G+C gram-positive subgroup, the Thermotogales subgroup, and a single species with sequence similarity to the deeply branching species Dehalococcoides ethenogenes were more predominant during active dechlorination of the PCB. Species with high sequence similarities to Methanomicrobiales and Methanosarcinales archaeal subgroups were predominant in both dechlorinating and nondechlorinating enrichment cultures. Deletion of sediment from PCB-dechlorinating enrichment cultures reduced the rate of dechlorination and the diversity of the community. Substitution of sodium acetate for the mixture of three fatty acids increased the rate of dechlorination, further reduced the community diversity, and caused a shift in the predominant species that included restriction fragment length polymorphism patterns not previously detected. Although PCB-dechlorinating cultures were methanogenic, inhibition of methanogenesis and elimination of the archaeal community by addition of bromoethanesulfonic acid only slightly inhibited dechlorination, indicating that the archaea were not required for ortho dechlorination of the congener. Deletion of Clostridium spp. from the community profile by addition of vancomycin only slightly reduced dechlorination. However, addition of sodium molybdate, an inhibitor of sulfate reduction, inhibited dechlorination and deleted selected species from the community profiles of the class Bacteria. With the exception of one 16S rDNA sequence that had the highest sequence similarity to the obligate perchloroethylene-dechlorinating Dehalococcoides, the 16S rDNA sequences associated with PCB ortho dechlorination had high sequence similarities to the δ, low-G+C gram-positive, and Thermotogales subgroups, which all include sulfur-, sulfate-, and/or iron(III)-respiring bacterial species. PMID:9726883
Putative Novel Genotype of Avian Hepatitis E Virus, Hungary, 2010
Bányai, Krisztián; Tóth, Ádám György; Ivanics, Éva; Glávits, Róbert; Szentpáli-Gavallér, Katalin
2012-01-01
To explore the genetic diversity of avian hepatitis E virus strains, we characterized the near-complete genome of a strain detected in 2010 in Hungary, uncovering moderate genome sequence similarity with reference strains. Public health implications related to consumption of eggs or meat contaminated by avian hepatitis E virus, or to poultry handling, require thorough investigation. PMID:22840214
A deep learning method for lincRNA detection using auto-encoder algorithm.
Yu, Ning; Yu, Zeng; Pan, Yi
2017-12-06
RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly annotated lincRNA data, deep learning methods based on auto-encoder algorithm can exert their capability in knowledge learning in order to capture the useful features and the information correlation along DNA genome sequences for lincRNA detection. As our knowledge, this is the first application to adopt the deep learning techniques for identifying lincRNA transcription sequences.
Diverse molecular signatures for ribosomally ‘active’ Perkinsea in marine sediments
2014-01-01
Background Perkinsea are a parasitic lineage within the eukaryotic superphylum Alveolata. Recent studies making use of environmental small sub-unit ribosomal RNA gene (SSU rDNA) sequencing methodologies have detected a significant diversity and abundance of Perkinsea-like phylotypes in freshwater environments. In contrast only a few Perkinsea environmental sequences have been retrieved from marine samples and only two groups of Perkinsea have been cultured and morphologically described and these are parasites of marine molluscs or marine protists. These two marine groups form separate and distantly related phylogenetic clusters, composed of closely related lineages on SSU rDNA trees. Here, we test the hypothesis that Perkinsea are a hitherto under-sampled group in marine environments. Using 454 diversity ‘tag’ sequencing we investigate the diversity and distribution of these protists in marine sediments and water column samples taken from the Deep Chlorophyll Maximum (DCM) and sub-surface using both DNA and RNA as the source template and sampling four European offshore locations. Results We detected the presence of 265 sequences branching with known Perkinsea, the majority of them recovered from marine sediments. Moreover, 27% of these sequences were sampled from RNA derived cDNA libraries. Phylogenetic analyses classify a large proportion of these sequences into 38 cluster groups (including 30 novel marine cluster groups), which share less than 97% sequence similarity suggesting this diversity encompasses a range of biologically and ecologically distinct organisms. Conclusions These results demonstrate that the Perkinsea lineage is considerably more diverse than previously detected in marine environments. This wide diversity of Perkinsea-like protists is largely retrieved in marine sediment with a significant proportion detected in RNA derived libraries suggesting this diversity represents ribosomally ‘active’ and intact cells. Given the phylogenetic range of hosts infected by known Perkinsea parasites, these data suggest that Perkinsea either play a significant but hitherto unrecognized role as parasites in marine sediments and/or members of this group are present in the marine sediment possibly as part of the ‘seed bank’ microbial community. PMID:24779375
Proels, Reinhard K; Roitsch, Thomas
2006-03-01
Very few CACTA transposon-like sequences have been described in Solanaceae species. Sequence information has been restricted to partial transposase (TPase)-like fragments, and no target gene of CACTA-like transposon insertion has been described in tomato to date. In this manuscript, we report on a CACTA transposon-like insertion in intron I of tomato (Lycopersicon esculentum) invertase gene Lin5 and TPase-like sequences of several Solanaceae species. Consensus primers deduced from the TPase region of the tomato CACTA transposon-like element allowed the amplification of similar sequences from various Solanaceae species of different subfamilies including Solaneae (Solanum tuberosum), Cestreae (Nicotiana tabacum) and Datureae (Datura stramonium). This demonstrates the ubiquitous presence of CACTA-like elements in Solanaceae genomes. The obtained partial sequences are highly conserved, and allow further detection and detailed analysis of CACTA-like transposons throughout Solanaceae species. CACTA-like transposon sequences make possible the evaluation of their use for genome analysis, functional studies of genes and the evolutionary relationships between plant species.
The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.
Qian, Jun; Song, Jingyuan; Gao, Huanhuan; Zhu, Yingjie; Xu, Jiang; Pang, Xiaohui; Yao, Hui; Sun, Chao; Li, Xian'en; Li, Chuyuan; Liu, Juyan; Xu, Haibin; Chen, Shilin
2013-01-01
Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.
Improved detection and relocation of micro-earthquakes applied to the Sea of Marmara
NASA Astrophysics Data System (ADS)
Tary, J. B.; Evangelia, B.; Géli, L.; Lomax, A.
2016-12-01
The Sea of Marmara is located at the western end of the North Anatolian Fault (NAF). This part of the NAF is considered as a seismic gap, being between the Izmit and Duzce earthquakes to the East and the Ganos earthquake to the West. Improved detection and location of seismicity in the Sea of Marmara is important for defining the seismic hazard in this area.On July 25, 2011, a Mw 5 earthquake occurred below the Western High in the western part of the Sea of Marmara. This earthquake as well as its aftershock sequence were recorded by a network of 10 ocean bottom seismometers (Ifremer) as well as seafloor observatories (KOERI). The OBSs were deployed from mid-April, 2011, to the end of July, 2011.The aftershock sequence is characterized by deep seismicity ( 10-15 km) around the main shock and shallow seismicity. Some of the shallow seismicity could be located at a similar depth as gas prone sediment layers below the Western High. The exact causes of these shallow aftershocks are still unclear. To better define this aftershock sequence, we use the match filter technique with a selection of aftershocks as templates to dig out child events from the continuous data streams. The templates are cross-correlated with the continuous data for stations with absolute time picks. The cross-correlation coefficients are then summed over all stations and components, and we then compute its median absolute deviation (MAD). Signals are detected when the summed cross-correlation time series exceeds a given number of times the MAD. Using a conservative detection threshold, we obtain a 10-fold increase in the number of events. The newly detected events are then relocated using the double-difference technique. With these newly detected events, we investigate the nucleation phase of the main shock and the aftershock sequence, as well as the possible triggering of the shallow aftershocks by the deeper seismicity.
Limited Variation in BK Virus T-Cell Epitopes Revealed by Next-Generation Sequencing
Sahoo, Malaya K.; Tan, Susanna K.; Chen, Sharon F.; Kapusinszky, Beatrix; Concepcion, Katherine R.; Kjelson, Lynn; Mallempati, Kalyan; Farina, Heidi M.; Fernández-Viña, Marcelo; Tyan, Dolly; Grimm, Paul C.; Anderson, Matthew W.; Concepcion, Waldo
2015-01-01
BK virus (BKV) infection causing end-organ disease remains a formidable challenge to the hematopoietic cell transplant (HCT) and kidney transplant fields. As BKV-specific treatments are limited, immunologic-based therapies may be a promising and novel therapeutic option for transplant recipients with persistent BKV infection. Here, we describe a whole-genome, deep-sequencing methodology and bioinformatics pipeline that identify BKV variants across the genome and at BKV-specific HLA-A2-, HLA-B0702-, and HLA-B08-restricted CD8 T-cell epitopes. BKV whole genomes were amplified using long-range PCR with four inverse primer sets, and fragmentation libraries were sequenced on the Ion Torrent Personal Genome Machine (PGM). An error model and variant-calling algorithm were developed to accurately identify rare variants. A total of 65 samples from 18 pediatric HCT and kidney recipients with quantifiable BKV DNAemia underwent whole-genome sequencing. Limited genetic variation was observed. The median number of amino acid variants identified per sample was 8 (range, 2 to 37; interquartile range, 10), with the majority of variants (77%) detected at a frequency of <5%. When normalized for length, there was no statistical difference in the median number of variants across all genes. Similarly, the predominant virus population within samples harbored T-cell epitopes similar to the reference BKV strain that was matched for the BKV genotype. Despite the conservation of epitopes, low-level variants in T-cell epitopes were detected in 77.7% (14/18) of patients. Understanding epitope variation across the whole genome provides insight into the virus-immune interface and may help guide the development of protocols for novel immunologic-based therapies. PMID:26202116
Sols, Ignasi; DuBrow, Sarah; Davachi, Lila; Fuentemilla, Lluís
2017-11-20
Although everyday experiences unfold continuously over time, shifts in context, or event boundaries, can influence how those events come to be represented in memory [1-4]. Specifically, mnemonic binding across sequential representations is more challenging at context shifts, such that successful temporal associations are more likely to be formed within than across contexts [1, 2, 5-9]. However, in order to preserve a subjective sense of continuity, it is important that the memory system bridge temporally adjacent events, even if they occur in seemingly distinct contexts. Here, we used pattern similarity analysis to scalp electroencephalographic (EEG) recordings during a sequential learning task [2, 3] in humans and showed that the detection of event boundaries triggered a rapid memory reinstatement of the just-encoded sequence episode. Memory reactivation was detected rapidly (∼200-800 ms from the onset of the event boundary) and was specific to context shifts that were preceded by an event sequence with episodic content. Memory reinstatement was not observed during the sequential encoding of events within an episode, indicating that memory reactivation was induced specifically upon context shifts. Finally, the degree of neural similarity between neural responses elicited during sequence encoding and at event boundaries correlated positively with participants' ability to later link across sequences of events, suggesting a critical role in binding temporally adjacent events in long-term memory. Current results shed light onto the neural mechanisms that promote episodic encoding not only for information within the event, but also, importantly, in the ability to link across events to create a memory representation of continuous experience. Copyright © 2017 Elsevier Ltd. All rights reserved.
Longkumer, Toshisangba; Kamireddy, Swetha; Muthyala, Venkateswar Reddy; Akbarpasha, Shaikh; Pitchika, Gopi Krishna; Kodetham, Gopinath; Ayaluru, Murali; Siddavattam, Dayananda
2013-01-01
While analyzing plasmids of Acinetobacter sp. DS002 we have detected a circular DNA molecule pTS236, which upon further investigation is identified as the genome of a phage. The phage genome has shown sequence similarity to the recently discovered Sphinx 2.36 DNA sequence co-purified with the Transmissible Spongiform Encephalopathy (TSE) particles isolated from infected brain samples collected from diverse geographical regions. As in Sphinx 2.36, the phage genome also codes for three proteins. One of them codes for RepA and is shown to be involved in replication of pTS236 through rolling circle (RC) mode. The other two translationally coupled ORFs, orf106 and orf96, code for coat proteins of the phage. Although an orf96 homologue was not previously reported in Sphinx 2.36, a closer examination of DNA sequence of Sphinx 2.36 revealed its presence downstream of orf106 homologue. TEM images and infection assays revealed existence of phage AbDs1 in Acinetobacter sp. DS002.
Longkumer, Toshisangba; Kamireddy, Swetha; Muthyala, Venkateswar Reddy; Akbarpasha, Shaikh; Pitchika, Gopi Krishna; Kodetham, Gopinath; Ayaluru, Murali; Siddavattam, Dayananda
2013-01-01
While analyzing plasmids of Acinetobacter sp. DS002 we have detected a circular DNA molecule pTS236, which upon further investigation is identified as the genome of a phage. The phage genome has shown sequence similarity to the recently discovered Sphinx 2.36 DNA sequence co-purified with the Transmissible Spongiform Encephalopathy (TSE) particles isolated from infected brain samples collected from diverse geographical regions. As in Sphinx 2.36, the phage genome also codes for three proteins. One of them codes for RepA and is shown to be involved in replication of pTS236 through rolling circle (RC) mode. The other two translationally coupled ORFs, orf106 and orf96, code for coat proteins of the phage. Although an orf96 homologue was not previously reported in Sphinx 2.36, a closer examination of DNA sequence of Sphinx 2.36 revealed its presence downstream of orf106 homologue. TEM images and infection assays revealed existence of phage AbDs1 in Acinetobacter sp. DS002. PMID:23867905
Yin, Yan-hui; Li, Bi-chun; Wei, Guang-hui; Zhu, Cai-ye; Li, Wei; Zhang, Ya-ni; Du, Li-xin; Cao, Wen-guang
2012-05-01
The aim of this study was to clone the heart-type fatty acid binding protein (H-FABP) gene of Xuhuai goat, to explore it bioinformatically, and analyze the subcellular localization using enhanced green fluorescent protein (EGFP). The results showed that the coding sequence (CDS) length of Xuhuai goat H-FABP gene was 402 bp, encoding 133 amino acids (GenBank accession number AY466498.1). The H-FABP cDNA coding sequence was compared with the corresponding region of human, chicken, brown rat, cow, wild boar, donkey, and zebrafish. The similarity were 89%, 76%, 85%, 84%, 93%, 91%, 70%, respectively. For the corresponding amino acid sequences, the similarity were 90%, 79%, 88%, 97%, 95%, 94%, 72%, respectively. This study did not find the signal peptide region in the H-FABP protein; it revealed that H-FABP protein might be a nonsecreted protein. H-FABP expression was detected in vitro by reverse transcription-polymerase chain reaction (RT-PCR), and the EGFP-H-FABP fusion protein was localized to the cytoplasm. The gene could also be transiently and permanently expressed in mice.
Adékambi, Toïdi; Foucault, Cedric; La Scola, Bernard; Drancourt, Michel
2006-01-01
Neurological infections due to rapidly growing mycobacteria (RGM) have rarely been reported. We recently investigated two unrelated immunocompetent patients, one with community-acquired lymphocytic meningitis and the other with cerebral thrombophlebitis. Mycobacterium mucogenicum was isolated in pure culture and detected by PCR sequencing of cerebrospinal fluid samples. Both patients eventually died. The two isolates exhibited an overlapping antimicrobial susceptibility pattern. They were susceptible in vitro to tetracyclines, macrolides, quinolones, amikacin, imipenem, cefoxitin, and trimethoprim-sulfamethoxazole and resistant to ceftriaxone. They shared 100% 16S rRNA gene sequence similarity with M. mucogenicum ATCC 49650T over 1,482 bp. Their partial rpoB sequences shared 97.8% and 98.1% similarity with M. mucogenicum ATCC 49650T, suggesting that the two isolates were representative of two sequevars of M. mucogenicum species. This case report should make clinicians aware that M. mucogenicum, an RGM frequently isolated from tap water or from respiratory specimens and mostly without clinical significance, can even be encountered in the central nervous system of immunocompetent patients. PMID:16517863
T-Reg Comparator: an analysis tool for the comparison of position weight matrices
Roepcke, Stefan; Grossmann, Steffen; Rahmann, Sven; Vingron, Martin
2005-01-01
T-Reg Comparator is a novel software tool designed to support research into transcriptional regulation. Sequence motifs representing transcription factor binding sites are usually encoded as position weight matrices. The user inputs a set of such weight matrices or binding site sequences and our program matches them against the T-Reg database, which is presently built on data from the Transfac [E. Wingender (2004) In Silico Biol., 4, 55–61] and Jaspar [A. Sandelin, W. Alkema, P. Engstrom, W. W. Wasserman and B. Lenhard (2004) Nucleic Acids Res., 32, D91–D94]. Our tool delivers a detailed report on similarities between user-supplied motifs and motifs in the database. Apart from simple one-to-one relationships, T-Reg Comparator is also able to detect similarities between submatrices. In addition, we provide a user interface to a program for sequence scanning with weight matrices. Typical areas of application for T-Reg Comparator are motif and regulatory module finding and annotation of regulatory genomic regions. T-Reg Comparator is available at . PMID:15980506
T-Reg Comparator: an analysis tool for the comparison of position weight matrices.
Roepcke, Stefan; Grossmann, Steffen; Rahmann, Sven; Vingron, Martin
2005-07-01
T-Reg Comparator is a novel software tool designed to support research into transcriptional regulation. Sequence motifs representing transcription factor binding sites are usually encoded as position weight matrices. The user inputs a set of such weight matrices or binding site sequences and our program matches them against the T-Reg database, which is presently built on data from the Transfac [E. Wingender (2004) In Silico Biol., 4, 55-61] and Jaspar [A. Sandelin, W. Alkema, P. Engstrom, W. W. Wasserman and B. Lenhard (2004) Nucleic Acids Res., 32, D91-D94]. Our tool delivers a detailed report on similarities between user-supplied motifs and motifs in the database. Apart from simple one-to-one relationships, T-Reg Comparator is also able to detect similarities between submatrices. In addition, we provide a user interface to a program for sequence scanning with weight matrices. Typical areas of application for T-Reg Comparator are motif and regulatory module finding and annotation of regulatory genomic regions. T-Reg Comparator is available at http://treg.molgen.mpg.de.
Anthrax Toxin-Expressing Bacillus cereus Isolated from an Anthrax-Like Eschar.
Marston, Chung K; Ibrahim, Hisham; Lee, Philip; Churchwell, George; Gumke, Megan; Stanek, Danielle; Gee, Jay E; Boyer, Anne E; Gallegos-Candela, Maribel; Barr, John R; Li, Han; Boulay, Darbi; Cronin, Li; Quinn, Conrad P; Hoffmaster, Alex R
2016-01-01
Bacillus cereus isolates have been described harboring Bacillus anthracis toxin genes, most notably B. cereus G9241, and capable of causing severe and fatal pneumonias. This report describes the characterization of a B. cereus isolate, BcFL2013, associated with a naturally occurring cutaneous lesion resembling an anthrax eschar. Similar to G9241, BcFL2013 is positive for the B. anthracis pXO1 toxin genes, has a multi-locus sequence type of 78, and a pagA sequence type of 9. Whole genome sequencing confirms the similarity to G9241. In addition to the chromosome having an average nucleotide identity of 99.98% when compared to G9241, BcFL2013 harbors three plasmids with varying homology to the G9241 plasmids (pBCXO1, pBC210 and pBFH_1). This is also the first report to include serologic testing of patient specimens associated with this type of B. cereus infection which resulted in the detection of anthrax lethal factor toxemia, a quantifiable serum antibody response to protective antigen (PA), and lethal toxin neutralization activity.
Ponting, C P; Mott, R; Bork, P; Copley, R R
2001-12-01
Sequence database searching methods such as BLAST, are invaluable for predicting molecular function on the basis of sequence similarities among single regions of proteins. Searches of whole databases however, are not optimized to detect multiple homologous regions within a single polypeptide. Here we have used the prospero algorithm to perform self-comparisons of all predicted Drosophila melanogaster gene products. Predicted repeats, and their homologs from all species, were analyzed further to detect hitherto unappreciated evolutionary relationships. Results included the identification of novel tandem repeats in the human X-linked retinitis pigmentosa type-2 gene product, repeated segments in cystinosin, associated with a defect in cystine transport, and 'nested' homologous domains in dysferlin, whose gene is mutated in limb girdle muscular dystrophy. Novel signaling domain families were found that may regulate the microtubule-based cytoskeleton and ubiquitin-mediated proteolysis, respectively. Two families of glycosyl hydrolases were shown to contain internal repetitions that hint at their evolution via a piecemeal, modular approach. In addition, three examples of fruit fly genes were detected with tandem exons that appear to have arisen via internal duplication. These findings demonstrate how completely sequenced genomes can be exploited to further understand the relationships between molecular structure, function, and evolution.
Molecular detection and characterization of Theileria species in the Philippines.
Belotindos, Lawrence P; Lazaro, Jonathan V; Villanueva, Marvin A; Mingala, Claro N
2014-09-01
Theileriosis is a tick-borne disease of domestic and wild animals that cause devastating economic loss in livestock in tropical and subtropical regions. Theileriosis is not yet documented in the Philippines as compared to babesiosis and anaplasmosis which are considered major tick-borne diseases that infect livestock in the country and contribute major losses to the livestock industry. The study was aimed to detect Theileria sp. at genus level in blood samples of cattle using polymerase chain reaction (PCR) assay. Specifically, it determined the phylogenetic relationship of Theileria species affecting cattle in the Philippines to other Theileria sp. registered in the GenBank. A total of 292 blood samples of cattle that were collected from various provinces were used. Theileria sp. was detected in 43/292 from the cattle blood samples using PCR assay targeting the major piroplasm surface protein (MPSP) gene. DNA sequence showed high similarity (90-99%) among the reported Theileria sp. isolates in the GenBank and the Philippine isolates of Theileria. Phylogenetic tree construction using nucleotide sequence classified the Philippine isolates of Theileria as benign. However, nucleotide polymorphism was observed in the new isolate based on nucleotide sequence alignment. It revealed that the new isolate can be a new species of Theileria.
Tattiyapong, Muncharee; Sivakumar, Thillaiampalam; Takemae, Hitoshi; Simking, Pacharathon; Jittapalapong, Sathaporn; Igarashi, Ikuo; Yokoyama, Naoaki
2016-07-01
Babesia bovis, an intraerythrocytic protozoan parasite, causes severe clinical disease in cattle worldwide. The genetic diversity of parasite antigens often results in different immune profiles in infected animals, hindering efforts to develop immune control methodologies against the B. bovis infection. In this study, we analyzed the genetic diversity of the merozoite surface antigen-1 (msa-1) gene using 162 B. bovis-positive blood DNA samples sourced from cattle populations reared in different geographical regions of Thailand. The identity scores shared among 93 msa-1 gene sequences isolated by PCR amplification were 43.5-100%, and the similarity values among the translated amino acid sequences were 42.8-100%. Of 23 total clades detected in our phylogenetic analysis, Thai msa-1 gene sequences occurred in 18 clades; seven among them were composed of sequences exclusively from Thailand. To investigate differential antigenicity of isolated MSA-1 proteins, we expressed and purified eight recombinant MSA-1 (rMSA-1) proteins, including an rMSA-1 from B. bovis Texas (T2Bo) strain and seven rMSA-1 proteins based on the Thai msa-1 sequences. When these antigens were analyzed in a western blot assay, anti-T2Bo cattle serum strongly reacted with the rMSA-1 from T2Bo, as well as with three other rMSA-1 proteins that shared 54.9-68.4% sequence similarity with T2Bo MSA-1. In contrast, no or weak reactivity was observed for the remaining rMSA-1 proteins, which shared low sequence similarity (35.0-39.7%) with T2Bo MSA-1. While demonstrating the high genetic diversity of the B. bovis msa-1 gene in Thailand, the present findings suggest that the genetic diversity results in antigenicity variations among the MSA-1 antigens of B. bovis in Thailand. Copyright © 2016 Elsevier B.V. All rights reserved.
Genetic characterization of a new astrovirus detected in dogs suffering from diarrhoea.
Toffan, Anna; Jonassen, Christine Monceyron; De Battisti, Cristian; Schiavon, Eliana; Kofstad, Tone; Capua, Ilaria; Cattoli, Giovanni
2009-10-20
Astroviruses have been described in several animals species frequently associated with diarrhoea, especially in young animals. In dogs, astrovirus-like particles have been observed sporadically and very little is known about their epidemiology and characteristics. In this paper, we describe the detection of astrovirus-like particles in symptomatic puppies. Furthermore, for the first time in this species, the presumptive identification made by electron microscopy was confirmed by genetic analysis of the viral RNA conducted directly on the clinical specimens. Genetic sequences of ORF2 (2443 nt), encoding for the capsid protein, and partial sequence of ORF1b (346 nt), encoding for the viral polymerase, identified the viruses as member of the family Astroviridae. The phylogenetic analysis clearly clustered canine astroviruses in the genus Mamastrovirus. Relative closest similarities were revealed with a cluster comprising human, porcine and feline astroviruses, based on the ORF2 sequences available. Based on the species definition for astroviruses and on the data obtained in this study, we suggest a new species of astrovirus - canine astrovirus, CaAstV - to be included in the genus Mamastrovirus.
Kuiper, Melanie W.; Valster, Rinske M.; Wullings, Bart A.; Boonstra, Harry; Smidt, Hauke; van der Kooij, Dick
2006-01-01
A real-time PCR-based method targeting the 18S rRNA gene was developed for the quantitative detection of Hartmannella vermiformis, a free-living amoeba which is a potential host for Legionella pneumophila in warm water systems and cooling towers. The detection specificity was validated using genomic DNA of the closely related amoeba Hartmannella abertawensis as a negative control and sequence analysis of amplified products from environmental samples. Real-time PCR detection of serially diluted DNA extracted from H. vermiformis was linear for microscopic cell counts between 1.14 × 10−1 and 1.14 × 104 cells per PCR. The genome of H. vermiformis harbors multiple copies of the 18S rRNA gene, and an average number (with standard error) of 1,330 ± 127 copies per cell was derived from real-time PCR calibration curves for cell suspensions and plasmid DNA. No significant differences were observed between the 18S rRNA gene copy numbers for trophozoites and cysts of strain ATCC 50237 or between the copy numbers for this strain and strain KWR-1. The developed method was applied to water samples (200 ml) collected from a variety of lakes and rivers serving as sources for drinking water production in The Netherlands. Detectable populations were found in 21 of the 28 samples, with concentrations ranging from 5 to 75 cells/liter. A high degree of similarity (≥98%) was observed between sequences of clones originating from the different surface waters and between these clones and the reference strains. Hence, H. vermiformis, which is highly similar to strains serving as hosts for L. pneumophila, is a common component of the microbial community in fresh surface water. PMID:16957190
DOE Office of Scientific and Technical Information (OSTI.GOV)
Claffey, K.P.; Herrera, V.L.; Brecher, P.
1987-12-01
A fatty acid binding protein (FABP) as been identified and characterized in rat heart, but the function and regulation of this protein are unclear. In this study the cDNA for rat heart FABP was cloned from a lambda gt11 library. Sequencing of the cDNA showed an open reading frame coding for a protein with 133 amino acids and a calculated size of 14,776 daltons. Several differences were found between the sequence determined from the cDNA and that reported previously by protein sequencing techniques. Northern blot analysis using rat heart FABP cDNA as a probe established the presence of an abundantmore » mRNA in rat heart about 0.85 kilobases in length. This mRNA was detected, but was not abundant, in fetal heart tissue. Tissue distribution studies showed a similar mRNA species in red, but not white, skeletal muscle. In general, the mRNA tissue distribution was similar to that of the protein detected by Western immunoblot analysis, suggesting that heart FABP expression may be regulated at the transcriptional level. S1 nuclease mapping studies confirmed that the mRNA hybridized to rat heart FABP cDNA was identical in heart and red skeletal muscle throughout the entire open reading frame. The structural differences between heart FABP and other members of this multigene family may be related to the functional requirements of oxidative muscle for fatty acids as a fuel source.« less
[Rapid prenatal genetic diagnosis of a fetus with a high risk for Morquio A syndrome].
Guo, Yi-bin; Ai, Yang; Zhao, Yan; Tang, Jia; Jiang, Wei-ying; Du, Min-lian; Ma, Hua-mei; Zhong, Yan-fang
2012-04-01
To provide rapid and accurate prenatal genetic diagnosis for a fetus with high risk of Morquio A syndrome. Based on ascertained etiology of the proband and genotypes of the parents, particular mutations of the GALNS gene were screened at 10th gestational week with amplification refractory mutation system (ARMS), denaturing high performance liquid chromatography (DHPLC), and direct DNA sequencing. DHPLC screening has identified abnormal double peaks in the PCR products of exons 1 and 10, whilst only a single peak was detected in normal controls. Amplification of ARMS specific primers derived a specific product for the fetus's gene, whilst no similar product was detected in normal controls. Sequencing of PCR products confirmed that exons 1 and 10 of the GALNS gene from the fetus contained a heterozygous paternal c.106-111 del (p.L36-L37 del) deletion and a heterozygous maternal c.1097 T>C (p.L366P) missense mutation, which resulted in a compound heterozygote status. The fetus was diagnosed with Morquio A syndrome and a genotype similar to the proband. Termination of the pregnancy was recommended. Combined ARMS, DHPLC and DNA sequencing are effective for rapid and accurate prenatal diagnosis for fetus with a high risk for Morquio A syndrome. Such methods are particularly suitable for early diagnosis when pathogenesis is clear. Furthermore, combined ARMS and DHPLC are suitable for rapid processing of large numbers of samples for the identification of new mutations.
Chen, Shuo; Ning, Jia; Zhao, Xihai; Wang, Jinnan; Zhou, Zechen; Yuan, Chun; Chen, Huijun
2017-02-01
To propose a fast simultaneous noncontrast angiography and intraplaque hemorrhage (fSNAP) sequence for carotid artery imaging. The proposed fSNAP sequence uses a low-resolution reference acquisition for phase-sensitive reconstruction to speed up the scan, and an inversion recovery acquisition with arbitrary k-space filling order to generate similar contrast to conventional SNAP. Four healthy volunteers and eight patients were recruited to test the performance of fSNAP in vivo. The lumen area quantification, muscle-blood CNR, IPH-blood CNR, lumen SNR, and standard deviation and intraplaque hemorrhage (IPH) detection accuracy were compared between fSNAP and SNAP. By using a low-resolution reference acquisition with 1/4 matrix size of the full-resolution reference scan, the scan time of fSNAP was 37.5% less than that of SNAP. A high agreement of lumen area measurement (ICC = 0.97, 95% CI: 0.96-0.99) and IPH detection (Kappa = 1) were found between fSNAP and SNAP. Also, no significant difference was found for muscle-blood CNR (P = 0.25), IPH-blood CNR (P = 0.35), lumen SNR (P = 0.60), and standard deviation (P = 0.46) between the two techniques. The feasibility of fSNAP was validated. fSNAP can improve the imaging efficiency with similar performance to SNAP on carotid artery imaging. Magn Reson Med 77:753-758, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Barati, Ali; Razmi, Gholamreza
2018-05-15
Canine hepatozoonosis, caused by H. canis, is a tick-borne disease in domestic and wild dogs that is transmitted by ingestion of Rhipicephalus sanguineus ticks. The aim of the study was to detect H. canis in stray dogs in Iran using blood smear examination and molecular techniques. From October 2014 to September 2015, 150 EDTA blood samples were collected from stray dogs in the northeast region of Iran. Blood smears were microscopically examined for the presence of Hepatozoon gamonts; whole blood was evaluated by PCR, with subsequent sequencing and phylogenetic analysis. Hepatozoon spp. Gamonts were observed in the neutrophils of 5/150 (3.3%) blood smears, whereas Hepatozoon spp. 18S rDNA was detected in 12/150 (8.0%) blood samples from stray dogs. There was a good agreement between microscopy and PCR methods. (Kappa= 0.756). The highest rate of infection was seasonally detected in the summer (p<0.05). The difference of frequency of Hepatozoon spp infection was not significant by gender and age factors (p>0.05). The alignment analysis of the sequenced samples showed ≥99% similarity with other nucleotide sequences of Hepatozoon spp. in GenBank. The phylogenetic tree also revealed that the nucleotide sequences in this study were clustered in the H. canis clade and different from the H. felis and H. americanum clades. According to the results, it is concluded that H. canis infection is present among dogs in northeastern region of Iran.
Li, Tian-Cheng; Yoshizaki, Sayaka; Kataoka, Michiyo; Ami, Yasushi; Suzaki, Yuriko; Doan, Yen Hai; Haga, Kei; Ishii, Koji; Takeda, Naokazu; Wakita, Takaji
2017-07-01
A novel cluster of five ferret hepatitis E virus (HEV) strains was detected from nine laboratory ferrets (Mustela putorius furo) imported from a ferret farm in the U.S. Our detection of ferret HEV RNA and anti-HEV antibodies, and alanine aminotransferase (ALT) value assessment indicated that all of the 9 ferrets were infected with ferret HEV, and that the infection exhibited three patterns: sub-clinical infection (n=2), acute hepatitis (n=6) and persistent infection (n=1). Next-generation sequence analyses of the entire genome sequences of the five strains revealed that their nucleotide sequence identities ranged from 99.5% to 99.9%, indicating that genetically similar ferret HEVs had been circulating at this the U.S. ferret farm. In contrast, the strains shared 82% and 89% nucleotide sequence identities with other ferret HEV that isolated from the Netherlands (JN998607) and the U.S. (AB890374), suggesting that these strains form a novel cluster of ferret HEV with diverse genomes depending on the region where their host. Particles with a diameter of ~35nm at a density of 1.201g/cm 3 were observed in the fecal specimens by electron microscopy. There was no evidence that the particles were associated with the cell membrane. The ferret HEV RNA was not constantly detected in urine, suggesting that the excretion of ferret HEV into urine is not a common feature of HEV infection. Copyright © 2017 Elsevier B.V. All rights reserved.
Yamagata, Akira; Kato, Junichi; Hirota, Ryuichi; Kuroda, Akio; Ikeda, Tsukasa; Takiguchi, Noboru; Ohtake, Hisao
1999-01-01
Two plasmids were discovered in the ammonia-oxidizing bacterium Nitrosomonas sp. strain ENI-11, which was isolated from activated sludge. The plasmids, designated pAYS and pAYL, were relatively small, being approximately 1.9 kb long. They were cryptic plasmids, having no detectable plasmid-linked antibiotic resistance or heavy metal resistance markers. The complete nucleotide sequences of pAYS and pAYL were determined, and their physical maps were constructed. There existed two major open reading frames, ORF1 in pAYS and ORF2 in pAYL, each of which was more than 500 bp long. The predicted product of ORF2 was 28% identical to part of the replication protein of a Bacillus plasmid, pBAA1. However, no significant similarity to any known protein sequences was detected with the predicted product of ORF1. pAYS and pAYL had a highly homologous region, designated HHR, of 262 bp. The overall identity was 98% between the two nucleotide sequences. Interestingly, HHR-homologous sequences were also detected in the genomes of ENI-11 and the plasmidless strain Nitrosomonas europaea IFO14298. Deletion analysis of pAYS and pAYL indicated that HHR, together with either ORF1 or ORF2, was essential for plasmid maintenance in ENI-11. To our knowledge, pAYS and pAYL are the first plasmids found in the ammonia-oxidizing autotrophic bacteria. PMID:10348848
High prevalence of human parvovirus 4 infection in HBV and HCV infected individuals in shanghai.
Yu, Xuelian; Zhang, Jing; Hong, Liang; Wang, Jiayu; Yuan, Zhengan; Zhang, Xi; Ghildyal, Reena
2012-01-01
Human parvovirus 4 (PARV4) has been detected in blood and diverse tissues samples from HIV/AIDS patients who are injecting drug users. Although B19 virus, the best characterized human parvovirus, has been shown to co-infect patients with hepatitis B or hepatitis C virus (HBV, HCV) infection, the association of PARV4 with HBV or HCV infections is still unknown.The aim of this study was to characterise the association of viruses belonging to PARV4 genotype 1 and 2 with chronic HBV and HCV infection in Shanghai.Serum samples of healthy controls, HCV infected subjects and HBV infected subjects were retrieved from Shanghai Center for Disease Control and Prevention (SCDC) Sample Bank. Parvovirus-specific nested-PCR was performed and results confirmed by sequencing. Sequences were compared with reference sequences obtained from Genbank to derive phylogeny trees.The frequency of parvovirus molecular detection was 16-22%, 33% and 41% in healthy controls, HCV infected and HBV infected subjects respectively, with PARV4 being the only parvovirus detected. HCV infected and HBV infected subjects had a significantly higher PARV4 prevalence than the healthy population. No statistical difference was found in PARV4 prevalence between HBV or HCV infected subjects. PARV4 sequence divergence within study groups was similar in healthy subjects, HBV or HCV infected subjects.Our data clearly demonstrate that PARV4 infection is strongly associated with HCV and HBV infection in Shanghai but may not cause increased disease severity.
Kutyavin, Igor V.
2010-01-01
The article describes a new technology for real-time polymerase chain reaction (PCR) detection of nucleic acids. Similar to Taqman, this new method, named Snake, utilizes the 5′-nuclease activity of Thermus aquaticus (Taq) DNA polymerase that cleaves dual-labeled Förster resonance energy transfer (FRET) probes and generates a fluorescent signal during PCR. However, the mechanism of the probe cleavage in Snake is different. In this assay, PCR amplicons fold into stem–loop secondary structures. Hybridization of FRET probes to one of these structures leads to the formation of optimal substrates for the 5′-nuclease activity of Taq. The stem–loop structures in the Snake amplicons are introduced by the unique design of one of the PCR primers, which carries a special 5′-flap sequence. It was found that at a certain length of these 5′-flap sequences the folded Snake amplicons have very little, if any, effect on PCR yield but benefit many aspects of the detection process, particularly the signal productivity. Unlike Taqman, the Snake system favors the use of short FRET probes with improved fluorescence background. The head-to-head comparison study of Snake and Taqman revealed that these two technologies have more differences than similarities with respect to their responses to changes in PCR protocol, e.g. the variations in primer concentration, annealing time, PCR asymmetry. The optimal PCR protocol for Snake has been identified. The technology’s real-time performance was compared to a number of conventional assays including Taqman, 3′-MGB-Taqman, Molecular Beacon and Scorpion primers. The test trial showed that Snake supersedes the conventional assays in the signal productivity and detection of sequence variations as small as single nucleotide polymorphisms. Due to the assay’s cost-effectiveness and simplicity of design, the technology is anticipated to quickly replace all known conventional methods currently used for real-time nucleic acid detection. PMID:19969535
Borozan, Ivan; Watt, Stuart; Ferretti, Vincent
2015-05-01
Alignment-based sequence similarity searches, while accurate for some type of sequences, can produce incorrect results when used on more divergent but functionally related sequences that have undergone the sequence rearrangements observed in many bacterial and viral genomes. Here, we propose a classification model that exploits the complementary nature of alignment-based and alignment-free similarity measures with the aim to improve the accuracy with which DNA and protein sequences are characterized. Our model classifies sequences using a combined sequence similarity score calculated by adaptively weighting the contribution of different sequence similarity measures. Weights are determined independently for each sequence in the test set and reflect the discriminatory ability of individual similarity measures in the training set. Because the similarity between some sequences is determined more accurately with one type of measure rather than another, our classifier allows different sets of weights to be associated with different sequences. Using five different similarity measures, we show that our model significantly improves the classification accuracy over the current composition- and alignment-based models, when predicting the taxonomic lineage for both short viral sequence fragments and complete viral sequences. We also show that our model can be used effectively for the classification of reads from a real metagenome dataset as well as protein sequences. All the datasets and the code used in this study are freely available at https://collaborators.oicr.on.ca/vferretti/borozan_csss/csss.html. ivan.borozan@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Borozan, Ivan; Watt, Stuart; Ferretti, Vincent
2015-01-01
Motivation: Alignment-based sequence similarity searches, while accurate for some type of sequences, can produce incorrect results when used on more divergent but functionally related sequences that have undergone the sequence rearrangements observed in many bacterial and viral genomes. Here, we propose a classification model that exploits the complementary nature of alignment-based and alignment-free similarity measures with the aim to improve the accuracy with which DNA and protein sequences are characterized. Results: Our model classifies sequences using a combined sequence similarity score calculated by adaptively weighting the contribution of different sequence similarity measures. Weights are determined independently for each sequence in the test set and reflect the discriminatory ability of individual similarity measures in the training set. Because the similarity between some sequences is determined more accurately with one type of measure rather than another, our classifier allows different sets of weights to be associated with different sequences. Using five different similarity measures, we show that our model significantly improves the classification accuracy over the current composition- and alignment-based models, when predicting the taxonomic lineage for both short viral sequence fragments and complete viral sequences. We also show that our model can be used effectively for the classification of reads from a real metagenome dataset as well as protein sequences. Availability and implementation: All the datasets and the code used in this study are freely available at https://collaborators.oicr.on.ca/vferretti/borozan_csss/csss.html. Contact: ivan.borozan@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25573913
Zhang, Shu; Sui, Zhenghong; Chang, Lianpeng; Kang, Kyoungho; Ma, Jinhua; Kong, Fanna; Zhou, Wei; Wang, Jinguo; Guo, Liliang; Geng, Huili; Zhong, Jie; Ma, Qingxia
2014-03-10
In this article, high-throughput de novo transcriptomic sequencing was performed in Alexandrium catenella, which provided the first view of the gene repertoire in this dinoflagellate based on next-generation sequencing (NGS) technologies. A total of 118,304 unigenes were identified with an average length of 673bp (base pair). Of these unigenes, 77,936 (65.9%) were annotated with known proteins based on sequence similarities, among which 24,149 and 22,956 unigenes were assigned to gene ontology categories (GO) and clusters of orthologous groups (COGs), respectively. Furthermore, 16,467 unigenes were mapped onto 322 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). We also detected 1143 simple sequence repeats (SSRs), in which the tri-nucleotide repeat motif (69.3%) was the most abundant. The genetic facts and significance derived from the transcriptome dataset were suggested and discussed. All four core nucleosomal histones and linker histones were detected, in addition to the unigenes involved in histone modifications.190 unigenes were identified as being involved in the endocytosis pathway, and clathrin-dependent endocytosis was suggested to play a role in the heterotrophy of A. catenella. A conserved 22-nt spliced leader (SL) was identified in 21 unigenes which suggested the existence of trans-splicing processing of mRNA in A. catenella. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
Lefèvre, Emilie; Bardot, Corinne; Noël, Christophe; Carrias, Jean-François; Viscogliosi, Eric; Amblard, Christian; Sime-Ngando, Télesphore
2007-01-01
This study presents an original 18S rRNA PCR survey of the freshwater picoeukaryote community, and was designed to detect unidentified heterotrophic picoflagellates (size range 0.6-5 microm) which are prevalent throughout the year within the heterotrophic flagellate assemblage in Lake Pavin. Four clone libraries were constructed from samples collected in two contrasting zones in the lake. Computerized statistic tools have suggested that sequence retrieval was representative of the in situ picoplankton diversity. The two sampling zones exhibited similar diversity patterns but shared only about 5% of the operational taxonomic units (OTUs). Phylogenetic analysis clustered our sequences into three taxonomic groups: Alveolates (30% of OTUs), Fungi (23%) and Cercozoa (19%). Fungi thus substantially contributed to the detected diversity, as was additionally supported by direct microscopic observations of fungal zoospores and sporangia. A large fraction of the sequences belonged to parasites, including Alveolate sequences affiliated to the genus Perkinsus known as zooparasites, and chytrids that include host-specific parasitic fungi of various freshwater phytoplankton species, primarily diatoms. Phylogenetic analysis revealed five novel clades that probably include typical freshwater environmental sequences. Overall, from the unsuspected fungal diversity unveiled, we think that fungal zooflagellates have been misidentified as phagotrophic nanoflagellates in previous studies. This is in agreement with a recent experimental demonstration that zoospore-producing fungi and parasitic activity may play an important role in aquatic food webs.
Nong, Guang; Chow, Virginia; Schmidt, Liesbeth M; Dickson, Don W; Preston, James F
2007-08-01
Pasteuria species are endospore-forming obligate bacterial parasites of soil-inhabiting nematodes and water-inhabiting cladocerans, e.g. water fleas, and are closely related to Bacillus spp. by 16S rRNA gene sequence. As naturally occurring bacteria, biotypes of Pasteuria penetrans are attractive candidates for the biocontrol of various Meloidogyne spp. (root-knot nematodes). Failure to culture these bacteria outside their hosts has prevented isolation of genomic DNA in quantities sufficient for identification of genes associated with host recognition and virulence. We have applied multiple-strand displacement amplification (MDA) to generate DNA for comparative genomics of biotypes exhibiting different host preferences. Using the genome of Bacillus subtilis as a paradigm, MDA allowed quantitative detection and sequencing of 12 marker genes from 2000 cells. Meloidogyne spp. infected with P. penetrans P20 or B4 contained single nucleotide polymorphisms (SNPs) in the spoIIAB gene that did not change the amino acid sequence, or that substituted amino acids with similar chemical properties. Individual nematodes infected with P. penetrans P20 or B4 contained SNPs in the spoIIAB gene sequenced in MDA-generated products. Detection of SNPs in the spoIIAB gene in a nematode indicates infection by more than one genotype, supporting the need to sequence genomes of Pasteuria spp. derived from single spore isolates.
Fuentes-Ramírez, Alicia; Jiménez-Soto, Mauricio; Castro, Ruth; Romero-Zuñiga, Juan José; Dolz, Gaby
2017-01-01
One hundred and fifty-two blood samples of non-human primates of thirteen rescue centers in Costa Rica were analyzed to determine the presence of species of Plasmodium using thick blood smears, semi-nested multiplex polymerase chain reaction (SnM-PCR) for species differentiation, cloning and sequencing for confirmation. Using thick blood smears, two samples were determined to contain the Plasmodium malariae parasite, with SnM-PCR, a total of five (3.3%) samples were positive to P. malariae, cloning and sequencing confirmed both smear samples as P. malariae. One sample amplified a larger and conserved region of 18S rDNA for the genus Plasmodium and sequencing confirmed the results obtained microscopically and through SnM-PCR tests. Sequencing and construction of a phylogenetic tree of this sample revealed that the P. malariae/P. brasilianum parasite (GenBank KU999995) found in a howler monkey (Alouatta palliata) is identical to that recently reported in humans in Costa Rica. The SnM-PCR detected P. malariae/P. brasilianum parasite in different non-human primate species in captivity and in various regions of the southern Atlantic and Pacific coast of Costa Rica. The similarity of the sequences of parasites found in humans and a monkey suggests that monkeys may be acting as reservoirs of P.malariae/P. brasilianum, for which reason it is important, to include them in control and eradication programs.
Quaranfil, Johnston Atoll, and Lake Chad viruses are novel members of the family Orthomyxoviridae.
Presti, Rachel M; Zhao, Guoyan; Beatty, Wandy L; Mihindukulasuriya, Kathie A; da Rosa, Amelia P A Travassos; Popov, Vsevolod L; Tesh, Robert B; Virgin, Herbert W; Wang, David
2009-11-01
Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae.
Quaranfil, Johnston Atoll, and Lake Chad Viruses Are Novel Members of the Family Orthomyxoviridae▿
Presti, Rachel M.; Zhao, Guoyan; Beatty, Wandy L.; Mihindukulasuriya, Kathie A.; Travassos da Rosa, Amelia P. A.; Popov, Vsevolod L.; Tesh, Robert B.; Virgin, Herbert W.; Wang, David
2009-01-01
Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae. PMID:19726499
Molecular characterization of a wild poliovirus type 3 epidemic in The Netherlands (1992 and 1993).
Mulders, M N; van Loon, A M; van der Avoort, H G; Reimerink, J H; Ras, A; Bestebroer, T M; Drebot, M A; Kew, O M; Koopmans, M P
1995-01-01
An outbreak of poliomyelitis due to wild poliovirus type 3 (PV3) occurred in an unvaccinated community in The Netherlands between September 1992 and February 1993. The outbreak involved 71 patients. The aim of this study was to characterize the virus at the molecular level and to analyze the molecular evolution of the epidemic virus. Molecular analysis was carried out by sequencing the VP1/2A junction region (150 nucleotides) of 50 PV3 strains isolated in association with this outbreak and the entire VP1 gene of 14 strains. In addition, the sequence of the VP1/2A junction region of strains from geographical regions endemic for PV3 (Egypt, India, and Central Asia) was analyzed and compared with the nucleotide sequence of the epidemic strain from The Netherlands. The earliest isolate was obtained from river water sampled 3 weeks before diagnosis of the first poliomyelitis patient and was found by VP1/2A sequence analysis to be genetically identical to the strain isolated from the first patient. Sequence divergence among the strains from the epidemic in The Netherlands was less than 2%. The closest genetic similarity (97.3%) was found with an Indian isolate (New Delhi, December 1991), indicating the likely source of the virus. A more than 99% sequence similarity was found in the VP1/2A region. Finally, the sequence information was used to design primers for the specific and highly sensitive molecular detection of PV3 strains during the epidemic. PMID:8586711
Zhu, Haisheng; Liu, Jianting; Wen, Qingfang; Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng
2017-01-01
Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar 'Fusi-3'. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1-6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism.
Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S
2015-09-01
The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.
Wong, Danny Ka-Ho; Tsoi, Ottilia; Huang, Fung-Yu; Seto, Wai-Kay; Fung, James; Lai, Ching-Lung
2014-01-01
Nucleoside/nucleotide analogue for the treatment of chronic hepatitis B virus (HBV) infection is hampered by the emergence of drug resistance mutations. Conventional PCR sequencing cannot detect minor variants of <20%. We developed a modified co-amplification at lower denaturation temperature-PCR (COLD-PCR) method for the detection of HBV minority drug resistance mutations. The critical denaturation temperature for COLD-PCR was determined to be 78°C. Sensitivity of COLD-PCR sequencing was determined using serially diluted plasmids containing mixed proportions of HBV reverse transcriptase (rt) wild-type and mutant sequences. Conventional PCR sequencing detected mutations only if they existed in ≥25%, whereas COLD-PCR sequencing detected mutations when they existed in 5 to 10% of the viral population. The performance of COLD-PCR was compared to conventional PCR sequencing and a line probe assay (LiPA) using 215 samples obtained from 136 lamivudine- or telbivudine-treated patients with virological breakthrough. Among these 215 samples, drug resistance mutations were detected in 155 (72%), 148 (69%), and 113 samples (53%) by LiPA, COLD-PCR, and conventional PCR sequencing, respectively. Nineteen (9%) samples had mutations detectable by COLD-PCR but not LiPA, while 26 (12%) samples had mutations detectable by LiPA but not COLD-PCR, indicating both methods were comparable (P = 0.371). COLD-PCR was more sensitive than conventional PCR sequencing. Thirty-five (16%) samples had mutations detectable by COLD-PCR but not conventional PCR sequencing, while none had mutations detected by conventional PCR sequencing but not COLD-PCR (P < 0.0001). COLD-PCR sequencing is a simple method which is comparable to LiPA and superior to conventional PCR sequencing in detecting minor lamivudine/telbivudine resistance mutations. PMID:24951803
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.
Borodovsky, M; Rudd, K E; Koonin, E V
1994-01-01
The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
SARNAclust: Semi-automatic detection of RNA protein binding motifs from immunoprecipitation data
Dotu, Ivan; Adamson, Scott I.; Coleman, Benjamin; Fournier, Cyril; Ricart-Altimiras, Emma; Eyras, Eduardo
2018-01-01
RNA-protein binding is critical to gene regulation, controlling fundamental processes including splicing, translation, localization and stability, and aberrant RNA-protein interactions are known to play a role in a wide variety of diseases. However, molecular understanding of RNA-protein interactions remains limited; in particular, identification of RNA motifs that bind proteins has long been challenging, especially when such motifs depend on both sequence and structure. Moreover, although RNA binding proteins (RBPs) often contain more than one binding domain, algorithms capable of identifying more than one binding motif simultaneously have not been developed. In this paper we present a novel pipeline to determine binding peaks in crosslinking immunoprecipitation (CLIP) data, to discover multiple possible RNA sequence/structure motifs among them, and to experimentally validate such motifs. At the core is a new semi-automatic algorithm SARNAclust, the first unsupervised method to identify and deconvolve multiple sequence/structure motifs simultaneously. SARNAclust computes similarity between sequence/structure objects using a graph kernel, providing the ability to isolate the impact of specific features through the bulge graph formalism. Application of SARNAclust to synthetic data shows its capability of clustering 5 motifs at once with a V-measure value of over 0.95, while GraphClust achieves only a V-measure of 0.083 and RNAcontext cannot detect any of the motifs. When applied to existing eCLIP sets, SARNAclust finds known motifs for SLBP and HNRNPC and novel motifs for several other RBPs such as AGGF1, AKAP8L and ILF3. We demonstrate an experimental validation protocol, a targeted Bind-n-Seq-like high-throughput sequencing approach that relies on RNA inverse folding for oligo pool design, that can validate the components within the SLBP motif. Finally, we use this protocol to experimentally interrogate the SARNAclust motif predictions for protein ILF3. Our results support a newly identified partially double-stranded UUUUUGAGA motif similar to that known for the splicing factor HNRNPC. PMID:29596423
Tlapakova, Tereza; Krylov, Vladimir; Macha, Jaroslav
2005-01-01
Two paralogous mitochondrial malate dehydrogenase 2 (Mdh2) genes of Xenopus laevis have been cloned and sequenced, revealing 95% identity. Fluorescence in-situ hybridization (FISH) combined with tyramide amplification discriminates both genes; Mdh2a was localized into chromosome q3 and Mdh2b into chromosome q8. One kb cDNA probes detect both genes with 85% accuracy. The remaining signals were on the paralogous counterpart. Introns interrupt coding sequences at the same nucleotide as defined for mouse. Restriction polymorphism has been detected in the first intron of Mdh2a, while the individual variability in intron 6 of Mdh2b gene is represented by an insertion of incomplete retrotransposon L1Xl. Rates of nucleotide substitutions indicate that both genes are under similar evolutionary constraints. X. laevis Mdh2 genes can be used as markers for physical mapping and linkage analysis.
Detection of Nipah virus RNA in fruit bat (Pteropus giganteus) from India.
Yadav, Pragya D; Raut, Chandrashekhar G; Shete, Anita M; Mishra, Akhilesh C; Towner, Jonathan S; Nichol, Stuart T; Mourya, Devendra T
2012-09-01
The study deals with the survey of different bat populations (Pteropus giganteus, Cynopterus sphinx, and Megaderma lyra) in India for highly pathogenic Nipah virus (NiV), Reston Ebola virus, and Marburg virus. Bats (n = 140) from two states in India (Maharashtra and West Bengal) were tested for IgG (serum samples) against these viruses and for virus RNAs. Only NiV RNA was detected in a liver homogenate of P. giganteus captured in Myanaguri, West Bengal. Partial sequence analysis of nucleocapsid, glycoprotein, fusion, and phosphoprotein genes showed similarity with the NiV sequences from earlier outbreaks in India. A serum sample of this bat was also positive by enzyme-linked immunosorbent assay for NiV-specific IgG. This is the first report on confirmation of Nipah viral RNA in Pteropus bat from India and suggests the possible role of this species in transmission of NiV in India.
Chirkov, Sergei; Ivanov, Peter; Sheveleva, Anna
2013-06-01
Atypical isolates of plum pox virus (PPV) were discovered in naturally infected sour cherry in urban ornamental plantings in Moscow, Russia. The isolates were detected by polyclonal double antibody sandwich ELISA and RT-PCR using universal primers specific for the 3'-non-coding and coat protein (CP) regions of the genome but failed to be recognized by triple antibody sandwich ELISA with the universal monoclonal antibody 5B and by RT-PCR using primers specific to for PPV strains D, M, C and W. Sequence analysis of the CP genes of nine isolates revealed 99.2-100 % within-group identity and 62-85 % identity to conventional PPV strains. Phylogenetic analysis showed that the atypical isolates represent a group that is distinct from the known PPV strains. Alignment of the N-terminal amino acid sequences of CP demonstrated their close similarity to those of a new tentative PPV strain, CR.
Characterisation of a collagen gene subfamily from the potato cyst nematode Globodera pallida.
Gray, L J; Curtis, R H; Jones, J T
2001-01-24
We have isolated two full-length genomic DNA sequences, which encode the cuticle collagen proteins GP-COL-1 and GP-COL-2, from the potato cyst nematode Globodera pallida. A third, partial collagen gene ORF termed gp-col-t(t=truncated) has also been isolated and appears to represent an unexpressed pseudogene. The gp-col-1 and gp-col-2 genes both contain three short (<97 bp) introns which disrupt coding regions predicted to specify proteins with molecular weights of 33 and 32.7 kDa respectively. All three sequences show high similarity to each other and to the previously isolated G. pallida cDNA clone gp-col-8. The conserved pattern of cysteine residues and non-(Gly-X-Y)(n) region sequence similarity observed in all four G. pallida genes suggests that these molecules form part of the same subfamily of collagens. Southern analysis indicates that this subfamily is likely to contain further members. The G. pallida collagen sequences show striking similarity to twelve genes from Caenorhabditis elegans which collectively represent the recently classified Group 1a collagen subfamily. No data exists on the function of this subfamily in C. elegans. gp-col-1 and gp-col-2 are developmentally regulated with transcripts of both genes detected in adult virgin and gravid females but not in pre-parasitic second stage juveniles. A similar expression pattern is observed for the Group 1a collagen lemmi 5 from Meloidogyne incognita perhaps indicating a generic link between subfamily and function during the various changes in cuticular structure which accompany nematode growth and reproduction. Immunochemical studies indicate that the GP-COL-1 protein is specifically located in the hypodermis of G. pallida adult females.
Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T
1993-01-01
Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043
Group I introns are widespread in archaea.
Nawrocki, Eric P; Jones, Thomas A; Eddy, Sean R
2018-05-18
Group I catalytic introns have been found in bacterial, viral, organellar, and some eukaryotic genomes, but not in archaea. All known archaeal introns are bulge-helix-bulge (BHB) introns, with the exception of a few group II introns. It has been proposed that BHB introns arose from extinct group I intron ancestors, much like eukaryotic spliceosomal introns are thought to have descended from group II introns. However, group I introns have little sequence conservation, making them difficult to detect with standard sequence similarity searches. Taking advantage of recent improvements in a computational homology search method that accounts for both conserved sequence and RNA secondary structure, we have identified 39 group I introns in a wide range of archaeal phyla, including examples of group I introns and BHB introns in the same host gene.
Method for identifying and quantifying nucleic acid sequence aberrations
Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.
1998-01-01
A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.
Method for identifying and quantifying nucleic acid sequence aberrations
Lucas, J.N.; Straume, T.; Bogen, K.T.
1998-07-21
A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.
2012-01-01
Background A detailed knowledge about spatial and temporal gene expression is important for understanding both the function of genes and their evolution. For the vast majority of species, transcriptomes are still largely uncharacterized and even in those where substantial information is available it is often in the form of partially sequenced transcriptomes. With the development of next generation sequencing, a single experiment can now simultaneously identify the transcribed part of a species genome and estimate levels of gene expression. Results mRNA from actively growing needles of Norway spruce (Picea abies) was sequenced using next generation sequencing technology. In total, close to 70 million fragments with a length of 76 bp were sequenced resulting in 5 Gbp of raw data. A de novo assembly of these reads, together with publicly available expressed sequence tag (EST) data from Norway spruce, was used to create a reference transcriptome. Of the 38,419 PUTs (putative unique transcripts) longer than 150 bp in this reference assembly, 83.5% show similarity to ESTs from other spruce species and of the remaining PUTs, 3,704 show similarity to protein sequences from other plant species, leaving 4,167 PUTs with limited similarity to currently available plant proteins. By predicting coding frames and comparing not only the Norway spruce PUTs, but also PUTs from the close relatives Picea glauca and Picea sitchensis to both Pinus taeda and Taxus mairei, we obtained estimates of synonymous and non-synonymous divergence among conifer species. In addition, we detected close to 15,000 SNPs of high quality and estimated gene expression differences between samples collected under dark and light conditions. Conclusions Our study yielded a large number of single nucleotide polymorphisms as well as estimates of gene expression on transcriptome scale. In agreement with a recent study we find that the synonymous substitution rate per year (0.6 × 10−09 and 1.1 × 10−09) is an order of magnitude smaller than values reported for angiosperm herbs. However, if one takes generation time into account, most of this difference disappears. The estimates of the dN/dS ratio (non-synonymous over synonymous divergence) reported here are in general much lower than 1 and only a few genes showed a ratio larger than 1. PMID:23122049
Tzou, Philip L; Ariyaratne, Pramila; Varghese, Vici; Lee, Charlie; Rakhmanaliev, Elian; Villy, Carolin; Yee, Meiqi; Tan, Kevin; Michel, Gerd; Pinsky, Benjamin A; Shafer, Robert W
2018-06-01
The ability of next-generation sequencing (NGS) technologies to detect low frequency HIV-1 drug resistance mutations (DRMs) not detected by dideoxynucleotide Sanger sequencing has potential advantages for improved patient outcomes. We compared the performance of an in vitro diagnostic (IVD) NGS assay, the Sentosa SQ HIV genotyping assay for HIV-1 genotypic resistance testing, with Sanger sequencing on 138 protease/reverse transcriptase (RT) and 39 integrase sequences. The NGS assay used a 5% threshold for reporting low-frequency variants. The level of complete plus partial nucleotide sequence concordance between Sanger sequencing and NGS was 99.9%. Among the 138 protease/RT sequences, a mean of 6.4 DRMs was identified by both Sanger and NGS, a mean of 0.5 DRM was detected by NGS alone, and a mean of 0.1 DRM was detected by Sanger sequencing alone. Among the 39 integrase sequences, a mean of 1.6 DRMs was detected by both Sanger sequencing and NGS and a mean of 0.15 DRM was detected by NGS alone. Compared with Sanger sequencing, NGS estimated higher levels of resistance to one or more antiretroviral drugs for 18.2% of protease/RT sequences and 5.1% of integrase sequences. There was little evidence for technical artifacts in the NGS sequences, but the G-to-A hypermutation was detected in three samples. In conclusion, the IVD NGS assay evaluated in this study was highly concordant with Sanger sequencing. At the 5% threshold for reporting minority variants, NGS appeared to attain a modestly increased sensitivity for detecting low-frequency DRMs without compromising sequence accuracy. Copyright © 2018 American Society for Microbiology.
Takahashi, Hajime; Kimura, Bon; Yoshikawa, Miwako; Fujii, Tateo
2003-05-01
The use of molecular tools for early and rapid detection of gram-negative histamine-producing bacteria is important for preventing the accumulation of histamine in fish products. To date, no molecular detection or identification system for gram-negative histamine-producing bacteria has been developed. A molecular method that allows the rapid detection of gram-negative histamine producers by PCR and simultaneous differentiation by single-strand conformation polymorphism (SSCP) analysis using the amplification product of the histidine decarboxylase genes (hdc) was developed. A collection of 37 strains of histamine-producing bacteria (8 reference strains from culture collections and 29 isolates from fish) and 470 strains of non-histamine-producing bacteria isolated from fish were tested. Histamine production of bacteria was determined by paper chromatography and confirmed by high-performance liquid chromatography. Among 37 strains of histamine-producing bacteria, all histidine-decarboxylating gram-negative bacteria produced a PCR product, except for a strain of Citrobacter braakii. In contrast, none of the non-histamine-producing strains (470 strains) produced an amplification product. Specificity of the amplification was further confirmed by sequencing the 0.7-kbp amplification product. A phylogenetic tree of the isolates constructed using newly determined sequences of partial hdc was similar to the phylogenetic tree generated from 16S ribosomal DNA sequences. Histamine accumulation occurred when PCR amplification of hdc was positive in all of fish samples tested and the presence of powerful histamine producers was confirmed by subsequent SSCP identification. The potential application of the PCR-SSCP method as a rapid monitoring tool is discussed.
Paweletz, Cloud P; Sacher, Adrian G; Raymond, Chris K; Alden, Ryan S; O'Connell, Allison; Mach, Stacy L; Kuang, Yanan; Gandhi, Leena; Kirschmeier, Paul; English, Jessie M; Lim, Lee P; Jänne, Pasi A; Oxnard, Geoffrey R
2016-02-15
Tumor genotyping is a powerful tool for guiding non-small cell lung cancer (NSCLC) care; however, comprehensive tumor genotyping can be logistically cumbersome. To facilitate genotyping, we developed a next-generation sequencing (NGS) assay using a desktop sequencer to detect actionable mutations and rearrangements in cell-free plasma DNA (cfDNA). An NGS panel was developed targeting 11 driver oncogenes found in NSCLC. Targeted NGS was performed using a novel methodology that maximizes on-target reads, and minimizes artifact, and was validated on DNA dilutions derived from cell lines. Plasma NGS was then blindly performed on 48 patients with advanced, progressive NSCLC and a known tumor genotype, and explored in two patients with incomplete tumor genotyping. NGS could identify mutations present in DNA dilutions at ≥ 0.4% allelic frequency with 100% sensitivity/specificity. Plasma NGS detected a broad range of driver and resistance mutations, including ALK, ROS1, and RET rearrangements, HER2 insertions, and MET amplification, with 100% specificity. Sensitivity was 77% across 62 known driver and resistance mutations from the 48 cases; in 29 cases with common EGFR and KRAS mutations, sensitivity was similar to droplet digital PCR. In two cases with incomplete tumor genotyping, plasma NGS rapidly identified a novel EGFR exon 19 deletion and a missed case of MET amplification. Blinded to tumor genotype, this plasma NGS approach detected a broad range of targetable genomic alterations in NSCLC with no false positives including complex mutations like rearrangements and unexpected resistance mutations such as EGFR C797S. Through use of widely available vacutainers and a desktop sequencing platform, this assay has the potential to be implemented broadly for patient care and translational research. ©2015 American Association for Cancer Research.
Paweletz, Cloud P.; Sacher, Adrian G.; Raymond, Chris K.; Alden, Ryan S.; O'Connell, Allison; Mach, Stacy L.; Kuang, Yanan; Gandhi, Leena; Kirschmeier, Paul; English, Jessie M.; Lim, Lee P.; Jänne, Pasi A.; Oxnard, Geoffrey R.
2015-01-01
Purpose Tumor genotyping is a powerful tool for guiding non-small cell lung cancer (NSCLC) care, however comprehensive tumor genotyping can be logistically cumbersome. To facilitate genotyping, we developed a next-generation sequencing (NGS) assay using a desktop sequencer to detect actionable mutations and rearrangements in cell-free plasma DNA (cfDNA). Experimental Design An NGS panel was developed targeting 11 driver oncogenes found in NSCLC. Targeted NGS was performed using a novel methodology that maximizes on-target reads, and minimizes artifact, and was validated on DNA dilutions derived from cell lines. Plasma NGS was then blindly performed on 48 patients with advanced, progressive NSCLC and a known tumor genotype, and explored in two patients with incomplete tumor genotyping. Results NGS could identify mutations present in DNA dilutions at ≥0.4% allelic frequency with 100% sensitivity/specificity. Plasma NGS detected a broad range of driver and resistance mutations, including ALK, ROS1, and RET rearrangements, HER2 insertions, and MET amplification, with 100% specificity. Sensitivity was 77% across 62 known driver and resistance mutations from the 48 cases; in 29 cases with common EGFR and KRAS mutations, sensitivity was similar to droplet digital PCR. In two cases with incomplete tumor genotyping, plasma NGS rapidly identified a novel EGFR exon 19 deletion and a missed case of MET amplification. Conclusion Blinded to tumor genotype, this plasma NGS approach detected a broad range of targetable genomic alterations in NSCLC with no false positives including complex mutations like rearrangements and unexpected resistance mutations such as EGFR C797S. Through use of widely available vacutainers and a desktop sequencing platform, this assay has the potential to be implemented broadly for patient care and translational research. PMID:26459174
Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M
2007-01-01
Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the selected grapevine genotypes. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlex™ genotyping technology in a sample of grapevine genotypes and segregating progenies. Conclusion These results provide accurate values for nucleotide diversity in coding sequences and a first estimate of short-range LD in grapevine. Using SNPlex™ genotyping we have shown the application of a set of discovered SNPs as molecular markers for cultivar identification, linkage mapping and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlex™ high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. PMID:18021442
França, Luís; Simões, Catarina; Taborda, Marco; Diogo, Catarina; da Costa, Milton S.
2015-01-01
Over a period of ten months a total of 5618 cord blood units (CBU) were screened for microbial contamination under routine conditions. The antibiotic resistance profile for all isolates was also examined using ATB strips. The detection rate for culture positive units was 7.5%, corresponding to 422 samples.16S rRNA sequence analysis and identification with API test system were used to identify the culturable aerobic, microaerophilic and anaerobic bacteria from CBUs. From these samples we recovered 485 isolates (84 operational taxonomic units, OTUs) assigned to the classes Bacteroidia, Actinobacteria, Clostridia, Bacilli, Betaproteobacteria and primarily to the Gammaproteobacteria. Sixty-nine OTUs, corresponding to 447 isolates, showed 16S rRNA sequence similarities above 99.0% with known cultured bacteria. However, 14 OTUs had 16S rRNA sequence similarities between 95 and 99% in support of genus level identification and one OTU with 16S rRNA sequence similarity of 90.3% supporting a family level identification only. The phenotypic identification formed 29 OTUs that could be identified to the species level and 9 OTUs that could be identified to the genus level by API test system. We failed to obtain identification for 14 OTUs, while 32 OTUs comprised organisms producing mixed identifications. Forty-two OTUs covered species not included in the API system databases. The API test system Rapid ID 32 Strep and Rapid ID 32 E showed the highest proportion of identifications to the species level, the lowest ratio of unidentified results and the highest agreement to the results of 16S rRNA assignments. Isolates affiliated to the Bacilli and Bacteroidia showed the highest antibiotic multi-resistance indices and microorganisms of the Clostridia displayed the most antibiotic sensitive phenotypes. PMID:26512991
França, Luís; Simões, Catarina; Taborda, Marco; Diogo, Catarina; da Costa, Milton S
2015-01-01
Over a period of ten months a total of 5618 cord blood units (CBU) were screened for microbial contamination under routine conditions. The antibiotic resistance profile for all isolates was also examined using ATB strips. The detection rate for culture positive units was 7.5%, corresponding to 422 samples.16S rRNA sequence analysis and identification with API test system were used to identify the culturable aerobic, microaerophilic and anaerobic bacteria from CBUs. From these samples we recovered 485 isolates (84 operational taxonomic units, OTUs) assigned to the classes Bacteroidia, Actinobacteria, Clostridia, Bacilli, Betaproteobacteria and primarily to the Gammaproteobacteria. Sixty-nine OTUs, corresponding to 447 isolates, showed 16S rRNA sequence similarities above 99.0% with known cultured bacteria. However, 14 OTUs had 16S rRNA sequence similarities between 95 and 99% in support of genus level identification and one OTU with 16S rRNA sequence similarity of 90.3% supporting a family level identification only. The phenotypic identification formed 29 OTUs that could be identified to the species level and 9 OTUs that could be identified to the genus level by API test system. We failed to obtain identification for 14 OTUs, while 32 OTUs comprised organisms producing mixed identifications. Forty-two OTUs covered species not included in the API system databases. The API test system Rapid ID 32 Strep and Rapid ID 32 E showed the highest proportion of identifications to the species level, the lowest ratio of unidentified results and the highest agreement to the results of 16S rRNA assignments. Isolates affiliated to the Bacilli and Bacteroidia showed the highest antibiotic multi-resistance indices and microorganisms of the Clostridia displayed the most antibiotic sensitive phenotypes.
Nath, B Surendra; Gupta, S K; Bajpai, A K
2012-12-01
The life cycle, spore morphology, pathogenicity, tissue specificity, mode of transmission and small subunit rRNA (SSU-rRNA) gene sequence analysis of the five new microsporidian isolates viz., NIWB-11bp, NIWB-12n, NIWB-13md, NIWB-14b and NIWB-15mb identified from the silkworm, Bombyx mori have been studied along with type species, NIK-1s_mys. The life cycle of the microsporidians identified exhibited the sequential developmental cycles that are similar to the general developmental cycle of the genus, Nosema. The spores showed considerable variations in their shape, length and width. The pathogenicity observed was dose-dependent and differed from each of the microsporidian isolates; the NIWB-15mb was found to be more virulent than other isolates. All of the microsporidians were found to infect most of the tissues examined and showed gonadal infection and transovarial transmission in the infected silkworms. SSU-rRNA sequence based phylogenetic tree placed NIWB-14b, NIWB-12n and NIWB-11bp in a separate branch along with other Nosema species and Nosema bombycis; while NIWB-15mb and NIWB-13md together formed another cluster along with other Nosema species. NIK-1s_mys revealed a signature sequence similar to standard type species, N. bombycis, indicating that NIK-1s_mys is similar to N. bombycis. Based on phylogenetic relationships, branch length information based on genetic distance and nucleotide differences, we conclude that the microsporidian isolates identified are distinctly different from the other known species and belonging to the genus, Nosema. This SSU-rRNA gene sequence analysis method is found to be more useful approach in detecting different and closely related microsporidians of this economically important domestic insect.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.
Powell, Bradford C; Hutchison, Clyde A
2006-01-19
Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs
Powell, Bradford C; Hutchison, Clyde A
2006-01-01
Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes. PMID:16423288
Mutation Scanning in Wheat by Exon Capture and Next-Generation Sequencing.
King, Robert; Bird, Nicholas; Ramirez-Gonzalez, Ricardo; Coghill, Jane A; Patil, Archana; Hassani-Pak, Keywan; Uauy, Cristobal; Phillips, Andrew L
2015-01-01
Targeted Induced Local Lesions in Genomes (TILLING) is a reverse genetics approach to identify novel sequence variation in genomes, with the aims of investigating gene function and/or developing useful alleles for breeding. Despite recent advances in wheat genomics, most current TILLING methods are low to medium in throughput, being based on PCR amplification of the target genes. We performed a pilot-scale evaluation of TILLING in wheat by next-generation sequencing through exon capture. An oligonucleotide-based enrichment array covering ~2 Mbp of wheat coding sequence was used to carry out exon capture and sequencing on three mutagenised lines of wheat containing previously-identified mutations in the TaGA20ox1 homoeologous genes. After testing different mapping algorithms and settings, candidate SNPs were identified by mapping to the IWGSC wheat Chromosome Survey Sequences. Where sequence data for all three homoeologues were found in the reference, mutant calls were unambiguous; however, where the reference lacked one or two of the homoeologues, captured reads from these genes were mis-mapped to other homoeologues, resulting either in dilution of the variant allele frequency or assignment of mutations to the wrong homoeologue. Competitive PCR assays were used to validate the putative SNPs and estimate cut-off levels for SNP filtering. At least 464 high-confidence SNPs were detected across the three mutagenized lines, including the three known alleles in TaGA20ox1, indicating a mutation rate of ~35 SNPs per Mb, similar to that estimated by PCR-based TILLING. This demonstrates the feasibility of using exon capture for genome re-sequencing as a method of mutation detection in polyploid wheat, but accurate mutation calling will require an improved genomic reference with more comprehensive coverage of homoeologues.
Trung, Le Quang; VAN Puyvelde, Karolien; Triest, Ludwig
2008-03-01
Consensus primers, based on exon sequences of the cyp73 gene family coding for cinnamate 4-hydroxylase (C4H) of the lignin biosynthesis pathway, were designed for the tetraploid willow species Salix alba and Salix fragilis. Diagnostic alleles at species level were observed among introns of three cyp73 genes and allowed unambiguous detection of the first generation and introgressed hybrids in populations. Progeny analysis of a female S. alba with a male introgressed hybrid confirmed the codominant inheritance of each intron. Sequences of the diagnostic alleles of both species were similar to those found in the hybrids. © 2007 The Authors.
A novel Enterovirus 96 circulating in China causes hand, foot, and mouth disease.
Xu, Yi; Sun, Yisuo; Ma, Jinmin; Zhou, Shuru; Fang, Wei; Ye, Jiawei; Tan, Limei; Ji, Jingkai; Luo, Dan; Li, Liqiang; Li, Jiandong; Fang, Chunxiao; Pei, Na; Shi, Shuo; Liu, Xin; Jiang, Hui; Gong, Sitang; Xu, Xun
2017-06-01
Enterovirus 96 (EV-96) is a recently described member of the species Enterovirus C and is associated with paralysis and myelitis. In this study, using metagenomic sequencing, we identified a new enterovirus 96 strain (EV-96-SZ/GD/CHN/2014) as the sole pathogen causing hand, foot, and mouth disease (HFMD). A genomic comparison showed that EV-96-SZ/GD/CHN/2014 is most similar to the EV-96-05517 strain (85% identity), which has also been detected in Guangdong Province. This is the first time that metagenomic sequencing has been used to identify an EV-96 strain shown to be associated with HFMD.
Phylogeny and differentiation of reptilian and amphibian ranaviruses detected in Europe.
Stöhr, Anke C; López-Bueno, Alberto; Blahak, Silvia; Caeiro, Maria F; Rosa, Gonçalo M; Alves de Matos, António Pedro; Martel, An; Alejo, Alí; Marschang, Rachel E
2015-01-01
Ranaviruses in amphibians and fish are considered emerging pathogens and several isolates have been extensively characterized in different studies. Ranaviruses have also been detected in reptiles with increasing frequency, but the role of reptilian hosts is still unclear and only limited sequence data has been provided. In this study, we characterized a number of ranaviruses detected in wild and captive animals in Europe based on sequence data from six genomic regions (major capsid protein (MCP), DNA polymerase (DNApol), ribonucleoside diphosphate reductase alpha and beta subunit-like proteins (RNR-α and -β), viral homolog of the alpha subunit of eukaryotic initiation factor 2, eIF-2α (vIF-2α) genes and microsatellite region). A total of ten different isolates from reptiles (tortoises, lizards, and a snake) and four ranaviruses from amphibians (anurans, urodeles) were included in the study. Furthermore, the complete genome sequences of three reptilian isolates were determined and a new PCR for rapid classification of the different variants of the genomic arrangement was developed. All ranaviruses showed slight variations on the partial nucleotide sequences from the different genomic regions (92.6-100%). Some very similar isolates could be distinguished by the size of the band from the microsatellite region. Three of the lizard isolates had a truncated vIF-2α gene; the other ranaviruses had full-length genes. In the phylogenetic analyses of concatenated sequences from different genes (3223 nt/10287 aa), the reptilian ranaviruses were often more closely related to amphibian ranaviruses than to each other, and most clustered together with previously detected ranaviruses from the same geographic region of origin. Comparative analyses show that among the closely related amphibian-like ranaviruses (ALRVs) described to date, three recently split and independently evolving distinct genetic groups can be distinguished. These findings underline the wide host range of ranaviruses and the emergence of pathogen pollution via animal trade of ectothermic vertebrates.
Phylogeny and Differentiation of Reptilian and Amphibian Ranaviruses Detected in Europe
Stöhr, Anke C.; López-Bueno, Alberto; Blahak, Silvia; Caeiro, Maria F.; Rosa, Gonçalo M.; Alves de Matos, António Pedro; Martel, An; Alejo, Alí; Marschang, Rachel E.
2015-01-01
Ranaviruses in amphibians and fish are considered emerging pathogens and several isolates have been extensively characterized in different studies. Ranaviruses have also been detected in reptiles with increasing frequency, but the role of reptilian hosts is still unclear and only limited sequence data has been provided. In this study, we characterized a number of ranaviruses detected in wild and captive animals in Europe based on sequence data from six genomic regions (major capsid protein (MCP), DNA polymerase (DNApol), ribonucleoside diphosphate reductase alpha and beta subunit-like proteins (RNR-α and -β), viral homolog of the alpha subunit of eukaryotic initiation factor 2, eIF-2α (vIF-2α) genes and microsatellite region). A total of ten different isolates from reptiles (tortoises, lizards, and a snake) and four ranaviruses from amphibians (anurans, urodeles) were included in the study. Furthermore, the complete genome sequences of three reptilian isolates were determined and a new PCR for rapid classification of the different variants of the genomic arrangement was developed. All ranaviruses showed slight variations on the partial nucleotide sequences from the different genomic regions (92.6–100%). Some very similar isolates could be distinguished by the size of the band from the microsatellite region. Three of the lizard isolates had a truncated vIF-2α gene; the other ranaviruses had full-length genes. In the phylogenetic analyses of concatenated sequences from different genes (3223 nt/10287 aa), the reptilian ranaviruses were often more closely related to amphibian ranaviruses than to each other, and most clustered together with previously detected ranaviruses from the same geographic region of origin. Comparative analyses show that among the closely related amphibian-like ranaviruses (ALRVs) described to date, three recently split and independently evolving distinct genetic groups can be distinguished. These findings underline the wide host range of ranaviruses and the emergence of pathogen pollution via animal trade of ectothermic vertebrates. PMID:25706285
The Discovery, Distribution, and Evolution of Viruses Associated with Drosophila melanogaster
Webster, Claire L.; Waldron, Fergal M.; Robertson, Shaun; Crowson, Daisy; Ferrari, Giada; Quintana, Juan F.; Brouqui, Jean-Michel; Bayne, Elizabeth H.; Longdon, Ben; Buck, Amy H.; Lazzaro, Brian P.; Akorli, Jewelna; Haddrill, Penelope R.; Obbard, Darren J.
2015-01-01
Drosophila melanogaster is a valuable invertebrate model for viral infection and antiviral immunity, and is a focus for studies of insect-virus coevolution. Here we use a metagenomic approach to identify more than 20 previously undetected RNA viruses and a DNA virus associated with wild D. melanogaster. These viruses not only include distant relatives of known insect pathogens but also novel groups of insect-infecting viruses. By sequencing virus-derived small RNAs, we show that the viruses represent active infections of Drosophila. We find that the RNA viruses differ in the number and properties of their small RNAs, and we detect both siRNAs and a novel miRNA from the DNA virus. Analysis of small RNAs also allows us to identify putative viral sequences that lack detectable sequence similarity to known viruses. By surveying >2,000 individually collected wild adult Drosophila we show that more than 30% of D. melanogaster carry a detectable virus, and more than 6% carry multiple viruses. However, despite a high prevalence of the Wolbachia endosymbiont—which is known to be protective against virus infections in Drosophila—we were unable to detect any relationship between the presence of Wolbachia and the presence of any virus. Using publicly available RNA-seq datasets, we show that the community of viruses in Drosophila laboratories is very different from that seen in the wild, but that some of the newly discovered viruses are nevertheless widespread in laboratory lines and are ubiquitous in cell culture. By sequencing viruses from individual wild-collected flies we show that some viruses are shared between D. melanogaster and D. simulans. Our results provide an essential evolutionary and ecological context for host–virus interaction in Drosophila, and the newly reported viral sequences will help develop D. melanogaster further as a model for molecular and evolutionary virus research. PMID:26172158
Searching for δ Scuti-type pulsation and characterising northern pre-main-sequence field stars
NASA Astrophysics Data System (ADS)
Díaz-Fraile, D.; Rodríguez, E.; Amado, P. J.
2014-08-01
Context. Pre-main-sequence (PMS) stars are objects evolving from the birthline to the zero-age main sequence (ZAMS). Given a mass range near the ZAMS, the temperatures and luminosities of PMS and main-sequence stars are very similar. Moreover, their evolutionary tracks intersect one another causing some ambiguity in the determination of their evolutionary status. In this context, the detection and study of pulsations in PMS stars is crucial for differentiating between both types of stars by obtaining information of their interiors via asteroseismic techniques. Aims: A photometric variability study of a sample of northern field stars, which previously classified as either PMS or Herbig Ae/Be objects, has been undertaken with the purpose of detecting δ Scuti-type pulsations. Determination of physical parameters for these stars has also been carried out to locate them on the Hertzsprung-Russell diagram and check the instability strip for this type of pulsators. Methods: Multichannel photomultiplier and CCD time series photometry in the uvby Strömgren and BVI Johnson bands were obtained during four consecutive years from 2007 to 2010. The light curves have been analysed, and a variability criterion has been established. Among the objects classified as variable stars, we have selected those which present periodicities above 4 d-1, which was established as the lowest limit for δ Scuti-type pulsations in this investigation. Finally, these variable stars have been placed in a colour-magnitude diagram using the physical parameters derived with the collected uvbyβ Strömgren-Crawford photometry. Results: Five PMS δ Scuti- and three probable β Cephei-type stars have been detected. Two additional PMS δ Scuti stars are also confirmed in this work. Moreover, three new δ Scuti- and two γ Doradus-type stars have been detected among the main-sequence objects used as comparison or check stars.
Identification of Shifts and Trends in Hydrometric Data in Canada Based on Several Detection Tests
NASA Astrophysics Data System (ADS)
Lauzon, N.; Lence, B. J.
2004-05-01
This work proposes new detection tests based on the Kohonen neural network and on fuzzy c-means for the identification of shifts and trends in data sequences. Annual mean and maximum flow sequences are considered as application case, for they have often been considered for the study of shifts and trends in hydrologic data. In recent years, several studies for the identification of trends have been accomplished with North American hydrometric data, often making use of only one detection test. The assumption here is that one cannot rely on only one test, and consequently several are employed in this work. A total of eight tests are considered, four for shifts and four for trends. Four of these tests, two for shifts and two for trends, are conventional statistical tests that are regularly employed, while the other four are developed based on the Kohonen neural network and on fuzzy c-means. Data from a group of 40 hydrometric stations across Canada are assessed for the detection of shifts and trends in time periods of 30, 40 and 50 years. While the results obtained confirm the conclusions of previous studies performed on similar groups of data, they also indicate that each test may behave differently from one another. For example, one test may detect a trend in a given sequence while the other tests do not, or vice-versa. Thus, the strategy of using several tests ensures not only that they may confirm each others diagnostics but also may complement each other in the case of divergent diagnostics, with the possibility of improving the final conclusion on the detection of shifts and trends. Using artificial intelligence techniques for the construction of detection tests constitutes also a departure from the use of statistics, and a discussion in this work on complementary studies (i.e. detection on multivariate cases) highlights the possibility of enhanced performance by the artificial intelligence-based tests compared with conventional detection tests.
Aoki, Narumi; Tsutsumi, Kadzuyo; Deshimaru, Masanobu; Terada, Shigeyuki
2008-02-01
An antihemorrhagic protein has been isolated from the serum of Chinese mamushi (Gloydius blomhoffi brevicaudus) by using a combination of ethanol precipitation and a reverse-phase high-performance liquid chromatography (HPLC) on a C8 column. This protein-designated Chinese mamushi serum factor (cMSF)-suppressed mamushi venom-induced hemorrhage in a dose-dependent manner. It had no effect on trypsin, chymotrypsin, thermolysin, and papain but inhibited the proteinase activities of several snake venom metalloproteinases (SVMPs) including hemorrhagic enzymes isolated from the venoms of mamushi and habu (Trimeresurus flavoviridis). A similar protein (Japanese MSF, jMSF) with antihemorrhagic activity has also been purified from the sera of Japanese mamushi (G. blomhoffi). The N-terminal 70 and 51 residues of the intact cMSF and jMSF were directly analyzed; a similarity between the sequences of two MSFs to that of antihemorrhagic protein (HSF) from habu serum was noticed. To obtain the complete amino acid sequences of MSFs, cDNAs encoding these proteins were cloned from the liver mRNA of Chinese and Japanese vipers based on their N-terminal amino acid sequences. The mature forms of both MSFs consisted of 305 amino acids with a 19-residue signal sequence, and a unique 17-residue deletion was detected in their His-rich domains.
Seok, Yoonmi; Bae, Il Kwon; Jeong, Seok Hoon; Kim, Soo Hyun; Lee, Hyukmin; Lee, Kyungwon
2011-12-01
To investigate the epidemiological traits of Pseudomonas aeruginosa clinical isolates producing metallo-β-lactamases (MBLs) in Korea. A total of 386 non-duplicate P. aeruginosa clinical isolates were collected from Korea in 2009. Detection of MBL genes was performed by PCR. The genetic organization of class 1 integrons carrying the MBL gene cassette was investigated by PCR mapping and sequencing. The epidemiological relationships of the isolates were investigated by multilocus sequence typing and PFGE. Of 386 P. aeruginosa isolates, 30 (7.8%) isolates carried the bla(IMP-6) gene and 1 (0.3%) isolate carried the bla(VIM-2) gene. A probe specific for the bla(IMP-6) gene was hybridized to an ∼950 kbp I-CeuI-macrorestriction fragment from all 30 isolates and a probe specific for the bla(VIM-2) gene also hybridized to an ∼500 kbp I-CeuI-macrorestriction fragment from 1 isolate (BDC10). All 31 MBL-producing isolates shared an identical sequence type (ST), ST235, and they carried the same bla(OXA-50) allelic type, bla(OXA-50g). All MBL-producing isolates showed similar XbaI-macrorestriction patterns (similarity >85%), irrespective of MBL genotype. P. aeruginosa ST235 carrying the chromosomally located bla(IMP-6) gene is widely disseminated in Korea.
Hatamoto, Masashi; Kimura, Masafumi; Sato, Takafumi; Koizumi, Masato; Takahashi, Masanobu; Kawakami, Shuji; Araki, Nobuo; Yamaguchi, Takashi
2014-01-01
Denitrifying anaerobic methane oxidizing (DAMO) microorganisms were enriched from paddy field soils using continuous-flow and batch cultures fed with nitrate or nitrite as a sole electron acceptor. After several months of cultivation, the continuous-flow cultures using nitrite showed remarkable simultaneous methane oxidation and nitrite reduction and DAMO bacteria belonging to phylum NC10 were enriched. A maximum volumetric nitrite consumption rate of 70.4±3.4 mg-N·L(-1)·day(-1) was achieved with very short hydraulic retention time of 2.1 hour. In the culture, about 68% of total microbial cells were bacteria and no archaeal cells were detected by fluorescence in situ hybridization. In the nitrate-fed continuous-flow cultures, 58% of total microbial cells were bacteria while archaeal cells accounted for 7% of total cell numbers. Phylogenetic analysis of pmoA gene sequence showed that enriched DAMO bacteria in the continuous-flow cultivation had over 98% sequence similarity to DAMO bacteria in the inoculum. In contrast, for batch culture, the enriched pmoA gene sequences had 89-91% sequence similarity to DAMO bacteria in the inoculum. These results indicate that electron acceptor and cultivation method strongly affect the microbial community structures of DAMO consortia.
Panosyan, Hovik; Birkeland, Nils-Kåre
2014-11-01
The phylogenetic diversity of the prokaryotic community thriving in the Arzakan hot spring in Armenia was studied using molecular and culture-based methods. A sequence analysis of 16S rRNA gene clone libraries demonstrated the presence of a diversity of microorganisms belonging to the Alphaproteobacteria, Betaproteobacteria, Gammaproteobacteria, Epsilonproteobacteria, Firmicutes, Bacteroidetes phyla, and Cyanobacteria. Proteobacteria was the dominant group, representing 52% of the bacterial clones. Denaturing gradient gel electrophoresis profiles of the bacterial 16S rRNA gene fragments also indicated the abundance of Proteobacteria, Bacteroidetes, and Cyanobacteria populations. Most of the sequences were most closely related to uncultivated microorganisms and shared less than 96% similarity with their closest matches in GenBank, indicating that this spring harbors a unique community of novel microbial species or genera. The majority of the sequences of an archaeal 16S rRNA gene library, generated from a methanogenic enrichment, were close relatives of members of the genus Methanoculleus. Aerobic endospore-forming bacteria mainly belonging to Bacillus and Geobacillus were detected only by culture-dependent methods. Three isolates were successfully obtained having 99, 96, and 96% 16S rRNA gene sequence similarities to Arcobacter sp., Methylocaldum sp., and Methanoculleus sp., respectively. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Lu, Jing; Guo, Xue; Zhang, Yong; Li, Hui; Liu, Leng; Zeng, Hanri; Fang, Ling; Mo, Yanling; Yoshida, Hiromu; Yi, Lina; Liu, Tao; Rutherford, Shannon; Xu, Wenbo; Ke, Changwen
2015-01-01
An aseptic meningitis outbreak occurred in Luoding City of Guangdong, China, in 2012, and echovirus type 30 (ECHO30) was identified as the major causative pathogen. Environmental surveillance indicated that ECHO30 was detected in the sewage of a neighboring city, Guangzhou, from 2010 to 2012 and also in Luoding City sewage samples (6/43, 14%) collected after the outbreak. In order to track the potential origin of the outbreak viral strains, we sequenced the VP1 genes of 29 viral strains from clinical patients and environmental samples. Sequence alignments and phylogenetic analyses based on VP1 gene sequences revealed that virus strains isolated from the sewage of Guangzhou and Luoding cities matched well the clinical strains from the outbreak, with high nucleotide sequence similarity (98.5% to 100%) and similar cluster distribution. Five ECHO30 clinical strains were clustered with the Guangdong environmental strains but diverged from strains from other regions, suggesting that this subcluster of viruses most likely originated from the circulating virus in Guangdong rather than having been more recently imported from other regions. These findings underscore the importance of long-term, continuous environmental surveillance and genetic analysis to monitor circulating enteroviruses. PMID:25616804
Passos-Castilho, Ana Maria; Granato, Celso Francisco Hernandes
Hepatitis E virus is responsible for acute and chronic liver infections worldwide. Swine hepatitis E virus has been isolated in Brazil, and a probable zoonotic transmission has been described, although data are still scarce. The aim of this study was to investigate the frequency of hepatitis E virus infection in pigs from a small-scale farm in the rural area of Paraná State, South Brazil. Fecal samples were collected from 170 pigs and screened for hepatitis E virus RNA using a duplex real-time RT-PCR targeting a highly conserved 70nt long sequence within overlapping parts of ORF2 and ORF3 as well as a 113nt sequence of ORF2. Positive samples with high viral loads were subjected to direct sequencing and phylogenetic analysis. hepatitis E virus RNA was detected in 34 (20.0%) of the 170 pigs following positive results in at least one set of screening real-time RT-PCR primers and probes. The swine hepatitis E virus strains clustered with the genotype hepatitis E virus-3b reference sequences in the phylogenetic analysis and showed close similarity to human hepatitis E virus isolates previously reported in Brazil. Copyright © 2017 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Solomon, Isaac H; Spera, Kristyn M; Ryan, Sophia L; Helgager, Jeffrey; Andrici, Juliana; Zaki, Sherif R; Vaitkevicius, Henrikas; Leon, Kristoffer E; Wilson, Michael R; DeRisi, Joseph L; Koo, Sophia; Smirnakis, Stelios M; De Girolami, Umberto
2018-06-01
Powassan virus is a rare but increasingly recognized cause of severe neurological disease. To highlight the diagnostic challenges and neuropathological findings in a fatal case of Powassan encephalitis caused by deer tick virus (lineage II) in a patient with follicular lymphoma receiving rituximab, with nonspecific anti-GAD65 antibodies, who was initially seen with fever and orchiepididymitis. Comparison of clinical, radiological, histological, and laboratory findings, including immunohistochemistry, real-time polymerase chain reaction, antibody detection, and unbiased sequencing assays, in a single case report (first seen in December 2016) at an academic medical center. Infection with Powassan virus. Results of individual assays compared retrospectively. In a 63-year-old man with fatal Powassan encephalitis, serum and cerebrospinal fluid IgM antibodies were not detected via standard methods, likely because of rituximab exposure. Neuropathological findings were extensive, including diffuse leptomeningeal and parenchymal lymphohistiocytic infiltration, microglial proliferation, marked neuronal loss, and white matter microinfarctions most severely involving the cerebellum, thalamus, and basal ganglia. Diagnosis was made after death by 3 independent methods, including demonstration of Powassan virus antigen in brain biopsy and autopsy tissue, detection of viral RNA in serum and cerebrospinal fluid by targeted real-time polymerase chain reaction, and detection of viral RNA in cerebrospinal fluid by unbiased sequencing. Extensive testing for other etiologies yielded negative results, including mumps virus owing to prodromal orchiepididymitis. Low-titer anti-GAD65 antibodies identified in serum, suggestive of limbic encephalitis, were not detected in cerebrospinal fluid. Owing to the rarity of Powassan encephalitis, a high degree of suspicion is required to make the diagnosis, particularly in an immunocompromised patient, in whom antibody-based assays may be falsely negative. Unbiased sequencing assays have the potential to detect uncommon infectious agents and may prove useful in similar scenarios.
Genetic characterization of a novel astrovirus in Pekin ducks.
Liao, Qinfeng; Liu, Ning; Wang, Xiaoyan; Wang, Fumin; Zhang, Dabing
2015-06-01
Three divergent groups of duck astroviruses (DAstVs), namely DAstV-1, DAstV-2 (formerly duck hepatitis virus type 3) and DAstV-3 (isolate CPH), and other avastroviruses are known to infect domestic ducks. To provide more data regarding the molecular epidemiology of astroviruses in domestic ducks, we examined the prevalence of astroviruses in 136 domestic duck samples collected from four different provinces of China. Nineteen goose samples were also included. Using an astrovirus-specific reverse transcription-PCR assay, two groups of astroviruses were detected from our samples. A group of astroviruses detected from Pekin ducks, Shaoxing ducks and Landes geese were highly similar to the newly discovered DAstV-3. More interestingly, a novel group of avastroviruses, which we named DAstV-4, was detected in Pekin ducks. Following full-length sequencing and sequence analysis, the variation between DAstV-4 and other avastroviruses in terms of lengths of genome and internal component was highlighted. Sequence identity and phylogenetic analyses based on the amino acid sequences of the three open reading frames (ORFs) clearly demonstrated that DAstV-4 was highly divergent from all other avastroviruses. Further analyses showed that DAstV-4 shared low levels of genome identities (50-58%) and high levels of mean amino acid genetic distances in the ORF2 sequences (0.520-0.801) with other avastroviruses, suggesting DAstV-4 may represent an additional avastrovirus species although the taxonomic relationship of DAstV-4 to DAstV-3 remains to be resolved. The present works contribute to the understanding of epidemiology, ecology and taxonomy of astroviruses in ducks. Copyright © 2015 Elsevier B.V. All rights reserved.
Coleman, Russell E; Hochberg, Lisa P; Swanson, Katherine I; Lee, John S; McAvin, James C; Moulton, John K; Eddington, David O; Groebner, Jennifer L; O'Guinn, Monica L; Putnam, John L
2009-05-01
Sand flies collected between April 2003 and November 2004 at Tallil Air Base, Iraq, were evaluated for the presence of Leishmania parasites using a combination of a real-time Leishmania-generic polymerase chain reaction (PCR) assay and sequencing of a 360-bp fragment of the glucose-6-phosphate-isomerase (GPI) gene. A total of 2,505 pools containing 26,574 sand flies were tested using the real-time PCR assay. Leishmania DNA was initially detected in 536 pools; however, after extensive retesting with the real-time PCR assay, a total of 456 pools were considered positive and 80 were considered indeterminate. A total of 532 samples were evaluated for Leishmania GPI by sequencing, to include 439 PCR-positive samples, 80 PCR-indeterminate samples, and 13 PCR-negative samples. Leishmania GPI was detected in 284 samples that were sequenced, to include 281 (64%) of the PCR-positive samples and 3 (4%) of the PCR-indeterminate samples. Of the 284 sequences identified as Leishmania, 261 (91.9%) were L. tarentolae, 18 (6.3%) were L. donovani-complex parasites, 3 (1.1%) were L. tropica, and 2 were similar to both L. major and L. tropica. Minimum field infection rates were 0.09% for L. donovani-complex parasites, 0.02% for L. tropica, and 0.01% for the L. major/tropica-like parasite. Subsequent sequencing of a 600-bp region of the "Hyper" gene of 12 of the L. donovani-complex parasites showed that all 12 parasites were L. infantum. These data suggest that L. infantum was the primary leishmanial threat to U.S. military personnel deployed to Tallil Air Base. The implications of these findings are discussed.
Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica
2016-02-18
The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through the graphical user interface ( http://compbio.math.hr/ ). Our results show that scanning with a carefully parameterized motif-HMM is an effective approach for annotation of protein families with low sequence similarity and conserved motifs. The results of this study expand current knowledge and provide new insights into the evolution of the large GDSL-lipase family in land plants.
Collier, Ashley; Carr, Steven M
2018-03-29
Claims have long been made as to the survival to the present day of descendants of the Newfoundland Beothuk, a group generally accepted to have become extinct with the death of the last known member, Shanawdithit, in 1829. Interest has recently been revived by the availability of commercial genetic testing, which some claim can assign living individuals to specific Native American groups. We compare complete mitogenome sequences (16569 bp) from aDNA of eight distinct Beothuk lineages, including Shanawdithit's uncle Nonosabasut and his wife Demasduit, with three Newfoundland Mi'kmaq lineages and 21 other living Native Americans drawn from GenBank. A Newfoundland Mi'kmaq lineage in Haplogroup A is more similar to three Native Americans (1-3 SNPs) than to the most closely related Beothuk (24 SNPs). Nonosabasut in Haplogroup X is identical to a non-Beothuk Native American. Demasduit in Haplogroup C differs from three other Native Americans by 1-4 substitutions. Within a 2168 bp region of the HVS sequences available from living Mi'kmaq of the Miawpukek First Nation in Newfoundland, lineages in Haplogroups C, X, and A differ by 1, 4, and 8 substitutions, from the most similar Beothuk, and are more similar to other Native Americans. MtDNA genome sequences in living persons identical or similar to those of Beothuk do not necessarily indicate Beothuk ancestry. Mi'kmaq lineages cannot at this time be associated with any Beothuk lineages more closely than those of other Native Americans.
Nasanit, Rujikan; Tangwong-O-Thai, Apirat; Tantirungkij, Manee; Limtong, Savitree
2015-12-01
The diversity of epiphytic yeasts from sugarcane (Saccharum officinarum Linn.) phyllospheres in Thailand was investigated by culture-independent method based on the analysis of the D1/D2 domains of the large subunit rRNA gene sequences. Forty-five samples of sugarcane leaf were collected randomly from ten provinces in Thailand. A total of 1342 clones were obtained from 45 clone libraries. 426 clones (31.7 %) were closely related to yeast strains in the GenBank database, and they were clustered into 31 operational taxonomic units (OTUs) with a similarity threshold of 99 %. All OTU sequences were classified in phylum Basidiomycota which were closely related to 11 yeast species in seven genera including Cryptococcus flavus, Hannaella coprosmaensis, Rhodotorula taiwanensis, Jaminaea angkoreiensis, Malassezia restricta, Pseudozyma antarctica, Pseudozyma aphidis, Pseudozyma hubeiensis, Pseudozyma prolifica, Pseudozyma shanxiensis, and Sporobolomyces vermiculatus. The most predominant yeasts detected belonged to Ustilaginales with 89.4 % relative frequency and the prevalent yeast genus was Pseudozyma. However, the majority were unable to be identified as known yeast species and these sequences may represent the sequences of new yeast taxa. In addition, The OTU that closely related to P. prolifica was commonly detected in sugarcane phyllosphere. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
A generalized global alignment algorithm.
Huang, Xiaoqiu; Chao, Kun-Mao
2003-01-22
Homologous sequences are sometimes similar over some regions but different over other regions. Homologous sequences have a much lower global similarity if the different regions are much longer than the similar regions. We present a generalized global alignment algorithm for comparing sequences with intermittent similarities, an ordered list of similar regions separated by different regions. A generalized global alignment model is defined to handle sequences with intermittent similarities. A dynamic programming algorithm is designed to compute an optimal general alignment in time proportional to the product of sequence lengths and in space proportional to the sum of sequence lengths. The algorithm is implemented as a computer program named GAP3 (Global Alignment Program Version 3). The generalized global alignment model is validated by experimental results produced with GAP3 on both DNA and protein sequences. The GAP3 program extends the ability of standard global alignment programs to recognize homologous sequences of lower similarity. The GAP3 program is freely available for academic use at http://bioinformatics.iastate.edu/aat/align/align.html.
Li, Jonathan Z; Chapman, Brad; Charlebois, Patrick; Hofmann, Oliver; Weiner, Brian; Porter, Alyssa J; Samuel, Reshmi; Vardhanabhuti, Saran; Zheng, Lu; Eron, Joseph; Taiwo, Babafemi; Zody, Michael C; Henn, Matthew R; Kuritzkes, Daniel R; Hide, Winston; Wilson, Cara C; Berzins, Baiba I; Acosta, Edward P; Bastow, Barbara; Kim, Peter S; Read, Sarah W; Janik, Jennifer; Meres, Debra S; Lederman, Michael M; Mong-Kryspin, Lori; Shaw, Karl E; Zimmerman, Louis G; Leavitt, Randi; De La Rosa, Guy; Jennings, Amy
2014-01-01
The impact of raltegravir-resistant HIV-1 minority variants (MVs) on raltegravir treatment failure is unknown. Illumina sequencing offers greater throughput than 454, but sequence analysis tools for viral sequencing are needed. We evaluated Illumina and 454 for the detection of HIV-1 raltegravir-resistant MVs. A5262 was a single-arm study of raltegravir and darunavir/ritonavir in treatment-naïve patients. Pre-treatment plasma was obtained from 5 participants with raltegravir resistance at the time of virologic failure. A control library was created by pooling integrase clones at predefined proportions. Multiplexed sequencing was performed with Illumina and 454 platforms at comparable costs. Illumina sequence analysis was performed with the novel snp-assess tool and 454 sequencing was analyzed with V-Phaser. Illumina sequencing resulted in significantly higher sequence coverage and a 0.095% limit of detection. Illumina accurately detected all MVs in the control library at ≥0.5% and 7/10 MVs expected at 0.1%. 454 sequencing failed to detect any MVs at 0.1% with 5 false positive calls. For MVs detected in the patient samples by both 454 and Illumina, the correlation in the detected variant frequencies was high (R2 = 0.92, P<0.001). Illumina sequencing detected 2.4-fold greater nucleotide MVs and 2.9-fold greater amino acid MVs compared to 454. The only raltegravir-resistant MV detected was an E138K mutation in one participant by Illumina sequencing, but not by 454. In participants of A5262 with raltegravir resistance at virologic failure, baseline raltegravir-resistant MVs were rarely detected. At comparable costs to 454 sequencing, Illumina demonstrated greater depth of coverage, increased sensitivity for detecting HIV MVs, and fewer false positive variant calls.
High resolution identity testing of inactivated poliovirus vaccines
Mee, Edward T.; Minor, Philip D.; Martin, Javier
2015-01-01
Background Definitive identification of poliovirus strains in vaccines is essential for quality control, particularly where multiple wild-type and Sabin strains are produced in the same facility. Sequence-based identification provides the ultimate in identity testing and would offer several advantages over serological methods. Methods We employed random RT-PCR and high throughput sequencing to recover full-length genome sequences from monovalent and trivalent poliovirus vaccine products at various stages of the manufacturing process. Results All expected strains were detected in previously characterised products and the method permitted identification of strains comprising as little as 0.1% of sequence reads. Highly similar Mahoney and Sabin 1 strains were readily discriminated on the basis of specific variant positions. Analysis of a product known to contain incorrect strains demonstrated that the method correctly identified the contaminants. Conclusion Random RT-PCR and shotgun sequencing provided high resolution identification of vaccine components. In addition to the recovery of full-length genome sequences, the method could also be easily adapted to the characterisation of minor variant frequencies and distinction of closely related products on the basis of distinguishing consensus and low frequency polymorphisms. PMID:26049003
A Multicenter Study To Evaluate the Performance of High-Throughput Sequencing for Virus Detection
Ng, Siemon H. S.; Vandeputte, Olivier; Aljanahi, Aisha; Deyati, Avisek; Cassart, Jean-Pol; Charlebois, Robert L.; Taliaferro, Lanyn P.
2017-01-01
ABSTRACT The capability of high-throughput sequencing (HTS) for detection of known and unknown viruses makes it a powerful tool for broad microbial investigations, such as evaluation of novel cell substrates that may be used for the development of new biological products. However, like any new assay, regulatory applications of HTS need method standardization. Therefore, our three laboratories initiated a study to evaluate performance of HTS for potential detection of viral adventitious agents by spiking model viruses in different cellular matrices to mimic putative materials for manufacturing of biologics. Four model viruses were selected based upon different physical and biochemical properties and commercial availability: human respiratory syncytial virus (RSV), Epstein-Barr virus (EBV), feline leukemia virus (FeLV), and human reovirus (REO). Additionally, porcine circovirus (PCV) was tested by one laboratory. Independent samples were prepared for HTS by spiking intact viruses or extracted viral nucleic acids, singly or mixed, into different HeLa cell matrices (resuspended whole cells, cell lysate, or total cellular RNA). Data were obtained using different sequencing platforms (Roche 454, Illumina HiSeq1500 or HiSeq2500). Bioinformatic analyses were performed independently by each laboratory using available tools, pipelines, and databases. The results showed that comparable virus detection was obtained in the three laboratories regardless of sample processing, library preparation, sequencing platform, and bioinformatic analysis: between 0.1 and 3 viral genome copies per cell were detected for all of the model viruses used. This study highlights the potential for using HTS for sensitive detection of adventitious viruses in complex biological samples containing cellular background. IMPORTANCE Recent high-throughput sequencing (HTS) investigations have resulted in unexpected discoveries of known and novel viruses in a variety of sample types, including research materials, clinical materials, and biological products. Therefore, HTS can be a powerful tool for supplementing current methods for demonstrating the absence of adventitious or unwanted viruses in biological products, particularly when using a new cell line. However, HTS is a complex technology with different platforms, which needs standardization for evaluation of biologics. This collaborative study was undertaken to investigate detection of different virus types using two different HTS platforms. The results of the independently performed studies demonstrated a similar sensitivity of virus detection, regardless of the different sample preparation and processing procedures and bioinformatic analyses done in the three laboratories. Comparable HTS detection of different virus types supports future development of reference virus materials for standardization and validation of different HTS platforms. PMID:28932815
Clifford, Jacob; Adami, Christoph
2015-09-02
Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
Okamura, Yukio; Watanabe, Yuichiro
2006-01-01
Fluorescence resonance energy transfer (FRET) occurs when two fluorophores are in close proximity, and the emission energy of a donor fluorophore is transferred to excite an acceptor fluorophore. Using such fluorescently labeled oligonucleotides as FRET probes, makes possible specific detection of RNA molecules even if similar sequences are present in the environment. A higher ratio of signal to background fluorescence is required for more sensitive probe detection. We found that double-labeled donor probes labeled with BODIPY dye resulted in a remarkable increase in fluorescence intensity compared to single-labeled donor probes used in conventional FRET. Application of this double-labeled donor system can improve a variety of FRET techniques.
Han, Lin; Wu, Hua-Jun; Zhu, Haiying; Kim, Kun-Yong; Marjani, Sadie L; Riester, Markus; Euskirchen, Ghia; Zi, Xiaoyuan; Yang, Jennifer; Han, Jasper; Snyder, Michael; Park, In-Hyun; Irizarry, Rafael; Weissman, Sherman M; Michor, Franziska; Fan, Rong; Pan, Xinghua
2017-06-02
Conventional DNA bisulfite sequencing has been extended to single cell level, but the coverage consistency is insufficient for parallel comparison. Here we report a novel method for genome-wide CpG island (CGI) methylation sequencing for single cells (scCGI-seq), combining methylation-sensitive restriction enzyme digestion and multiple displacement amplification for selective detection of methylated CGIs. We applied this method to analyzing single cells from two types of hematopoietic cells, K562 and GM12878 and small populations of fibroblasts and induced pluripotent stem cells. The method detected 21 798 CGIs (76% of all CGIs) per cell, and the number of CGIs consistently detected from all 16 profiled single cells was 20 864 (72.7%), with 12 961 promoters covered. This coverage represents a substantial improvement over results obtained using single cell reduced representation bisulfite sequencing, with a 66-fold increase in the fraction of consistently profiled CGIs across individual cells. Single cells of the same type were more similar to each other than to other types, but also displayed epigenetic heterogeneity. The method was further validated by comparing the CpG methylation pattern, methylation profile of CGIs/promoters and repeat regions and 41 classes of known regulatory markers to the ENCODE data. Although not every minor methylation differences between cells are detectable, scCGI-seq provides a solid tool for unsupervised stratification of a heterogeneous cell population. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Babbar, Anshu; Itzek, Andreas; Pieper, Dietmar H; Nitsche-Schmitz, D Patric
2018-03-12
Streptococcus dysgalactiae subsp. equisimilis (SDSE), belonging to the group C and G streptococci, are human pathogens reported to cause clinical manifestations similar to infections caused by Streptococcus pyogenes. To scrutinize the distribution of gene coding for S. pyogenes virulence factors in SDSE, 255 isolates were collected from humans infected with SDSE in Vellore, a region in southern India, with high incidence of SDSE infections. Initial evaluation indicated SDSE isolates comprising of 82.35% group G and 17.64% group C. A multiplex PCR system was used to detect 21 gene encoding virulence-associated factors of S. pyogenes, like superantigens, DNases, proteinases, and other immune modulatory toxins. As validated by DNA sequencing of the PCR products, sequences homologous to speC, speG, speH, speI, speL, ssa and smeZ of the family of superantigen coding genes and for DNases like sdaD and sdc were detected in the SDSE collection. Furthermore, there was high abundance (48.12% in group G and 86.6% in group C SDSE) of scpA, the gene coding for C5a peptidase in these isolates. Higher abundance of S. pyogenes virulence factor genes was observed in SDSE of Lancefield group C as compared to group G, even though the incidence rates in former were lower. This study not only substantiates detection of S. pyogenes virulence factor genes in whole genome sequenced SDSE but also makes significant contribution towards the understanding of SDSE and its increasing virulence potential.
Dridi, Bédis; Henry, Mireille; El Khéchine, Amel; Raoult, Didier; Drancourt, Michel
2009-01-01
Background The low and variable prevalence of Methanobrevibacter smithii and Methanosphaera stadtmanae DNA in human stool contrasts with the paramount role of these methanogenic Archaea in digestion processes. We hypothesized that this contrast is a consequence of the inefficiencies of current protocols for archaeon DNA extraction. We developed a new protocol for the extraction and PCR-based detection of M. smithii and M. stadtmanae DNA in human stool. Methodology/Principal Findings Stool specimens collected from 700 individuals were filtered, mechanically lysed twice, and incubated overnight with proteinase K prior to DNA extraction using a commercial DNA extraction kit. Total DNA was used as a template for quantitative real-time PCR targeting M. smithii and M. stadtmanae 16S rRNA and rpoB genes. Amplification of 16S rRNA and rpoB yielded positive detection of M. smithii in 95.7% and M. stadtmanae in 29.4% of specimens. Sequencing of 16S rRNA gene PCR products from 30 randomly selected specimens (15 for M. smithii and 15 for M. stadtmanae) yielded a sequence similarity of 99–100% using the reference M. smithii ATCC 35061 and M. stadtmanae DSM 3091 sequences. Conclusions/Significance In contrast to previous reports, these data indicate a high prevalence of the methanogens M. smithii and M. stadtmanae in the human gut, with the former being an almost ubiquitous inhabitant of the intestinal microbiome. PMID:19759898
Ramey, Andy M.; Pearce, John M.; Reeves, A.B.; Franson, J. Christian; Petersen, Margaret R.; Ip, Hon S.
2011-01-01
Avian influenza virus (AIV) prevalence and sequence data were analyzed for Steller's eiders (Polysticta stelleri) to assess the role of this species in transporting virus genes between continents and maintaining a regional viral reservoir with sympatric northern pintails (Anas acuta). AIV prevalence was 0.2% at Izembek Lagoon and 3.9% at Nelson Lagoon for Steller's eiders and 11.2% for northern pintails at Izembek Lagoon. Phylogenetic analysis of 13 AIVs from Steller's eiders revealed that 4.9% of genes were of Eurasian origin. Seven subtypes were detected, including two also observed in northern pintails. No AIV strains were highly similar (> 99%) at all gene segments between species; however, highly similar individual genes were detected. The proportion of highly similar genes was greater within rather than between species. Steller's eiders likely transport AIV genes between continents through long-distance migratory movements. Differences in AIV prevalence, subtype distribution, and the proportion of highly similar genes suggest limited AIV exchange between Steller's eiders and northern pintails at Alaska Peninsula coastal lagoons during autumn.
Ramey, Andrew M; Pearce, John M; Reeves, Andrew B; Franson, J Christian; Petersen, Margaret R; Ip, Hon S
2011-10-01
Avian influenza virus (AIV) prevalence and sequence data were analyzed for Steller's eiders (Polysticta stelleri) to assess the role of this species in transporting virus genes between continents and maintaining a regional viral reservoir with sympatric northern pintails (Anas acuta). AIV prevalence was 0.2% at Izembek Lagoon and 3.9% at Nelson Lagoon for Steller's eiders and 11.2% for northern pintails at Izembek Lagoon. Phylogenetic analysis of 13 AIVs from Steller's eiders revealed that 4.9% of genes were of Eurasian origin. Seven subtypes were detected, including two also observed in northern pintails. No AIV strains were highly similar (> 99%) at all gene segments between species; however, highly similar individual genes were detected. The proportion of highly similar genes was greater within rather than between species. Steller's eiders likely transport AIV genes between continents through long-distance migratory movements. Differences in AIV prevalence, subtype distribution, and the proportion of highly similar genes suggest limited AIV exchange between Steller's eiders and northern pintails at Alaska Peninsula coastal lagoons during autumn.
MRPrimerV: a database of PCR primers for RNA virus detection
Kim, Hyerin; Kang, NaNa; An, KyuHyeon; Kim, Doyun; Koo, JaeHyung; Kim, Min-Soo
2017-01-01
Many infectious diseases are caused by viral infections, and in particular by RNA viruses such as MERS, Ebola and Zika. To understand viral disease, detection and identification of these viruses are essential. Although PCR is widely used for rapid virus identification due to its low cost and high sensitivity and specificity, very few online database resources have compiled PCR primers for RNA viruses. To effectively detect viruses, the MRPrimerV database (http://MRPrimerV.com) contains 152 380 247 PCR primer pairs for detection of 1818 viruses, covering 7144 coding sequences (CDSs), representing 100% of the RNA viruses in the most up-to-date NCBI RefSeq database. Due to rigorous similarity testing against all human and viral sequences, every primer in MRPrimerV is highly target-specific. Because MRPrimerV ranks CDSs by the penalty scores of their best primer, users need only use the first primer pair for a single-phase PCR or the first two primer pairs for two-phase PCR. Moreover, MRPrimerV provides the list of genome neighbors that can be detected using each primer pair, covering 22 192 variants of 532 RefSeq RNA viruses. We believe that the public availability of MRPrimerV will facilitate viral metagenomics studies aimed at evaluating the variability of viruses, as well as other scientific tasks. PMID:27899620
Sequence-similar, structure-dissimilar protein pairs in the PDB.
Kosloff, Mickey; Kolodny, Rachel
2008-05-01
It is often assumed that in the Protein Data Bank (PDB), two proteins with similar sequences will also have similar structures. Accordingly, it has proved useful to develop subsets of the PDB from which "redundant" structures have been removed, based on a sequence-based criterion for similarity. Similarly, when predicting protein structure using homology modeling, if a template structure for modeling a target sequence is selected by sequence alone, this implicitly assumes that all sequence-similar templates are equivalent. Here, we show that this assumption is often not correct and that standard approaches to create subsets of the PDB can lead to the loss of structurally and functionally important information. We have carried out sequence-based structural superpositions and geometry-based structural alignments of a large number of protein pairs to determine the extent to which sequence similarity ensures structural similarity. We find many examples where two proteins that are similar in sequence have structures that differ significantly from one another. The source of the structural differences usually has a functional basis. The number of such proteins pairs that are identified and the magnitude of the dissimilarity depend on the approach that is used to calculate the differences; in particular sequence-based structure superpositioning will identify a larger number of structurally dissimilar pairs than geometry-based structural alignments. When two sequences can be aligned in a statistically meaningful way, sequence-based structural superpositioning provides a meaningful measure of structural differences. This approach and geometry-based structure alignments reveal somewhat different information and one or the other might be preferable in a given application. Our results suggest that in some cases, notably homology modeling, the common use of nonredundant datasets, culled from the PDB based on sequence, may mask important structural and functional information. We have established a data base of sequence-similar, structurally dissimilar protein pairs that will help address this problem (http://luna.bioc.columbia.edu/rachel/seqsimstrdiff.htm).
Geisler, Christoph
2018-02-07
Adventitious viral contamination in cell substrates used for biologicals production is a major safety concern. A powerful new approach that can be used to identify adventitious viruses is a combination of bioinformatics tools with massively parallel sequencing technology. Typically, this involves mapping or BLASTN searching individual reads against viral nucleotide databases. Although extremely sensitive for known viruses, this approach can easily miss viruses that are too dissimilar to viruses in the database. Moreover, it is computationally intensive and requires reference cell genome databases. To avoid these drawbacks, we set out to develop an alternative approach. We reasoned that searching genome and transcriptome assemblies for adventitious viral contaminants using TBLASTN with a compact viral protein database covering extant viral diversity as the query could be fast and sensitive without a requirement for high performance computing hardware. We tested our approach on Spodoptera frugiperda Sf-RVN, a recently isolated insect cell line, to determine if it was contaminated with one or more adventitious viruses. We used Illumina reads to assemble the Sf-RVN genome and transcriptome and searched them for adventitious viral contaminants using TBLASTN with our viral protein database. We found no evidence of viral contamination, which was substantiated by the fact that our searches otherwise identified diverse sequences encoding virus-like proteins. These sequences included Maverick, R1 LINE, and errantivirus transposons, all of which are common in insect genomes. We also identified previously described as well as novel endogenous viral elements similar to ORFs encoded by diverse insect viruses. Our results demonstrate TBLASTN searching massively parallel sequencing (MPS) assemblies with a compact, manually curated viral protein database is more sensitive for adventitious virus detection than BLASTN, as we identified various sequences that encoded virus-like proteins, but had no similarity to viral sequences at the nucleotide level. Moreover, searches were fast without requiring high performance computing hardware. Our study also documents the enhanced biosafety profile of Sf-RVN as compared to other Sf cell lines, and supports the notion that Sf-RVN is highly suitable for the production of safe biologicals.
Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.
Froim, D; Hopkins, C E; Belenky, A; Cohen, A S
1997-11-01
The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation.
Method for phosphorothioate antisense DNA sequencing by capillary electrophoresis with UV detection.
Froim, D; Hopkins, C E; Belenky, A; Cohen, A S
1997-01-01
The progress of antisense DNA therapy demands development of reliable and convenient methods for sequencing short single-stranded oligonucleotides. A method of phosphorothioate antisense DNA sequencing analysis using UV detection coupled to capillary electrophoresis (CE) has been developed based on a modified chain termination sequencing method. The proposed method reduces the sequencing cost since it uses affordable CE-UV instrumentation and requires no labeling with minimal sample processing before analysis. Cycle sequencing with ThermoSequenase generates quantities of sequencing products that are readily detectable by UV. Discrimination of undesired components from sequencing products in the reaction mixture, previously accomplished by fluorescent or radioactive labeling, is now achieved by bringing concentrations of undesired components below the UV detection range which yields a 'clean', well defined sequence. UV detection coupled with CE offers additional conveniences for sequencing since it can be accomplished with commercially available CE-UV equipment and is readily amenable to automation. PMID:9336449
Katayama, Takahiro; Yasukawa, Hiro
2008-10-01
It has been reported that Dictyostelium discoideum encodes four silent information regulator 2 (Sir2) proteins (Sir2A-D) showing sequence similarity to human homologues of Sir2 (SIRT1-3). Further screening in a database revealed that D. discoideum encodes an additional Sir2 homologue (Sir2E). The amino acid sequence of Sir2E is not similar to those of SIRTs but is similar to those of proteins encoded by Giardia lamblia, Cryptosporidium hominis and Cryptosporidium parvum. Fluorescence of Sir2E-green fluorescent protein fusion protein was detected in the D. discoideum nucleus, indicating that Sir2E is a nuclear localizing protein. Reverse transcription-polymerase chain reaction and whole-mount in situ hybridization analyses showed that D. discoideum expressed sir2E in amoebae in the growth phase and in prestalk cells in the developmental phase. D. discoideum overexpressing sir2E grew faster than the wild type. These results indicate that Sir2E plays important roles both in the growth phase and developmental phase of D. discoideum.
Buttò, Stefano; Fiorelli, Valeria; Tripiciano, Antonella; Ruiz-Alvarez, Maria J; Scoglio, Arianna; Ensoli, Fabrizio; Ciccozzi, Massimo; Collacchi, Barbara; Sabbatucci, Michela; Cafaro, Aurelio; Guzmán, Carlos A; Borsetti, Alessandra; Caputo, Antonella; Vardas, Eftyhia; Colvin, Mark; Lukwiya, Matthew; Rezza, Giovanni; Ensoli, Barbara
2003-10-15
We determined immune cross-recognition and the degree of Tat conservation in patients infected by local human immunodeficiency virus (HIV) type 1 strains. The data indicated a similar prevalence of total and epitope-specific anti-Tat IgG in 578 serum samples from HIV-infected Italian (n=302), Ugandan (n=139), and South African (n=137) subjects, using the same B clade Tat protein that is being used in vaccine trials. In particular, anti-Tat antibodies were detected in 13.2%, 10.8%, and 13.9% of HIV-1-infected individuals from Italy, Uganda, and South Africa, respectively. Sequence analysis results indicated a high similarity of Tat from the different circulating viruses with BH-10 Tat, particularly in the 1-58 amino acid region, which contains most of the immunogenic epitopes. These data indicate an effective cross-recognition of a B-clade laboratory strain-derived Tat protein vaccine by individuals infected with different local viruses, owing to the high similarity of Tat epitopes.
Bellerophon: A program to detect chimeric sequences in multiple sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip
2003-12-23
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Intrusion Detection in Control Systems using Sequence Characteristics
NASA Astrophysics Data System (ADS)
Kiuchi, Mai; Onoda, Takashi
Intrusion detection is considered effective in control systems. Sequences of the control application behavior observed in the communication, such as the order of the control device to be controlled, are important in control systems. However, most intrusion detection systems do not effectively reflect sequences in the application layer into the detection rules. In our previous work, we considered utilizing sequences for intrusion detection in control systems, and demonstrated the usefulness of sequences for intrusion detection. However, manually writing the detection rules for a large system can be difficult, so using machine learning methods becomes feasible. Also, in the case of control systems, there have been very few observed cyber attacks, so we have very little knowledge of the attack data that should be used to train the intrusion detection system. In this paper, we use an approach that combines CRF (Conditional Random Field) considering the sequence of the system, thus able to reflect the characteristics of control system sequences into the intrusion detection system, and also does not need the knowledge of attack data to construct the detection rules.
Content-based video retrieval by example video clip
NASA Astrophysics Data System (ADS)
Dimitrova, Nevenka; Abdel-Mottaleb, Mohamed
1997-01-01
This paper presents a novel approach for video retrieval from a large archive of MPEG or Motion JPEG compressed video clips. We introduce a retrieval algorithm that takes a video clip as a query and searches the database for clips with similar contents. Video clips are characterized by a sequence of representative frame signatures, which are constructed from DC coefficients and motion information (`DC+M' signatures). The similarity between two video clips is determined by using their respective signatures. This method facilitates retrieval of clips for the purpose of video editing, broadcast news retrieval, or copyright violation detection.
Nandasena, Kemanthi G; O'Hara, Graham W; Tiwari, Ravi P; Willlems, Anne; Howieson, John G
2007-05-01
Biserrula pelecinus L. is a pasture legume species that forms a highly specific nitrogen-fixing symbiotic interaction with a group of bacteria that belong to Mesorhizobium. These mesorhizobia have >98.8 % sequence similarity to Mesorhizobium ciceri and Mesorhizobium loti for the 16S rRNA gene (1440 bp) and >99.3 % sequence similarity to M. ciceri for the dnaK gene (300 bp), and strain WSM1271 has 100 % sequence similarity to M. ciceri for GSII (600 bp). Strain WSM1271 had 85 % relatedness to M. ciceri LMG 14989(T) and 50 % relatedness to M. loti LMG 6125(T) when DNA-DNA hybridization was performed. WSM1271 also had a similar cellular fatty acid profile to M. ciceri. These results are strong evidence that the Biserrula mesorhizobia and M. ciceri belong to the same group of bacteria. Significant differences were revealed between the Biserrula mesorhizobia and M. ciceri in growth conditions, antibiotic resistance and carbon source utilization. The G+C content of the DNA of WSM1271 was 62.7 mol%, compared to 63-64 mol% for M. ciceri. The Biserrula mesorhizobia contained a plasmid ( approximately 500 bp), but the symbiotic genes were detected on a mobile symbiosis island and considerable variation was present in the symbiotic genes of Biserrula mesorhizobia and M. ciceri. There was <78.6 % sequence similarity for nodA and <66.9 % for nifH between Biserrula mesorhizobia and M. ciceri. Moreover, the Biserrula mesorhizobia did not nodulate the legume host of M. ciceri, Cicer arietinum, and M. ciceri did not nodulate B. pelecinus. These significant differences observed between Biserrula mesorhizobia and M. ciceri warrant the proposal of a novel biovar for Biserrula mesorhizobia within M. ciceri. The name Mesorhizobium ciceri biovar biserrulae is proposed, with strain WSM1271 (=LMG 23838=HAMBI 2942) as the reference strain.
Biodiversity of the microbial mat of the Garga hot spring.
Rozanov, Alexey Sergeevich; Bryanskaya, Alla Victorovna; Ivanisenko, Timofey Vladimirovich; Malup, Tatyana Konstantinovna; Peltek, Sergey Evgenievich
2017-12-28
Microbial mats are a good model system for ecological and evolutionary analysis of microbial communities. There are more than 20 alkaline hot springs on the banks of the Barguzin river inflows. Water temperature reaches 75 °C and pH is usually 8.0-9.0. The formation of microbial mats is observed in all hot springs. Microbial communities of hot springs of the Baikal rift zone are poorly studied. Garga is the biggest hot spring in this area. In this study, we investigated bacterial and archaeal diversity of the Garga hot spring (Baikal rift zone, Russia) using 16S rRNA metagenomic sequencing. We studied two types of microbial communities: (i) small white biofilms on rocks in the points with the highest temperature (75 °C) and (ii) continuous thick phototrophic microbial mats observed at temperatures below 70 °C. Archaea (mainly Crenarchaeota; 19.8% of the total sequences) were detected only in the small biofilms. The high abundance of Archaea in the sample from hot springs of the Baikal rift zone supplemented our knowledge of the distribution of Archaea. Most archaeal sequences had low similarity to known Archaea. In the microbial mats, primary products were formed by cyanobacteria of the genus Leptolyngbya. Heterotrophic microorganisms were mostly represented by Actinobacteria and Proteobacteria in all studied samples of the microbial mats. Planctomycetes, Chloroflexi, and Chlorobi were abundant in the middle layer of the microbial mats, while heterotrophic microorganisms represented mostly by Firmicutes (Clostridia, strict anaerobes) dominated in the bottom part. Besides prokaryotes, we detect some species of Algae with help of detection their chloroplasts 16 s rRNA. High abundance of Archaea in samples from hot springs of the Baikal rift zone supplemented our knowledge of the distribution of Archaea. Most archaeal sequences had low similarity to known Archaea. Metagenomic analysis of microbial communities of the microbial mat of Garga hot spring showed that the three studied points sampled at 70 °C, 55 °C, and 45 °C had similar species composition. Cyanobacteria of the genus Leptolyngbya dominated in the upper layer of the microbial mat. Chloroflexi and Chlorobi were less abundant and were mostly observed in the middle part of the microbial mat. We detected domains of heterotrophic organisms in high abundance (Proteobacteria, Firmicutes, Verrucomicrobia, Planctomicetes, Bacteroidetes, Actinobacteria, Thermi), according to metabolic properties of known relatives, which can form complete cycles of carbon, sulphur, and nitrogen in the microbial mat. The studied microbial mats evolved in early stages of biosphere formation. They can live autonomously, providing full cycles of substances and preventing live activity products poisoning.
Hasiów-Jaroszewska, Beata; Komorowska, Beata
2013-10-01
Diagnostic methods distinguished different Pepino mosaic virus (PepMV) genotypes but the methods do not detect sequence variation in particular gene segments. The necrotic and non-necrotic isolates (pathotypes) of PepMV share a 99% sequence similarity. These isolates differ from each other at one nucleotide site in the triple gene block 3. In this study, a combination of real-time reverse transcription polymerase chain reaction and high resolution melting curve analysis of triple gene block 3 was developed for simultaneous detection and differentiation of PepMV pathotypes. The triple gene block 3 region carrying a transition A → G was amplified using two primer pairs from twelve virus isolates, and was subjected to high resolution melting curve analysis. The results showed two distinct melting curve profiles related to each pathotype. The results also indicated that the high resolution melting method could readily differentiate between necrotic and non-necrotic PepMV pathotypes. Copyright © 2013 Elsevier B.V. All rights reserved.
Molecular spectrum of somaclonal variation in regenerated rice revealed by whole-genome sequencing.
Miyao, Akio; Nakagome, Mariko; Ohnuma, Takako; Yamagata, Harumi; Kanamori, Hiroyuki; Katayose, Yuichi; Takahashi, Akira; Matsumoto, Takashi; Hirochika, Hirohiko
2012-01-01
Somaclonal variation is a phenomenon that results in the phenotypic variation of plants regenerated from cell culture. One of the causes of somaclonal variation in rice is the transposition of retrotransposons. However, many aspects of the mechanisms that result in somaclonal variation remain undefined. To detect genome-wide changes in regenerated rice, we analyzed the whole-genome sequences of three plants independently regenerated from cultured cells originating from a single seed stock. Many single-nucleotide polymorphisms (SNPs) and insertions and deletions (indels) were detected in the genomes of the regenerated plants. The transposition of only Tos17 among 43 transposons examined was detected in the regenerated plants. Therefore, the SNPs and indels contribute to the somaclonal variation in regenerated rice in addition to the transposition of Tos17. The observed molecular spectrum was similar to that of the spontaneous mutations in Arabidopsis thaliana. However, the base change ratio was estimated to be 1.74 × 10(-6) base substitutions per site per regeneration, which is 248-fold greater than the spontaneous mutation rate of A. thaliana.
Exploration of Deinococcus-Thermus molecular diversity by novel group-specific PCR primers
Theodorakopoulos, Nicolas; Bachar, Dipankar; Christen, Richard; Alain, Karine; Chapon, Virginie
2013-01-01
The deeply branching Deinococcus-Thermus lineage is recognized as one of the most extremophilic phylum of bacteria. In previous studies, the presence of Deinococcus-related bacteria in the hot arid Tunisian desert of Tataouine was demonstrated through combined molecular and culture-based approaches. Similarly, Thermus-related bacteria have been detected in Tunisian geothermal springs. The present work was conducted to explore the molecular diversity within the Deinococcus-Thermus phylum in these extreme environments. A set of specific primers was designed in silico on the basis of 16S rRNA gene sequences, validated for the specific detection of reference strains, and used for the polymerase chain reaction (PCR) amplification of metagenomic DNA retrieved from the Tataouine desert sand and Tunisian hot spring water samples. These analyses have revealed the presence of previously undescribed Deinococcus-Thermus bacterial sequences within these extreme environments. The primers designed in this study thus represent a powerful tool for the rapid detection of Deinococcus-Thermus in environmental samples and could also be applicable to clarify the biogeography of the Deinococcus-Thermus phylum. PMID:23996915
Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin
2015-01-01
Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937
NASA Astrophysics Data System (ADS)
Melbourne, J.; Soifer, B. T.; Desai, Vandana; Pope, Alexandra; Armus, Lee; Dey, Arjun; Bussmann, R. S.; Jannuzi, B. T.; Alberts, Stacey
2012-05-01
Dust-obscured galaxies (DOGs) are a subset of high-redshift (z ≈ 2) optically-faint ultra-luminous infrared galaxies (ULIRGs, e.g., L IR > 1012 L ⊙). We present new far-infrared photometry, at 250, 350, and 500 μm (observed-frame), from the Herschel Space Telescope for a large sample of 113 DOGs with spectroscopically measured redshifts. Approximately 60% of the sample are detected in the far-IR. The Herschel photometry allows the first robust determinations of the total infrared luminosities of a large sample of DOGs, confirming their high IR luminosities, which range from 1011.6 L ⊙
Ni, W; Le Guiner, C; Gernoux, G; Penaud-Budloo, M; Moullier, P; Snyder, R O
2011-07-01
Legitimate uses of gene transfer technology can benefit from sensitive detection methods to determine vector biodistribution in pre-clinical studies and in human clinical trials, and similar methods can detect illegitimate gene transfer to provide sports-governing bodies with the ability to maintain fairness. Real-time PCR assays were developed to detect a performance-enhancing transgene (erythropoietin, EPO) and backbone sequences in the presence of endogenous cellular sequences. In addition to developing real-time PCR assays, the steps involved in DNA extraction, storage and transport were investigated. By real-time PCR, the vector transgene is distinguishable from the genomic DNA sequence because of the absence of introns, and the vector backbone can be identified by heterologous gene expression control elements. After performance of the assays was optimized, cynomolgus macaques received a single dose by intramuscular (IM) injection of plasmid DNA, a recombinant adeno-associated viral vector serotype 1 (rAAV1) or a rAAV8 vector expressing cynomolgus macaque EPO. Macaques received a high plasmid dose intended to achieve a significant, but not life-threatening, increase in hematocrit. rAAV vectors were used at low doses to achieve a small increase in hematocrit and to determine the limit of sensitivity for detecting rAAV sequences by single-step PCR. DNA extracted from white blood cells (WBCs) was tested to determine whether WBCs can be collaterally transfected by plasmid or transduced by rAAV vectors in this context, and can be used as a surrogate marker for gene doping. We demonstrate that IM injection of a conventional plasmid and rAAV vectors results in the presence of DNA that can be detected at high levels in blood before rapid elimination, and that rAAV genomes can persist for several months in WBCs.
He, Liming; Liu, Fang; Karuppiah, Valliappan; Ren, Yi; Li, Zhiyong
2014-05-01
To date, the knowledge of eukaryotic communities associated with sponges remains limited compared with prokaryotic communities. In a manner similar to prokaryotes, it could be hypothesized that sponge holobionts have phylogenetically diverse eukaryotic symbionts, and the eukaryotic community structures in different sponge holobionts were probably different. In order to test this hypothesis, the communities of eukaryota associated with 11 species of South China Sea sponges were compared with the V4 region of 18S ribosomal ribonucleic acid gene using 454 pyrosequencing. Consequently, 135 and 721 unique operational taxonomic units (OTUs) of fungi and protists were obtained at 97 % sequence similarity, respectively. These sequences were assigned to 2 phyla of fungi (Ascomycota and Basidiomycota) and 9 phyla of protists including 5 algal phyla (Chlorophyta, Haptophyta, Streptophyta, Rhodophyta, and Stramenopiles) and 4 protozoal phyla (Alveolata, Cercozoa, Haplosporidia, and Radiolaria) including 47 orders (12 fungi, 35 protists). Entorrhizales of fungi and 18 orders of protists were detected in marine sponges for the first time. Particularly, Tilletiales of fungi and Chlorocystidales of protists were detected for the first time in marine habitats. Though Ascomycota, Alveolata, and Radiolaria were detected in all the 11 sponge species, sponge holobionts have different fungi and protistan communities according to OTU comparison and principal component analysis at the order level. This study provided the first insights into the fungal and protistan communities associated with different marine sponge holobionts using pyrosequencing, thus further extending the knowledge on sponge-associated eukaryotic diversity.
Species-Specific TT Viruses and Cross-Species Infection in Nonhuman Primates
Okamoto, Hiroaki; Fukuda, Masako; Tawara, Akio; Nishizawa, Tsutomu; Itoh, Yukio; Hayasaka, Ikuo; Tsuda, Fumio; Tanaka, Takeshi; Miyakawa, Yuzo; Mayumi, Makoto
2000-01-01
Viruses resembling human TT virus (TTV) were searched for in sera from nonhuman primates by PCR with primers deduced from well-conserved areas in the untranslated region. TTV DNA was detected in 102 (98%) of 104 chimpanzees, 9 (90%) of 10 Japanese macaques, 4 (100%) of 4 red-bellied tamarins, 5 (83%) of 6 cotton-top tamarins, and 5 (100%) of 5 douroucoulis tested. Analysis of the amplification products of 90 to 106 nucleotides revealed TTV DNA sequences specific for each species, with a decreasing similarity to human TTV in the order of chimpanzee, Japanese macaque, and tamarin/douroucouli TTVs. Full-length viral sequences were amplified by PCR with inverted nested primers deduced from the untranslated region of TTV DNA from each species. All animal TTVs were found to be circular with a genomic length at 3.5 to 3.8 kb, which was comparable to or slightly shorter than human TTV. Sequences closely similar to human TTV were determined by PCR with primers deduced from a coding region (N22 region) and were detected in 49 (47%) of the 104 chimpanzees; they were not found in any animals of the other species. Sequence analysis of the N22 region (222 to 225 nucleotides) of chimpanzee TTV DNAs disclosed four genetic groups that differed by 36.1 to 50.2% from one another; they were 35.0 to 52.8% divergent from any of the 16 genotypes of human TTV. Of the 104 chimpanzees, only 1 was viremic with human TTV of genotype 1a. It was among the 53 chimpanzees which had been used in transmission experiments with human hepatitis viruses. Antibody to TTV of genotype 1a was detected significantly more frequently in the chimpanzees that had been used in transmission experiments than in those that had not (8 of 28 [29%] and 3 of 35 [9%], respectively; P = 0.038). These results indicate that species-specific TTVs are prevalent in nonhuman primates and that human TTV can cross-infect chimpanzees. PMID:10627523
Duarte, Gabriela Frois; Rosado, Alexandre Soares; Seldin, Lucy; de Araujo, Welington; van Elsas, Jan Dirk
2001-01-01
The selective effects of sulfur-containing hydrocarbons, with respect to changes in bacterial community structure and selection of desulfurizing organisms and genes, were studied in soil. Samples taken from a polluted field soil (A) along a concentration gradient of sulfurous oil and from soil microcosms treated with dibenzothiophene (DBT)-containing petroleum (FSL soil) were analyzed. Analyses included plate counts of total bacteria and of DBT utilizers, molecular community profiling via soil DNA-based PCR-denaturing gradient gel electrophoresis (PCR-DGGE), and detection of genes that encode enzymes involved in the desulfurization of hydrocarbons, i.e., dszA, dszB, and dszC.Data obtained from the A soil showed no discriminating effects of oil levels on the culturable bacterial numbers on either medium used. Generally, counts of DBT degraders were 10- to 100-fold lower than the total culturable counts. However, PCR-DGGE showed that the numbers of bands detected in the molecular community profiles decreased with increasing oil content of the soil. Analysis of the sequences of three prominent bands of the profiles generated with the highly polluted soil samples suggested that the underlying organisms were related to Actinomyces sp., Arthrobacter sp., and a bacterium of uncertain affiliation. dszA, dszB, and dszC genes were present in all A soil samples, whereas a range of unpolluted soils gave negative results in this analysis. Results from the study of FSL soil revealed minor effects of the petroleum-DBT treatment on culturable bacterial numbers and clear effects on the DBT-utilizing communities. The molecular community profiles were largely stable over time in the untreated soil, whereas they showed a progressive change over time following treatment with DBT-containing petroleum. Direct PCR assessment revealed the presence of dszB-related signals in the untreated FSL soil and the apparent selection of dszA- and dszC-related sequences by the petroleum-DBT treatment. PCR-DGGE applied to sequential enrichment cultures in DBT-containing sulfur-free basal salts medium prepared from the A and treated FSL soils revealed the selection of up to 10 distinct bands. Sequencing a subset of these bands provided evidence for the presence of organisms related to Pseudomonas putida, a Pseudomonas sp., Stenotrophomonas maltophilia, and Rhodococcus erythropolis. Several of 52 colonies obtained from the A and FSL soils on agar plates with DBT as the sole sulfur source produced bands that matched the migration of bands selected in the enrichment cultures. Evidence for the presence of dszB in 12 strains was obtained, whereas dszA and dszC genes were found in only 7 and 6 strains, respectively. Most of the strains carrying dszA or dszC were classified as R. erythropolis related, and all revealed the capacity to desulfurize DBT. A comparison of 37 dszA sequences, obtained via PCR from the A and FSL soils, from enrichments of these soils, and from isolates, revealed the great similarity of all sequences to the canonical (R. erythropolis strain IGTS8) dszA sequence and a large degree of internal conservation. The 37 sequences recovered were grouped in three clusters. One group, consisting of 30 sequences, was minimally 98% related to the IGTS8 sequence, a second group of 2 sequences was slightly different, and a third group of 5 sequences was 95% similar. The first two groups contained sequences obtained from both soil types and enrichment cultures (including isolates), but the last consisted of sequences obtained directly from the polluted A soil. PMID:11229891
Martínez-Castilla, León P.; Rodríguez-Sotres, Rogelio
2010-01-01
Background Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. Principal Findings The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449–460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. Conclusion Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone. PMID:20830209
Fermi Establishes Classical Novae as a Distinct Class of Gamma-ray Sources
NASA Technical Reports Server (NTRS)
Ackermann, M.; Ajello, M.; Albert, A.; Baldini, L.; Ballet, J.; Bastieri, D.; Bellazzini, R.; Bissaldi, E.; Blandford, R. D.; Bloom, E. D.;
2014-01-01
A classical nova results from runaway thermonuclear explosions on the surface of a white dwarf that accretes matter from a low-mass main-sequence stellar companion. In 2012 and 2013, three novae were detected in gamma rays and stood in contrast to the first gamma-ray detected nova V407 Cygni 2010, which belongs to a rare class of symbiotic binary systems. Despite likely differences in the compositions and masses of their white dwarf progenitors, the three classical novae are similarly characterized as soft spectrum transient gamma-ray sources detected over 2-3 week durations. The gamma-ray detections point to unexpected high-energy particle acceleration processes linked to the mass ejection from thermonuclear explosions in an unanticipated class of Galactic gamma-ray sources.
Fermi establishes classical novae as a distinct class of gamma-ray sources
Cheung, C. C.
2014-07-31
A classical nova results from runaway thermonuclear explosions on the surface of a white dwarf that accretes matter from a low-mass main-sequence stellar companion. In 2012 and 2013, three novae were detected in γ rays and stood in contrast to the first γ-ray detected nova V407 Cygni 2010, which belongs to a rare class of symbiotic binary systems. Despite likely differences in the compositions and masses of their white dwarf progenitors, the three classical novae are similarly characterized as soft spectrum transient γ-ray sources detected over 2-3 week durations. The γ-ray detections point to unexpected high-energy particle acceleration processes linkedmore » to the mass ejection from thermonuclear explosions in an unanticipated class of Galactic γ-ray sources.« less
Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee
2014-02-01
Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Divergence and evolution of homologous regions of Bombyx mori nuclear polyhedrosis virus.
Majima, K; Kobara, R; Maeda, S
1993-01-01
Homologous regions (hrs) (hr1,hr2-left,hr2-right,hr3,hr4-left,hr 4-right, and hr5) similar to those found in the Autographa californica nuclear polyhedrosis virus (AcNPV) genome were found in the Bombyx mori NPV (BmNPV) genome. The BmNPV hrs contained two to eight repeats of a homologous nucleotide sequence which were on average about 75 bp long. All of these homologous sequence repeats contained a 26-bp-long palindrome motif with an EcoRI or EcoRI-like site at its core. The consensus sequence of the BmNPV hrs showed 95% conservation with respect to those found in AcNPV. Nucleotide sequence analysis indicated that hr2-left and hr2-right of BmNPV evolved from an ancestor similar to hr2 of AcNPV by inversion, cleavage, and ligation. The polarities of the BmNPV and AcNPV hrs were conserved except for that of hr4-left. Within hr4-right of BmNPV, four repeats of a previously underscribed palindrome motif were found. Bmhr5D, a BmNPV mutant which lacked hr5, replicated at a rate similar to that of wild-type BmNPV in BmN cells and silkworm larvae, indicating that hr5 was not essential for viral replication. After ten passages of Bmhr5D in BmN cells, no detectable changes in its genome were observed by restriction endonuclease analysis. The evolution and divergence of the BmNPV genome are also discussed. Images PMID:8230471
Viau, Roberto A.; Hujer, Andrea M.; Marshall, Steven H.; Perez, Federico; Hujer, Kristine M.; Briceño, David F.; Dul, Michael; Jacobs, Michael R.; Grossberg, Richard; Toltzis, Philip
2012-01-01
Background. Klebsiella pneumoniae isolates harboring the K. pneumoniae carbapenemase gene (blaKPC) are creating a significant healthcare threat in both acute and long-term care facilities (LTCFs). As part of a study conducted in 2004 to determine the risk of stool colonization with extended-spectrum cephalosporin-resistant gram-negative bacteria, 12 isolates of K. pneumoniae that exhibited nonsusceptibility to extended-spectrum cephalosporins were detected. All were gastrointestinal carriage isolates that were not associated with infection. Methods. Reassessment of the carbapenem minimum inhibitory concentrations using revised 2011 Clinical Laboratory Standards Institute breakpoints uncovered carbapenem resistance. To further investigate, a DNA microarray assay, PCR-sequencing of bla genes, immunoblotting, repetitive-sequence-based PCR (rep-PCR) and multilocus sequence typing (MLST) were performed. Results. The DNA microarray detected blaKPC in all 12 isolates, and blaKPC-3 was identified by PCR amplification and sequencing of the amplicon. In addition, a blaSHV-11 gene was detected in all isolates. Immunoblotting revealed “low-level” production of the K. pneumoniae carbapenemase, and rep-PCR indicated that all blaKPC-3-positive K. pneumoniae strains were genetically related (≥98% similar). According to MLST, all isolates belonged to sequence type 36. This sequence type has not been previously linked with blaKPC carriage. Plasmids from 3 representative isolates readily transferred the blaKPC-3 to Escherichia coli J-53 recipients. Conclusions. Our findings reveal the “silent” dissemination of blaKPC-3 as part of Tn4401b on a mobile plasmid in Northeast Ohio nearly a decade ago and establish the first report, to our knowledge, of K. pneumoniae containing blaKPC-3 in an LTCF caring for neurologically impaired children and young adults. PMID:22492318
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA
Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.
1995-01-01
The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581
The Gap Procedure: for the identification of phylogenetic clusters in HIV-1 sequence data.
Vrbik, Irene; Stephens, David A; Roger, Michel; Brenner, Bluma G
2015-11-04
In the context of infectious disease, sequence clustering can be used to provide important insights into the dynamics of transmission. Cluster analysis is usually performed using a phylogenetic approach whereby clusters are assigned on the basis of sufficiently small genetic distances and high bootstrap support (or posterior probabilities). The computational burden involved in this phylogenetic threshold approach is a major drawback, especially when a large number of sequences are being considered. In addition, this method requires a skilled user to specify the appropriate threshold values which may vary widely depending on the application. This paper presents the Gap Procedure, a distance-based clustering algorithm for the classification of DNA sequences sampled from individuals infected with the human immunodeficiency virus type 1 (HIV-1). Our heuristic algorithm bypasses the need for phylogenetic reconstruction, thereby supporting the quick analysis of large genetic data sets. Moreover, this fully automated procedure relies on data-driven gaps in sorted pairwise distances to infer clusters, thus no user-specified threshold values are required. The clustering results obtained by the Gap Procedure on both real and simulated data, closely agree with those found using the threshold approach, while only requiring a fraction of the time to complete the analysis. Apart from the dramatic gains in computational time, the Gap Procedure is highly effective in finding distinct groups of genetically similar sequences and obviates the need for subjective user-specified values. The clusters of genetically similar sequences returned by this procedure can be used to detect patterns in HIV-1 transmission and thereby aid in the prevention, treatment and containment of the disease.
ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites.
Li, Li; Crabtree, Jonathan; Fischer, Steve; Pinney, Deborah; Stoeckert, Christian J; Sibley, L David; Roos, David S
2004-01-01
ApiEST-DB (http://www.cbil.upenn.edu/paradbs-servlet/) provides integrated access to publicly available EST data from protozoan parasites in the phylum Apicomplexa. The database currently incorporates a total of nearly 100,000 ESTs from several parasite species of clinical and/or veterinary interest, including Eimeria tenella, Neospora caninum, Plasmodium falciparum, Sarcocystis neurona and Toxoplasma gondii. To facilitate analysis of these data, EST sequences were clustered and assembled to form consensus sequences for each organism, and these assemblies were then subjected to automated annotation via similarity searches against protein and domain databases. The underlying relational database infrastructure, Genomics Unified Schema (GUS), enables complex biologically based queries, facilitating validation of gene models, identification of alternative splicing, detection of single nucleotide polymorphisms, identification of stage-specific genes and recognition of phylogenetically conserved and phylogenetically restricted sequences.
Al Hammadi, Zulaikha M; Chu, Daniel K W; Eltahir, Yassir M; Al Hosani, Farida; Al Mulla, Mariam; Tarnini, Wasim; Hall, Aron J; Perera, Ranawaka A P M; Abdelkhalek, Mohamed M; Peiris, J S M; Al Muhairi, Salama S; Poon, Leo L M
2015-12-01
In May 2015 in United Arab Emirates, asymptomatic Middle East respiratory syndrome coronavirus infection was identified through active case finding in 2 men with exposure to infected dromedaries. Epidemiologic and virologic findings suggested zoonotic transmission. Genetic sequences for viruses from the men and camels were similar to those for viruses recently detected in other countries.
Rapid Identification of Micro-Organisms.
1985-08-26
mixed cell populations to which this technology has been applied, although many similarities exist as well. In most applications of flow cytometry, it...specific nucleic acid sequences detectable with DNA probes, are applicable only to organisms previously know to and available to the laboratory workers...peak of phycoerythrin, and the 585/593 nm yellow emission from He-Ne lasers now in development is well suited for excitation of phycocyanin . Any of the
Chen, Mindong; Wang, Bin; Zhang, Qianrong; Xue, Zhuzheng
2017-01-01
Fresh-cut luffa (Luffa cylindrica) fruits commonly undergo browning. However, little is known about the molecular mechanisms regulating this process. We used the RNA-seq technique to analyze the transcriptomic changes occurring during the browning of fresh-cut fruits from luffa cultivar ‘Fusi-3’. Over 90 million high-quality reads were assembled into 58,073 Unigenes, and 60.86% of these were annotated based on sequences in four public databases. We detected 35,282 Unigenes with significant hits to sequences in the NCBInr database, and 24,427 Unigenes encoded proteins with sequences that were similar to those of known proteins in the Swiss-Prot database. Additionally, 20,546 and 13,021 Unigenes were similar to existing sequences in the Eukaryotic Orthologous Groups of proteins and Kyoto Encyclopedia of Genes and Genomes databases, respectively. Furthermore, 27,301 Unigenes were differentially expressed during the browning of fresh-cut luffa fruits (i.e., after 1–6 h). Moreover, 11 genes from five gene families (i.e., PPO, PAL, POD, CAT, and SOD) identified as potentially associated with enzymatic browning as well as four WRKY transcription factors were observed to be differentially regulated in fresh-cut luffa fruits. With the assistance of rapid amplification of cDNA ends technology, we obtained the full-length sequences of the 15 Unigenes. We also confirmed these Unigenes were expressed by quantitative real-time polymerase chain reaction analysis. This study provides a comprehensive transcriptome sequence resource, and may facilitate further studies aimed at identifying genes affecting luffa fruit browning for the exploitation of the underlying mechanism. PMID:29145430
König, Stephan; Wubet, Tesfaye; Dormann, Carsten F.; Hempel, Stefan; Renker, Carsten; Buscot, François
2010-01-01
Large-scale (temporal and/or spatial) molecular investigations of the diversity and distribution of arbuscular mycorrhizal fungi (AMF) require considerable sampling efforts and high-throughput analysis. To facilitate such efforts, we have developed a TaqMan real-time PCR assay to detect and identify AMF in environmental samples. First, we screened the diversity in clone libraries, generated by nested PCR, of the nuclear ribosomal DNA internal transcribed spacer (ITS) of AMF in environmental samples. We then generated probes and forward primers based on the detected sequences, enabling AMF sequence type-specific detection in TaqMan multiplex real-time PCR assays. In comparisons to conventional clone library screening and Sanger sequencing, the TaqMan assay approach provided similar accuracy but higher sensitivity with cost and time savings. The TaqMan assays were applied to analyze the AMF community composition within plots of a large-scale plant biodiversity manipulation experiment, the Jena Experiment, primarily designed to investigate the interactive effects of plant biodiversity on element cycling and trophic interactions. The results show that environmental variables hierarchically shape AMF communities and that the sequence type spectrum is strongly affected by previous land use and disturbance, which appears to favor disturbance-tolerant members of the genus Glomus. The AMF species richness of disturbance-associated communities can be largely explained by richness of plant species and plant functional groups, while plant productivity and soil parameters appear to have only weak effects on the AMF community. PMID:20418424
Li, Fagen; Zhou, Changpin; Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming
2015-01-01
Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.
Characterization of c-Ki-ras and N-ras oncogenes in aflatoxin B sub 1 -induced rat liver tumors
DOE Office of Scientific and Technical Information (OSTI.GOV)
McMahon, G.; Davis, E.F.; Huber, L.J.
c-Ki-ras and N-ras oncogenes have been characterized in aflatoxin B{sub 1}-induced hepatocellular carcinomas. Detection of different protooncogene and oncogene sequences and estimation of their frequency distribution were accomplished by polymerase chain reaction, cloning, and plaque screening methods. Two c-Ki-ras oncogene sequences were identified in DNA from liver tumors that contained nucleotide changes absent in DNA from livers of untreated control rats. Sequence changes involving G{center dot}C to T{center dot}A or G{center dot}C to A{center dot}T nucleotide substitutions in codon 12 were scored in three of eight tumor-bearing animals. Distributions of c-Ki-ras sequences in tumors and normal liver DNA indicated thatmore » the observed nucleotide changes were consistent with those expected to result from direct mutagenesis of the germ-line protooncogene by aflatoxin B{sub 1}. N-ras oncogene sequences were identified in DNA from two of eight tumors. Three N-ras gene regions were identified, one of which was shown to be associated with an oncogene containing a putative activating amino acid residing at codon 13. All three N-ras sequences, including the region detected in N-ras oncogenes, were present at similar frequencies in DNA samples from control livers as well as liver tumors. The presence of a potential germ-line oncogene may be related to the sensitivity of the Fischer rat strain to liver carcinogenesis by aflatoxin B{sub 1} and other chemical carcinogens.« less
Discovery of a bovine enterovirus in alpaca.
McClenahan, Shasta D; Scherba, Gail; Borst, Luke; Fredrickson, Richard L; Krause, Philip R; Uhlenhaut, Christine
2013-01-01
A cytopathic virus was isolated using Madin-Darby bovine kidney (MDBK) cells from lung tissue of alpaca that died of a severe respiratory infection. To identify the virus, the infected cell culture supernatant was enriched for virus particles and a generic, PCR-based method was used to amplify potential viral sequences. Genomic sequence data of the alpaca isolate was obtained and compared with sequences of known viruses. The new alpaca virus sequence was most similar to recently designated Enterovirus species F, previously bovine enterovirus (BEVs), viruses that are globally prevalent in cattle, although they appear not to cause significant disease. Because bovine enteroviruses have not been previously reported in U.S. alpaca, we suspect that this type of infection is fairly rare, and in this case appeared not to spread beyond the original outbreak. The capsid sequence of the detected virus had greatest homology to Enterovirus F type 1 (indicating that the virus should be considered a member of serotype 1), but the virus had greater homology in 2A protease sequence to type 3, suggesting that it may have been a recombinant. Identifying pathogens that infect a new host species for the first time can be challenging. As the disease in a new host species may be quite different from that in the original or natural host, the pathogen may not be suspected based on the clinical presentation, delaying diagnosis. Although this virus replicated in MDBK cells, existing standard culture and molecular methods could not identify it. In this case, a highly sensitive generic PCR-based pathogen-detection method was used to identify this pathogen.
Discovery of a Bovine Enterovirus in Alpaca
McClenahan, Shasta D.; Scherba, Gail; Borst, Luke; Fredrickson, Richard L.; Krause, Philip R.; Uhlenhaut, Christine
2013-01-01
A cytopathic virus was isolated using Madin-Darby bovine kidney (MDBK) cells from lung tissue of alpaca that died of a severe respiratory infection. To identify the virus, the infected cell culture supernatant was enriched for virus particles and a generic, PCR-based method was used to amplify potential viral sequences. Genomic sequence data of the alpaca isolate was obtained and compared with sequences of known viruses. The new alpaca virus sequence was most similar to recently designated Enterovirus species F, previously bovine enterovirus (BEVs), viruses that are globally prevalent in cattle, although they appear not to cause significant disease. Because bovine enteroviruses have not been previously reported in U.S. alpaca, we suspect that this type of infection is fairly rare, and in this case appeared not to spread beyond the original outbreak. The capsid sequence of the detected virus had greatest homology to Enterovirus F type 1 (indicating that the virus should be considered a member of serotype 1), but the virus had greater homology in 2A protease sequence to type 3, suggesting that it may have been a recombinant. Identifying pathogens that infect a new host species for the first time can be challenging. As the disease in a new host species may be quite different from that in the original or natural host, the pathogen may not be suspected based on the clinical presentation, delaying diagnosis. Although this virus replicated in MDBK cells, existing standard culture and molecular methods could not identify it. In this case, a highly sensitive generic PCR-based pathogen-detection method was used to identify this pathogen. PMID:23950875
Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming
2015-01-01
Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430
Identification and characterization of mobile genetic elements LINEs from Brassica genome.
Nouroz, Faisal; Noreen, Shumaila; Khan, Muhammad Fiaz; Ahmed, Shehzad; Heslop-Harrison, J S Pat
2017-09-05
Among transposable elements (TEs), the LTR retrotransposons are abundant followed by non-LTR retrotransposons in plant genomes, the lateral being represented by LINEs and SINEs. Computational and molecular approaches were used for the characterization of Brassica LINEs, their diversity and phylogenetic relationships. Four autonomous and four non-autonomous LINE families were identified and characterized from Brassica. Most of the autonomous LINEs displayed two open reading frames, ORF1 and ORF2, where ORF1 is a gag protein domain, while ORF2 encodes endonuclease (EN) and a reverse transcriptase (RT). Three of four families encoded an additional RNase H (RH) domain in pol gene common to 'R' and 'I' type of LINEs. The PCR analyses based on LINEs RT fragments indicate their high diversity and widespread occurrence in tested 40 Brassica cultivars. Database searches revealed the homology in LINE sequences in closely related genera Arabidopsis indicating their origin from common ancestors predating their separation. The alignment of 58 LINEs RT sequences from Brassica, Arabidopsis and other plants depicted 4 conserved domains (domain II-V) showing similarity to previously detected domains. Based on RT alignment of Brassica and 3 known LINEs from monocots, Brassicaceae LINEs clustered in separate clade, further resolving 4 Brassica-Arabidopsis specific families in 2 sub-clades. High similarities were observed in RT sequences in the members of same family, while low homology was detected in members across the families. The investigation led to the characterization of Brassica specific LINE families and their diversity across Brassica species and their cultivars. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Chien-Yuan; Li, Quanzi; Tunlaya-Anukit, Sermsawat
2016-03-11
Class III peroxidases are members of a large plant-specific sequence-heterogeneous protein family. Several sequence-conserved homologs have been associated with lignin polymerization in Arabidopsis thaliana, Oryza sativa, Nicotiana tabacum, Zinnia elegans, Picea abies, and Pinus sylvestris. In Populus trichocarpa, a model species for studies of wood formation, the peroxidases involved in lignin biosynthesis have not yet been identified. To do this, we retrieved sequences of all PtrPOs from Peroxibase and conducted RNA-seq to identify candidates. Transcripts from 42 PtrPOs were detected in stem differentiating xylem (SDX) and four of them are the most xylem-abundant (PtrPO12, PtrPO21, PtrPO42, and PtrPO64). PtrPO21 showsmore » xylem-specific expression similar to that of genes encoding the monolignol biosynthetic enzymes. Using protein cleavage-isotope dilution mass spectrometry, PtrPO21 is detected only in the cell wall fraction and not in the soluble fraction. Downregulated transgenics of PtrPO21 have a lignin reduction of ~20% with subunit composition (S/G ratio) similar to wild type. The transgenics show a growth reduction and reddish color of stem wood. The modulus of elasticity (MOE) of the stems of the downregulated PtrPO21-line 8 can be reduced to ~60% of wild type. Differentially expressed gene (DEG) analysis of PtrPO21 downregulated transgenics identified a significant overexpression of PtPrx35, suggesting a compensatory effect within the peroxidase family. No significant changes in the expression of the 49 P. trichocarpa laccases (PtrLACs) were observed.« less
Positive selection on MHC class II DRB and DQB genes in the bank vole (Myodes glareolus).
Scherman, Kristin; Råberg, Lars; Westerdahl, Helena
2014-05-01
The major histocompatibility complex (MHC) class IIB genes show considerable sequence similarity between loci. The MHC class II DQB and DRB genes are known to exhibit a high level of polymorphism, most likely maintained by parasite-mediated selection. Studies of the MHC in wild rodents have focused on DRB, whilst DQB has been given much less attention. Here, we characterised DQB genes in Swedish bank voles Myodes glareolus, using full-length transcripts. We then designed primers that specifically amplify exon 2 from DRB (202 bp) and DQB (205 bp) and investigated molecular signatures of natural selection on DRB and DQB alleles. The presence of two separate gene clusters was confirmed using BLASTN and phylogenetic analysis, where our seven transcripts clustered according to either DQB or DRB homologues. These gene clusters were again confirmed on exon 2 data from 454-amplicon sequencing. Our DRB primers amplify a similar number of alleles per individual as previously published DRB primers, though our reads are longer. Traditional d N/d S analyses of DRB sequences in the bank vole have not found a conclusive signal of positive selection. Using a more advanced substitution model (the Kumar method) we found positive selection in the peptide binding region (PBR) of both DRB and DQB genes. Maximum likelihood models of codon substitutions detected positively selected sites located in the PBR of both DQB and DRB. Interestingly, these analyses detected at least twice as many positively selected sites in DQB than DRB, suggesting that DQB has been under stronger positive selection than DRB over evolutionary time.
Frías-De-León, María Guadalupe; Ramírez-Bárcenas, José Antonio; Rodríguez-Arellanes, Gabriela; Velasco-Castrejón, Oscar; Taylor, Maria Lucia; Reyes-Montes, María Del Rocío
2017-03-01
Histoplasmosis is considered the most important systemic mycosis in Mexico, and its diagnosis requires fast and reliable methodologies. The present study evaluated the usefulness of PCR using Hcp100 and 1281-1283 (220) molecular markers in detecting Histoplasma capsulatum in occupational and recreational outbreaks. Seven clinical serum samples of infected individuals from three different histoplasmosis outbreaks were processed by enzyme-linked immunosorbent assay (ELISA) to titre anti-H. capsulatum antibodies and to extract DNA. Fourteen environmental samples were also processed for H. capsulatum isolation and DNA extraction. Both clinical and environmental DNA samples were analysed by PCR with Hcp100 and 1281-1283 (220) markers. Antibodies to H. capsulatum were detected by ELISA in all serum samples using specific antigens, and in six of these samples, the PCR products of both molecular markers were amplified. Four environmental samples amplified one of the two markers, but only one sample amplified both markers and an isolate of H. capsulatum was cultured from this sample. All PCR products were sequenced, and the sequences for each marker were analysed using the Basic Local Alignment Search Tool (BLASTn), which revealed 95-98 and 98-100 % similarities with the reference sequences deposited in the GenBank for Hcp100 and 1281-1283 (220) , respectively. Both molecular markers proved to be useful in studying histoplasmosis outbreaks because they are matched for pathogen detection in either clinical or environmental samples.
Exome-wide Sequencing Shows Low Mutation Rates and Identifies Novel Mutated Genes in Seminomas.
Cutcutache, Ioana; Suzuki, Yuka; Tan, Iain Beehuat; Ramgopal, Subhashini; Zhang, Shenli; Ramnarayanan, Kalpana; Gan, Anna; Lee, Heng Hong; Tay, Su Ting; Ooi, Aikseng; Ong, Choon Kiat; Bolthouse, Jonathan T; Lane, Brian R; Anema, John G; Kahnoski, Richard J; Tan, Patrick; Teh, Bin Tean; Rozen, Steven G
2015-07-01
Testicular germ cell tumors are the most common cancer diagnosed in young men, and seminomas are the most common type of these cancers. There have been no exome-wide examinations of genes mutated in seminomas or of overall rates of nonsilent somatic mutations in these tumors. The objective was to analyze somatic mutations in seminomas to determine which genes are affected and to determine rates of nonsilent mutations. Eight seminomas and matched normal samples were surgically obtained from eight patients. DNA was extracted from tissue samples and exome sequenced on massively parallel Illumina DNA sequencers. Single-nucleotide polymorphism chip-based copy number analysis was also performed to assess copy number alterations. The DNA sequencing read data were analyzed to detect somatic mutations including single-nucleotide substitutions and short insertions and deletions. The detected mutations were validated by independent sequencing and further checked for subclonality. The rate of nonsynonymous somatic mutations averaged 0.31 mutations/Mb. We detected nonsilent somatic mutations in 96 genes that were not previously known to be mutated in seminomas, of which some may be driver mutations. Many of the mutations appear to have been present in subclonal populations. In addition, two genes, KIT and KRAS, were affected in two tumors each with mutations that were previously observed in other cancers and are presumably oncogenic. Our study, the first report on exome sequencing of seminomas, detected somatic mutations in 96 new genes, several of which may be targetable drivers. Furthermore, our results show that seminoma mutation rates are five times higher than previously thought, but are nevertheless low compared to other common cancers. Similar low rates are seen in other cancers that also have excellent rates of remission achieved with chemotherapy. We examined the DNA sequences of seminomas, the most common type of testicular germ cell cancer. Our study identified 96 new genes in which mutations occurred during seminoma development, some of which might contribute to cancer development or progression. The study also showed that the rates of DNA mutations during seminoma development are higher than previously thought, but still lower than for other common solid-organ cancers. Such low rates are also observed among other cancers that, like seminomas, show excellent rates of disease remission after chemotherapy. Copyright © 2015 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Najm, Nour-Addeen; Meyer-Kayser, Elisabeth; Hoffmann, Lothar; Pfister, Kurt; Silaghi, Cornelia
2014-07-01
In this study, the prevalence of Hepatozoon spp. in red foxes (Vulpes vulpes) and their ticks from Germany, as well as molecular characterizations and phylogenetic relationship to other Hepatozoon spp. were investigated. DNA extracts of 261 spleen samples and 1,953 ticks were examined for the presence of Hepatozoon spp. by a conventional polymerase chain reaction (PCR) targeting the 18S rRNA gene. The ticks included four tick species: Ixodes ricinus, Ixodes canisuga, Ixodes hexagonus and Dermacentor reticulatus. A total of 118/261 foxes (45.2%) and 148/1,953 ticks (7.5%) were Hepatozoon PCR-positive. Amplicons from 36 positive foxes and 41 positive ticks were sequenced. All sequences obtained from foxes and 39/41 from ticks had a 99% similarity to Hepatozoon canis, whereas two ticks' sequences had a 99% identity to Hepatozoon sp. The obtained Hepatozoon sequences in this study were phylogenetically related to other Hepatozoon sequences detected in other countries, which may represent strain variants. The high prevalence of H. canis DNA in red foxes in this study supports the suggested role of those animals in distribution of this parasite. Furthermore, detection of DNA of H. canis in foxes and all examined tick species collected from those foxes allows speculating about previously undescribed potential vectors for H. canis and suggests a potential role of the red fox in its natural endemic cycles.
Fuentes-Ramírez, Alicia; Jiménez-Soto, Mauricio; Castro, Ruth; Romero-Zuñiga, Juan José
2017-01-01
One hundred and fifty-two blood samples of non-human primates of thirteen rescue centers in Costa Rica were analyzed to determine the presence of species of Plasmodium using thick blood smears, semi-nested multiplex polymerase chain reaction (SnM-PCR) for species differentiation, cloning and sequencing for confirmation. Using thick blood smears, two samples were determined to contain the Plasmodium malariae parasite, with SnM-PCR, a total of five (3.3%) samples were positive to P. malariae, cloning and sequencing confirmed both smear samples as P. malariae. One sample amplified a larger and conserved region of 18S rDNA for the genus Plasmodium and sequencing confirmed the results obtained microscopically and through SnM-PCR tests. Sequencing and construction of a phylogenetic tree of this sample revealed that the P. malariae/P. brasilianum parasite (GenBank KU999995) found in a howler monkey (Alouatta palliata) is identical to that recently reported in humans in Costa Rica. The SnM-PCR detected P. malariae/P. brasilianum parasite in different non-human primate species in captivity and in various regions of the southern Atlantic and Pacific coast of Costa Rica. The similarity of the sequences of parasites found in humans and a monkey suggests that monkeys may be acting as reservoirs of P.malariae/P. brasilianum, for which reason it is important, to include them in control and eradication programs. PMID:28125696
Zhu, X Q; Chilton, N B; Gasser, R B
1998-05-01
This study evaluated the use of a commercially available DNA intercalating agent (Resolver Gold) in agarose gels for the direct detection of sequence variation in ribosomal DNA (rDNA). This agent binds preferentially to AT sequence motifs in DNA. Regions of nuclear rDNA, known to provide genetic markers for the identification of species of parasitic ascarid nematodes (order Ascaridida), were amplified by polymerase chain reaction (PCR) and subjected to electrophoresis in standard agarose gels versus gels supplemented with Resolver Gold. Individual taxa examined could not be distinguished reliably based on the size of their amplicons in standard agarose gels, whereas they could be readily delineated based on mobility using Resolver Gold-supplemented gels. The latter was achieved because of differences (approximately 0.1-8.2%) in the AT content of the fragments among different taxa, which were associated with significant interspecific differences (approximately 11-39%) in the rDNA sequences employed. There was a tendency for fragments with higher AT content to migrate slower in supplemented agarose gels compared with those of lower AT content. The results indicate the usefulness of this electrophoretic approach to rapidly screen for sequence variability within or among PCR-amplified rDNA fragments of similar sizes but differing AT contents. Although evaluated on rDNA of parasites, the approach has potential to be applied to a range of genes of different groups of infectious organisms.
Aslett, Denise; Haas, Joseph; Hyman, Michael
2011-09-01
Biodegradation of the gasoline oxygenates methyl tertiary-butyl ether (MTBE) and ethyl tertiary-butyl ether (ETBE) can cause tertiary butyl alcohol (TBA) to accumulate in gasoline-impacted environments. One remediation option for TBA-contaminated groundwater involves oxygenated granulated activated carbon (GAC) reactors that have been self-inoculated by indigenous TBA-degrading microorganisms in ground water extracted from contaminated aquifers. Identification of these organisms is important for understanding the range of TBA-metabolizing organisms in nature and for determining whether self-inoculation of similar reactors is likely to occur at other sites. In this study (13)C-DNA-stable isotope probing (SIP) was used to identify TBA-utilizing organisms in samples of self-inoculated BioGAC reactors operated at sites in New York and California. Based on 16S rRNA nucleotide sequences, all TBA-utilizing organisms identified were members of the Burkholderiales order of the β-proteobacteria. Organisms similar to Cupriavidus and Methylibium were observed in both reactor samples while organisms similar to Polaromonas and Rhodoferax were unique to the reactor sample from New York. Organisms similar to Hydrogenophaga and Paucibacter strains were only detected in the reactor sample from California. We also analyzed our samples for the presence of several genes previously implicated in TBA oxidation by pure cultures of bacteria. Genes Mpe_B0532, B0541, B0555, and B0561 were all detected in (13)C-metagenomic DNA from both reactors and deduced amino acid sequences suggested these genes all encode highly conserved enzymes. One gene (Mpe_B0555) encodes a putative phthalate dioxygenase-like enzyme that may be particularly appropriate for determining the potential for TBA oxidation in contaminated environmental samples.
Identification of snake arenaviruses in live boas and pythons in a zoo in Germany.
Aqrawi, T; Stöhr, A C; Knauf-Witzens, T; Krengel, A; Heckers, K O; Marschang, R E
2015-01-01
Recent studies have described the detection and characterisation of new, snake specific arenaviruses in boas and pythons with inclusion body disease (IBD). The objective of this study was to detect arenaviral RNA in live snakes and to determine if these were associated with IBD in all cases. Samples for arenavirus detection in live animals were compared. Detected viruses were compared in order to understand their genetic variability. Esophageal swabs and whole blood was collected from a total of 28 boas and pythons. Samples were tested for arenaviral RNA by RT-PCR. Blood smears from all animals were examined for the presence of inclusion bodies. Internal tissues from animals that died or were euthanized during the study were examined for inclusions and via RT-PCR for arenaviral RNA. All PCR products were sequenced and the genomic sequences phylogenetically analysed. Nine live animals were found to be arenavirus-positive. Two additional snakes tested positive following necropsy. Five new arenaviruses were detected and identified. The detected viruses were named "Boa Arenavirus Deutschland (Boa Av DE) numbers 1-4" and one virus detected in a python (Morelia viridis) was named "Python Av DE1". Results from sequence analyses revealed considerable similarities to a portion of the glycoprotein genes of recently identified boid snake arenaviruses. Both oral swabs and whole blood can be used for the detection of arenaviruses in snakes. In most cases, but not in all, the presence of arenaviral RNA correlated with the presence of inclusions in the tissues of infected animals. There was evidence that some animals may be able to clear arenavirus infection without development of IBD. This is the first detection of arenaviruses in live snakes. The detection of arenaviruses in live snakes is of importance for both disease detection and prevention and for use in quarantine situations. The findings in this study support the theory that arenaviruses are the cause of IBD, but indicate that in some cases it may be possible for animals to clear arenavirus infections without developing IBD.
Kamoun, Choumouss; Payen, Thibaut; Hua-Van, Aurélie; Filée, Jonathan
2013-10-11
Insertion Sequences (ISs) and their non-autonomous derivatives (MITEs) are important components of prokaryotic genomes inducing duplication, deletion, rearrangement or lateral gene transfers. Although ISs and MITEs are relatively simple and basic genetic elements, their detection remains a difficult task due to their remarkable sequence diversity. With the advent of high-throughput genome and metagenome sequencing technologies, the development of fast, reliable and sensitive methods of ISs and MITEs detection become an important challenge. So far, almost all studies dealing with prokaryotic transposons have used classical BLAST-based detection methods against reference libraries. Here we introduce alternative methods of detection either taking advantages of the structural properties of the elements (de novo methods) or using an additional library-based method using profile HMM searches. In this study, we have developed three different work flows dedicated to ISs and MITEs detection: the first two use de novo methods detecting either repeated sequences or presence of Inverted Repeats; the third one use 28 in-house transposase alignment profiles with HMM search methods. We have compared the respective performances of each method using a reference dataset of 30 archaeal and 30 bacterial genomes in addition to simulated and real metagenomes. Compared to a BLAST-based method using ISFinder as library, de novo methods significantly improve ISs and MITEs detection. For example, in the 30 archaeal genomes, we discovered 30 new elements (+20%) in addition to the 141 multi-copies elements already detected by the BLAST approach. Many of the new elements correspond to ISs belonging to unknown or highly divergent families. The total number of MITEs has even doubled with the discovery of elements displaying very limited sequence similarities with their respective autonomous partners (mainly in the Inverted Repeats of the elements). Concerning metagenomes, with the exception of short reads data (<300 bp) for which both techniques seem equally limited, profile HMM searches considerably ameliorate the detection of transposase encoding genes (up to +50%) generating low level of false positives compare to BLAST-based methods. Compared to classical BLAST-based methods, the sensitivity of de novo and profile HMM methods developed in this study allow a better and more reliable detection of transposons in prokaryotic genomes and metagenomes. We believed that future studies implying ISs and MITEs identification in genomic data should combine at least one de novo and one library-based method, with optimal results obtained by running the two de novo methods in addition to a library-based search. For metagenomic data, profile HMM search should be favored, a BLAST-based step is only useful to the final annotation into groups and families.
Tardif, Steve; Brady, Heidi A.; Breazeale, Kelly R.; Bi, Ming; Thompson, Leslie D.; Bruemmer, Jason E.; Bailey, Laura B.; Hardy, Daniel M.
2009-01-01
Zonadhesin is a rapidly evolving protein in the sperm acrosome that confers species specificity to sperm-zona pellucida adhesion. Though structural variation in zonadhesin likely contributes to its species-specific function, the protein has not previously been characterized in organisms capable of interbreeding. Here we compared properties of zonadhesin in several animals, including the horse (Equus caballus), donkey (E. asinus), and Grevy's zebra (E. grevyi) to determine if variation in zonadhesin correlates with ability of gametes to cross-fertilize. Zonadhesin localized to the apical acrosomes of spermatozoa from all three Equus species, similar to its localization in other animals. Likewise, in horse and donkey testis, zonadhesin was detected only in germ cells, first in the acrosomal granule of round spermatids and then in the developing acrosomes of elongating spermatids. Among non-Equus species, D3-domain polypeptides of mature, processed zonadhesin varied markedly in size and detergent solubility. However, zonadhesin D3-domain polypeptides in horse, donkey, and zebra spermatozoa exhibited identical electrophoretic mobility and detergent solubility. Equus zonadhesin D3-polypeptides (p110/p80 doublet) were most similar in size to porcine and bovine zonadhesin D3-polypeptides (p105). Sequence comparisons revealed that the horse zonadhesin precursor's domain content and arrangement are similar to those of zonadhesin from other large animals. Partial sequences of horse and donkey zonadhesin were much more similar to each other (>99% identity) than they were to orthologous sequences of human, pig, rabbit, and mouse zonadhesin (52%–72% identity). We conclude that conservation of zonadhesin D3-polypeptide properties correlates with ability of Equus species to interbreed. PMID:19794156
Model-free aftershock forecasts constructed from similar sequences in the past
NASA Astrophysics Data System (ADS)
van der Elst, N.; Page, M. T.
2017-12-01
The basic premise behind aftershock forecasting is that sequences in the future will be similar to those in the past. Forecast models typically use empirically tuned parametric distributions to approximate past sequences, and project those distributions into the future to make a forecast. While parametric models do a good job of describing average outcomes, they are not explicitly designed to capture the full range of variability between sequences, and can suffer from over-tuning of the parameters. In particular, parametric forecasts may produce a high rate of "surprises" - sequences that land outside the forecast range. Here we present a non-parametric forecast method that cuts out the parametric "middleman" between training data and forecast. The method is based on finding past sequences that are similar to the target sequence, and evaluating their outcomes. We quantify similarity as the Poisson probability that the observed event count in a past sequence reflects the same underlying intensity as the observed event count in the target sequence. Event counts are defined in terms of differential magnitude relative to the mainshock. The forecast is then constructed from the distribution of past sequences outcomes, weighted by their similarity. We compare the similarity forecast with the Reasenberg and Jones (RJ95) method, for a set of 2807 global aftershock sequences of M≥6 mainshocks. We implement a sequence-specific RJ95 forecast using a global average prior and Bayesian updating, but do not propagate epistemic uncertainty. The RJ95 forecast is somewhat more precise than the similarity forecast: 90% of observed sequences fall within a factor of two of the median RJ95 forecast value, whereas the fraction is 85% for the similarity forecast. However, the surprise rate is much higher for the RJ95 forecast; 10% of observed sequences fall in the upper 2.5% of the (Poissonian) forecast range. The surprise rate is less than 3% for the similarity forecast. The similarity forecast may be useful to emergency managers and non-specialists when confidence or expertise in parametric forecasting may be lacking. The method makes over-tuning impossible, and minimizes the rate of surprises. At the least, this forecast constitutes a useful benchmark for more precisely tuned parametric forecasts.
Wilkins, David; Lu, Xiao-Ying; Shen, Zhiyong; Chen, Jiapeng
2014-01-01
Methanogenic archaea play a key role in biogas-producing anaerobic digestion and yet remain poorly taxonomically characterized. This is in part due to the limitations of low-throughput Sanger sequencing of a single (16S rRNA) gene, which in the past may have undersampled methanogen diversity. In this study, archaeal communities from three sludge digesters in Hong Kong and one wastewater digester in China were examined using high-throughput pyrosequencing of the methyl coenzyme M reductase (mcrA) and 16S rRNA genes. Methanobacteriales, Methanomicrobiales, and Methanosarcinales were detected in each digester, indicating that both hydrogenotrophic and acetoclastic methanogenesis was occurring. Two sludge digesters had similar community structures, likely due to their similar design and feedstock. Taxonomic classification of the mcrA genes suggested that these digesters were dominated by acetoclastic methanogens, particularly Methanosarcinales, while the other digesters were dominated by hydrogenotrophic Methanomicrobiales. The proposed euryarchaeotal order Methanomassiliicoccales and the uncultured WSA2 group were detected with the 16S rRNA gene, and potential mcrA genes for these groups were identified. 16S rRNA gene sequencing also recovered several crenarchaeotal groups potentially involved in the initial anaerobic digestion processes. Overall, the two genes produced different taxonomic profiles for the digesters, while greater methanogen richness was detected using the mcrA gene, supporting the use of this functional gene as a complement to the 16S rRNA gene to better assess methanogen diversity. A significant positive correlation was detected between methane production and the abundance of mcrA transcripts in digesters treating sludge and wastewater samples, supporting the mcrA gene as a biomarker for methane yield. PMID:25381241
Plasmin on adherent cells: from microvesiculation to apoptosis
Doeuvre, Loïc; Plawinski, Laurent; Goux, Didier; Vivien, Denis; Anglés-Cano, Eduardo
2010-01-01
SYNOPSIS Cell activation by stressors is characterised by a sequence of detectable phenotypic cell changes. The strength of a given stimulus induces modifications in the activity of membrane phospholipids transporters and calpains, which leads to phosphatidylserine exposure, membrane blebbing and the release of microparticles (nanoscale membrane vesicles). This vesiculation could be considered as a warning signal that may be followed, if the stimulus is maintained, by cell detachment-induced apoptosis. In this study, plasminogen incubated onto adherent cells is activated into plasmin by constitutively expressed tPA or uPA. Plasmin formed on the cellular membrane then induces an unique response characterized by membrane blebbing and vesiculation. Hitherto unknown for plasmin, these membrane changes are similar to those induced by thrombin on platelets. If plasmin formation evolves, matrix proteins are then degraded, cells lose attachment and enter the apoptotic process, characterized by DNA fragmentation and electron microscopy features. This sequence of events was experimentally documented at all these stages. Since other proteolytic or inflammatory stimuli may evoke similar responses by distinct adherent cells, this sequence can be applied to distinguish activated adherent cells from cells entering the apoptotic process. This is a major definition crucial to the identification of mediators, inhibitors and potential therapeutic agents. PMID:20846121
Peddayelachagiri, Bhavani V.; Paul, Soumya; Nagaraj, Sowmya; Gogoi, Madhurjya; Sripathy, Murali H.; Batra, Harsh V.
2016-01-01
Accurate identification of pathogens with biowarfare importance requires detection tools that specifically differentiate them from near-neighbor species. Burkholderia pseudomallei, the causative agent of a fatal disease melioidosis, is one such biothreat agent whose differentiation from its near-neighbor species is always a challenge. This is because of its phenotypic similarity with other Burkholderia species which have a wide spread geographical distribution with shared environmental niches. Melioidosis is a major public health concern in endemic regions including Southeast Asia and northern Australia. In India, the disease is still considered to be emerging. Prevalence surveys of this saprophytic bacterium in environment are under-reported in the country. A major challenge in this case is the specific identification and differentiation of B. pseudomallei from the growing list of species of Burkholderia genus. The objectives of this study included examining the prevalence of B. pseudomallei and near-neighbor species in coastal region of South India and development of a novel detection tool for specific identification and differentiation of Burkholderia species. Briefly, we analyzed soil and water samples collected from Malabar coastal region of Kerala, South India for prevalence of B. pseudomallei. The presumptive Burkholderia isolates were identified using recA PCR assay. The recA PCR assay identified 22 of the total 40 presumptive isolates as Burkholderia strains (22.72% and 77.27% B. pseudomallei and non-pseudomallei Burkholderia respectively). In order to identify each isolate screened, we performed recA and 16S rDNA sequencing. This two genes sequencing revealed that the presumptive isolates included B. pseudomallei, non-pseudomallei Burkholderia as well as non-Burkholderia strains. Furthermore, a gene termed D-beta hydroxybutyrate dehydrogenase (bdha) was studied both in silico and in vitro for accurate detection of Burkholderia genus. The optimized bdha based PCR assay when evaluated on the Burkholderia isolates of this study, it was found to be highly specific (100%) in its detection feature and a clear detection sensitivity of 10 pg/μl of purified gDNA was recorded. Nucleotide sequence variations of bdha among interspecies, as per in silico analysis, ranged from 8 to 29% within the target stretch of 730 bp highlighting the potential utility of bdha sequencing method in specific detection of Burkholderia species. Further, sequencing of the 730 bp bdha PCR amplicon of each Burkholderia strain isolated could differentiate the species and the data was comparable with recA sequence data of the strains. All sequencing results obtained were submitted to NCBI database. Bayesian phylogenetic analysis of bdha in comparison with recA and 16S rDNA showed that the bdha gene provided comparable identification of Burkholderia species. PMID:27632353
A highly divergent Puumala virus lineage in southern Poland.
Rosenfeld, Ulrike M; Drewes, Stephan; Ali, Hanan Sheikh; Sadowska, Edyta T; Mikowska, Magdalena; Heckel, Gerald; Koteja, Paweł; Ulrich, Rainer G
2017-05-01
Puumala virus (PUUV) represents one of the most important hantaviruses in Central Europe. Phylogenetic analyses of PUUV strains indicate a strong genetic structuring of this hantavirus. Recently, PUUV sequences were identified in the natural reservoir, the bank vole (Myodes glareolus), collected in the northern part of Poland. The objective of this study was to evaluate the presence of PUUV in bank voles from southern Poland. A total of 72 bank voles were trapped in 2009 at six sites in this part of Poland. RT-PCR and IgG-ELISA analyses detected three PUUV positive voles at one trapping site. The PUUV-infected animals were identified by cytochrome b gene analysis to belong to the Carpathian and Eastern evolutionary lineages of bank vole. The novel PUUV S, M and L segment nucleotide sequences showed the closest similarity to sequences of the Russian PUUV lineage from Latvia, but were highly divergent to those previously found in northern Poland, Slovakia and Austria. In conclusion, the detection of a highly divergent PUUV lineage in southern Poland indicates the necessity of further bank vole monitoring in this region allowing rational public health measures to prevent human infections.
Vedler, Eve; Heinaru, Eeva; Jutkina, Jekaterina; Viggor, Signe; Koressaar, Triinu; Remm, Maido; Heinaru, Ain
2013-12-01
A set of phenol-degrading strains of a collection of bacteria isolated from Baltic Sea surface water was screened for the presence of two key catabolic genes coding for phenol hydroxylases and catechol 2,3-dioxygenases. The multicomponent phenol hydroxylase (LmPH) gene was detected in 70 out of 92 strains studied, and 41 strains among these LmPH(+) phenol-degraders were found to exhibit catechol 2,3-dioxygenase (C23O) activity. Comparative phylogenetic analyses of LmPH and C23O sequences from 56 representative strains were performed. The studied strains were mostly affiliated to the genera Pseudomonas and Acinetobacter. However, the study also widened the range of phenol-degraders by including the genus Limnobacter. Furthermore, using a next generation sequencing approach, the LmPH genes of Limnobacter strains were found to be the most prevalent ones in the microbial community of the Baltic Sea surface water. Four different Limnobacter strains having almost identical 16S rRNA gene sequences (99%) and similar physiological properties formed separate phylogenetic clusters of LmPH and C23O genes in the respective phylogenetic trees. Copyright © 2013 Elsevier GmbH. All rights reserved.
Akhoundi, Mohammad; Cannet, Arnaud; Loubatier, Céline; Berenger, Jean-Michel; Izri, Arezki; Marty, Pierre; Delaunay, Pascal
2016-01-01
Wolbachia symbionts are maternally inherited intracellular bacteria that have been detected in numerous insects including bed bugs. The objective of this study, the first epidemiological study in Europe, was to screen Wolbachia infection among Cimex lectularius collected in the field, using PCR targeting the surface protein gene (wsp), and to compare obtained Wolbachia strains with those reported from laboratory colonies of C. lectularius as well as other Wolbachia groups. For this purpose, 284 bed bug specimens were caught and studied from eight different regions of France including the suburbs of Paris, Bouches-du-Rhône, Lot-et-Garonne, and five localities in Alpes-Maritimes. Among the samples, 166 were adults and the remaining 118 were considered nymphs. In all, 47 out of 118 nymphs (40%) and 61 out of 166 adults (37%) were found positive on wsp screening. Among the positive cases, 10 samples were selected randomly for sequencing. The sequences had 100% homology with wsp sequences belonging to the F-supergroup strains of Wolbachia. Therefore, we confirm the similarity of Wolbachia strains detected in this epidemiological study to Wolbachia spp. reported from laboratory colonies of C. lectularius. PMID:27492563
HPV detection rate in saliva may depend on the immune system efficiency.
Adamopoulou, Maria; Vairaktaris, Eleftherios; Panis, Vassilis; Nkenke, Emeka; Neukam, Friedreich W; Yapijakis, Christos
2008-01-01
Human papilloma virus (HPV) has been established as a major etiological factor of anogenital cancer. In addition, HPV has also been implicated in oral carcinogenesis but its detection rates appear to be highly variable, depending on the patient population tested, the molecular methodology used, as well as the type of oral specimen investigated. For example, saliva is an oral fluid that may play a role in HPV transmission, although the detection rates of the virus are lower than tissue. Recent evidence has indicated that HPV-related pathology is increased in the oral cavity of human immunodeficiency virus (HIV)-positive individuals. In order to investigate whether the presence of different HPV types in saliva depends on immune system efficiency, oral fluid samples of patients with oral cancer and without any known immune deficiency were compared with those of HIV-positive individuals. Saliva samples were collected from 68 patients with oral squamous cell carcinoma and 34 HIV seropositive individuals. HPV DNA sequences were detected by L1 concensus polymerase chain reaction (PCR), followed by restriction fragment length polymorphism (RFLP) analysis and DNA sequencing for HPV typing. HPV DNA was detected in 7/68 (10.3%) of the oral cancer patients and in 12/34 (35.3%) of the HIV-positive individuals, a highly significant difference (p = 0.006; odds ratio 4.753; 95% confidence interval 1.698-13.271). Among HPV-positive samples, the prevalence of HPV types associated with high oncogenic risk was similar in oral cancer and HIV-positive cases (71.4% and 66.7%, respectively). In both groups, the most common HPV type was high-risk 16 (50% and 42.8%, respectively). Although a similar pattern of HPV high-risk types was detected in oral cancer and HIV-positive cases, the quantitative detection of HPV in saliva significantly depended on immune system efficiency. Furthermore, the significantly increased detection rates of HPV in saliva of HIV-positive individuals may be associated with high risk for development of HPV-related oral lesions, including malignancy.
Dialog detection in narrative video by shot and face analysis
NASA Astrophysics Data System (ADS)
Kroon, B.; Nesvadba, J.; Hanjalic, A.
2007-01-01
The proliferation of captured personal and broadcast content in personal consumer archives necessitates comfortable access to stored audiovisual content. Intuitive retrieval and navigation solutions require however a semantic level that cannot be reached by generic multimedia content analysis alone. A fusion with film grammar rules can help to boost the reliability significantly. The current paper describes the fusion of low-level content analysis cues including face parameters and inter-shot similarities to segment commercial content into film grammar rule-based entities and subsequently classify those sequences into so-called shot reverse shots, i.e. dialog sequences. Moreover shot reverse shot specific mid-level cues are analyzed augmenting the shot reverse shot information with dialog specific descriptions.
Kang, Hae Ji; Bennett, Shannon N.; Dizney, Laurie; Sumibcay, Laarni; Arai, Satoru; Ruedas, Luis A.; Song, Jin-Won; Yanagihara, Richard
2009-01-01
A genetically distinct hantavirus, designated Oxbow virus (OXBV), was detected in tissues of an American shrew mole (Neurotrichus gibbsii), captured in Gresham, Oregon, in September 2003. Pairwise analysis of full-length S- and M- and partial L-segment nucleotide and amino acid sequences of OXBV indicated low sequence similarity with rodent-borne hantaviruses. Phylogenetic analyses using maximum-likelihood and Bayesian methods, and host-parasite evolutionary comparisons, showed that OXBV and Asama virus, a hantavirus recently identified from the Japanese shrew mole (Urotrichus talpoides), were related to soricine shrew-borne hantaviruses from North America and Eurasia, respectively, suggesting parallel evolution associated with cross-species transmission. PMID:19394994
Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir
2018-01-01
Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.
Deshpande, Lalitagauri M; Ashcraft, Deborah S; Kahn, Heather P; Pankey, George; Jones, Ronald N; Farrell, David J; Mendes, Rodrigo E
2015-10-01
Two linezolid-resistant Enterococcus faecium isolates (MICs, 8 μg/ml) from unique patients of a medical center in New Orleans were included in this study. Isolates were initially investigated for the presence of mutations in the V domain of 23S rRNA genes and L3, L4, and L22 ribosomal proteins, as well as cfr. Isolates were subjected to pulsed-field gel electrophoresis (just one band difference), and one representative strain was submitted to whole-genome sequencing. Gene location was also determined by hybridization, and cfr genes were cloned and expressed in a Staphylococcus aureus background. The two isolates had one out of six 23S rRNA alleles mutated (G2576T), had wild-type L3, L4, and L22 sequences, and were positive for a cfr-like gene. The sequence of the protein encoded by the cfr-like gene was most similar (99.7%) to that found in Peptoclostridium difficile, which shared only 74.9% amino acid identity with the proteins encoded by genes previously identified in staphylococci and non-faecium enterococci and was, therefore, denominated Cfr(B). When expressed in S. aureus, the protein conferred a resistance profile similar to that of Cfr. Two copies of cfr(B) were chromosomally located and embedded in a Tn6218 similar to the cfr-carrying transposon described in P. difficile. This study reports the first detection of cfr genes in E. faecium clinical isolates in the United States and characterization of a new cfr variant, cfr(B). cfr(B) has been observed in mobile genetic elements in E. faecium and P. difficile, suggesting potential for dissemination. However, further analysis is necessary to access the resistance levels conferred by cfr(B) when expressed in enterococci. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan
2017-01-01
PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.
First molecular evidence of Hepatozoon canis infection in red foxes and golden jackals from Hungary.
Farkas, Róbert; Solymosi, Norbert; Takács, Nóra; Hornyák, Ákos; Hornok, Sándor; Nachum-Biala, Yaarit; Baneth, Gad
2014-07-02
Recently, Hepatozoon canis infection has been detected among shepherd, hunting and stray dogs in the southern part of Hungary, which is considered to be free of Rhipicephalus sanguineus sensu lato and close to the border with Croatia. The aim of this study was to acquire information on the possibility that red foxes and/or golden jackals could play a role in the appearance and spread of H. canis in Hungary. A conventional PCR was used to amplify a 666 bp long fragment of the Hepatozoon 18S rRNA gene from blood samples collected from 334 foxes shot in 231 locations in 16 counties and 15 golden jackals shot in 9 locations in two southwestern counties close to Croatia. A second PCR assay was performed in some of the samples positive by the first PCR to amplify a larger segment (approximately 1500 bp) of the 18S rRNA gene of Hepatozoon spp. for further phylogenetic analysis. Hepatozoon infection was detected in canids shot in 30 locations and 9 counties. Altogether 26 foxes (8.0%, 95% CI: 5-11%) and 9 jackals (60%, 95% CI: 33-81%) were PCR positive. Hepatozoon canis sequences were obtained from 12 foxes and 7 jackals. DNA sequences from 16 animals were 99-100% similar to H. canis from Croatian foxes or dogs while two of the sequences were 99% similar to an Italian fox. Half (13/26) of the infected red foxes and all golden jackals were shot in the two southwestern counties. This is the first report on molecular evidence of H. canis in red foxes (Vulpes vulpes) and golden jackals (Canis aureus) from Hungary, which is considered free from the tick vector of H. canis, R. sanguineus. Although no R. sanguineus sensu lato had been found on infected or non-infected wild canids, the detection of authochnous canine hepatozoonosis in Hungary might imply that the range of R. sanguineus sensu lato has reached this country.
Mohan, Sudesh B; Schmid, Markus; Jetten, Mike; Cole, Jeff
2004-09-01
Degenerate primers to detect nrfA were designed by aligning six nrfA sequences including Escherichia coli K-12, Sulfurospirillum deleyianum and Wolinella succinogenes. These primers amplified a 490 bp fragment of nrfA. The ability of these primers to detect nrfA was tested with chromosomal DNA isolated from a variety of bacteria: they could distinguish between bacteria in which the gene is known to be present or absent. The positive reference organisms spanned the various classes of Proteobacteria, suggesting that these primers are probably generic. The primer pair F1 and R1 was also used successfully to analyse nrfA diversity from community DNA isolated from a sulphate reducing bioreactor, and from two established Anammox reactors (for an aerobic ammonia oxidation, in which nitrite is reduced by ammonia to dinitrogen gas). The nrfA clones isolated from these three sources grouped with the Bacteroidetes phylum. The nrfA primers also amplified 570 bp fragments from the Anammox community DNA. These fragments encoded a protein with four haem-binding motifs typical of a c-type cytochrome, but were unrelated to the NrfA nitrite reductase. A BLAST search failed to reveal similarity to any known proteins. However, similarity was found to one sequence, which was annotated as rapC (response regulator aspartate phosphatase), in the genome of the planctomycete Rhodopirellula baltica. These sequences possibly belong to a new class of c-type cytochrome that might be specific to members of the order Planctomycetales. The data are consistent with the proposal that cytochrome c nitrite reductases, present in the periplasm of Gram-negative bacteria, are widely distributed in many different environments where they provide a short circuit in the biological nitrogen cycle by reducing nitrite directly to ammonia.
Kvitt, H; Ucko, M; Colorni, A; Batargias, C; Zlotkin, A; Knibb, W
2002-04-05
A PCR protocol for the rapid diagnosis of fish 'pasteurellosis' based on 16S rRNA gene sequences was developed. The procedure combines low annealing temperature that detects low titers of Photobacterium damselae but also related species, and high annealing temperature for the specific identification of P. damselae directly from infected fish. The PCR protocol was validated on 19 piscine isolates of P. damselae ssp. piscicida from different geographic regions (Japan, Italy, Spain, Greece and Israel), on spontaneously infected sea bream Sparus aurata and sea bass Dicentrarchus labrax, and on closely related American Type Culture Collection (ATCC) reference strains. PCR using high annealing temperature (64 degrees C) discriminated between P. damselae and closely related reference strains, including P. histaminum. Sixteen isolates of P. damselae ssp. piscicida, 2 P. damselae ssp. piscicida reference strains and 1 P. damselae ssp. damselae reference strain were subjected to Amplified Fragment Length Polymorphism (AFLP) analysis, and a similarity matrix was produced. Accordingly, the Japanese isolates of P. damselae ssp. piscicida were distinguished from the Mediterranean/European isolates at a cut-off value of 83% similarity. A further subclustering at a cut-off value of 97% allowed discrimination between the Israeli P. damselae ssp. piscicida isolates and the other Mediterranean/European isolates. The combination of PCR direct amplification and AFLP provides a 2-step procedure, where P. damselae is rapidly identified at genus level on the basis of its 16S rRNA gene sequence and then grouped into distinct clusters on the basis of AFLP polymorphisms. The first step of direct amplification is highly sensitive and has immediate practical consequences, offering fish farmers a rapid diagnosis, while the AFLP is more specific and detects intraspecific variation which, in our study, also reflected geographic correspondence. Because of its superior discriminative properties, AFLP can be an important tool for epidemiological and taxonomic studies of this highly homogeneous genus.
Suzuki, Shun'ichi; Takenaka, Yasuhiro; Onishi, Norimasa; Yokozeki, Kenzo
2005-08-01
A DNA fragment from Microbacterium liquefaciens AJ 3912, containing the genes responsible for the conversion of 5-substituted-hydantoins to alpha-amino acids, was cloned in Escherichia coli and sequenced. Seven open reading frames (hyuP, hyuA, hyuH, hyuC, ORF1, ORF2, and ORF3) were identified on the 7.5 kb fragment. The deduced amino acid sequence encoded by the hyuA gene included the N-terminal amino acid sequence of the hydantoin racemase from M. liquefaciens AJ 3912. The hyuA, hyuH, and hyuC genes were heterologously expressed in E. coli; their presence corresponded with the detection of hydantoin racemase, hydantoinase, and N-carbamoyl alpha-amino acid amido hydrolase enzymatic activities respectively. The deduced amino acid sequences of hyuP were similar to those of the allantoin (5-ureido-hydantoin) permease from Saccharomyces cerevisiae, suggesting that hyuP protein might function as a hydantoin transporter.
CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.
Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine
2007-07-01
Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foltz, R.; Wilson, G.; DeGroot, A.
We study the slope, intercept, and scatter of the color–magnitude and color–mass relations for a sample of 10 infrared red-sequence-selected clusters at z ∼ 1. The quiescent galaxies in these clusters formed the bulk of their stars above z ≳ 3 with an age spread Δt ≳ 1 Gyr. We compare UVJ color–color and spectroscopic-based galaxy selection techniques, and find a 15% difference in the galaxy populations classified as quiescent by these methods. We compare the color–magnitude relations from our red-sequence selected sample with X-ray- and photometric-redshift-selected cluster samples of similar mass and redshift. Within uncertainties, we are unable tomore » detect any difference in the ages and star formation histories of quiescent cluster members in clusters selected by different methods, suggesting that the dominant quenching mechanism is insensitive to cluster baryon partitioning at z ∼ 1.« less
Liew, Pauline Woanying; Jong, Bor Chyan
2008-05-01
Two culture-independent methods, namely ribosomal DNA libraries and denaturing gradient gel electrophoresis (DGGE), were adopted to examine the microbial community of a Malaysian light crude oil. In this study, both 16S and 18S rDNAs were PCR-amplified from bulk DNA of crude oil samples, cloned, and sequenced. Analyses of restriction fragment length polymorphism (RFLP) and phylogenetics clustered the 16S and 18S rDNA sequences into seven and six groups, respectively. The ribosomal DNA sequences obtained showed sequence similarity between 90 to 100% to those available in the GenBank database. The closest relatives documented for the 16S rDNAs include member species of Thermoincola and Rhodopseudomonas, whereas the closest fungal relatives include Acremonium, Ceriporiopsis, Xeromyces, Lecythophora, and Candida. Others were affiliated to uncultured bacteria and uncultured ascomycete. The 16S rDNA library demonstrated predomination by a single uncultured bacterial type by >80% relative abundance. The predomination was confirmed by DGGE analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Melbourne, J.; Soifer, B. T.; Desai, Vandana
Dust-obscured galaxies (DOGs) are a subset of high-redshift (z Almost-Equal-To 2) optically-faint ultra-luminous infrared galaxies (ULIRGs, e.g., L{sub IR} > 10{sup 12} L{sub Sun} ). We present new far-infrared photometry, at 250, 350, and 500 {mu}m (observed-frame), from the Herschel Space Telescope for a large sample of 113 DOGs with spectroscopically measured redshifts. Approximately 60% of the sample are detected in the far-IR. The Herschel photometry allows the first robust determinations of the total infrared luminosities of a large sample of DOGs, confirming their high IR luminosities, which range from 10{sup 11.6} L{sub Sun} 10{sup 13} L{sub Sun }. Themore » rest-frame near-IR (1-3 {mu}m) spectral energy distributions (SEDs) of the Herschel-detected DOGs are predictors of their SEDs at longer wavelengths. DOGs with 'power-law' SEDs in the rest-frame near-IR show observed-frame 250/24 {mu}m flux density ratios similar to the QSO-like local ULIRG, Mrk 231. DOGs with a stellar 'bump' in their rest-frame near-IR show observed-frame 250/24 {mu}m flux density ratios similar to local star-bursting ULIRGs like NGC 6240. None show 250/24 {mu}m flux density ratios similar to extreme local ULIRG, Arp 220; though three show 350/24 {mu}m flux density ratios similar to Arp 220. For the Herschel-detected DOGs, accurate estimates (within {approx}25%) of total IR luminosity can be predicted from their rest-frame mid-IR data alone (e.g., from Spitzer observed-frame 24 {mu}m luminosities). Herschel-detected DOGs tend to have a high ratio of infrared luminosity to rest-frame 8 {mu}m luminosity (the IR8 = L{sub IR}(8-1000 {mu}m)/{nu}L{sub {nu}}(8 {mu}m) parameter of Elbaz et al.). Instead of lying on the z = 1-2 'infrared main sequence' of star-forming galaxies (like typical LIRGs and ULIRGs at those epochs) the DOGs, especially large fractions of the bump sources, tend to lie in the starburst sequence. While, Herschel-detected DOGs are similar to scaled up versions of local ULIRGs in terms of 250/24 {mu}m flux density ratio, and IR8, they tend to have cooler far-IR dust temperatures (20-40 K for DOGs versus 40-50 K for local ULIRGs) as measured by the rest-frame 80/115 {mu}m flux density ratios (e.g., observed-frame 250/350 {mu}m ratios at z = 2). DOGs that are not detected by Herschel appear to have lower observed-frame 250/24 {mu}m ratios than the detected sample, either because of warmer dust temperatures, lower IR luminosities, or both.« less
Transcription Factor Map Alignment of Promoter Regions
Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic
2006-01-01
We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547
Method and apparatus for biological sequence comparison
Marr, T.G.; Chang, W.I.
1997-12-23
A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.
Method and apparatus for biological sequence comparison
Marr, Thomas G.; Chang, William I-Wei
1997-01-01
A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.
Moustafa, Mohamed Abdallah Mohamed; Lee, Kyunglee; Taylor, Kyle; Nakao, Ryo; Sashika, Mariko; Shimozuru, Michito; Tsubota, Toshio
2015-12-01
A previously undescribed Anaplasma species (herein referred to as AP-sd) has been detected in sika deer, cattle and ticks in Japan. Despite being highly similar to some strains of A. phagocytophilum, AP-sd has never been detected in humans. Its ambiguous epidemiology and the lack of tools for its specific detection make it difficult to understand and interpret the prevalence of this Anaplasma species. We developed a method for specific detection, and examined AP-sd prevalence in Hokkaido wildlife. Our study included 250 sika deer (Cervus nippon yesoensis), 13 brown bears (Ursus arctos yesoensis) and 252 rodents including 138 (Apodemus speciosus), 45 (Apodemus argenteus), 42 (Myodes rufocanus) and 27 (Myodes rutilus) were collected from Hokkaido island, northern Japan, collected during 2010 to 2015. A 770 bp and 382 bp segment of the 16S rRNA and gltA genes, respectively, were amplified by nested PCR. Results were confirmed by cloning and sequencing of the positive PCR products. A reverse line blot hybridization (RLB) based on the 16S rRNA gene was then developed for the specific detection of AP-sd. The prevalence of AP-sd by nested PCR in sika deer was 51% (128/250). We detected this Anaplasma sp. for the first time in wild brown bears and rodents with a prevalence of 15% (2/13) and 2.4% (6/252), respectively. The sequencing results of the 16S rRNA and gltA gene amplicons were divergent from the selected A. phagocytophilum sequences in GenBank. Using a newly designed AP-sd specific probe for RLB has enabled us to specifically detect this Anaplasma species. Besides sika deer and cattle, wild brown bears and rodents were identified as potential reservoir hosts for AP-sd. This study provided a high throughput molecular method that specifically detects AP-sd, and which can be used to investigate its ecology and its potential as a threat to humans in Japan. Copyright © 2015 Elsevier B.V. All rights reserved.
2009-01-01
Background Tardigrades represent an animal phylum with extraordinary resistance to environmental stress. Results To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data. Conclusion Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences. PMID:19821996
2014-01-01
The Bactrian camel (Camelus bactrianus) and the dromedary (Camelus dromedarius) are among the last species that have been domesticated around 3000–6000 years ago. During domestication, strong artificial (anthropogenic) selection has shaped the livestock, creating a huge amount of phenotypes and breeds. Hence, domestic animals represent a unique resource to understand the genetic basis of phenotypic variation and adaptation. Similar to its late domestication history, the Bactrian camel is also among the last livestock animals to have its genome sequenced and deciphered. As no genomic data have been available until recently, we generated a de novo assembly by shotgun sequencing of a single male Bactrian camel. We obtained 1.6 Gb genomic sequences, which correspond to more than half of the Bactrian camel’s genome. The aim of this study was to identify heterozygous single-nucleotide polymorphisms (SNPs) and to estimate population parameters and nucleotide diversity based on an individual camel. With an average 6.6-fold coverage, we detected over 116 000 heterozygous SNPs and recorded a genome-wide nucleotide diversity similar to that of other domesticated ungulates. More than 20 000 (85%) dromedary expressed sequence tags successfully aligned to our genomic draft. Our results provide a template for future association studies targeting economically relevant traits and to identify changes underlying the process of camel domestication and environmental adaptation. PMID:23454912