Although recent technological advances in DNA sequencing and computational biology now allow scientists to compare entire microbial genomes, comparisons of closely related bacterial species and individual isolates by whole-genome sequencing approaches remains prohibitively expens...
Complete Genome Sequence of a Street Rabies Virus Isolated from a Dog in Nigeria
Zhou, Ming; Zhou, Zutao; Kia, Grace S. N.; Gnanadurai, Clement W.; Leyson, Christina M.; Umoh, Jarlath U.; Kwaga, Jacob P.; Kazeem, Haruna M.
2013-01-01
A canine rabies virus (RABV) was isolated from a trade dog in Nigeria. Its entire genome was sequenced and found to be closely related to canine RABVs circulating in Africa. Sequence comparison indicates that the virus is closely related to the Africa 2 RABV lineage. The virus is now termed DRV-NG11. PMID:23469344
IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS
Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...
eShadow: A tool for comparing closely related sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ovcharenko, Ivan; Boffelli, Dario; Loots, Gabriela G.
2004-01-15
Primate sequence comparisons are difficult to interpret due to the high degree of sequence similarity shared between such closely related species. Recently, a novel method, phylogenetic shadowing, has been pioneered for predicting functional elements in the human genome through the analysis of multiple primate sequence alignments. We have expanded this theoretical approach to create a computational tool, eShadow, for the identification of elements under selective pressure in multiple sequence alignments of closely related genomes, such as in comparisons of human to primate or mouse to rat DNA. This tool integrates two different statistical methods and allows for the dynamic visualizationmore » of the resulting conservation profile. eShadow also includes a versatile optimization module capable of training the underlying Hidden Markov Model to differentially predict functional sequences. This module grants the tool high flexibility in the analysis of multiple sequence alignments and in comparing sequences with different divergence rates. Here, we describe the eShadow comparative tool and its potential uses for analyzing both multiple nucleotide and protein alignments to predict putative functional elements. The eShadow tool is publicly available at http://eshadow.dcode.org/« less
Dynamics of actin evolution in dinoflagellates.
Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F
2011-04-01
Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
USDA-ARS?s Scientific Manuscript database
Salmonella enterica are a versatile group of bacteria with a wide range in virulence potential. To facilitate genome comparisons across this virulence spectrum, we present eight complete closed genome sequences of four S. enterica serotypes (Anatum, Montevideo, Typhimurium, and Newport) isolated fro...
Complete genome sequence of the plant pathogen Erwinia amylovora strain ATCC 49946
USDA-ARS?s Scientific Manuscript database
Erwinia amylovora causes the economically important disease fire blight that affects rosaceous plants, especially pear and apple. Here we report the complete genome sequence and annotation of strain ATCC 49946. The analysis of the sequence and its comparison with sequenced genomes of closely related...
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.
Boffelli, Dario; McAuliffe, Jon; Ovcharenko, Dmitriy; Lewis, Keith D; Ovcharenko, Ivan; Pachter, Lior; Rubin, Edward M
2003-02-28
Nonhuman primates represent the most relevant model organisms to understand the biology of Homo sapiens. The recent divergence and associated overall sequence conservation between individual members of this taxon have nonetheless largely precluded the use of primates in comparative sequence studies. We used sequence comparisons of an extensive set of Old World and New World monkeys and hominoids to identify functional regions in the human genome. Analysis of these data enabled the discovery of primate-specific gene regulatory elements and the demarcation of the exons of multiple genes. Much of the information content of the comprehensive primate sequence comparisons could be captured with a small subset of phylogenetically close primates. These results demonstrate the utility of intraprimate sequence comparisons to discover common mammalian as well as primate-specific functional elements in the human genome, which are unattainable through the evaluation of more evolutionarily distant species.
How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity
NASA Technical Reports Server (NTRS)
Fox, G. E.; Wisotzkey, J. D.; Jurtshuk, P. Jr
1992-01-01
16S rRNA (genes coding for rRNA) sequence comparisons were conducted with the following three psychrophilic strains: Bacillus globisporus W25T (T = type strain) and Bacillus psychrophilus W16AT, and W5. These strains exhibited more than 99.5% sequence identity and within experimental uncertainty could be regarded as identical. Their close taxonomic relationship was further documented by phenotypic similarities. In contrast, previously published DNA-DNA hybridization results have convincingly established that these strains do not belong to the same species if current standards are used. These results emphasize the important point that effective identity of 16S rRNA sequences is not necessarily a sufficient criterion to guarantee species identity. Thus, although 16S rRNA sequences can be used routinely to distinguish and establish relationships between genera and well-resolved species, very recently diverged species may not be recognizable.
Suarez, David L.; Perdue, Michael L.; Cox, Nancy; Rowe, Thomas; Bender, Catherine; Huang, Jing; Swayne, David E.
1998-01-01
Genes of an influenza A (H5N1) virus from a human in Hong Kong isolated in May 1997 were sequenced and found to be all avian-like (K. Subbarao et al., Science 279:393–395, 1998). Gene sequences of this human isolate were compared to those of a highly pathogenic chicken H5N1 influenza virus isolated from Hong Kong in April 1997. Sequence comparisons of all eight RNA segments from the two viruses show greater than 99% sequence identity between them. However, neither isolate’s gene sequence was closely (>95% sequence identity) related to any other gene sequences found in the GenBank database. Phylogenetic analysis demonstrated that the nucleotide sequences of at least four of the eight RNA segments clustered with Eurasian origin avian influenza viruses. The hemagglutinin gene phylogenetic analysis also included the sequences from an additional three human and two chicken H5N1 virus isolates from Hong Kong, and the isolates separated into two closely related groups. However, no single amino acid change separated the chicken origin and human origin isolates, but they all contained multiple basic amino acids at the hemagglutinin cleavage site, which is associated with a highly pathogenic phenotype in poultry. In experimental intravenous inoculation studies with chickens, all seven viruses were highly pathogenic, killing most birds within 24 h. All infected chickens had virtually identical pathologic lesions, including moderate to severe diffuse edema and interstitial pneumonitis. Viral nucleoprotein was most frequently demonstrated in vascular endothelium, macrophages, heterophils, and cardiac myocytes. Asphyxiation from pulmonary edema and generalized cardiovascular collapse were the most likely pathogenic mechanisms responsible for illness and death. In summary, a small number of changes in hemagglutinin gene sequences defined two closely related subgroups, with both subgroups having human and chicken members, among the seven viruses examined from Hong Kong, and all seven viruses were highly pathogenic in chickens and caused similar lesions in experimental inoculations. PMID:9658115
USDA-ARS?s Scientific Manuscript database
Rhesus macaques are a widely used model system for the study of vaccines, infectious diseases, and microbial pathogenesis. Their value as a model lies in their close evolutionary relationship to humans, which, in theory, allows them to serve as a close approximation of the human immune system. Howev...
Welch, Andreanna J; Collins, Katherine; Ratan, Aakrosh; Drautz-Moses, Daniela I; Schuster, Stephan C; Lindqvist, Charlotte
2016-06-01
These data are presented in support of a plastid phylogenomic analysis of the recent radiation of the Hawaiian endemic mints (Lamiaceae), and their close relatives in the genus Stachys, "The quest to resolve recent radiations: Plastid phylogenomics of extinct and endangered Hawaiian endemic mints (Lamiaceae)" [1]. Here we describe the chloroplast genome sequences for 12 mint taxa. Data presented include summaries of gene content and length for these taxa, structural comparison of the mint chloroplast genomes with published sequences from other species in the order Lamiales, and comparisons of variability among three Hawaiian taxa vs. three outgroup taxa. Finally, we provide a list of 108 primer pairs targeting the most variable regions within this group and designed specifically for amplification of DNA extracted from degraded herbarium material.
Design and implementation of a database for Brucella melitensis genome annotation.
De Hertogh, Benoît; Lahlimi, Leïla; Lambert, Christophe; Letesson, Jean-Jacques; Depiereux, Eric
2008-03-18
The genome sequences of three Brucella biovars and of some species close to Brucella sp. have become available, leading to new relationship analysis. Moreover, the automatic genome annotation of the pathogenic bacteria Brucella melitensis has been manually corrected by a consortium of experts, leading to 899 modifications of start sites predictions among the 3198 open reading frames (ORFs) examined. This new annotation, coupled with the results of automatic annotation tools of the complete genome sequences of the B. melitensis genome (including BLASTs to 9 genomes close to Brucella), provides numerous data sets related to predicted functions, biochemical properties and phylogenic comparisons. To made these results available, alphaPAGe, a functional auto-updatable database of the corrected sequence genome of B. melitensis, has been built, using the entity-relationship (ER) approach and a multi-purpose database structure. A friendly graphical user interface has been designed, and users can carry out different kinds of information by three levels of queries: (1) the basic search use the classical keywords or sequence identifiers; (2) the original advanced search engine allows to combine (by using logical operators) numerous criteria: (a) keywords (textual comparison) related to the pCDS's function, family domains and cellular localization; (b) physico-chemical characteristics (numerical comparison) such as isoelectric point or molecular weight and structural criteria such as the nucleic length or the number of transmembrane helix (TMH); (c) similarity scores with Escherichia coli and 10 species phylogenetically close to B. melitensis; (3) complex queries can be performed by using a SQL field, which allows all queries respecting the database's structure. The database is publicly available through a Web server at the following url: http://www.fundp.ac.be/urbm/bioinfo/aPAGe.
Conservation in the face of diversity: multistrain analysis of an intracellular bacterium
USDA-ARS?s Scientific Manuscript database
Comparisons of multiple strains revealed that A. marginale has a closed-core genome with few highly plastic regions, which include the msp2 and msp3 genes, as well as the aaap locus. Comparison of the Florida and St. Maries genome sequences found that SNPs comprise 0.8% of the longer Florida genome,...
Jeong, Man-Ki; Soh, Ho Young; Wi, Jin Hee; Suh, Hae-Lip
2018-01-01
Notomastus koreanus sp. n. , collected from the sublittoral muddy bottom of Korean waters, is described as a new species. The Korean new species closely resembles N. torquatus Hutchings & Rainer, 1979 in the chaetal arrangement and the details of abdominal segments, but differs in the position of genital pores and the absence of eyes. DNA sequences (mtCOI, 16S rRNA, and histone H3) of the new species were compared with all the available sequences of Notomastus species in the GenBank database. Three genes showed significant genetic differences between the new species and its congeners (COI: 51.2%, 16S: 38.1-47.3%, H3: 3.7-9.3%). This study also includes a comprehensive comparison of the new Korean Notomastus species with its most closely similar species, based on the morphological and genetic results.
Liu, Peipei; Lu, Hao; Li, Shuang; Moureau, Gregory; Deng, Yong-Qiang; Wang, Yongyue; Zhang, Lijiao; Jiang, Tao; de Lamballerie, Xavier; Qin, Cheng-Feng; Gould, Ernest A; Su, Jingliang; Gao, George F
2012-10-01
Duck egg-drop syndrome virus (DEDSV) is a newly emerging pathogenic flavivirus causing avian diseases in China. The infection occurs in laying ducks characterized by a severe drop in egg production with a fatality rate of 5-15 %. The virus was found to be most closely related to Tembusu virus (TMUV), an isolate from mosquitoes in South-east Asia. Here, we have sequenced and characterized the full-length genomes of seven DEDSV strains, including the 5'- and 3'-non-coding regions (NCRs). We also report for the first time the ORF sequences of TMUV and Sitiawan virus (STWV), another closely related flavivirus isolated from diseased chickens. We analysed the phylogenetic and antigenic relationships of DEDSV in relation to the Asian viruses TMUV and STWV, and other representative flaviviruses. Our results confirm the close relationship between DEDSV and TMUV/STWV and we discuss their probable evolutionary origins. We have also characterized the cleavage sites, potential glycosylation sites and unique motifs/modules of these viruses. Additionally, conserved sequences in both 5'- and 3'-NCRs were identified and the predicted secondary structures of the terminal sequences were studied. Antigenic cross-reactivity comparisons of DEDSV with related pathogenic flaviviruses identified a surprisingly close relationship with dengue virus (DENV) and raised the question of whether or not DEDSV may have a potential infectious threat to man. Importantly, DEDSV can be efficiently recognized by a broadly cross-reactive flavivirus mAb, 2A10G6, derived against DENV. The significance of these studies is discussed in the context of the emergence, evolution, epidemiology, antigenicity and pathogenicity of the newly emergent DEDSV.
The point mutation process in proteins
NASA Technical Reports Server (NTRS)
Schwartz, R. M.; Dayhoff, M. O.
1978-01-01
An optimized scoring matrix for residue-by-residue comparisons of distantly related protein sequences has been developed. The scoring matrix is based on observed exchanges and mutabilities of amino acids in 1572 closely related sequences derived from a cross-section of protein groups. Very few superimposed or parallel mutations are included in the data. The scoring matrix is most useful for demonstrating the relatedness of proteins between 65 and 85% different.
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
Desjardin, Dennis E; Hemmes, Don E; Perry, Brian A
2014-01-01
Pseudobaeospora wipapatiae is described as new based on material collected in alien wet habitats on the island of Hawaii. Unique features of this beautiful species include deep ruby-colored basidiomes with two-spored basidia, amyloid cheilocystidia and a hymeniderm pileipellis with abundant pileocystidia that is initially deep ruby in KOH then changes to lilac gray. Phylogenetic analysis of nuclear large ribosomal subunit sequence data suggest a close relationship between Pseudobaeospora and Tricholoma. BLAST comparisons of internal transcribed spacer and 5.8S nuclear ribosomal subunit regions sequence data reveal greatest similarity with existing sequences of Pseudobaeospora species. A comprehensive description, color photograph, illustrations of salient micromorphological features and comparisons with phenetically similar taxa are provided. © 2014 by The Mycological Society of America.
Brassica ASTRA: an integrated database for Brassica genomic research.
Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David
2005-01-01
Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
Breaking the computational barriers of pairwise genome comparison.
Torreno, Oscar; Trelles, Oswaldo
2015-08-11
Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community. We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods. We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.
Weigel, B J; Burgett, S G; Chen, V J; Skatrud, P L; Frolik, C A; Queener, S W; Ingolia, T D
1988-01-01
beta-Lactam antibiotics such as penicillins and cephalosporins are synthesized by a wide variety of microbes, including procaryotes and eucaryotes. Isopenicillin N synthetase catalyzes a key reaction in the biosynthetic pathway of penicillins and cephalosporins. The genes encoding this protein have previously been cloned from the filamentous fungi Cephalosporium acremonium and Penicillium chrysogenum and characterized. We have extended our analysis to the isopenicillin N synthetase genes from the fungus Aspergillus nidulans and the gram-positive procaryote Streptomyces lipmanii. The isopenicillin N synthetase genes from these organisms have been cloned and sequenced, and the proteins encoded by the open reading frames were expressed in Escherichia coli. Active isopenicillin N synthetase enzyme was recovered from extracts of E. coli cells prepared from cells containing each of the genes in expression vectors. The four isopenicillin N synthetase genes studied are closely related. Pairwise comparison of the DNA sequences showed between 62.5 and 75.7% identity; comparison of the predicted amino acid sequences showed between 53.9 and 80.6% identity. The close homology of the procaryotic and eucaryotic isopenicillin N synthetase genes suggests horizontal transfer of the genes during evolution. Images PMID:3045077
BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons
2011-01-01
Background Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. Results BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. Conclusions There is a clear need for a user-friendly program that can produce genome comparisons for a large number of prokaryote genomes with an emphasis on rapidly utilising unfinished or unassembled genome data. Here we present BRIG, a cross-platform application that enables the interactive generation of comparative genomic images via a simple graphical-user interface. BRIG is freely available for all operating systems at http://sourceforge.net/projects/brig/. PMID:21824423
BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons.
Alikhan, Nabil-Fareed; Petty, Nicola K; Ben Zakour, Nouri L; Beatson, Scott A
2011-08-08
Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. There is a clear need for a user-friendly program that can produce genome comparisons for a large number of prokaryote genomes with an emphasis on rapidly utilising unfinished or unassembled genome data. Here we present BRIG, a cross-platform application that enables the interactive generation of comparative genomic images via a simple graphical-user interface. BRIG is freely available for all operating systems at http://sourceforge.net/projects/brig/.
Nabavi, Reza; Conneely, Brendan; McCarthy, Elaine; Good, Barbara; Shayan, Parviz; DE Waal, Theo
2014-09-01
Accurate identification of sheep nematodes is a critical point in epidemiological studies and monitoring of drug resistance in flocks. However, due to a close morphological similarity between the eggs and larval stages of many of these nematodes, such identification is not a trivial task. There are a number of studies showing that molecular targets in ribosomal DNA (Internal transcribed spacer 1, 2 and Intergenic spacer) are suitable for accurate identification of sheep bursate nematodes. The objective of present study was to compare the ITS1, ITS2 and IGS regions of Iranian common bursate nematodes in order to choose best target for specific identification methods. The first and second internal transcribed spacers (ITS1and ITS2) and intergenic spacer (IGS) of the ribosomal DNA (rDNA) of 5 common Iranian bursate nematodes of sheep were sequenced. The sequences of some non-Iranian isolates were used for comparison in order to evaluate the variation in sequence homology between geographically different nematode populations. Comparison of the ITS1 and ITS2 sequences of Iranian nematodes showed greatest similarity among Teladorsagia circumcincta and Marshallagia marshalli of 94% and 88%, respectively. While Trichostrongylus colubriformis and M. marshalli showed the highest homology (99%) in the IGS sequences. Comparison of the spacer sequences of Iranian with non-Iranian isolates showed significantly higher variation in Haemonchus contortus compared to the other species. Both the ITS1 and ITS2 sequences are convenient targets to have species-specific identification of Iranian bursate nematodes. On the other hand the IGS region may be a less suitable molecular target.
Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka
2009-01-01
Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria. PMID:19440252
Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka
2009-01-01
Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria.
Makarchenko, Eugenyi A; Makarchenko, Marina A; Semenchenko, Alexander A
2015-08-14
Illustrated descriptions of adult male, pupa and fourth instar larva, as well as DNA barcoding, of Hydrobaenus majus sp. nov. in comparison with the close related species H. sikhotealinensis Makarchenko et Makarchenko from the Russian Far East are provided. The species-specificity of H. majus sp. nov. COI sequences is analyzed and the sequences are presented as diagnostic characters--molecular markers of H. majus and H. sikhotealinensis.
Naser, Sabri M; Vancanneyt, Marc; Hoste, Bart; Snauwaert, Cindy; Swings, Jean
2006-07-01
The applicability of a multilocus sequence analysis (MLSA)-based identification system for lactobacilli was evaluated. Two housekeeping genes that code for the phenylalanyl-tRNA synthase alpha-subunit (pheS) and RNA polymerase alpha-subunit (rpoA) were sequenced and analysed for members of the Lactobacillus salivarius species group. The type strains of Lactobacillus acidipiscis and Lactobacillus cypricasei were investigated further using a third gene that encodes the alpha-subunit of ATP synthase (atpA). The MLSA data revealed close relatedness between L. acidipiscis and L. cypricasei, with 99.8-100 % pheS, rpoA and atpA gene sequence similarities. Comparison of the 16S rRNA gene sequences of the type strains of the two species confirmed the close relatedness (99.8 % gene sequence similarity) between the two taxa. Similar phenotypes and high DNA-DNA binding values in the range of 84 to 97.5 % confirmed that L. acidipiscis and L. cypricasei are synonymous species. On the basis of the present study, it is proposed that Lactobacillus cypricasei is a later heterotypic synonym of Lactobacillus acidipiscis.
Behera, Bijay Kumar; Baisvar, Vishwamitra Singh; Kumari, Kavita; Rout, Ajaya Kumar; Pakrashi, Sudip; Paria, Prasenjet; Rao, A R; Rai, Anil
2017-03-01
In the present study, the complete mitochondrial genome sequence of Anabas testudineusis reported using PGM sequencer (Ion Torrent, Life Technologies, La Jolla, CA). The complete mitogenome of climbing perch, A. testudineusis obtained by the de novo sequences assembly of genomic reads using the Torrent Mapping Alignment Program (TMAP), which is 16 603 bp in length. The mitogenome of A. testudineus composed of 13 protein- coding genes, two rRNA, and 22 tRNAs. Here, 20 tRNAs genes showed typical clover leaf model, and D-Loop as the control region along with gene order and organization, being closely similar to Osphronemidae and most of other Perciformes fish mitogenomes of NCBI databases. The mitogenome in the present study has 99% similarity to the complete mitogenome sequence of earlier reported A. testudineus. The phylogenetic analysis of Anabantidae depicted that their mitogenomes are closely related to each other. The complete mitogenome sequence of A. testudineus would be helpful in understanding the population genetics, phylogenetics, and evolution of Anabantidae.
Comparative Sequence Analysis of Multidrug-Resistant IncA/C Plasmids from Salmonella enterica.
Hoffmann, Maria; Pettengill, James B; Gonzalez-Escalona, Narjol; Miller, John; Ayers, Sherry L; Zhao, Shaohua; Allard, Marc W; McDermott, Patrick F; Brown, Eric W; Monday, Steven R
2017-01-01
Determinants of multidrug resistance (MDR) are often encoded on mobile elements, such as plasmids, transposons, and integrons, which have the potential to transfer among foodborne pathogens, as well as to other virulent pathogens, increasing the threats these traits pose to human and veterinary health. Our understanding of MDR among Salmonella has been limited by the lack of closed plasmid genomes for comparisons across resistance phenotypes, due to difficulties in effectively separating the DNA of these high-molecular weight, low-copy-number plasmids from chromosomal DNA. To resolve this problem, we demonstrate an efficient protocol for isolating, sequencing and closing IncA/C plasmids from Salmonella sp. using single molecule real-time sequencing on a Pacific Biosciences (Pacbio) RS II Sequencer. We obtained six Salmonella enterica isolates from poultry, representing six different serovars, each exhibiting the MDR-Ampc resistance profile. Salmonella plasmids were obtained using a modified mini preparation and transformed with Escherichia coli DH10Br. A Qiagen Large-Construct kit™ was used to recover highly concentrated and purified plasmid DNA that was sequenced using PacBio technology. These six closed IncA/C plasmids ranged in size from 104 to 191 kb and shared a stable, conserved backbone containing 98 core genes, with only six differences among those core genes. The plasmids encoded a number of antimicrobial resistance genes, including those for quaternary ammonium compounds and mercury. We then compared our six IncA/C plasmid sequences: first with 14 IncA/C plasmids derived from S. enterica available at the National Center for Biotechnology Information (NCBI), and then with an additional 38 IncA/C plasmids derived from different taxa. These comparisons allowed us to build an evolutionary picture of how antimicrobial resistance may be mediated by this common plasmid backbone. Our project provides detailed genetic information about resistance genes in plasmids, advances in plasmid sequencing, and phylogenetic analyses, and important insights about how MDR evolution occurs across diverse serotypes from different animal sources, particularly in agricultural settings where antimicrobial drug use practices vary.
Conversation Analysis--A Discourse Approach to Teaching Oral English Skills
ERIC Educational Resources Information Center
Wu, Yan
2013-01-01
This paper explores a pedagocial approach to teaching oral English---Conversation Analysis. First, features of spoken language is described in comparison to written language. Second, Conversation Analysis theory is elaborated in terms of adjacency pairs, turn-taking, repairs, sequences, openings and closings, and feedback. Third, under the…
Bruce, A. Gregory; Thouless, Margaret E.; Haines, Anthony S.; Pallen, Mark J.; Grundhoff, Adam
2015-01-01
ABSTRACT Two rhadinovirus lineages have been identified in Old World primates. The rhadinovirus 1 (RV1) lineage consists of human herpesvirus 8, Kaposi's sarcoma-associated herpesvirus (KSHV), and closely related rhadinoviruses of chimpanzees, gorillas, macaques and other Old World primates. The RV2 rhadinovirus lineage is distinct and consists of closely related viruses from the same Old World primate species. Rhesus macaque rhadinovirus (RRV) is the RV2 prototype, and two RRV isolates, 26-95 and 17577, were sequenced. We determined that the pig-tailed macaque RV2 rhadinovirus, MneRV2, is highly associated with lymphomas in macaques with simian AIDS. To further study the role of rhadinoviruses in the development of lymphoma, we sequenced the complete genome of MneRV2 and identified 87 protein coding genes and 17 candidate microRNAs (miRNAs). A strong genome colinearity and sequence homology were observed between MneRV2 and RRV26-95, although the open reading frame (ORF) encoding the KSHV ORFK15 homolog was disrupted in RRV26-95. Comparison with MneRV2 revealed several genomic anomalies in RRV17577 that were not present in other rhadinovirus genomes, including an N-terminal duplication in ORF4 and a recombinative exchange of more distantly related homologs of the ORF22/ORF47 interacting glycoprotein genes. The comparison with MneRV2 has revealed novel genes and important conservation of protein coding domains and transcription initiation, termination, and splicing signals, which have added to our knowledge of RV2 rhadinovirus genetics. Further comparisons with KSHV and other RV1 rhadinoviruses will provide important avenues for dissecting the biology, evolution, and pathology of these closely related tumor-inducing viruses in humans and other Old World primates. IMPORTANCE This work provides the sequence characterization of MneRV2, the pig-tailed macaque homolog of rhesus rhadinovirus (RRV). MneRV2 and RRV belong to the rhadinovirus 2 (RV2) rhadinovirus lineage of Old World primates and are distinct but related to Kaposi's sarcoma-associated herpesvirus (KSHV), the etiologic agent of Kaposi's sarcoma. Pig-tailed macaques provide important models of human disease, and our previous studies have indicated that MneRV2 plays a causal role in AIDS-related lymphomas in macaques. Delineation of the MneRV2 sequence has allowed a detailed characterization of the genome structure, and evolutionary comparisons with RRV and KSHV have identified conserved promoters, splice junctions, and novel genes. This comparison provides insight into RV2 rhadinovirus biology and sets the groundwork for more intensive next-generation (Next-Gen) transcript and genetic analysis of this class of tumor-inducing herpesvirus. This study supports the use of MneRV2 in pig-tailed macaques as an important model for studying rhadinovirus biology, transmission and pathology. PMID:25609822
Nowrousian, Minou; Würtz, Christian; Pöggeler, Stefanie; Kück, Ulrich
2004-03-01
One of the most challenging parts of large scale sequencing projects is the identification of functional elements encoded in a genome. Recently, studies of genomes of up to six different Saccharomyces species have demonstrated that a comparative analysis of genome sequences from closely related species is a powerful approach to identify open reading frames and other functional regions within genomes [Science 301 (2003) 71, Nature 423 (2003) 241]. Here, we present a comparison of selected sequences from Sordaria macrospora to their corresponding Neurospora crassa orthologous regions. Our analysis indicates that due to the high degree of sequence similarity and conservation of overall genomic organization, S. macrospora sequence information can be used to simplify the annotation of the N. crassa genome.
A measure of the denseness of a phylogenetic network. [by sequenced proteins from extant species
NASA Technical Reports Server (NTRS)
Holmquist, R.
1978-01-01
An objective measure of phylogenetic denseness is developed to examine various phylogenetic criteria: alpha- and beta-hemoglobin, myoglobin, cytochrome c, and the parvalbumin family. Attention is given to the number of nucleotide replacements separating homologous sequences, and to the topology of the network (in other words, to the qualitative nature of the network as defined by how closely the studied species are related). Applications include quantitative comparisons of species origin, relation, and rates of evolution.
The complete genomic sequence of a tentative new polerovirus identified in barley in South Korea.
Zhao, Fumei; Lim, Seungmo; Yoo, Ran Hee; Igori, Davaajargal; Kim, Sang-Min; Kwak, Do Yeon; Kim, Sun Lim; Lee, Bong Choon; Moon, Jae Sun
2016-07-01
The complete nucleotide sequence of a new barley polerovirus, tentatively named barley virus G (BVG), which was isolated in Gimje, South Korea, has been determined using an RNA sequencing technique combined with polymerase chain reaction methods. The viral genomic RNA of BVG is 5,620 nucleotides long and contains six typical open reading frames commonly observed in other poleroviruses. Sequence comparisons revealed that BVG is most closely related to maize yellow dwarf virus-RMV, with the highest amino acid identities being less than 90 % for all of the corresponding proteins. These results suggested that BVG is a member of a new species in the genus Polerovirus.
Fujisaki, K; Hagihara, F; Kaido, M; Mise, K; Okuno, T
2003-01-01
Spring beauty latent virus (SBLV), a bromovirus, systemically and efficiently infected Arabidopsis thaliana, whereas the well-studied bromoviruses brome mosaic virus (BMV) and cowpea chlorotic mottle virus (CCMV) did not infect and poorly infected A. thaliana, respectively. We constructed biologically active cDNA clones of SBLV genomic RNAs and determined their complete nucleotide sequences. Interestingly, SBLV RNA3 contains both the box B motif in the intercistronic region, as does BMV, and the subgenomic promoter-like sequence in the 5' noncoding region, as does CCMV. Sequence comparisons of SBLV, BMV, CCMV, and broad bean mottle virus demonstrated that SBLV is closely related to BMV and CCMV.
Castejon, Maria; Menéndez, Maria Carmen; Comas, Iñaki; Vicente, Ana; Garcia, Maria J
2018-06-01
Bacterial whole-genome sequences contain informative features of their evolutionary pathways. Comparison of whole-genome sequences have become the method of choice for classification of prokaryotes, thus allowing the identification of bacteria from an evolutionary perspective, and providing data to resolve some current controversies. Currently, controversy exists about the assignment of members of the Mycobacterium avium complex, as is for the cases of Mycobacterium yongonense and 'Mycobacterium indicus pranii'. These two mycobacteria, closely related to Mycobacterium intracellulare on the basis of standard phenotypic and single gene-sequences comparisons, were not considered a member of such species on the basis on some particular differences displayed by a single strain. Whole-genome sequence comparison procedures, namely the average nucleotide identity and the genome distance, showed that those two mycobacteria should be considered members of the species M. intracellulare. The results were confirmed with other whole-genome comparison supplementary methods. According to the data provided, Mycobacterium yongonense and 'Mycobacterium indicus pranii' should be considered and renamed and included as members of M. intracellulare. This study highlights the problems caused when a novel species is accepted on the basis of a single strain, as was the case for M. yongonense. Based mainly on whole-genome sequence analysis, we conclude that M. yongonense should be reclassified as a subspecies of Mycobacterium intracellulareas Mycobacterium intracellularesubsp. yongonense and 'Mycobacterium indicus pranii' classified in the same subspecies as the type strain of Mycobacterium intracellulare and classified as Mycobacterium intracellularesubsp. intracellulare.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick S. G.; Carniel, E.; Larimer, Frank W
2004-09-01
Yersinia pestis, the causative agent of plague, is a highly uniform clone that diverged recently from the enteric pathogen Yersinia pseudotuberculosis. Despite their close genetic relationship, they differ radically in their pathogenicity and transmission. Here, we report the complete genomic sequence of Y. pseudotuberculosis IP32953 and its use for detailed genome comparisons with available Y. pestis sequences. Analyses of identified differences across a panel of Yersinia isolates from around the world reveal 32 Y. pestis chromosomal genes that, together with the two Y. pestis-specific plasmids, to our knowledge, represent the only new genetic material in Y. pestis acquired since themore » the divergence from Y. pseudotuberculosis. In contrast, 149 other pseudogenes (doubling the previous estimate) and 317 genes absent from Y. pestis were detected, indicating that as many as 13% of Y. pseudotuberculosis genes no longer function in Y. pestis. Extensive insertion sequence-mediated genome rearrangements and reductive evolution through massive gene loss, resulting in elimination and modification of preexisting gene expression pathways, appear to be more important than acquisition of genes in the evolution of Y. pestis. These results provide a sobering example of how a highly virulent epidemic clone can suddenly emerge from a less virulent, closely related progenitor.« less
Hahn, Lars; Leimeister, Chris-André; Ounit, Rachid; Lonardi, Stefano; Morgenstern, Burkhard
2016-10-01
Many algorithms for sequence analysis rely on word matching or word statistics. Often, these approaches can be improved if binary patterns representing match and don't-care positions are used as a filter, such that only those positions of words are considered that correspond to the match positions of the patterns. The performance of these approaches, however, depends on the underlying patterns. Herein, we show that the overlap complexity of a pattern set that was introduced by Ilie and Ilie is closely related to the variance of the number of matches between two evolutionarily related sequences with respect to this pattern set. We propose a modified hill-climbing algorithm to optimize pattern sets for database searching, read mapping and alignment-free sequence comparison of nucleic-acid sequences; our implementation of this algorithm is called rasbhari. Depending on the application at hand, rasbhari can either minimize the overlap complexity of pattern sets, maximize their sensitivity in database searching or minimize the variance of the number of pattern-based matches in alignment-free sequence comparison. We show that, for database searching, rasbhari generates pattern sets with slightly higher sensitivity than existing approaches. In our Spaced Words approach to alignment-free sequence comparison, pattern sets calculated with rasbhari led to more accurate estimates of phylogenetic distances than the randomly generated pattern sets that we previously used. Finally, we used rasbhari to generate patterns for short read classification with CLARK-S. Here too, the sensitivity of the results could be improved, compared to the default patterns of the program. We integrated rasbhari into Spaced Words; the source code of rasbhari is freely available at http://rasbhari.gobics.de/.
Anton, Brian P; Mongodin, Emmanuel F; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R; Roberts, Richard J; Raleigh, Elisabeth A
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems.
Anton, Brian P.; Mongodin, Emmanuel F.; Agrawal, Sonia; Fomenkov, Alexey; Byrd, Devon R.; Roberts, Richard J.; Raleigh, Elisabeth A.
2015-01-01
We report the complete sequence of ER2796, a laboratory strain of Escherichia coli K-12 that is completely defective in DNA methylation. Because of its lack of any native methylation, it is extremely useful as a host into which heterologous DNA methyltransferase genes can be cloned and the recognition sequences of their products deduced by Pacific Biosciences Single-Molecule Real Time (SMRT) sequencing. The genome was itself sequenced from a long-insert library using the SMRT platform, resulting in a single closed contig devoid of methylated bases. Comparison with K-12 MG1655, the first E. coli K-12 strain to be sequenced, shows an essentially co-linear relationship with no major rearrangements despite many generations of laboratory manipulation. The comparison revealed a total of 41 insertions and deletions, and 228 single base pair substitutions. In addition, the long-read approach facilitated the surprising discovery of four gene conversion events, three involving rRNA operons and one between two cryptic prophages. Such events thus contribute both to genomic homogenization and to bacteriophage diversification. As one of relatively few laboratory strains of E. coli to be sequenced, the genome also reveals the sequence changes underlying a number of classical mutant alleles including those affecting the various native DNA methylation systems. PMID:26010885
The folding mechanism of two closely related proteins in the intracellular lipid binding protein family, human bile acid binding protein (hBABP) and rat bile acid binding protein (rBABP) were examined. These proteins are 77% identical (93% similar) in sequence Both of these singl...
Characterization of perch rhabdovirus (PRV) in farmed grayling Thymallus thymallus.
Gadd, Tuija; Viljamaa-Dirks, Satu; Holopainen, Riikka; Koski, Perttu; Jakava-Viljanen, Miia
2013-10-11
Two Finnish fish farms experienced elevated mortality rates in farmed grayling Thymallus thymallus fry during the summer months, most typically in July. The mortalities occurred during several years and were connected with a few neurological disorders and peritonitis. Virological investigation detected an infection with an unknown rhabdovirus. Based on the entire glycoprotein (G) and partial RNA polymerase (L) gene sequences, the virus was classified as a perch rhabdovirus (PRV). Pairwise comparisons of the G and L gene regions of grayling isolates revealed that all isolates were very closely related, with 99 to 100% nucleotide identity, which suggests the same origin of infection. Phylogenetic analysis demonstrated that they were closely related to the strain isolated from perch Perca fluviatilis and sea trout Salmo trutta trutta caught from the Baltic Sea. The entire G gene sequences revealed that all Finnish grayling isolates, and both the perch and sea trout isolates, were most closely related to a PRV isolated in France in 2004. According to the partial L gene sequences, all of the Finnish grayling isolates were most closely related to the Danish isolate DK5533 from pike. The genetic analysis of entire G gene and partial L gene sequences showed that the Finnish brown trout isolate ka907_87 shared only approximately 67 and 78% identity, respectively, with our grayling isolates. The grayling isolates were also analysed by an immunofluorescence antibody test. This is the first report of a PRV causing disease in grayling in Finland.
Evolution of long centromeres in fire ants.
Huang, Yu-Ching; Lee, Chih-Chi; Kao, Chia-Yi; Chang, Ni-Chen; Lin, Chung-Chi; Shoemaker, DeWayne; Wang, John
2016-09-15
Centromeres are essential for accurate chromosome segregation, yet sequence conservation is low even among closely related species. Centromere drive predicts rapid turnover because some centromeric sequences may compete better than others during female meiosis. In addition to sequence composition, longer centromeres may have a transmission advantage. We report the first observations of extremely long centromeres, covering on average 34 % of the chromosomes, in the red imported fire ant Solenopsis invicta. By comparison, cytological examination of Solenopsis geminata revealed typical small centromeric constrictions. Bioinformatics and molecular analyses identified CenSol, the major centromeric satellite DNA repeat. We found that CenSol sequences are very similar between the two species but the CenSol copy number in S. invicta is much greater than that in S. geminata. In addition, centromere expansion in S. invicta is not correlated with the duplication of CenH3. Comparative analyses revealed that several closely related fire ant species also possess long centromeres. Our results are consistent with a model of simple runaway centromere expansion due to centromere drive. We suggest expanded centromeres may be more prevalent in hymenopteran insects, which use haplodiploid sex determination, than previously considered.
Montoya-Ruiz, Carolina; Cajimat, Maria N B; Milazzo, Mary Louise; Diaz, Francisco J; Rodas, Juan David; Valbuena, Gustavo; Fulhorst, Charles F
2015-07-01
The results of a previous study suggested that Cherrie's cane rat (Zygodontomys cherriei) is the principal host of Necoclí virus (family Bunyaviridae, genus Hantavirus) in Colombia. Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences in this study confirmed that Necoclí virus is phylogenetically closely related to Maporal virus, which is principally associated with the delicate pygmy rice rat (Oligoryzomys delicatus) in western Venezuela. In pairwise comparisons, nonidentities between the complete amino acid sequence of the nucleocapsid protein of Necoclí virus and the complete amino acid sequences of the nucleocapsid proteins of other hantaviruses were ≥8.7%. Likewise, nonidentities between the complete amino acid sequence of the glycoprotein precursor of Necoclí virus and the complete amino acid sequences of the glycoprotein precursors of other hantaviruses were ≥11.7%. Collectively, the unique association of Necoclí virus with Z. cherriei in Colombia, results of the Bayesian analyses of complete nucleocapsid protein gene sequences and complete glycoprotein precursor gene sequences, and results of the pairwise comparisons of amino acid sequences strongly support the notion that Necoclí virus represents a novel species in the genus Hantavirus. Further work is needed to determine whether Calabazo virus (a hantavirus associated with Z. brevicauda cherriei in Panama) and Necoclí virus are conspecific.
Primate-specific evolution of an LDLR enhancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Qian-Fei; Prabhakar, Shyam; Wang, Qianben
2005-12-01
Sequence changes in regulatory regions have often been invoked to explain phenotypic divergence among species, but molecular examples of this have been difficult to obtain. In this study we identified an anthropoid primate-specific sequence element that contributed to the regulatory evolution of the low-density lipoprotein receptor. Using a combination of close and distant species genomic sequence comparisons coupled with in vivo and in vitro studies, we found that a functional cholesterol-sensing sequence motif arose and was fixed within a pre-existing enhancer in the common ancestor of anthropoid primates. Our study demonstrates one molecular mechanism by which ancestral mammalian regulatory elementsmore » can evolve to perform new functions in the primate lineage leading to human.« less
The complete chloroplast genome of Aconitum chiisanense Nakai (Ranunculaceae).
Lim, Chae Eun; Kim, Goon-Bo; Baek, Seunghoon; Han, Su-Min; Yu, Hee-Ju; Mun, Jeong-Hwan
2017-01-01
We determined the complete chloroplast DNA sequence of Aconitum chiisanense Nakai, a rare Aconitum species endemic to Korea. The chloroplast genome is 155 934 bp in length and contains 4 rRNA, 30 tRNA, and 78 protein-coding genes. Phylogenetic analysis revealed that the chloroplast genome of A. chiisanense is closely related to that of A. barbatum var. puberulum. Sequence comparison with other Ranunculaceae chloroplasts identified a unique deletion in the rps16 gene of A. chiisanense chloroplast DNA that can serve as a molecular marker for species identification.
ERIC Educational Resources Information Center
Williams, Joanna P.; Kao, Jenny C.; Pao, Lisa S.; Ordynans, Jill G.; Atkins, J. Grant; Cheng, Rong; DeBonis, Daniel
2016-01-01
We developed and evaluated an intervention that teaches reading comprehension via expository text structure training to second graders in urban public schools at risk for academic failure. Fifty lessons on 5 basic text structures (sequence, comparison, causation, description, and problem-solution) were embedded in a social studies curriculum that…
2011-01-01
Background Because biotechnological uses of bacteriophage gene products as alternatives to conventional antibiotics will require a thorough understanding of their genomic context, we sequenced and analyzed the genomes of four closely related phages isolated from Clostridium perfringens, an important agricultural and human pathogen. Results Phage whole-genome tetra-nucleotide signatures and proteomic tree topologies correlated closely with host phylogeny. Comparisons of our phage genomes to 26 others revealed three shared COGs; of particular interest within this core genome was an endolysin (PF01520, an N-acetylmuramoyl-L-alanine amidase) and a holin (PF04531). Comparative analyses of the evolutionary history and genomic context of these common phage proteins revealed two important results: 1) strongly significant host-specific sequence variation within the endolysin, and 2) a protein domain architecture apparently unique to our phage genomes in which the endolysin is located upstream of its associated holin. Endolysin sequences from our phages were one of two very distinct genotypes distinguished by variability within the putative enzymatically-active domain. The shared or core genome was comprised of genes with multiple sequence types belonging to five pfam families, and genes belonging to 12 pfam families, including the holin genes, which were nearly identical. Conclusions Significant genomic diversity exists even among closely-related bacteriophages. Holins and endolysins represent conserved functions across divergent phage genomes and, as we demonstrate here, endolysins can have significant variability and host-specificity even among closely-related genomes. Endolysins in our phage genomes may be subject to different selective pressures than the rest of the genome. These findings may have important implications for potential biotechnological applications of phage gene products. PMID:21631945
An improved model for whole genome phylogenetic analysis by Fourier transform.
Yin, Changchuan; Yau, Stephen S-T
2015-10-07
DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
Behera, Bijay Kumar; Kumari, Kavita; Baisvar, Vishwamitra Singh; Rout, Ajaya Kumar; Pakrashi, Sudip; Paria, Prasenjet; Jena, J K
2017-01-01
In the present study, the complete mitochondrial genome sequence of Labeo gonius is reported using PGM sequencer (Ion Torrent). The complete mitogenome of L. gonius is obtained by the de novo sequences assembly of genomic reads using the Torrent Mapping Alignment Program (TMAP) which is 16 614 bp in length. The mitogenome of L. gonius comprised of 13 protein-coding genes, 22 tRNAs, 2 rRNA genes, and D-loop as control region along with gene order and organization, being similar to most of other fish mitogenomes of NCBI databases. The mitogenome in the present study has 99% similarity to the complete mitogenome sequence of Labeo fimbriatus, as reported earlier. The phylogenetic analysis of Cypriniformes depicted that their mitogenomes are closely related to each other. The complete mitogenome sequence of L. gonius would be helpful in understanding the population genetics, phylogenetics, and evolution of Indian Carps.
Kayansamruaj, Pattanapon; Pirarat, Nopadon; Kondo, Hidehiro; Hirono, Ikuo; Rodkhum, Channarong
2015-12-01
Streptococcus agalactiae, or Group B streptococcus (GBS), is a highly virulent pathogen in aquatic animals, causing huge mortalities worldwide. In Thailand, the serotype Ia, β-hemolytic GBS, belonging to sequence type (ST) 7 of clonal complex (CC) 7, was found to be the major cause of streptococcosis outbreaks in fish farms. In this study, we performed an in silico genomic comparison, aiming to investigate the phylogenetic relationship between the pathogenic fish strains of Thai ST7 and other ST7 from different hosts and geographical origins. In general, the genomes of Thai ST7 strains are closely related to other fish ST7s, as the core genome is shared by 92-95% of any individual fish ST7 genome. Among the fish ST7 genomes, we observed only small dissimilarities, based on the analysis of clustered regularly interspaced short palindromic repeats (CRISPRs), surface protein markers, insertions sequence (IS) elements and putative virulence genes. The phylogenetic tree based on single nucleotide polymorphisms (SNPs) of the core genome sequences clearly categorized the ST7 strains according to their geographical and host origins, with the human ST7 being genetically distant from other fish ST7 strains. A pan-genome analysis of ST7 strains detected a 48-kb gene island specifically in the Thai ST7 isolates. The orientations and predicted amino acid sequences of the genes in the island closely matched those of Tn5252, a streptococcal conjugative transposon, in GBS 2603V/R serotype V, Streptococcus pneumoniae and Streptococcus suis. Thus, it was presumed that Thai ST7 acquired this Tn5252 homologue from related streptococci. The close phylogenetic relationship between the fish ST7 strains suggests that these strains were derived from a common ancestor and have diverged in different geographical regions and in different hosts. Copyright © 2015 Elsevier B.V. All rights reserved.
Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F; Abbazia, Patrick; Ababio, Amma; Adam, Naazneen
2015-01-01
The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. DOI: http://dx.doi.org/10.7554/eLife.06416.001 PMID:25919952
Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F
2015-04-28
The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.
Novel Hepatozoon in vertebrates from the southern United States.
Allen, Kelly E; Yabsley, Michael J; Johnson, Eileen M; Reichard, Mason V; Panciera, Roger J; Ewing, Sidney A; Little, Susan E
2011-08-01
Novel Hepatozoon spp. sequences collected from previously unrecognized vertebrate hosts in North America were compared with documented Hepatozoon 18S rRNA sequences in an effort to examine phylogenetic relationships between the different Hepatozoon organisms found cycling in nature. An approximately 500-base pair fragment of 18S rDNA common to Hepatozoon spp. and some other apicomplexans was amplified and sequenced from the tissues or blood of 16 vertebrate host species from the southern United States, including 1 opossum (Didelphis virginiana), 2 bobcats (Lynx rufus), 1 domestic cat (Felis catus), 3 coyotes (Canis latrans), 1 gray fox (Urocyon cinereoargenteus), 4 raccoons (Procyon lotor), 1 pet boa constrictor (Boa constrictor imperator), 1 swamp rabbit (Sylvilagus aquaticus), 1 cottontail rabbit (Sylvilagus floridanus), 4 woodrats (Neotoma fuscipes and Neotoma micropus), 3 white-footed mice (Peromyscus leucopus), 8 cotton rats (Sigmodon hispidus), 1 cotton mouse (Peromyscus gossypinus), 1 eastern grey squirrel (Sciurus carolinensis), and 1 woodchuck (Marmota monax). Phylogenetic analyses and comparison with sequences in the existing database revealed distinct groups of Hepatozoon spp., with clusters formed by sequences obtained from scavengers and carnivores (opossum, raccoons, canids, and felids) and those obtained from rodents. Surprisingly, Hepatozoon spp. sequences from wild rabbits were most closely related to sequences obtained from carnivores (97.2% identical), and the sequence from the boa constrictor was most closely related to the rodent cluster (97.4% identical). These data are consistent with recent work identifying prey-predator transmission cycles in Hepatozoon spp. and suggest this pattern may be more common than previously recognized.
Chen, Tsute; Siddiqui, Huma; Olsen, Ingar
2017-01-01
Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.
Chen, Tsute; Siddiqui, Huma; Olsen, Ingar
2017-01-01
Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563
The mitogenome of Onchocerca volvulus from the Brazilian Amazonia focus.
Crainey, James L; Silva, Túllio R R da; Encinas, Fernando; Marín, Michel A; Vicente, Ana Carolina P; Luz, Sérgio L B
2016-01-01
We report here the first complete mitochondria genome of Onchocerca volvulus from a focus outside of Africa. An O. volvulus mitogenome from the Brazilian Amazonia focus was obtained using a combination of high-throughput and Sanger sequencing technologies. Comparisons made between this mitochondrial genome and publicly available mitochondrial sequences identified 46 variant nucleotide positions and suggested that our Brazilian mitogenome is more closely related to Cameroon-origin mitochondria than West African-origin mitochondria. As well as providing insights into the origins of Latin American onchocerciasis, the Brazilian Amazonia focus mitogenome may also have value as an epidemiological resource.
Zhu, Ruo-Lin; Zhang, Qi-Ya
2014-04-01
Paralichthys olivaceus rhabdovirus (PORV), which is associated with high mortality rates in flounder, was isolated in China in 2005. Here, we provide an annotated sequence record of PORV, the genome of which comprises 11,182 nucleotides and contains six genes in the order 3'-N-P-M-G-NV-L-5'. Phylogenetic analysis based on glycoprotein sequences of PORV and other rhabdoviruses showed that PORV clusters with viral haemorrhagic septicemia virus (VHSV), genus Novirhabdovirus, family Rhabdoviridae. Further phylogenetic analysis of the combined amino acid sequences of six proteins of PORV and VHSV strains showed that PORV clusters with Korean strains and is closely related to Asian strains, all of which were isolated from flounder. In a comparison in which the sequences of the six proteins were combined, PORV shared the highest identity (98.3 %) with VHSV strain KJ2008 from Korea.
Osmundson, Todd W.; Robert, Vincent A.; Schoch, Conrad L.; Baker, Lydia J.; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M.
2013-01-01
Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1–2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa. PMID:23638077
Osmundson, Todd W; Robert, Vincent A; Schoch, Conrad L; Baker, Lydia J; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M
2013-01-01
Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1-2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa.
Bell, Stephanie A; Pusterla, Nicola; Balasuriya, Udeni B R; Mapes, Samantha M; Nyberg, Nicole L; MacLachlan, N James
2008-07-27
Equids are commonly infected by herpesviruses, but isolation of herpesviruses from mules has apparently not been previously reported. Furthermore, the genomic relationships among the various equid herpesviruses are poorly characterized. We describe the isolation and preliminary characterization of a mule gammaherpesvirus tentatively identified as asinine herpesvirus-2 (AHV-2; also designated equid herpesvirus-7 (EHV-7)) from the nasal secretions (NS) of a healthy mule in northern California. The virus was initially identified by transmission electron microscopic examination of lysates of cell culture inoculated with NS collected from the mule. A 913 nucleotide sequence of the DNA polymerase gene was amplified using degenerate primers, and comparison of this sequence with those of various other herpesviruses showed that the mule herpesvirus was most closely related to EHV-2 (AHV-2 sequences were not available for comparison). The sequence of a shorter portion (166 nucleotides) of the mule herpesvirus DNA polymerase gene was identical to that of the published sequence of an asinine gammaherpesvirus, previously designated as AHV-4-3 (AY054992). AHV-2 was detected by real-time polymerase chain reaction assay in the NS of approximately 8% of a cohort of 114 healthy mules and 13 donkeys.
Gubser, Caroline; Smith, Geoffrey L
2002-04-01
Camelpox virus (CMPV) and variola virus (VAR) are orthopoxviruses (OPVs) that share several biological features and cause high mortality and morbidity in their single host species. The sequence of a virulent CMPV strain was determined; it is 202182 bp long, with inverted terminal repeats (ITRs) of 6045 bp and has 206 predicted open reading frames (ORFs). As for other poxviruses, the genes are tightly packed with little non-coding sequence. Most genes within 25 kb of each terminus are transcribed outwards towards the terminus, whereas genes within the centre of the genome are transcribed from either DNA strand. The central region of the genome contains genes that are highly conserved in other OPVs and 87 of these are conserved in all sequenced chordopoxviruses. In contrast, genes towards either terminus are more variable and encode proteins involved in host range, virulence or immunomodulation. In some cases, these are broken versions of genes found in other OPVs. The relationship of CMPV to other OPVs was analysed by comparisons of DNA and predicted protein sequences, repeats within the ITRs and arrangement of ORFs within the terminal regions. Each comparison gave the same conclusion: CMPV is the closest known virus to variola virus, the cause of smallpox.
Kämpfer, Peter; Falsen, Enevold; Busse, Hans-Jürgen
2008-01-01
Pseudomonas mephitica CCUG 2513(T) has been reinvestigated to clarify its taxonomic position. 16S rRNA gene sequence comparisons demonstrated that this strain clusters phylogenetically closely with Janthinobacterium lividum (99.8% sequence similarity to the type strain). Investigation of fatty acid patterns, polar lipid profiles, polyamine patterns and quinone systems supported this delineation. Substrate utilization profiles and biochemical characteristics displayed no differences from the type strain of J. lividum, CCUG 2344(T). Therefore, the reclassification of Pseudomonas mephitica as a later heterotypic synonym of Janthinobacterium lividum is proposed, based upon the estimated phylogenetic position derived from 16S rRNA gene sequence data and chemotaxonomic and biochemical data.
Zhang, Wenwei; Cheng, Zhuomin; Xu, Lei; Wu, Maosen; Waterhouse, Peter; Zhou, Guanghe; Li, Shifang
2009-01-01
The complete nucleotide sequence of the ssRNA genome of a Chinese GPV isolate of barley yellow dwarf virus (BYDV) was determined. It comprised 5673 nucleotides, and the deduced genome organization resembled that of members of the genus Polerovirus. It was most closely related to cereal yellow dwarf virus-RPV (77% nt identity over the entire genome; coat protein amino acid identity 79%). The GPV isolate also differs in vector specificity from other BYDV strains. Biological properties, phylogenetic analyses and detailed sequence comparisons suggest that GPV should be considered a member of a new species within the genus, and the name Wheat yellow dwarf virus-GPV is proposed.
Gagny, B; Rossignol, M; Silar, P
1997-12-01
We have cloned and sequenced the gene encoding the translation elongation factor eEF1A from two filamentous fungi, Podospora curvicolla and Sordaria macrospora. These fungi are close relatives of Podospora anserina and also show senescence syndromes. Comparison of the sequences of the deduced proteins with that of P. anserina reveals that the three proteins differ in several positions. Replacement of the P. anserina gene by either of the two exogenous genes does not entail any modification in P. anserina physiology; the longevity of the fungus is not affected. No alteration of in vivo translational accuracy was detected; however, the exogenous proteins nonetheless promoted a modification of the resistance to the aminoglycoside antibiotic paromomycin. These data suggest that optimization of life span between these closely related fungi has likely not been performed during evolution through modifications of eEF1A activity, despite the fact that mutations in this factor can drastically affect longevity. Copyright 1997 Academic Press.
Khamrin, Pattara; Okitsu, Shoko; Ushijima, Hiroshi; Maneekarn, Niwat
2013-07-01
Epidemiological surveillance of human bocavirus (HBoV) was conducted on fecal specimens collected from hospitalized children with diarrhea in Chiang Mai, Thailand in 2011. By partial sequence analysis of VP1 gene, an unusual strain of HBoV (CMH-S011-11), was initially identified as HBoV4. The complete genome sequence of CMH-S011-11 was performed and analyzed further to clarify whether it was a recombinant strain or a new HBoV variant. Analysis of complete genome sequence revealed that the coding sequence starting from NS1, NP1 to VP1/VP2 was 4795 nucleotides long. Interestingly, the nucleotide sequence of NS1 gene of CMH-S011-11 was most closely related to the HBoV2 reference strains detected in Pakistan, which contradicted to the initial genotyping result of the partial VP1 region in the previous study. In addition, comparison of NP1 nucleotide sequence of CMH-S011-11 with those of other HBoV1-4 reference strains also revealed a high level of sequence identity with HBoV2. On the other hand, nucleotide sequence of VP1/VP2 gene of CMH-S011-11 was most closely related to those of HBoV4 reference strains detected in Nigeria. The overall full-length sequence analysis revealed that this CMH-S011-11 was grouped within HBoV4 species, but located in a separate branch from other HBoV4 prototype strains. Recombination analysis revealed that CMH-S011-11 was the result of recombination between HBoV2 and HBoV4 strains with the break point located near the start codon of VP2. Copyright © 2013 Elsevier B.V. All rights reserved.
Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu
2017-03-01
Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.
Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu
2017-01-01
Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
Harnessing Whole Genome Sequencing in Medical Mycology.
Cuomo, Christina A
2017-01-01
Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.
Improving pairwise comparison of protein sequences with domain co-occurrence
Gascuel, Olivier
2018-01-01
Comparing and aligning protein sequences is an essential task in bioinformatics. More specifically, local alignment tools like BLAST are widely used for identifying conserved protein sub-sequences, which likely correspond to protein domains or functional motifs. However, to limit the number of false positives, these tools are used with stringent sequence-similarity thresholds and hence can miss several hits, especially for species that are phylogenetically distant from reference organisms. A solution to this problem is then to integrate additional contextual information to the procedure. Here, we propose to use domain co-occurrence to increase the sensitivity of pairwise sequence comparisons. Domain co-occurrence is a strong feature of proteins, since most protein domains tend to appear with a limited number of other domains on the same protein. We propose a method to take this information into account in a typical BLAST analysis and to construct new domain families on the basis of these results. We used Plasmodium falciparum as a case study to evaluate our method. The experimental findings showed an increase of 14% of the number of significant BLAST hits and an increase of 25% of the proteome area that can be covered with a domain. Our method identified 2240 new domains for which, in most cases, no model of the Pfam database could be linked. Moreover, our study of the quality of the new domains in terms of alignment and physicochemical properties show that they are close to that of standard Pfam domains. Source code of the proposed approach and supplementary data are available at: https://gite.lirmm.fr/menichelli/pairwise-comparison-with-cooccurrence PMID:29293498
Ashen, Jon B.; Goff, Lynda J.
2000-01-01
The phylogenetic relationships of bacterial symbionts from three gall-bearing species in the marine red algal genus Prionitis (Rhodophyta) were inferred from 16S rDNA sequence analysis and compared to host phylogeny also inferred from sequence comparisons (nuclear ribosomal internal-transcribed-spacer region). Gall formation has been described previously on two species of Prionitis, P. lanceolata (from central California) and P. decipiens (from Peru). This investigation reports gall formation on a third related host, Prionitis filiformis. Phylogenetic analyses based on sequence comparisons place the bacteria as a single lineage within the Roseobacter grouping of the α subclass of the division Proteobacteria (99.4 to 98.25% sequence identity among phylotypes). Comparison of symbiont and host molecular phylogenies confirms the presence of three gall-bearing algal lineages and is consistent with the hypothesis that these red seaweeds and their bacterial symbionts are coevolving. The species specificity of these associations was investigated in nature by whole-cell hybridization of gall bacteria and in the laboratory by using cross-inoculation trials. Whole-cell in situ hybridization confirmed that a single bacterial symbiont phylotype is present in galls on each host. In laboratory trials, bacterial symbionts were incapable of inducing galls on alternate hosts (including two non-gall-bearing species). Symbiont-host specificity in Prionitis gall formation indicates an effective ecological separation between these closely related symbiont phylotypes and provides an example of a biological context in which to consider the organismic significance of 16S rDNA sequence variation. PMID:10877801
Differences in expression of retinal pigment epithelium mRNA between normal canines
2004-01-01
Abstract A reference database of differences in mRNA expression in normal healthy canine retinal pigment epithelium (RPE) has been established. This database identifies non-informative differences in mRNA expression that can be used in screening canine RPE for mutations associated with clinical effects on vision. Complementary DNA (cDNA) pools were prepared from mRNA harvested from RPE, amplified by PCR, and used in a subtractive hybridization protocol (representational differential analysis) to identify differences in RPE mRNA expression between canines. The effect of relatedness of the test canines on the frequency of occurrence of differences was evaluated by using 2 unrelated canines for comparison with 2 female sibling canines of blue heeler/bull terrier lineage. Differentially expressed cDNA species were cloned, sequenced, and identified by comparison to public database entries. The most frequently observed differentially expressed sequence from the unrelated canine comparison was cDNA with 21 base pairs (bp) identical to the human epithelial membrane protein 1 gene (present in 8 of 20 clones). Different clones from the same-sex sibling RPE contained repetitions of several short sequence motifs including the human epithelial membrane protein 1 (4 of 25 clones). Other prevalent differences between sibling RPE included sequences similar to a chicken genetic marker sequence motif (5 of 25), and 6 clones with homology to porcine major histocompatibility loci. In addition to identifying several repetitively occurring, noninformative, differentially expressed RPE mRNA species, the findings confirm that fewer differences occurred between siblings, highlighting the importance of using closely related subjects in representational difference analysis studies. PMID:15352545
Wolff, G; Burger, G; Lang, B F; Kück, U
1993-01-01
The mitochondrial DNA from the colourless alga Prototheca wickerhamii contains two mosaic genes as was revealed from complete sequencing of the circular extranuclear genome. The genes for the large subunit of the ribosomal RNA (LSUrRNA) as well as for subunit I of the cytochrome oxidase (coxI) carry two and three intronic sequences respectively. On the basis of their canonical nucleotide sequences they can be classified as group I introns. Phylogenetic comparisons of the coxI protein sequences allow us to conclude that the P.wickerhamii mtDNA is much closer related to higher plant mtDNAs than to those of the chlorophyte alga C.reinhardtii. The comparison of the intron sequences revealed several unusual features: (1) The P.wickerhamii introns are structurally related to mitochondrial introns from various ascomycetous fungi. (2) Phylogenetic analyses indicate a close relationship between fungal and algal intronic sequences. (3) The P. wickerhamii introns are located at positions within the structural genes which can be considered as preferred intron insertion sites in homologous mitochondrial genes from fungi or liverwort. In all cases, the sequences adjacent to the insertion sites are very well conserved over large evolutionary distances. Our finding of highly similar introns in fungi and algae is consistent with the idea that introns have already been present in the bacterial ancestors of present day mitochondria and evolved concomitantly with the organelles. PMID:7680126
Gonzalez, P; Barroso, G; Labarère, J
1998-10-05
The Basidiomycota Agrocybe aegerita (Aa) mitochondrial cox1 gene (6790 nucleotides), encoding a protein of 527aa (58377Da), is split by four large subgroup IB introns possessing site-specific endonucleases assumed to be involved in intron mobility. When compared to other fungal COX1 proteins, the Aa protein is closely related to the COX1 one of the Basidiomycota Schizophyllum commune (Sc). This clade reveals a relationship with the studied Ascomycota ones, with the exception of Schizosaccharomyces pombe (Sp) which ranges in an out-group position compared with both higher fungi divisions. When comparison is extended to other kingdoms, fungal COX1 sequences are found to be more related to algae and plant ones (more than 57.5% aa similarity) than to animal sequences (53.6% aa similarity), contrasting with the previously established close relationship between fungi and animals, based on comparisons of nuclear genes. The four Aa cox1 introns are homologous to Ascomycota or algae cox1 introns sharing the same location within the exonic sequences. The percentages of identity of the intronic nucleotide sequences suggest a possible acquisition by lateral transfers of ancestral copies or of their derived sequences. These identities extend over the whole intronic sequences, arguing in favor of a transfer of the complete intron rather than a transfer limited to the encoded ORF. The intron i4 shares 74% of identity, at the nucleotidic level, with the Podospora anserina (Pa) intron i14, and up to 90.5% of aa similarity between the encoded proteins, i.e. the highest values reported to date between introns of two phylogenetically distant species. This low divergence argues for a recent lateral transfer between the two species. On the contrary, the low sequence identities (below 36%) observed between Aa i1 and the homologous Sp i1 or Prototheca wickeramii (Pw) i1 suggest a long evolution time after the separation of these sequences. The introns i2 and i3 possessed intermediate percentages of identity with their homologous Ascomycota introns. This is the first report of the complete nucleotide sequence and molecular organization of a mitochondrial cox1 gene of any member of the Basidiomycota division.
Prevotella timonensis sp. nov., isolated from a human breast abscess.
Glazunova, Olga O; Launay, Thierry; Raoult, Didier; Roux, Véronique
2007-04-01
Gram-negative anaerobic rods were isolated from a human breast abscess. Based on genotypic and phenotypic characteristics, the novel strain belonged to the genus Prevotella. Phylogenetic analysis based on 16S rRNA gene sequence comparisons showed that it was closely related to Prevotella buccalis (94 % 16S rRNA gene sequence similarity), Prevotella salivae (90 %) and Prevotella oris (89.1 %). The major cellular fatty acid was C(14 : 0) (19.5 %). The new isolate represents a novel species in the genus Prevotella, for which the name Prevotella timonensis sp. nov. is proposed. The type strain is strain 4401737(T) (=CIP 108522(T)=CCUG 50105(T)).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boore, Jeffrey L.; Staton, Joseph
We have determined the sequence of about half (7470 nts) of the mitochondrial genome of the sipunculid Phascolopsis gouldii, the first representative of this phylum to be so studied. All of the 19 identified genes are transcribed from the same DNA strand. The arrangement of these genes is remarkably similar to that of the oligochaete annelid Lumbricus terrestris. Comparison of both the inferred amino acid sequences and the gene arrangements of a variety of diverse metazoan taxa reveals that the phylum Sipuncula is more closely related to Annelida than to Mollusca. This requires reinterpretation of the homology of several embryologicalmore » features and of patterns of animal body plan evolution.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Souza, B; Stoutland, P; Derbise, A
2004-01-24
Yersinia pestis, the causative agent of plague, is a highly uniform clone that diverged recently from the enteric pathogen Yersinia pseudotuberculosis. Despite their close genetic relationship, they differ radically in their pathogenicity and transmission. Here we report the complete genomic sequence of Y. pseudotuberculosis IP32953 and its use for detailed genome comparisons to available Y. pestis sequences. Analyses of identified differences across a panel of Yersinia isolates from around the world reveals 32 Y. pestis chromosomal genes that, together with the two Y. pestis-specific plasmids, represent the only new genetic material in Y. pestis acquired since the divergence from Y.more » pseudotuberculosis. In contrast, 149 new pseudogenes (doubling the previous estimate) and 317 genes absent from Y. pestis were detected, indicating that as many as 13% of Y. pseudotuberculosis genes no longer function in Y. pestis. Extensive IS-mediated genome rearrangements and reductive evolution through massive gene loss, resulting in elimination and modification of pre-existing gene expression pathways appear to be more important than acquisition of new genes in the evolution of Y. pestis. These results provide a sobering example of how a highly virulent epidemic clone can suddenly emerge from a less virulent, closely related progenitor.« less
Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Bitaraf-Sani, Morteza
2015-01-01
Very little is known about LHR and FSHR genes of domestic dromedary camels. The main objective of this study was to determine and analyze partial genomic regions of FSHR and LHR genes in dromedary camels for the first time. To this end, a total of50 DNA samples belonging to dromedary camels raised in Iran were sent for sequencing (25 samples of each gene). We compared the nucleotide sequences of Camelus dromedarius with corresponding sequences of previously published FSHR and LHR genes in bactrian camels and other species. According to the data, the same nucleotide variation was identified in both regions of the two camel species. The alignment of deduced protein sequences of the two different species revealed an amino acid variation at the FSHR region. No evidence of amino acid variation was observed, however, in LHR sequences. Phylogenetic analysis indicated that both camel species had a close relationship and clustered together in a separate branch. This was further confirmed by genetic distance values illustrating significant sequence identity between Camelus dromedarius and Camelus bactrianus. Interestingly, sequence comparisons revealed heterozygote patterns in FSHR sequences isolated from dromedary camels of Iran. In comparison to other species, this camel contains three amino acid substitutions at 5, 67, and 105 positions in the FSHR coding region. These positions are found exclusively in camels and can be considered as species specific. The results of our study can be used for hormone functionality research (FSHR and LHR) as well as reproduction-linked polymorphisms and breeding programs. PMID:27844002
Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Bitaraf-Sani, Morteza
2015-06-01
Very little is known about LHR and FSHR genes of domestic dromedary camels. The main objective of this study was to determine and analyze partial genomic regions of FSHR and LHR genes in dromedary camels for the first time. To this end, a total of50 DNA samples belonging to dromedary camels raised in Iran were sent for sequencing (25 samples of each gene). We compared the nucleotide sequences of Camelus dromedarius with corresponding sequences of previously published FSHR and LHR genes in bactrian camels and other species. According to the data, the same nucleotide variation was identified in both regions of the two camel species. The alignment of deduced protein sequences of the two different species revealed an amino acid variation at the FSHR region. No evidence of amino acid variation was observed, however, in LHR sequences. Phylogenetic analysis indicated that both camel species had a close relationship and clustered together in a separate branch. This was further confirmed by genetic distance values illustrating significant sequence identity between Camelus dromedarius and Camelus bactrianus . Interestingly, sequence comparisons revealed heterozygote patterns in FSHR sequences isolated from dromedary camels of Iran. In comparison to other species, this camel contains three amino acid substitutions at 5, 67, and 105 positions in the FSHR coding region. These positions are found exclusively in camels and can be considered as species specific. The results of our study can be used for hormone functionality research ( FSHR and LHR ) as well as reproduction-linked polymorphisms and breeding programs.
Moretto, Marco; Barghini, Elena; Mascagni, Flavia; Natali, Lucia; Brilli, Matteo; Lomsadze, Alexandre; Sonego, Paolo; Giongo, Lara; Alonge, Michael; Velasco, Riccardo; Varotto, Claudio; Šurbanovski, Nada; Borodovsky, Mark; Ward, Judson A; Engelen, Kristof; Cavallini, Andrea; Cestaro, Alessandro
2018-01-01
Abstract Background The genus Potentilla is closely related to that of Fragaria, the economically important strawberry genus. Potentilla micrantha is a species that does not develop berries but shares numerous morphological and ecological characteristics with Fragaria vesca. These similarities make P. micrantha an attractive choice for comparative genomics studies with F. vesca. Findings In this study, the P. micrantha genome was sequenced and annotated, and RNA-Seq data from the different developmental stages of flowering and fruiting were used to develop a set of gene predictions. A 327 Mbp sequence and annotation of the genome of P. micrantha, spanning 2674 sequence contigs, with an N50 size of 335,712, estimated to cover 80% of the total genome size of the species was developed. The genus Potentilla has a characteristically larger genome size than Fragaria, but the recovered sequence scaffolds were remarkably collinear at the micro-syntenic level with the genome of F. vesca, its closest sequenced relative. A total of 33,602 genes were predicted, and 95.1% of bench-marking universal single-copy orthologous genes were complete within the presented sequence. Thus, we argue that the majority of the gene-rich regions of the genome have been sequenced. Conclusions Comparisons of RNA-Seq data from the stages of floral and fruit development revealed genes differentially expressed between P. micrantha and F. vesca.The data presented are a valuable resource for future studies of berry development in Fragaria and the Rosaceae and they also shed light on the evolution of genome size and organization in this family. PMID:29659812
Buti, Matteo; Moretto, Marco; Barghini, Elena; Mascagni, Flavia; Natali, Lucia; Brilli, Matteo; Lomsadze, Alexandre; Sonego, Paolo; Giongo, Lara; Alonge, Michael; Velasco, Riccardo; Varotto, Claudio; Šurbanovski, Nada; Borodovsky, Mark; Ward, Judson A; Engelen, Kristof; Cavallini, Andrea; Cestaro, Alessandro; Sargent, Daniel James
2018-04-01
The genus Potentilla is closely related to that of Fragaria, the economically important strawberry genus. Potentilla micrantha is a species that does not develop berries but shares numerous morphological and ecological characteristics with Fragaria vesca. These similarities make P. micrantha an attractive choice for comparative genomics studies with F. vesca. In this study, the P. micrantha genome was sequenced and annotated, and RNA-Seq data from the different developmental stages of flowering and fruiting were used to develop a set of gene predictions. A 327 Mbp sequence and annotation of the genome of P. micrantha, spanning 2674 sequence contigs, with an N50 size of 335,712, estimated to cover 80% of the total genome size of the species was developed. The genus Potentilla has a characteristically larger genome size than Fragaria, but the recovered sequence scaffolds were remarkably collinear at the micro-syntenic level with the genome of F. vesca, its closest sequenced relative. A total of 33,602 genes were predicted, and 95.1% of bench-marking universal single-copy orthologous genes were complete within the presented sequence. Thus, we argue that the majority of the gene-rich regions of the genome have been sequenced. Comparisons of RNA-Seq data from the stages of floral and fruit development revealed genes differentially expressed between P. micrantha and F. vesca.The data presented are a valuable resource for future studies of berry development in Fragaria and the Rosaceae and they also shed light on the evolution of genome size and organization in this family.
Mercury BLASTP: Accelerating Protein Sequence Alignment
Jacob, Arpith; Lancaster, Joseph; Buhler, Jeremy; Harris, Brandon; Chamberlain, Roger D.
2008-01-01
Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. BLASTP is the most popular tool for comparative analysis of protein sequences. In recent years, an exponential increase in the size of protein sequence databases has required either exponentially more running time or a cluster of machines to keep pace. To address this problem, we have designed and built a high-performance FPGA-accelerated version of BLASTP, Mercury BLASTP. In this paper, we describe the architecture of the portions of the application that are accelerated in the FPGA, and we also describe the integration of these FPGA-accelerated portions with the existing BLASTP software. We have implemented Mercury BLASTP on a commodity workstation with two Xilinx Virtex-II 6000 FPGAs. We show that the new design runs 11-15 times faster than software BLASTP on a modern CPU while delivering close to 99% identical results. PMID:19492068
A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.
2006-01-20
Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would bemore » very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since both of these genomes are crop plants, their complete genome sequence will facilitate development of chloroplast genetic engineering technology, as in recent studies from Daniell's lab. Knowing the exact sequence from spacer regions is crucial for introducing transgenes into the chloroplast genome.« less
Using pyrosequencing to shed light on deep mine microbial ecology
Edwards, Robert A; Rodriguez-Brito, Beltran; Wegley, Linda; Haynes, Matthew; Breitbart, Mya; Peterson, Dean M; Saar, Martin O; Alexander, Scott; Alexander, E Calvin; Rohwer, Forest
2006-01-01
Background Contrasting biological, chemical and hydrogeological analyses highlights the fundamental processes that shape different environments. Generating and interpreting the biological sequence data was a costly and time-consuming process in defining an environment. Here we have used pyrosequencing, a rapid and relatively inexpensive sequencing technology, to generate environmental genome sequences from two sites in the Soudan Mine, Minnesota, USA. These sites were adjacent to each other, but differed significantly in chemistry and hydrogeology. Results Comparisons of the microbes and the subsystems identified in the two samples highlighted important differences in metabolic potential in each environment. The microbes were performing distinct biochemistry on the available substrates, and subsystems such as carbon utilization, iron acquisition mechanisms, nitrogen assimilation, and respiratory pathways separated the two communities. Although the correlation between much of the microbial metabolism occurring and the geochemical conditions from which the samples were isolated could be explained, the reason for the presence of many pathways in these environments remains to be determined. Despite being physically close, these two communities were markedly different from each other. In addition, the communities were also completely different from other microbial communities sequenced to date. Conclusion We anticipate that pyrosequencing will be widely used to sequence environmental samples because of the speed, cost, and technical advantages. Furthermore, subsystem comparisons rapidly identify the important metabolisms employed by the microbes in different environments. PMID:16549033
Li, S.-F.; Xu, J.-W.; Yang, Q.-L.; Wang, C.H.; Chen, Q.; Chapman, D.C.; Lu, G.
2009-01-01
Based upon morphological characters, Silver carp Hypophthalmichthys molitrix and bighead carp Hypophthalmichthys nobilis (or Aristichthys nobilis) have been classified into either the same genus or two distinct genera. Consequently, the taxonomic relationship of the two species at the generic level remains equivocal. This issue is addressed by sequencing complete mitochondrial genomes of H. molitrix and H. nobilis, comparing their mitogenome organization, structure and sequence similarity, and conducting a comprehensive phylogenetic analysis of cyprinid species. As with other cyprinid fishes, the mitogenomes of the two species were structurally conserved, containing 37 genes including 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA (tRNAs) genes and a putative control region (D-loop). Sequence similarity between the two mitogenomes varied in different genes or regions, being highest in the tRNA genes (98??8%), lowest in the control region (89??4%) and intermediate in the protein-coding genes (94??2%). Analyses of the sequence comparison and phylogeny using concatenated protein sequences support the view that the two species belong to the genus Hypophthalmichthys. Further studies using nuclear markers and involving more closely related species, and the systematic combination of traditional biology and molecular biology are needed in order to confirm this conclusion. ?? 2009 The Fisheries Society of the British Isles.
Meadows, J R S; Kijas, J W
2009-02-01
The male-specific region of the ovine Y chromosome (MSY) remains poorly characterized, yet sequence variants from this region have the potential to reveal the wild progenitor of domestic sheep or examples of domestic and wild paternal introgression. The 5' promoter region of the sex-determining gene SRY was re-sequenced using a subset of wild sheep including bighorn (Ovis canadensis), thinhorn (Ovis dalli spp.), urial (Ovis vignei), argali (Ovis ammon), mouflon (Ovis musimon) and domestic sheep (Ovis aries). Seven novel SNPs (oY2-oY8) were revealed; these were polymorphic between but not within species. Re-sequencing and fragment analysis was applied to the MSY microsatellite SRYM18. It contains a complex compound repeat structure and sequencing of three novel size fragments revealed that a pentanucleotide element remained fixed, whilst a dinucleotide element displayed variability within species. Comparison of the sequence between species revealed that urial and argali sheep grouped more closely to the mouflon and domestic breeds than the pachyceriforms (bighorn and thinhorn). SNP and microsatellite data were combined to define six previously undetected haplotypes. Analysis revealed the mouflon as the only species to share a haplotype with domestic sheep, consistent with its status as a feral domesticate that has undergone male-mediated exchange with domestic animals. A comparison of the remaining wild species and domestic sheep revealed that O. aries is free from signatures of wild sheep introgression.
Králová-Hromadová, Ivica; Štefka, Jan; Bazsalovicsová, Eva; Bokorová, Silvia; Oros, Mikuláš
2013-10-01
Atractolytocestus tenuicollis (Li, 1964) Xi, Wang, Wu, Gao et Nie, 2009 is a monozoic, non-segmented tapeworm of the order Caryophyllidea, parasitizing exclusively common carp (Cyprinus carpio L.). In the current work, the first molecular data, in particular complete ribosomal internal transcribed spacer 2 (ITS2) and partial mitochondrial cytochrome c oxidase subunit I (cox1) on A. tenuicollis from Niushan Lake, Wuhan, China, are provided. In order to evaluate molecular interrelationships within Atractolytocestus, the data on A. tenuicollis were compared with relevant data on two other congeners, Atractolytocestus huronensis and Atractolytocestus sagittatus. Divergent intragenomic copies (ITS2 paralogues) were detected in the ITS2 ribosomal spacer of A. tenuicollis; the same phenomenon has previously been observed also in two other congeners. ITS2 structure of A. tenuicollis was very similar to that of A. huronensis from Slovakia, USA and UK; overall pairwise sequence identity was 91.7-95.2%. On the other hand, values of sequence identity between A. tenuicollis and A. sagittatus were lower, 69.7-70.9%. Cox1 sequence, analysed in five A. tenuicollis individuals, were 100 % identical and no intraspecific variation was observed. Comparison of A. tenuicollis cox1 with respective sequences of two other Atractolytocestus species showed that the mitochondrial haplotype found in Chinese A. tenuicollis is structurally specific (haplotype 4; Ha4) and differs from all so far determined Atractolytocestus haplotypes (Ha1 and Ha2 for A. huronensis; Ha3 for A. sagittatus). Pairwise sequence identity between A. tenuicollis cox1 haplotype and remaining three haplotypes followed the same pattern as in ITS2. The nucleotide and amino acide (aa) sequence comparison with A. huronensis Ha1 and Ha2 revealed higher sequence identity, 90.3-90.8% (96.9% in aa), while lower values were achieved between A. tenuicollis haplotype and Ha3 of Japanese A. sagittatus-75.2 % (81.9 % in aa). The phylogenetic analyses using cox1, ITS2 and combined cox1 + ITS2 sequences revealed close genetic interrelationship between A. tenuicollis and A. huronensis. Independently of a type of analysis and DNA region used, the topology of obtained trees was always identical; A. tenuicollis formed separate clade with A. huronensis forming a closely related sister group.
Molecular detection of viral agents in free-ranging and captive neotropical felids in Brazil.
Furtado, Mariana M; Taniwaki, Sueli A; de Barros, Iracema N; Brandão, Paulo E; Catão-Dias, José L; Cavalcanti, Sandra; Cullen, Laury; Filoni, Claudia; Jácomo, Anah T de Almeida; Jorge, Rodrigo S P; Silva, Nairléia Dos Santos; Silveira, Leandro; Ferreira Neto, José S
2017-09-01
We describe molecular testing for felid alphaherpesvirus 1 (FHV-1), carnivore protoparvovirus 1 (CPPV-1), feline calicivirus (FCV), alphacoronavirus 1 (feline coronavirus [FCoV]), feline leukemia virus (FeLV), feline immunodeficiency virus (FIV), and canine distemper virus (CDV) in whole blood samples of 109 free-ranging and 68 captive neotropical felids from Brazil. Samples from 2 jaguars ( Panthera onca) and 1 oncilla ( Leopardus tigrinus) were positive for FHV-1; 2 jaguars, 1 puma ( Puma concolor), and 1 jaguarundi ( Herpairulus yagouaroundi) tested positive for CPPV-1; and 1 puma was positive for FIV. Based on comparison of 103 nucleotides of the UL24-UL25 gene, the FHV-1 sequences were 99-100% similar to the FHV-1 strain of domestic cats. Nucleotide sequences of CPPV-1 were closely related to sequences detected in other wild carnivores, comparing 294 nucleotides of the VP1 gene. The FIV nucleotide sequence detected in the free-ranging puma, based on comparison of 444 nucleotides of the pol gene, grouped with other lentiviruses described in pumas, and had 82.4% identity with a free-ranging puma from Yellowstone Park and 79.5% with a captive puma from Brazil. Our data document the circulation of FHV-1, CPPV-1, and FIV in neotropical felids in Brazil.
2014-01-01
Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.
The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content betweenmore » strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. As a result, the loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars.« less
Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.; ...
2015-06-03
The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content betweenmore » strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. As a result, the loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars.« less
Waddell, Evan J.; Elliott, Terran J.; Sani, Rajesh K.; Vahrenkamp, Jefferey M.; Roggenthen, William M.; Anderson, Cynthia M.; Bang, Sookie S.
2013-01-01
Molecular characterization of subsurface microbial communities in the former Homestake gold mine, South Dakota, was carried out by 16S rDNA sequence analysis using a water sample and a weathered soil–like sample. Geochemical analyses indicated that both samples were high in sulfur, rich in nitrogen and salt, but with significantly different metal concentrations. Microbial diversity comparisons unexpectedly revealed three distinct operational taxonomic units (OTUs) belonging to the archaeal phylum Thaumarchaeota typically identified from marine environments, and one OTU to a potentially novel phylum that falls sister to Thaumarchaeota. To our knowledge this is only the second report of Thaumarchaeota in a terrestrial environment. The majority of the clones from Archaea sequence libraries fell into two closely related OTUs and grouped most closely to an ammonia–oxidizing, carbon–fixing and halophilic thaumarchaeote genus, Nitrosopumilus. The two samples showed neither Euryarchaeota nor Crenarchaeota members that were often identified from other subsurface terrestrial ecosystems. Bacteria OTUs containing the highest percentage of sequences were related to sulfur-oxidizing bacteria of the orders Chromatiales and Thiotrichales. Community members of Bacteria from individual Homestake ecosystems were heterogeneous and distinctive to each community with unique phylotypes identified within each sample. PMID:20662386
Matthews, T. David; Schmieder, Robert; Silva, Genivaldo G. Z.; Busch, Julia; Cassman, Noriko; Dutilh, Bas E.; Green, Dawn; Matlock, Brian; Heffernan, Brian; Olsen, Gary J.; Farris Hanna, Leigh; Schifferli, Dieter M.; Maloy, Stanley; Dinsdale, Elizabeth A.; Edwards, Robert A.
2015-01-01
The Salmonella enterica serovars Enteritidis, Dublin, and Gallinarum are closely related but differ in virulence and host range. To identify the genetic elements responsible for these differences and to better understand how these serovars are evolving, we sequenced the genomes of Enteritidis strain LK5 and Dublin strain SARB12 and compared these genomes to the publicly available Enteritidis P125109, Dublin CT 02021853 and Dublin SD3246 genome sequences. We also compared the publicly available Gallinarum genome sequences from biotype Gallinarum 287/91 and Pullorum RKS5078. Using bioinformatic approaches, we identified single nucleotide polymorphisms, insertions, deletions, and differences in prophage and pseudogene content between strains belonging to the same serovar. Through our analysis we also identified several prophage cargo genes and pseudogenes that affect virulence and may contribute to a host-specific, systemic lifestyle. These results strongly argue that the Enteritidis, Dublin and Gallinarum serovars of Salmonella enterica evolve by acquiring new genes through horizontal gene transfer, followed by the formation of pseudogenes. The loss of genes necessary for a gastrointestinal lifestyle ultimately leads to a systemic lifestyle and niche exclusion in the host-specific serovars. PMID:26039056
Simmonds-Gordon, R N; Collins-Fairclough, A M; Stewart, C S; Roye, M E
2014-10-01
Jatropha gossypifolia is a weed that is commonly found with yellow mosaic symptoms growing along the roadside and in close proximity to cultivated crops in many farming communities in Jamaica. For the first time, the complete genome sequence of a new begomovirus, designated jatropha mosaic virus-[Jamaica:Spanish Town:2004] (JMV-[JM:ST:04]), was determined from field-infected J. gossypifolia in the western hemisphere. DNA-A nucleotide sequence comparisons showed closest identity (84 %) to two tobacco-infecting viruses from Cuba, tobacco mottle leaf curl virus-[Cuba:Sancti Spiritus:03] (TbMoLCV-[CU:SS:03]) and tobacco leaf curl Cuba virus-[Cuba:Taguasco:2005] (TbLCuCUV-[CU:Tag:05]), and two weed-infecting viruses from Cuba and Jamaica, Rhynchosia rugose golden mosaic virus-[Cuba:Camaguey:171:2009] (RhRGMV- [CU:Cam:171:09]) and Wissadula golden mosaic St. Thomas virus-[Jamaica:Albion:2005] (WGMSTV-[JM:Alb:05]). Phylogenetic analysis revealed that JMV-[JM:ST:04] is most closely related to tobacco and tomato viruses from Cuba and WGMSTV-[JM:Alb:05], a common malvaceous-weed-infecting virus from eastern Jamaica, and that it is distinct from begomoviruses infecting Jatropha species in India and Nigeria.
Statistical method to compare massive parallel sequencing pipelines.
Elsensohn, M H; Leblay, N; Dimassi, S; Campan-Fournier, A; Labalme, A; Roucher-Boulez, F; Sanlaville, D; Lesca, G; Bardel, C; Roy, P
2017-03-01
Today, sequencing is frequently carried out by Massive Parallel Sequencing (MPS) that cuts drastically sequencing time and expenses. Nevertheless, Sanger sequencing remains the main validation method to confirm the presence of variants. The analysis of MPS data involves the development of several bioinformatic tools, academic or commercial. We present here a statistical method to compare MPS pipelines and test it in a comparison between an academic (BWA-GATK) and a commercial pipeline (TMAP-NextGENe®), with and without reference to a gold standard (here, Sanger sequencing), on a panel of 41 genes in 43 epileptic patients. This method used the number of variants to fit log-linear models for pairwise agreements between pipelines. To assess the heterogeneity of the margins and the odds ratios of agreement, four log-linear models were used: a full model, a homogeneous-margin model, a model with single odds ratio for all patients, and a model with single intercept. Then a log-linear mixed model was fitted considering the biological variability as a random effect. Among the 390,339 base-pairs sequenced, TMAP-NextGENe® and BWA-GATK found, on average, 2253.49 and 1857.14 variants (single nucleotide variants and indels), respectively. Against the gold standard, the pipelines had similar sensitivities (63.47% vs. 63.42%) and close but significantly different specificities (99.57% vs. 99.65%; p < 0.001). Same-trend results were obtained when only single nucleotide variants were considered (99.98% specificity and 76.81% sensitivity for both pipelines). The method allows thus pipeline comparison and selection. It is generalizable to all types of MPS data and all pipelines.
Azospirillum canadense sp. nov., a nitrogen-fixing bacterium isolated from corn rhizosphere.
Mehnaz, Samina; Weselowski, Brian; Lazarovits, George
2007-03-01
A free-living diazotrophic strain, DS2(T), was isolated from corn rhizosphere. Polyphasic taxonomy was performed including morphological characterization, Biolog analysis, and 16S rRNA, cpn60 and nifH gene sequence analyses. 16S rRNA gene sequence analysis indicated that strain DS2(T) was closely related to the genus Azospirillum (96 % similarity). Chemotaxonomic characteristics (DNA G+C content 67.9 mol%; Q-10 quinone system; major fatty acid 18 : 1omega7c) were also similar to those of the genus Azospirillum. In all the analyses, including phenotypic characterization using Biolog analysis and comparison of cellular fatty acids, this isolate was found to be different from the closely related species Azospirillum lipoferum, Azospirillum oryzae and Azospirillum brasilense. On the basis of these results, a novel species is proposed for this nitrogen-fixing strain. The name Azospirillum canadense sp. nov. is suggested with the type strain DS2(T) (=NCCB 100108(T)=LMG 23617(T)).
Subbiah, Madhuri; Xiao, Sa; Collins, Peter L.; Samal, Siba K
2009-01-01
The complete RNA genome sequence of avian paramyxovirus (APMV) serotype 2, strain Yucaipa isolated from chicken has been determined. With genome size of 14,904 nucleotides (nt), strain Yucaipa is consistent with the “rule of six” and is the smallest virus reported to date among the members of subfamily Paramyxovirinae. The genome contains six non-overlapping genes in the order 3′-N-P/V-M-F-HN-L-5′. The genes are flanked on either side by highly-conserved transcription start and stop signals and have intergenic sequences varying in length from 3 to 23 nt. The genome contains a 55 nt leader sequence at 3′ end and a 154 nt trailer sequence at 5′ end. Alignment and phylogenetic analysis of the predicted amino acid sequences of strain Yucaipa proteins with the cognate proteins of viruses of all of the five genera of family Paramyxoviridae showed that APMV-2 strain Yucaipa is more closely related to APMV-6 than APMV-1. PMID:18603323
Van Damme, Els J.M.; Charels, Diana; Roy, Soma; Tierens, Koenraad; Barre, Annick; Martins, José C.; Rougé, Pierre; Van Leuven, Fred; Does, Mirjam; Peumans, Willy J.
1999-01-01
We isolated SN-HLPf (Sambucus nigra hevein-like fruit protein), a hevein-like chitin-binding protein, from mature elderberry fruits. Cloning of the corresponding gene demonstrated that SN-HLPf is synthesized as a chimeric precursor consisting of an N-terminal chitin-binding domain corresponding to the mature elderberry protein and an unrelated C-terminal domain. Sequence comparisons indicated that the N-terminal domain of this precursor has high sequence similarity with the N-terminal domain of class I PR-4 (pathogenesis-related) proteins, whereas the C terminus is most closely related to that of class V chitinases. On the basis of these sequence homologies the gene encoding SN-HLPf can be considered a hybrid between a PR-4 and a class V chitinase gene. PMID:10198114
Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji
2012-12-01
In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Amoikon, Tiemele Laurent Simon; Grondin, Cécile; Djéni, Théodore N'Dédé; Jacques, Noémie; Casaregola, Serge
2018-05-21
Analysis of yeasts isolated from various biotopes in French Guiana led to the identification of two strains isolated from flowers and designated CLIB 1634 T and CLIB 1707 T . Comparison of the D1/D2 domain of the large subunit (LSU D1/D2) rRNA gene sequences of CLIB 1634 T and CLIB 1707 T to those in the GenBank database revealed that these strains belong to the Starmerella clade. Strain CLIB 1634 T was shown to diverge from the closely related Starmerella apicola type strain CBS 2868 T with a sequence divergence of 1.34 and 1.30 %, in the LSU D1/D2 rRNA gene and internal transcribed spacer (ITS) sequences respectively. Strain CLIB 1634 T and Candida apicola CBS 2868 T diverged by 3.81 and 14.96 % at the level of the protein-coding gene partial sequences EF-1α and RPB2, respectively. CLIB 1707 T was found to have sequence divergence of 3.88 and 9.16 % in the LSU D1/D2 rRNA gene and ITS, respectively, from that of the most closely related species Starmerella ratchasimensis type strain CBS 10611 T . The species Starmerella reginensis f.a., sp. nov. and Starmerella kourouensis f.a., sp. nov. are proposed to accommodate strains CLIB 1634 T (=CBS 15247 T ) and CLIB 1707 T (=CBS 15257 T ), respectively.
Searching for evidence of selection in avian DNA barcodes.
Kerr, Kevin C R
2011-11-01
The barcode of life project has assembled a tremendous number of mitochondrial cytochrome c oxidase I (COI) sequences. Although these sequences were gathered to develop a DNA-based system for species identification, it has been suggested that further biological inferences may also be derived from this wealth of data. Recurrent selective sweeps have been invoked as an evolutionary mechanism to explain limited intraspecific COI diversity, particularly in birds, but this hypothesis has not been formally tested. In this study, I collated COI sequences from previous barcoding studies on birds and tested them for evidence of selection. Using this expanded data set, I re-examined the relationships between intraspecific diversity and interspecific divergence and sampling effort, respectively. I employed the McDonald-Kreitman test to test for neutrality in sequence evolution between closely related pairs of species. Because amino acid sequences were generally constrained between closely related pairs, I also included broader intra-order comparisons to quantify patterns of protein variation in avian COI sequences. Lastly, using 22 published whole mitochondrial genomes, I compared the evolutionary rate of COI against the other 12 protein-coding mitochondrial genes to assess intragenomic variability. I found no conclusive evidence of selective sweeps. Most evidence pointed to an overall trend of strong purifying selection and functional constraint. The COI protein did vary across the class Aves, but to a very limited extent. COI was the least variable gene in the mitochondrial genome, suggesting that other genes might be more informative for probing factors constraining mitochondrial variation within species. © 2011 Blackwell Publishing Ltd.
Yu, Xiaoyu; Reva, Oleg N
2018-01-01
Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.
Yu, Xiaoyu; Reva, Oleg N
2018-01-01
Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data, and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotation- and alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA. PMID:29511354
Substrate-Driven Mapping of the Degradome by Comparison of Sequence Logos
Fuchs, Julian E.; von Grafenstein, Susanne; Huber, Roland G.; Kramer, Christian; Liedl, Klaus R.
2013-01-01
Sequence logos are frequently used to illustrate substrate preferences and specificity of proteases. Here, we employed the compiled substrates of the MEROPS database to introduce a novel metric for comparison of protease substrate preferences. The constructed similarity matrix of 62 proteases can be used to intuitively visualize similarities in protease substrate readout via principal component analysis and construction of protease specificity trees. Since our new metric is solely based on substrate data, we can engraft the protease tree including proteolytic enzymes of different evolutionary origin. Thereby, our analyses confirm pronounced overlaps in substrate recognition not only between proteases closely related on sequence basis but also between proteolytic enzymes of different evolutionary origin and catalytic type. To illustrate the applicability of our approach we analyze the distribution of targets of small molecules from the ChEMBL database in our substrate-based protease specificity trees. We observe a striking clustering of annotated targets in tree branches even though these grouped targets do not necessarily share similarity on protein sequence level. This highlights the value and applicability of knowledge acquired from peptide substrates in drug design of small molecules, e.g., for the prediction of off-target effects or drug repurposing. Consequently, our similarity metric allows to map the degradome and its associated drug target network via comparison of known substrate peptides. The substrate-driven view of protein-protein interfaces is not limited to the field of proteases but can be applied to any target class where a sufficient amount of known substrate data is available. PMID:24244149
Whole-genome comparative analysis of three phytopathogenic Xylella fastidiosa strains.
Bhattacharyya, Anamitra; Stilwagen, Stephanie; Ivanova, Natalia; D'Souza, Mark; Bernal, Axel; Lykidis, Athanasios; Kapatral, Vinayak; Anderson, Iain; Larsen, Niels; Los, Tamara; Reznik, Gary; Selkov, Eugene; Walunas, Theresa L; Feil, Helene; Feil, William S; Purcell, Alexander; Lassez, Jean-Louis; Hawkins, Trevor L; Haselkorn, Robert; Overbeek, Ross; Predki, Paul F; Kyrpides, Nikos C
2002-09-17
Xylella fastidiosa (Xf) causes wilt disease in plants and is responsible for major economic and crop losses globally. Owing to the public importance of this phytopathogen we embarked on a comparative analysis of the complete genome of Xf pv citrus and the partial genomes of two recently sequenced strains of this species: Xf pv almond and Xf pv oleander, which cause leaf scorch in almond and oleander plants, respectively. We report a reanalysis of the previously sequenced Xf 9a5c (CVC, citrus) strain and the two "gapped" Xf genomes revealing ORFs encoding critical functions in pathogenicity and conjugative transfer. Second, a detailed whole-genome functional comparison was based on the three sequenced Xf strains, identifying the unique genes present in each strain, in addition to those shared between strains. Third, an "in silico" cellular reconstruction of these organisms was made, based on a comparison of their core functional subsystems that led to a characterization of their conjugative transfer machinery, identification of potential differences in their adhesion mechanisms, and highlighting of the absence of a classical quorum-sensing mechanism. This study demonstrates the effectiveness of comparative analysis strategies in the interpretation of genomes that are closely related.
Ricard, Guénola; McEwan, Neil R; Dutilh, Bas E; Jouany, Jean-Pierre; Macheboeuf, Didier; Mitsumori, Makoto; McIntosh, Freda M; Michalowski, Tadeusz; Nagamine, Takafumi; Nelson, Nancy; Newbold, Charles J; Nsabimana, Eli; Takenaka, Akio; Thomas, Nadine A; Ushida, Kazunari; Hackstein, Johannes HP; Huynen, Martijn A
2006-01-01
Background The horizontal transfer of expressed genes from Bacteria into Ciliates which live in close contact with each other in the rumen (the foregut of ruminants) was studied using ciliate Expressed Sequence Tags (ESTs). More than 4000 ESTs were sequenced from representatives of the two major groups of rumen Cilates: the order Entodiniomorphida (Entodinium simplex, Entodinium caudatum, Eudiplodinium maggii, Metadinium medium, Diploplastron affine, Polyplastron multivesiculatum and Epidinium ecaudatum) and the order Vestibuliferida, previously called Holotricha (Isotricha prostoma, Isotricha intestinalis and Dasytricha ruminantium). Results A comparison of the sequences with the completely sequenced genomes of Eukaryotes and Prokaryotes, followed by large-scale construction and analysis of phylogenies, identified 148 ciliate genes that specifically cluster with genes from the Bacteria and Archaea. The phylogenetic clustering with bacterial genes, coupled with the absence of close relatives of these genes in the Ciliate Tetrahymena thermophila, indicates that they have been acquired via Horizontal Gene Transfer (HGT) after the colonization of the gut by the rumen Ciliates. Conclusion Among the HGT candidates, we found an over-representation (>75%) of genes involved in metabolism, specifically in the catabolism of complex carbohydrates, a rich food source in the rumen. We propose that the acquisition of these genes has greatly facilitated the Ciliates' colonization of the rumen providing evidence for the role of HGT in the adaptation to new niches. PMID:16472398
Rodriguez-R, Luis M; Gunturu, Santosh; Harvey, William T; Rosselló-Mora, Ramon; Tiedje, James M; Cole, James R; Konstantinidis, Konstantinos T
2018-06-14
The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.
Ricard, Guénola; McEwan, Neil R; Dutilh, Bas E; Jouany, Jean-Pierre; Macheboeuf, Didier; Mitsumori, Makoto; McIntosh, Freda M; Michalowski, Tadeusz; Nagamine, Takafumi; Nelson, Nancy; Newbold, Charles J; Nsabimana, Eli; Takenaka, Akio; Thomas, Nadine A; Ushida, Kazunari; Hackstein, Johannes H P; Huynen, Martijn A
2006-02-10
The horizontal transfer of expressed genes from Bacteria into Ciliates which live in close contact with each other in the rumen (the foregut of ruminants) was studied using ciliate Expressed Sequence Tags (ESTs). More than 4000 ESTs were sequenced from representatives of the two major groups of rumen Cilates: the order Entodiniomorphida (Entodinium simplex, Entodinium caudatum, Eudiplodinium maggii, Metadinium medium, Diploplastron affine, Polyplastron multivesiculatum and Epidinium ecaudatum) and the order Vestibuliferida, previously called Holotricha (Isotricha prostoma, Isotricha intestinalis and Dasytricha ruminantium). A comparison of the sequences with the completely sequenced genomes of Eukaryotes and Prokaryotes, followed by large-scale construction and analysis of phylogenies, identified 148 ciliate genes that specifically cluster with genes from the Bacteria and Archaea. The phylogenetic clustering with bacterial genes, coupled with the absence of close relatives of these genes in the Ciliate Tetrahymena thermophila, indicates that they have been acquired via Horizontal Gene Transfer (HGT) after the colonization of the gut by the rumen Ciliates. Among the HGT candidates, we found an over-representation (>75%) of genes involved in metabolism, specifically in the catabolism of complex carbohydrates, a rich food source in the rumen. We propose that the acquisition of these genes has greatly facilitated the Ciliates' colonization of the rumen providing evidence for the role of HGT in the adaptation to new niches.
Ranjard, Lionel; Brothier, Elisabeth; Nazaret, Sylvie
2000-01-01
Two major emerging bands (a 350-bp band and a 650-bp band) within the RISA (ribosomal intergenic spacer analysis) profile of a soil bacterial community spiked with Hg(II) were selected for further identification of the populations involved in the response of the community to the added metal. The bands were cut out from polyacrylamide gels, cloned, characterized by restriction analysis, and sequenced for phylogenetic affiliation of dominant clones. The sequences were the intergenic spacer between the rrs and rrl genes and the first 130 nucleotides of the rrl gene. Comparison of sequences derived from the 350-bp band to The GenBank database permitted us to identify the bacteria as being mostly close relatives to low G+C firmicutes (Clostridium-like genera), while the 650-bp band permitted us to identify the bacteria as being mostly close relatives to β-proteobacteria (Ralstonia-like genera). Oligonucleotide probes specific for the identified dominant bacteria were designed and hybridized with the RISA profiles derived from the control and spiked communities. These studies confirmed the contribution of these populations to the community response to the metal. Hybridization of the RISA profiles from subcommunities (bacterial pools associated with different soil microenvironments) also permitted to characterize the distribution and the dynamics of these populations at a microscale level following mercury spiking. PMID:11097911
Nishito, Yukari; Osana, Yasunori; Hachiya, Tsuyoshi; Popendorf, Kris; Toyoda, Atsushi; Fujiyama, Asao; Itaya, Mitsuhiro; Sakakibara, Yasubumi
2010-04-16
Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for gamma-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B. subtilis natto harbors but B. subtilis 168 lacks. Multiple genome-level comparisons among five closely related Bacillus species were also carried out. The determined genome sequence of B. subtilis natto and gene annotations are available from the Natto genome browser http://natto-genome.org/.
Jones, Christopher M; Stres, Blaz; Rosenquist, Magnus; Hallin, Sara
2008-09-01
Denitrification is a facultative respiratory pathway in which nitrite (NO2(-)), nitric oxide (NO), and nitrous oxide (N2O) are successively reduced to nitrogen gas (N(2)), effectively closing the nitrogen cycle. The ability to denitrify is widely dispersed among prokaryotes, and this polyphyletic distribution has raised the possibility of horizontal gene transfer (HGT) having a substantial role in the evolution of denitrification. Comparisons of 16S rRNA and denitrification gene phylogenies in recent studies support this possibility; however, these results remain speculative as they are based on visual comparisons of phylogenies from partial sequences. We reanalyzed publicly available nirS, nirK, norB, and nosZ partial sequences using Bayesian and maximum likelihood phylogenetic inference. Concomitant analysis of denitrification genes with 16S rRNA sequences from the same organisms showed substantial differences between the trees, which were supported by examining the posterior probability of monophyletic constraints at different taxonomic levels. Although these differences suggest HGT of denitrification genes, the presence of structural variants for nirK, norB, and nosZ makes it difficult to determine HGT from other evolutionary events. Additional analysis using phylogenetic networks and likelihood ratio tests of phylogenies based on full-length sequences retrieved from genomes also revealed significant differences in tree topologies among denitrification and 16S rRNA gene phylogenies, with the exception of the nosZ gene phylogeny within the data set of the nirK-harboring genomes. However, inspection of codon usage and G + C content plots from complete genomes gave no evidence for recent HGT. Instead, the close proximity of denitrification gene copies in the genomes of several denitrifying bacteria suggests duplication. Although HGT cannot be ruled out as a factor in the evolution of denitrification genes, our analysis suggests that other phenomena, such gene duplication/divergence and lineage sorting, may have differently influenced the evolution of each denitrification gene.
Assembly and comparison of two closely related Brassica napus genomes.
Bayer, Philipp E; Hurgobin, Bhavna; Golicz, Agnieszka A; Chan, Chon-Kit Kenneth; Yuan, Yuxuan; Lee, HueyTyng; Renton, Michael; Meng, Jinling; Li, Ruiyuan; Long, Yan; Zou, Jun; Bancroft, Ian; Chalhoub, Boulos; King, Graham J; Batley, Jacqueline; Edwards, David
2017-12-01
As an increasing number of plant genome sequences become available, it is clear that gene content varies between individuals, and the challenge arises to predict the gene content of a species. However, genome comparison is often confounded by variation in assembly and annotation. Differentiating between true gene absence and variation in assembly or annotation is essential for the accurate identification of conserved and variable genes in a species. Here, we present the de novo assembly of the B. napus cultivar Tapidor and comparison with an improved assembly of the Brassica napus cultivar Darmor-bzh. Both cultivars were annotated using the same method to allow comparison of gene content. We identified genes unique to each cultivar and differentiate these from artefacts due to variation in the assembly and annotation. We demonstrate that using a common annotation pipeline can result in different gene predictions, even for closely related cultivars, and repeat regions which collapse during assembly impact whole genome comparison. After accounting for differences in assembly and annotation, we demonstrate that the genome of Darmor-bzh contains a greater number of genes than the genome of Tapidor. Our results are the first step towards comparison of the true differences between B. napus genomes and highlight the potential sources of error in future production of a B. napus pangenome. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Mhc class II B gene evolution in East African cichlid fishes.
Figueroa, F; Mayer, W E; Sültmann, H; O'hUigin, C; Tichy, H; Satta, Y; Takezaki, N; Takahata, N; Klein, J
2000-06-01
A distinctive feature of essential major histocompatibility complex (Mhc) loci is their polymorphism characterized by large genetic distances between alleles and long persistence times of allelic lineages. Since the lineages often span several successive speciations, we investigated the behavior of the Mhc alleles during or close to the speciation phase. We sequenced exon 2 of the class II B locus 4 from 232 East African cichlid fishes representing 32 related species. The divergence times of the (sub)species ranged from 6,000 to 8.4 million years. Two types of evolutionary analysis were used to elucidate the pattern of exon 2 sequence divergence. First, phylogenetic methods were applied to reconstruct the most likely evolutionary pathways leading from the last common ancestor of the set to the extant sequences, and to assess the probable mechanisms involved in allelic diversification. Second, pairwise comparisons of sequences were carried out to detect differences seemingly incompatible with origin by nonparallel point mutations. The analysis revealed point mutations to be the most important mechanism behind allelic divergences, with recombination playing only an auxiliary part. Comparison of sequences from related species revealed evidence of random allelic (lineage) losses apparently associated with speciation. Sharing of identical alleles could be demonstrated between species that diverged 2 million years ago. The phylogeny of the exon was incongruent with that of the flanking introns, indicating either a high degree of convergent evolution at the peptide-binding region-encoding sites, or intron homogenization.
Mead, David A.; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Cheng, Jan-Feng; Bruce, David C.; Goodwin, Lynne A.; Pitluck, Sam; Chertkov, Olga; Zhang, Xiaojing; Detter, John C.; Han, Cliff S.; Tapia, Roxanne; Land, Miriam; Hauser, Loren J.; Chang, Yun-juan; Kyrpides, Nikos C.; Ivanova, Natalia N.; Ovchinnikova, Galina; Woyke, Tanja; Brumm, Catherine; Hochstein, Rebecca; Schoenfeld, Thomas; Brumm, Phillip
2012-01-01
Paenibacillus sp.Y412MC10 was one of a number of organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The isolate was initially classified as a Geobacillus sp. Y412MC10 based on its isolation conditions and similarity to other organisms isolated from hot springs at Yellowstone National Park. Comparison of 16 S rRNA sequences within the Bacillales indicated that Geobacillus sp.Y412MC10 clustered with Paenibacillus species, and the organism was most closely related to Paenibacillus lautus. Lucigen Corp. prepared genomic DNA and the genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute. The genome sequence was deposited at the NCBI in October 2009 (NC_013406). The genome of Paenibacillus sp. Y412MC10 consists of one circular chromosome of 7,121,665 bp with an average G+C content of 51.2%. Comparison to other Paenibacillus species shows the organism lacks nitrogen fixation, antibiotic production and social interaction genes reported in other paenibacilli. The Y412MC10 genome shows a high level of synteny and homology to the draft sequence of Paenibacillus sp. HGF5, an organism from the Human Microbiome Project (HMP) Reference Genomes. This, combined with genomic CAZyme analysis, suggests an intestinal, rather than environmental origin for Y412MC10. PMID:23408395
Mead, David A; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Cheng, Jan-Feng; Bruce, David C; Goodwin, Lynne A; Pitluck, Sam; Chertkov, Olga; Zhang, Xiaojing; Detter, John C; Han, Cliff S; Tapia, Roxanne; Land, Miriam; Hauser, Loren J; Chang, Yun-Juan; Kyrpides, Nikos C; Ivanova, Natalia N; Ovchinnikova, Galina; Woyke, Tanja; Brumm, Catherine; Hochstein, Rebecca; Schoenfeld, Thomas; Brumm, Phillip
2012-07-30
Paenibacillus sp.Y412MC10 was one of a number of organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The isolate was initially classified as a Geobacillus sp. Y412MC10 based on its isolation conditions and similarity to other organisms isolated from hot springs at Yellowstone National Park. Comparison of 16 S rRNA sequences within the Bacillales indicated that Geobacillus sp.Y412MC10 clustered with Paenibacillus species, and the organism was most closely related to Paenibacillus lautus. Lucigen Corp. prepared genomic DNA and the genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute. The genome sequence was deposited at the NCBI in October 2009 (NC_013406). The genome of Paenibacillus sp. Y412MC10 consists of one circular chromosome of 7,121,665 bp with an average G+C content of 51.2%. Comparison to other Paenibacillus species shows the organism lacks nitrogen fixation, antibiotic production and social interaction genes reported in other paenibacilli. The Y412MC10 genome shows a high level of synteny and homology to the draft sequence of Paenibacillus sp. HGF5, an organism from the Human Microbiome Project (HMP) Reference Genomes. This, combined with genomic CAZyme analysis, suggests an intestinal, rather than environmental origin for Y412MC10.
Wang, Yongkang; Song, Xiaodan; Li, Xiaorong; Yang, Sang-tian; Zou, Xiang
2017-01-04
To explore the genome sequence of Aureobasidium pullulans CCTCC M2012223, analyze the key genes related to the biosynthesis of important metabolites, and provide genetic background for metabolic engineering. Complete genome of A. pullulans CCTCC M2012223 was sequenced by Illumina HiSeq high throughput sequencing platform. Then, fragment assembly, gene prediction, functional annotation, and GO/COG cluster were analyzed in comparison with those of other five A. pullulans varieties. The complete genome sequence of A. pullulans CCTCC M2012223 was 30756831 bp with an average GC content of 47.49%, and 9452 genes were successfully predicted. Genome-wide analysis showed that A. pullulans CCTCC M2012223 had the biggest genome assembly size. Protein sequences involved in the pullulan and polymalic acid pathway were highly conservative in all of six A. pullulans varieties. Although both A. pullulans CCTCC M2012223 and A. pullulans var. melanogenum have a close affinity, some point mutation and inserts were occurred in protein sequences involved in melanin biosynthesis. Genome information of A. pullulans CCTCC M2012223 was annotated and genes involved in melanin, pullulan and polymalic acid pathway were compared, which would provide a theoretical basis for genetic modification of metabolic pathway in A. pullulans.
Sequencing a piece of history: complete genome sequence of the original Escherichia coli strain
Dunne, Karl A; Chaudhuri, Roy R; Rossiter, Amanda E; Beriotto, Irene; Browning, Douglas F; Squire, Derrick; Cunningham, Adam F; Cole, Jeffrey A; Loman, Nicholas
2017-01-01
In 1885, Theodor Escherich first described the Bacillus coli commune, which was subsequently renamed Escherichia coli. We report the complete genome sequence of this original strain (NCTC 86). The 5 144 392 bp circular chromosome encodes the genes for 4805 proteins, which include antigens, virulence factors, antimicrobial-resistance factors and secretion systems, of a commensal organism from the pre-antibiotic era. It is located in the E. coli A subgroup and is closely related to E. coli K-12 MG1655. E. coli strain NCTC 86 and the non-pathogenic K-12, C, B and HS strains share a common backbone that is largely co-linear. The exception is a large 2 803 932 bp inversion that spans the replication terminus from gmhB to clpB. Comparison with E. coli K-12 reveals 41 regions of difference (577 351 bp) distributed across the chromosome. For example, and contrary to current dogma, E. coli NCTC 86 includes a nine gene sil locus that encodes a silver-resistance efflux pump acquired before the current widespread use of silver nanoparticles as an antibacterial agent, possibly resulting from the widespread use of silver utensils and currency in Germany in the 1800s. In summary, phylogenetic comparisons with other E. coli strains confirmed that the original strain isolated by Escherich is most closely related to the non-pathogenic commensal strains. It is more distant from the root than the pathogenic organisms E. coli 042 and O157 : H7; therefore, it is not an ancestral state for the species. PMID:28663823
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.
Chang, D D; Clayton, D A
1986-01-01
Transcription of the heavy strand of mouse mitochondrial DNA starts from two closely spaced, distinct sites located in the displacement loop region of the genome. We report here an analysis of regulatory sequences required for faithful transcription from these two sites. Data obtained from in vitro assays demonstrated that a 51-base-pair region, encompassing nucleotides -40 to +11 of the downstream start site, contains sufficient information for accurate transcription from both start sites. Deletion of the 3' flanking sequences, including one or both start sites to -17, resulted in the initiation of transcription by the mitochondrial RNA polymerase from alternative sites within vector DNA sequences. This feature places the mouse heavy-strand promoter uniquely among other known mitochondrial promoters, all of which absolutely require cognate start sites for transcription. Comparison of the heavy-strand promoter with those of other vertebrate mitochondrial DNAs revealed a remarkably high rate of sequence divergence among species. Images PMID:3785226
Louis, Ed
2011-01-01
In the early days of the yeast genome sequencing project, gene annotation was in its infancy and suffered the problem of many false positive annotations as well as missed genes. The lack of other sequences for comparison also prevented the annotation of conserved, functional sequences that were not coding. We are now in an era of comparative genomics where many closely related as well as more distantly related genomes are available for direct sequence and synteny comparisons allowing for more probable predictions of genes and other functional sequences due to conservation. We also have a plethora of functional genomics data which helps inform gene annotation for previously uncharacterised open reading frames (ORFs)/genes. For Saccharomyces cerevisiae this has resulted in a continuous updating of the gene and functional sequence annotations in the reference genome helping it retain its position as the best characterized eukaryotic organism's genome. A single reference genome for a species does not accurately describe the species and this is quite clear in the case of S. cerevisiae where the reference strain is not ideal for brewing or baking due to missing genes. Recent surveys of numerous isolates, from a variety of sources, using a variety of technologies have revealed a great deal of variation amongst isolates with genome sequence surveys providing information on novel genes, undetectable by other means. We now have a better understanding of the extant variation in S. cerevisiae as a species as well as some idea of how much we are missing from this understanding. As with gene annotation, comparative genomics enhances the discovery and description of genome variation and is providing us with the tools for understanding genome evolution, adaptation and selection, and underlying genetics of complex traits.
Lee, Michael D.; Walworth, Nathan G.; Sylvan, Jason B.; Edwards, Katrina J.; Orcutt, Beth N.
2015-01-01
Areas of exposed basalt along mid-ocean ridges and at seafloor outcrops serve as conduits of fluid flux into and out of a subsurface ocean, and microbe–mineral interactions can influence alteration reactions at the rock–water interface. Located on the eastern flank of the East Pacific Rise, Dorado Outcrop is a site of low-temperature (<20°C) hydrothermal venting and represents a new end-member in the current survey of seafloor basalt biomes. Consistent with prior studies, a survey of 16S rRNA gene sequence diversity using universal primers targeting the V4 hypervariable region revealed much greater richness and diversity on the seafloor rocks than in surrounding seawater. Overall, Gamma-, Alpha-, and Deltaproteobacteria, and Thaumarchaeota dominated the sequenced communities, together making up over half of the observed diversity, though bacterial sequences were more abundant than archaeal in all samples. The most abundant bacterial reads were closely related to the obligate chemolithoautotrophic, sulfur-oxidizing Thioprofundum lithotrophicum, suggesting carbon and sulfur cycling as dominant metabolic pathways in this system. Representatives of Thaumarchaeota were detected in relatively high abundance on the basalts in comparison to bottom water, possibly indicating ammonia oxidation. In comparison to other sequence datasets from globally distributed seafloor basalts, this study reveals many overlapping and cosmopolitan phylogenetic groups and also suggests that substrate age correlates with community structure. PMID:26779122
Metzger, Julia; Tonda, Raul; Beltran, Sergi; Agueda, Lídia; Gut, Marta; Distl, Ottmar
2014-07-04
Domestication has shaped the horse and lead to a group of many different types. Some have been under strong human selection while others developed in close relationship with nature. The aim of our study was to perform next generation sequencing of breed and non-breed horses to provide an insight into genetic influences on selective forces. Whole genome sequencing of five horses of four different populations revealed 10,193,421 single nucleotide polymorphisms (SNPs) and 1,361,948 insertion/deletion polymorphisms (indels). In comparison to horse variant databases and previous reports, we were able to identify 3,394,883 novel SNPs and 868,525 novel indels. We analyzed the distribution of individual variants and found significant enrichment of private mutations in coding regions of genes involved in primary metabolic processes, anatomical structures, morphogenesis and cellular components in non-breed horses and in contrast to that private mutations in genes affecting cell communication, lipid metabolic process, neurological system process, muscle contraction, ion transport, developmental processes of the nervous system and ectoderm in breed horses. Our next generation sequencing data constitute an important first step for the characterization of non-breed in comparison to breed horses and provide a large number of novel variants for future analyses. Functional annotations suggest specific variants that could play a role for the characterization of breed or non-breed horses.
2009-01-01
Background Parthenium argentatum (guayule) is an industrial crop that produces latex, which was recently commercialized as a source of latex rubber safe for people with Type I latex allergy. The complete plastid genome of P. argentatum was sequenced. The sequence provides important information useful for genetic engineering strategies. Comparison to the sequences of plastid genomes from three other members of the Asteraceae, Lactuca sativa, Guitozia abyssinica and Helianthus annuus revealed details of the evolution of the four genomes. Chloroplast-specific DNA barcodes were developed for identification of Parthenium species and lines. Results The complete plastid genome of P. argentatum is 152,803 bp. Based on the overall comparison of individual protein coding genes with those in L. sativa, G. abyssinica and H. annuus, we demonstrate that the P. argentatum chloroplast genome sequence is most closely related to that of H. annuus. Similar to chloroplast genomes in G. abyssinica, L. sativa and H. annuus, the plastid genome of P. argentatum has a large 23 kb inversion with a smaller 3.4 kb inversion, within the large inversion. Using the matK and psbA-trnH spacer chloroplast DNA barcodes, three of the four Parthenium species tested, P. tomentosum, P. hysterophorus and P. schottii, can be differentiated from P. argentatum. In addition, we identified lines within P. argentatum. Conclusion The genome sequence of the P. argentatum chloroplast will enrich the sequence resources of plastid genomes in commercial crops. The availability of the complete plastid genome sequence may facilitate transformation efficiency by using the precise sequence of endogenous flanking sequences and regulatory elements in chloroplast transformation vectors. The DNA barcoding study forms the foundation for genetic identification of commercially significant lines of P. argentatum that are important for producing latex. PMID:19917140
Cajimat, Maria N. B.; Milazzo, Mary Louise; Borchert, Jeff N.; Abbott, Ken D.; Bradley, Robert D.; Fulhorst, Charles F.
2008-01-01
The results of analyses of glycoprotein precursor and nucleocapsid protein gene sequences indicated that an arenavirus isolated from a Mexican woodrat (Neotoma mexicana) captured in Arizona is a strain of a novel species (proposed name Skinner Tank virus) and that arenaviruses isolated from Mexican woodrats captured in Colorado, New Mexico, and Utah are strains of Whitewater Arroyo virus or species phylogenetically closely related to Whitewater Arroyo virus. Pairwise comparisons of glycoprotein precursor sequences and nucleocapsid protein sequences revealed a high level of divergence among the viruses isolated from the Mexican woodrats captured in Colorado, New Mexico, and Utah and the Whitewater Arroyo virus prototype strain AV 9310135, which originally was isolated from a white-throated woodrat (Neotoma albigula) captured in New Mexico. Conceptually, the viruses from Colorado, New Mexico, and Utah and strain AV 9310135 could be grouped together in a species complex in the family Arenaviridae, genus Arenavirus. PMID:18304671
Grose, Julianne H; Casjens, Sherwood R
2014-11-01
Bacteriophages are the predominant biological entity on the planet. The recent explosion of sequence information has made estimates of their diversity possible. We describe the genomic comparison of 337 fully sequenced tailed phages isolated on 18 genera and 31 species of bacteria in the Enterobacteriaceae. These phages were largely unambiguously grouped into 56 diverse clusters (32 lytic and 24 temperate) that have syntenic similarity over >50% of the genomes within each cluster, but substantially less sequence similarity between clusters. Most clusters naturally break into sets of more closely related subclusters, 78% of which are correlated with their host genera. The largest groups of related phages are superclusters united by genome synteny to lambda (81 phages) and T7 (51 phages). This study forms a robust framework for understanding diversity and evolutionary relationships of existing tailed phages, for relating newly discovered phages and for determining host/phage relationships.
Grose, Julianne H.; Casjens, Sherwood R.
2014-01-01
Bacteriophages are the predominant biological entity on the planet. The recent explosion of sequence information has made estimates of their diversity possible. We describe the genomic comparison of 337 fully sequenced tailed phages isolated on 18 genera and 31 species of bacteria in the Enterobacteriaceae. These phages were largely unambiguously grouped into 56 diverse clusters (32 lytic and 24 temperate) that have syntenic similarity over >50% of the genomes within each cluster, but substantially less sequence similarity between clusters. Most clusters naturally break into sets of more closely related subclusters, 78% of which are correlated with their host genera. The largest groups of related phages are superclusters united by genome synteny to lambda (81 phages) and T7 (51 phages). This study forms a robust framework for understanding diversity and evolutionary relationships of existing tailed phages, for relating newly discovered phages and for determining host/phage relationships. PMID:25240328
The complete genome sequence of freesia mosaic virus and its relationship to other potyviruses.
Choi, H I; Lim, H R; Song, Y S; Kim, M J; Choi, S H; Song, Y S; Bae, S C; Ryu, K H
2010-07-01
We have completed the genomic sequence of a potyvirus, freesia mosaic virus (FreMV), and compared it to those of other known potyviruses. The full-length genome sequence of FreMV consists of 9,489 nucleotides. The large protein contains 3,077 amino acids, with an AUG start codon and UAA stop codon, containing one open reading frame typical of a potyvirus polyprotein. The polyprotein of FreMV-Kr gives rise to eleven proteins (P1, HC-pro, P3, PIPO, 6K1, CI, 6K2, VPg, NIa, NIb and CP), and putative cleavage sites of each protein were identified by sequence comparison to those of other known potyviruses. Phylogenetic analysis of the polyprotein revealed that FreMV-Kr was most closely related to PeMoV and was related to BtMV, BaRMV and PeLMV, which belong to the BCMV subgroup. This is the first information on the complete genome structure of FreMV, and the sequence information clearly supports the status of FreMV as a member of a distinct species in the genus Potyvirus.
NASA Technical Reports Server (NTRS)
Villanueva, E.; Delihas, N.; Luehrsen, K. R.; Fox, G. E.; Gibson, J.
1985-01-01
The complete nucleotide sequences of 5S ribosomal RNAs from Rhodocyclus gelatinosa, Rhodobacter sphaeroides, and Pseudomonas cepacia were determined. Comparisons of these 5S RNA sequences show that rather than being phylogenetically related to one another, the two photosynthetic bacterial 5S RNAs share more sequence and signature homology with the RNAs of two nonphotosynthetic strains. Rhodobacter sphaeroides is specifically related to Paracoccus denitrificans and Rc. gelatinosa is related to Ps. cepacia. These results support earlier 16S ribosomal RNA studies and add two important groups to the 5S RNA data base. Unique 5S RNA structural features previously found in P. denitrificans are present also in the 5S RNA of Rb. sphaeroides; these provide the basis for subdivisional signatures. The immediate consequence of obtaining these new sequences is that it is possible to clarify the phylogenetic origins of the plant mitochondrion. In particular, a close phylogenetic relationship is found between the plant mitochondria and members of the alpha subdivision of the purple photosynthetic bacteria, namely, Rb. sphaeroides, P. denitrificans, and Rhodospirillum rubrum.
Alatoom, Adnan A.; Cazanave, Charles J.; Cunningham, Scott A.; Ihde, Sherry M.
2012-01-01
We evaluated the Bruker Biotyper matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) mass spectrometry for identification of 92 clinical isolates of Corynebacterium species in comparison to identification using rpoB or 16S rRNA gene sequencing. Eighty isolates (87%) yielded a score of ≥1.700, and all of these were correctly identified to the species level with the exception of Corynebacterium aurimucosum being misidentified as the closely related Corynebacterium minutissimum. PMID:22075579
Khayhan, Kantarawee; Hagen, Ferry; Norkaew, Treepradab; Puengchan, Tanpalang; Boekhout, Teun; Sriburee, Pojana
2017-04-01
The pathogenic yeast Cryptococcus gattii was isolated from a tree hollow of a Castanopsis argyrophylla King ex Hook.f. (Fagaceae) in Chiang Mai, Thailand. Molecular characterization with amplified fragment length polymorphism analysis and multi-locus sequence typing showed that this isolate belonged to genotype AFLP4/VGI representing C. gattii sensu stricto. Subsequent comparison of the environmental isolate with those from clinical samples from Thailand showed that they grouped closely together in a single cluster.
Voelker, T A; Staswick, P; Chrispeels, M J
1986-12-01
Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained.
Hesse, Cedar N; Torres-Cruz, Terry J; Tobias, Terri Billingsley; Al-Matruk, Maryam; Porras-Alfaro, Andrea; Kuske, Cheryl R
Soil fungal communities are responsible for carbon and nitrogen (N) cycling. The high complexity of the soil fungal community and the high proportion of taxonomically unidentifiable sequences confound ecological interpretations in field studies because physiological information is lacking for many organisms known only by their rRNA sequences. This situation forces experimental comparisons to be made at broader taxonomic racks where functions become difficult to infer. The objective of this study was to determine OTU (operational taxonomic units) level responses of the soil fungal community to N enrichment in a temperate pine forest experiment and to use the sequencing data to guide culture efforts of novel N-responsive fungal taxa. Replicate samples from four soil horizons (up to 10 cm depth) were obtained from ambient, enriched CO 2 and N-fertilization plots. Through a fungal large subunit rRNA gene (LSU) sequencing survey, we identified two novel fungal clades that were abundant in our soil sampling (representing up to 27% of the sequences in some samples) and responsive to changes in soil N. The two N-responsive taxa with no predicted taxonomic association were targeted for isolation and culturing from specific soil samples where their sequences were abundant. Representatives of both OTUs were successfully cultured using a filtration approach. One taxon (OTU6) was most closely related to Saccharomycotina; the second taxon (OTU69) was most closely related to Mucoromycotina. Both taxa likely represent novel species. This study shows how observation of specific OTUs level responses to altered N status in a large rRNA gene field survey provided the impetus to design targeted culture approaches for isolation of novel N-responsive fungal taxa.
Multidrug-Resistant Candida haemulonii and C. auris, Tel Aviv, Israel.
Ben-Ami, Ronen; Berman, Judith; Novikov, Ana; Bash, Edna; Shachor-Meyouhas, Yael; Zakin, Shiri; Maor, Yasmin; Tarabia, Jalal; Schechner, Vered; Adler, Amos; Finn, Talya
2017-02-01
Candida auris and C. haemulonii are closely related, multidrug-resistant emerging fungal pathogens that are not readily distinguishable with phenotypic assays. We studied C. auris and C. haemulonii clinical isolates from 2 hospitals in central Israel. C. auris was isolated in 5 patients with nosocomial bloodstream infection, and C. haemulonii was found as a colonizer of leg wounds at a peripheral vascular disease clinic. Liberal use of topical miconazole and close contact among patients were implicated in C. haemulonii transmission. C. auris exhibited higher thermotolerance, virulence in a mouse infection model, and ATP-dependent drug efflux activity than C. haemulonii. Comparison of ribosomal DNA sequences found that C. auris strains from Israel were phylogenetically distinct from isolates from East Asia, South Africa and Kuwait, whereas C. haemulonii strains from different countries were closely interrelated. Our findings highlight the pathogenicity of C. auris and underscore the need to limit its spread.
Locating Sequence on FPC Maps and Selecting a Minimal Tiling Path
Engler, Friedrich W.; Hatfield, James; Nelson, William; Soderlund, Carol A.
2003-01-01
This study discusses three software tools, the first two aid in integrating sequence with an FPC physical map and the third automatically selects a minimal tiling path given genomic draft sequence and BAC end sequences. The first tool, FSD (FPC Simulated Digest), takes a sequenced clone and adds it back to the map based on a fingerprint generated by an in silico digest of the clone. This allows verification of sequenced clone positions and the integration of sequenced clones that were not originally part of the FPC map. The second tool, BSS (Blast Some Sequence), takes a query sequence and positions it on the map based on sequence associated with the clones in the map. BSS has multiple uses as follows: (1) When the query is a file of marker sequences, they can be added as electronic markers. (2) When the query is draft sequence, the results of BSS can be used to close gaps in a sequenced clone or the physical map. (3) When the query is a sequenced clone and the target is BAC end sequences, one may select the next clone for sequencing using both sequence comparison results and map location. (4) When the query is whole-genome draft sequence and the target is BAC end sequences, the results can be used to select many clones for a minimal tiling path at once. The third tool, pickMTP, automates the majority of this last usage of BSS. Results are presented using the rice FPC map, BAC end sequences, and whole-genome shotgun from Syngenta. PMID:12915486
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana.
Liu, Yanan; Wang, Baoju; Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia
2016-01-01
The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model.
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana
Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia
2016-01-01
The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model. PMID:27806133
Suh, Sung-Oui; Zhou, Jianlong
2010-07-01
Seven yeast strains were isolated from the body surface and galleries of Xyloterinus politus, the ambrosia beetle that attacks black oak trees. Based on rDNA sequence comparisons and other taxonomic characteristics, five of the strains were identified as members of the species Saccharomycopsis microspora, Wickerhamomyces hampshirensis and Candida mycetangii, which have been reported previously as being associated with insects. The remaining two yeast strains were proposed as representatives of two novel species, Candida xyloterini sp. nov. (type strain ATCC 62898(T)=CBS 11547(T)) and Candida palmyrensis sp. nov. (type strain ATCC 62899(T)=CBS 11546(T)). C. xyloterini sp. nov. is a close sister taxon to Ogataea dorogensis and assimilates methanol as a sole carbon source but lacks ascospores. On the other hand, C. palmyrensis sp. nov. is phylogenetically distinct from any other ambrosia yeast reported so far. The species was placed near Candida sophiae-reginae and Candida beechii based on DNA sequence analyses, but neither of these were close sister taxa to C. palmyrensis sp. nov.
Genome data from a sixteenth century pig illuminate modern breed relationships
Ramírez, O; Burgos-Paz, W; Casas, E; Ballester, M; Bianco, E; Olalde, I; Santpere, G; Novella, V; Gut, M; Lalueza-Fox, C; Saña, M; Pérez-Enciso, M
2015-01-01
Ancient DNA (aDNA) provides direct evidence of historical events that have modeled the genome of modern individuals. In livestock, resolving the differences between the effects of initial domestication and of subsequent modern breeding is not straight forward without aDNA data. Here, we have obtained shotgun genome sequence data from a sixteenth century pig from Northeastern Spain (Montsoriu castle), the ancient pig was obtained from an extremely well-preserved and diverse assemblage. In addition, we provide the sequence of three new modern genomes from an Iberian pig, Spanish wild boar and a Guatemalan Creole pig. Comparison with both mitochondrial and autosomal genome data shows that the ancient pig is closely related to extant Iberian pigs and to European wild boar. Although the ancient sample was clearly domestic, admixture with wild boar also occurred, according to the D-statistics. The close relationship between Iberian, European wild boar and the ancient pig confirms that Asian introgression in modern Iberian pigs has not existed or has been negligible. In contrast, the Guatemalan Creole pig clusters apart from the Iberian pig genome, likely due to introgression from international breeds. PMID:25204303
Automated Finishing with Autofinish
Gordon, David; Desmarais, Cindy; Green, Phil
2001-01-01
Currently, the genome sequencing community is producing shotgun sequence data at a very high rate, but finishing (collecting additional directed sequence data to close gaps and improve the quality of the data) is not matching that rate. One reason for the difference is that shotgun sequencing is highly automated but finishing is not: Most finishing decisions, such as which directed reads to obtain and which specialized sequencing techniques to use, are made by people. If finishing rates are to increase to match shotgun sequencing rates, most finishing decisions also must be automated. The Autofinish computer program (which is part of the Consed computer software package) does this by automatically choosing finishing reads. Autofinish is able to suggest most finishing reads required for completion of each sequencing project, greatly reducing the amount of human attention needed. Autofinish sometimes completely finishes the project, with no human decisions required. It cannot solve the most complex problems, so we recommend that Autofinish be allowed to suggest reads for the first three rounds of finishing, and if the project still is not finished completely, a human finisher complete the work. We compared this Autofinish-Hybrid method of finishing against a human finisher in five different projects with a variety of shotgun depths by finishing each project twice—once with each method. This comparison shows that the Autofinish-Hybrid method saves many hours over a human finisher alone, while using roughly the same number and type of reads and closing gaps at roughly the same rate. Autofinish currently is in production use at several large sequencing centers. It is designed to be adaptable to the finishing strategy of the lab—it can finish using some or all of the following: resequencing reads, reverses, custom primer walks on either subclone templates or whole clone templates, PCR, or minilibraries. Autofinish has been used for finishing cDNA, genomic clones, and whole bacterial genomes (see http://www.phrap.org). PMID:11282977
Gilchrist, Anthony Stuart; Shearman, Deborah C A; Frommer, Marianne; Raphael, Kathryn A; Deshpande, Nandan P; Wilkins, Marc R; Sherwin, William B; Sved, John A
2014-12-20
The tephritid fruit flies include a number of economically important pests of horticulture, with a large accumulated body of research on their biology and control. Amongst the Tephritidae, the genus Bactrocera, containing over 400 species, presents various species groups of potential utility for genetic studies of speciation, behaviour or pest control. In Australia, there exists a triad of closely-related, sympatric Bactrocera species which do not mate in the wild but which, despite distinct morphologies and behaviours, can be force-mated in the laboratory to produce fertile hybrid offspring. To exploit the opportunities offered by genomics, such as the efficient identification of genetic loci central to pest behaviour and to the earliest stages of speciation, investigators require genomic resources for future investigations. We produced a draft de novo genome assembly of Australia's major tephritid pest species, Bactrocera tryoni. The male genome (650-700 Mbp) includes approximately 150 Mb of interspersed repetitive DNA sequences and 60 Mb of satellite DNA. Assessment using conserved core eukaryotic sequences indicated 98% completeness. Over 16,000 MAKER-derived gene models showed a large degree of overlap with other Dipteran reference genomes. The sequence of the ribosomal RNA transcribed unit was also determined. Unscaffolded assemblies of B. neohumeralis and B. jarvisi were then produced; comparison with B. tryoni showed that the species are more closely related than any Drosophila species pair. The similarity of the genomes was exploited to identify 4924 potentially diagnostic indels between the species, all of which occur in non-coding regions. This first draft B. tryoni genome resembles other dipteran genomes in terms of size and putative coding sequences. For all three species included in this study, we have identified a comprehensive set of non-redundant repetitive sequences, including the ribosomal RNA unit, and have quantified the major satellite DNA families. These genetic resources will facilitate the further investigations of genetic mechanisms responsible for the behavioural and morphological differences between these three species and other tephritids. We have also shown how whole genome sequence data can be used to generate simple diagnostic tests between very closely-related species where only one of the species is scaffolded.
2013-01-01
Background The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes. Results Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide. Conclusions Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage. PMID:23586706
Comparative and functional genomics provide insights into the pathogenicity of dermatophytic fungi
2011-01-01
Background Millions of humans and animals suffer from superficial infections caused by a group of highly specialized filamentous fungi, the dermatophytes, which exclusively infect keratinized host structures. To provide broad insights into the molecular basis of the pathogenicity-associated traits, we report the first genome sequences of two closely phylogenetically related dermatophytes, Arthroderma benhamiae and Trichophyton verrucosum, both of which induce highly inflammatory infections in humans. Results 97% of the 22.5 megabase genome sequences of A. benhamiae and T. verrucosum are unambiguously alignable and collinear. To unravel dermatophyte-specific virulence-associated traits, we compared sets of potentially pathogenicity-associated proteins, such as secreted proteases and enzymes involved in secondary metabolite production, with those of closely related onygenales (Coccidioides species) and the mould Aspergillus fumigatus. The comparisons revealed expansion of several gene families in dermatophytes and disclosed the peculiarities of the dermatophyte secondary metabolite gene sets. Secretion of proteases and other hydrolytic enzymes by A. benhamiae was proven experimentally by a global secretome analysis during keratin degradation. Molecular insights into the interaction of A. benhamiae with human keratinocytes were obtained for the first time by global transcriptome profiling. Given that A. benhamiae is able to undergo mating, a detailed comparison of the genomes further unraveled the genetic basis of sexual reproduction in this species. Conclusions Our results enlighten the genetic basis of fundamental and putatively virulence-related traits of dermatophytes, advancing future research on these medically important pathogens. PMID:21247460
2011-01-01
Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921
DNA Barcodes of Asian Houbara Bustard (Chlamydotis undulata macqueenii)
Arif, Ibrahim A.; Khan, Haseeb A.; Williams, Joseph B.; Shobrak, Mohammad; Arif, Waad I.
2012-01-01
Populations of Houbara Bustards have dramatically declined in recent years. Captive breeding and reintroduction programs have had limited success in reviving population numbers and thus new technological solutions involving molecular methods are essential for the long term survival of this species. In this study, we sequenced the 694 bp segment of COI gene of the four specimens of Asian Houbara Bustard (Chlamydotis undulata macqueenii). We also compared these sequences with earlier published barcodes of 11 individuals comprising different families of the orders Gruiformes, Ciconiiformes, Podicipediformes and Crocodylia (out group). The pair-wise sequence comparison showed a total of 254 variable sites across all the 15 sequences from different taxa. Three of the four specimens of Houbara Bustard had an identical sequence of COI gene and one individual showed a single nucleotide difference (G > A transition at position 83). Within the bustard family (Otididae), comparison among the three species (Asian Houbara Bustard, Great Bustard (Otis tarda) and the Little Bustard (Tetrax tetrax)), representing three different genera, showed 116 variable sites. For another family (Rallidae), the intra-family variable sites among the individuals of four different genera were found to be 146. The COI genetic distances among the 15 individuals varied from 0.000 to 0.431. Phylogenetic analysis using 619 bp nucleotide segment of COI clearly discriminated all the species representing different genera, families and orders. All the four specimens of Houbara Bustard formed a single clade and are clearly separated from other two individuals of the same family (Otis tarda and Tetrax tetrax). The nucleotide sequence of partial segment of COI gene effectively discriminated the closely related species. This is the first study reporting the barcodes of Houbara Bustard and would be helpful in future molecular studies, particularly for the conservation of this threatened bird in Saudi Arabia. PMID:22408462
Kumar, Rakesh; Mandal, B; Geetanjali, A S; Jain, R K; Jaiwal, P K
2010-08-01
Watermelon bud necrosis virus (WBNV), a member of the genus Tospovirus, family Bunyaviridae is an important viral pathogen in watermelon cultivation in India. The complete genome sequence properties of WBNV are not available. In the present study, the complete M RNA sequence and the genome organisation of a WBNV isolate infecting watermelon in Delhi (WBNV-wDel) were determined. The M RNA was 4,794 nucleotides (nt) long and potentially coded for a movement protein (NSm) of 34.22 kDa (307 amino acids) on the viral sense strand and a Gn/Gc glycoprotein precursor of 127.15 kDa (1,121 amino acids) on the complementary strand. The two open reading frames were separated by an intergenic region of 402 nt. The 5' and 3' untranslated regions were 55 and 47 nt long, respectively, containing complementary termini typical of tospoviruses. WBNV-wDel was most closely related (79.1% identity) to Groundnut bud necrosis virus, an important tospovirus that occurs in several crops in India, and was different (63.3-75.2% identity) from the other cucurbit-infecting tospoviruses known to occur in Taiwan and Japan. Sequence analysis of NSm and Gn/Gc revealed phylogenetic incongruence between WBNV-wDel and another isolate originating from central India (WBNV-Wm-Som isolate). The Wm-Som isolate showed evolutionary divergence from the wDel isolate in the Gn/Gc protein (74.6% identity) potentially due to recombination with the other tospoviruses that are known to occur in India. This is the first report of a comparison of complete sequences of M RNA of WBNV.
Evolutionary distances in the twilight zone--a rational kernel approach.
Schwarz, Roland F; Fletcher, William; Förster, Frank; Merget, Benjamin; Wolf, Matthias; Schultz, Jörg; Markowetz, Florian
2010-12-31
Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.
Klein, Donald A.; Flores, Romeo M.; Venot, Christophe; Gabbert, Kendra; Schmidt, Raleigh; Stricker, Gary D.; Pruden, Amy; Mandernack, Kevin
2008-01-01
Coalbed methane regeneration is of increasing interest, and is gaining global attention with respect to enhancement of gas recovery. The objective of this study is to determine if there are differences in methanogen nucleic acid sequences associated with low rank coals from the Powder River Basin, Wyoming, in comparison with sequences that can be recovered from coal bed-associated produced waters. Based on results obtained to date, the sequences from the coals appear to be associated with putatively deep-rooted thermophilic autotrophic methanogens, whereas the sequences from the waters are associated with thermophilic autotrophic and heterotrophic methanogens. The recovered sequences associated with coal thus appear to be both phylogenetically and functionally distinct from those that are more closely associated with the produced water. To be able to relate such recovered sequences to organisms that might be present and possibly active in these environments, it is suggested that direct observation, followed by isolation and single cell-based physiological/molecular analyses, be used to characterize methanogenic consortia possibly associated with coals and/or produced waters. It is also important to characterize the microenvironment where these microbes might be found, in both ecological and geological contexts, to be able to develop effective, ecologically relevant coalbed methane regeneration processes.
Murat, Claude; Zampieri, Elisa; Vallino, Marta; Daghino, Stefania; Perotto, Silvia; Bonfante, Paola
2011-05-01
Characterization of genomic variation among different microbial species, or different strains of the same species, is a field of significant interest with a wide range of potential applications. We have investigated the genomic variation in mycorrhizal fungal genomes through genomic suppressive subtractive hybridization. The comparison was between phylogenetically distant and close truffle species (Tuber spp.), and between isolates of the ericoid mycorrhizal fungus Oidiodendron maius featuring different degrees of metal tolerance. In the interspecies experiment, almost all the sequences that were identified in the Tuber melanosporum genome and absent in Tuber borchii and Tuber indicum corresponded to transposable elements. In the intraspecies comparison, some specific sequences corresponded to regions coding for enzymes, among them a glutathione synthetase known to be involved in metal tolerance. This approach is a quick and rather inexpensive tool to develop molecular markers for mycorrhizal fungi tracking and barcoding, to identify functional genes and to investigate the genome plasticity, adaptation and evolution. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Arlt, Martin F.; Ozdemir, Alev Cagla; Birkeland, Shanda R.; Lyons, Robert H.; Glover, Thomas W.; Wilson, Thomas E.
2011-01-01
Copy-number variants (CNVs) are a major source of genetic variation in human health and disease. Previous studies have implicated replication stress as a causative factor in CNV formation. However, existing data are technically limited in the quality of comparisons that can be made between human CNVs and experimentally induced variants. Here, we used two high-resolution strategies—single nucleotide polymorphism (SNP) arrays and mate-pair sequencing—to compare CNVs that occur constitutionally to those that arise following aphidicolin-induced DNA replication stress in the same human cells. Although the optimized methods provided complementary information, sequencing was more sensitive to small variants and provided superior structural descriptions. The majority of constitutional and all aphidicolin-induced CNVs appear to be formed via homology-independent mechanisms, while aphidicolin-induced CNVs were of a larger median size than constitutional events even when mate-pair data were considered. Aphidicolin thus appears to stimulate formation of CNVs that closely resemble human pathogenic CNVs and the subset of larger nonhomologous constitutional CNVs. PMID:21212237
1985-01-01
We have determined the DNA sequence of a gene encoding a thymus leukemia (TL) antigen in the BALB/c mouse, and have more definitively mapped the cloned BALB/c Tla-region class I gene clusters. Analysis of the sequence shows that the Tla gene is less closely related to the H-2 genes than H-2 genes are to one another or to a Qa-2,3-region genes. The Tla gene, 17.3A, contains an apparent gene conversion. Comparison of the BALB/c Tla genes with those from C57BL shows that BALB/c has more Tla-region class I genes, and that one of the genes absent in C57BL is gene 17.3A. PMID:3894562
Berstein, R M; Schluter, S F; Shen, S; Marchalonis, J J
1996-04-16
All immunoglobulins and T-cell receptors throughout phylogeny share regions of highly conserved amino acid sequence. To identify possible primitive immunoglobulins and immunoglobulin-like molecules, we utilized 3' RACE (rapid amplification of cDNA ends) and a highly conserved constant region consensus amino acid sequence to isolate a new immunoglobulin class from the sandbar shark Carcharhinus plumbeus. The immunoglobulin, termed IgW, in its secreted form consists of 782 amino acids and is expressed in both the thymus and the spleen. The molecule overall most closely resembles mu chains of the skate and human and a new putative antigen binding molecule isolated from the nurse shark (NAR). The full-length IgW chain has a variable region resembling human and shark heavy-chain (VH) sequences and a novel joining segment containing the WGXGT motif characteristic of H chains. However, unlike any other H-chain-type molecule, it contains six constant (C) domains. The first C domain contains the cysteine residue characteristic of C mu1 that would allow dimerization with a light (L) chain. The fourth and sixth domains also contain comparable cysteines that would enable dimerization with other H chains or homodimerization. Comparison of the sequences of IgW V and C domains shows homology greater than that found in comparisons among VH and C mu or VL, or CL thereby suggesting that IgW may retain features of the primordial immunoglobulin in evolution.
McCourt, Clare M; McArt, Darragh G; Mills, Ken; Catherwood, Mark A; Maxwell, Perry; Waugh, David J; Hamilton, Peter; O'Sullivan, Joe M; Salto-Tellez, Manuel
2013-01-01
Next Generation Sequencing (NGS) has the potential of becoming an important tool in clinical diagnosis and therapeutic decision-making in oncology owing to its enhanced sensitivity in DNA mutation detection, fast-turnaround of samples in comparison to current gold standard methods and the potential to sequence a large number of cancer-driving genes at the one time. We aim to test the diagnostic accuracy of current NGS technology in the analysis of mutations that represent current standard-of-care, and its reliability to generate concomitant information on other key genes in human oncogenesis. Thirteen clinical samples (8 lung adenocarcinomas, 3 colon carcinomas and 2 malignant melanomas) already genotyped for EGFR, KRAS and BRAF mutations by current standard-of-care methods (Sanger Sequencing and q-PCR), were analysed for detection of mutations in the same three genes using two NGS platforms and an additional 43 genes with one of these platforms. The results were analysed using closed platform-specific proprietary bioinformatics software as well as open third party applications. Our results indicate that the existing format of the NGS technology performed well in detecting the clinically relevant mutations stated above but may not be reliable for a broader unsupervised analysis of the wider genome in its current design. Our study represents a diagnostically lead validation of the major strengths and weaknesses of this technology before consideration for diagnostic use.
Two Different Rickettsial Bacteria Invading Volvox carteri
Kawafune, Kaoru; Hongoh, Yuichi; Hamaji, Takashi; Sakamoto, Tomoaki; Kurata, Tetsuya; Hirooka, Shunsuke; Miyagishima, Shin-ya; Nozaki, Hisayoshi
2015-01-01
Background Bacteria of the family Rickettsiaceae are principally associated with arthropods. Recently, endosymbionts of the Rickettsiaceae have been found in non-phagotrophic cells of the volvocalean green algae Carteria cerasiformis, Pleodorina japonica, and Volvox carteri. Such endosymbionts were present in only C. cerasiformis strain NIES-425 and V. carteri strain UTEX 2180, of various strains of Carteria and V. carteri examined, suggesting that rickettsial endosymbionts may have been transmitted to only a few algal strains very recently. However, in preliminary work, we detected a sequence similar to that of a rickettsial gene in the nuclear genome of V. carteri strain EVE. Methodology/Principal Findings Here we explored the origin of the rickettsial gene-like sequences in the endosymbiont-lacking V. carteri strain EVE, by performing comparative analyses on 13 strains of V. carteri. By reference to our ongoing genomic sequence of rickettsial endosymbionts in C. cerasiformis strain NIES-425 cells, we confirmed that an approximately 9-kbp DNA sequence encompassing a region similar to that of four rickettsial genes was present in the nuclear genome of V. carteri strain EVE. Phylogenetic analyses, and comparisons of the synteny of rickettsial gene-like sequences from various strains of V. carteri, indicated that the rickettsial gene-like sequences in the nuclear genome of V. carteri strain EVE were closely related to rickettsial gene sequences of P. japonica, rather than those of V. carteri strain UTEX 2180. Conclusion/Significance At least two different rickettsial organisms may have invaded the V. carteri lineage, one of which may be the direct ancestor of the endosymbiont of V. carteri strain UTEX 2180, whereas the other may be closely related to the endosymbiont of P. japonica. Endosymbiotic gene transfer from the latter rickettsial organism may have occurred in an ancestor of V. carteri. Thus, the rickettsiae may be widely associated with V. carteri, and likely have often been lost during host evolution. PMID:25671568
Vibration-Rotation Bands of HF and DF
1977-09-23
98 IZa. Comparison of Observed and Calculated Line Positions of HF, Av = I Sequence ........................... 99 f2b. Comparison of Observed and...Calculated Line Positions of HF, Av = 2 Sequence ........................... 102 12c. Comparison of Observed and Calculated Line Positions of HF, Av = 3...Sequence ........................... 107 i2d. Comparison of Observed and Calculated Line Positions ofHF, Av = 4 Sequence ........................... fi
Complete genome sequences of Geobacillus sp. WCH70, a thermophilic strain isolated from wood compost
Brumm, Phillip; Land, Miriam L.; Mead, David
2016-04-27
Geobacillus sp. WCH70 was one of several thermophilic organisms isolated from hot composts in the Middleton, WI area. Comparison of 16 S rRNA sequences showed the strain may be a new species, and is most closely related to G. galactosidasius and G. toebii. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2009 (CP001638). The genome of Geobacillus species WCH70 consists of one circular chromosome of 3,893,306 bp with an average G + C content of 43 %, and two circular plasmids of 33,899 and 10,287 bp with anmore » average G + C content of 40 %. Among sequenced organisms, Geobacillus sp. WCH70 shares highest Average Nucleotide Identity (86 %) with G. thermoglucosidasius strains, as well as similar genome organization. Geobacillus sp. WCH70 appears to be a highly adaptable organism, with an exceptionally high 125 annotated transposons in the genome. The organism also possesses four predicted restriction-modification systems not found in other Geobacillus species.« less
de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles
2014-10-01
The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches' broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea.
de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles
2014-01-01
The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches’ broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea. PMID:25505843
Lim, P O; Sears, B B
1992-01-01
The families within the class Mollicutes are distinguished by their morphologies, nutritional requirements, and abilities to metabolize certain compounds. Biosystematic classification of the plant-pathogenic mycoplasmalike organisms (MLOs) has been difficult because these organisms have not been cultured in vitro, and hence their nutritional requirements have not been determined nor have physiological characterizations been possible. To investigate the evolutionary relationship of the MLOs to other members of the class Mollicutes, a segment of a ribosomal protein operon was cloned and sequenced from an aster yellows-type MLO which is pathogenic for members of the genus Oenothera and from Acholeplasma laidlawii. The deduced amino acid sequence data from the rpl22 and rps3 genes indicate that the MLOs are more closely related to A. laidlawii than to animal mycoplasmas, confirming previous results from 16S rRNA sequence comparisons. This conclusion is also supported by the finding that the UGA codon is not read as a tryptophan codon in the MLO and A. laidlawii, in contrast to its usage in Mycoplasma capricolum. PMID:1556079
L'vov, D K; Al'khovskiĭ, S V; Shchelkanov, M Iu; Shchetinin, A M; Deriabin, P G; Aristova, V A; Gitel'man, A K; Samokhvalov, E I; Botikov, A G
2014-01-01
The Tyulek virus (TLKV) was isolated from the ticks Argas vulgaris Filippova, 1961 (Argasidae), collected from the burrow biotopes in multispecies birds colony in the Aksu river floodplain near Tyulek village (northern part of Chu Valley, Kyrgyzstan). Recently, the TLKV was assigned to the Quaranfil group (including the Quaranfil virus (QRFV), Johnston Atoll virus (JAV), Lake Chad virus) that is a novel genus of the Quaranjavirus in the Orthomyxoviridae family. In his work, the complete genome (ID GenBank KJ438647-8) sequence of the TLKV was determined using next-generation sequencing (Illumina platform). Comparison of deduced amino acid sequences shows closed relationship of the TLKV with QRFV and JAV (86% and 84% identity for PB1 and about 70% for PB2 and PA, respectively). The identity level of the TLKV and QRFV in outer glycoprotein GP is 72% and 80% for nucleotide and amino acid sequences, respectively. The phylogenetic analysis showed that the TLKV belongs to the genus of the Quaranjavirus in the family Orthomyxoviridae.
Luedin, Samuel M; Pothier, Joël F; Danza, Francesco; Storelli, Nicola; Frigaard, Niels-Ulrik; Wittwer, Matthias; Tonolla, Mauro
2018-01-01
" Thiodictyon syntrophicum" sp. nov. strain Cad16 T is a photoautotrophic purple sulfur bacterium belonging to the family of Chromatiaceae in the class of Gammaproteobacteria . The type strain Cad16 T was isolated from the chemocline of the alpine meromictic Lake Cadagno in Switzerland. Strain Cad16 T represents a key species within this sulfur-driven bacterial ecosystem with respect to carbon fixation. The 7.74-Mbp genome of strain Cad16 T has been sequenced and annotated. It encodes 6237 predicted protein sequences and 59 RNA sequences. Phylogenetic comparison based on 16S rRNA revealed that Thiodictyon elegans strain DSM 232 T the most closely related species. Genes involved in sulfur oxidation, central carbon metabolism and transmembrane transport were found. Noteworthy, clusters of genes encoding the photosynthetic machinery and pigment biosynthesis are found on the 0.48 Mb plasmid pTs485. We provide a detailed insight into the Cad16 T genome and analyze it in the context of the microbial ecosystem of Lake Cadagno.
Boosting antibody developability through rational sequence optimization.
Seeliger, Daniel; Schulz, Patrick; Litzenburger, Tobias; Spitz, Julia; Hoerer, Stefan; Blech, Michaela; Enenkel, Barbara; Studts, Joey M; Garidel, Patrick; Karow, Anne R
2015-01-01
The application of monoclonal antibodies as commercial therapeutics poses substantial demands on stability and properties of an antibody. Therapeutic molecules that exhibit favorable properties increase the success rate in development. However, it is not yet fully understood how the protein sequences of an antibody translates into favorable in vitro molecule properties. In this work, computational design strategies based on heuristic sequence analysis were used to systematically modify an antibody that exhibited a tendency to precipitation in vitro. The resulting series of closely related antibodies showed improved stability as assessed by biophysical methods and long-term stability experiments. As a notable observation, expression levels also improved in comparison with the wild-type candidate. The methods employed to optimize the protein sequences, as well as the biophysical data used to determine the effect on stability under conditions commonly used in the formulation of therapeutic proteins, are described. Together, the experimental and computational data led to consistent conclusions regarding the effect of the introduced mutations. Our approach exemplifies how computational methods can be used to guide antibody optimization for increased stability.
Lam, Kelly Y C; Chan, Gallant K L; Xin, Gui-Zhong; Xu, Hong; Ku, Chuen-Fai; Chen, Jian-Ping; Yao, Ping; Lin, Huang-Quan; Dong, Tina T X; Tsim, Karl W K
2015-12-15
Cordyceps sinensis is an endoparasitic fungus widely used as a tonic and medicinal food in the practice of traditional Chinese medicine (TCM). In historical usage, Cordyceps specifically is referring to the species of C. sinensis. However, a number of closely related species are named themselves as Cordyceps, and they are sold commonly as C. sinensis. The substitutes and adulterants of C. sinensis are often introduced either intentionally or accidentally in the herbal market, which seriously affects the therapeutic effects or even leads to life-threatening poisoning. Here, we aim to identify Cordyceps by DNA sequencing technology. Two different DNA-based approaches were compared. The internal transcribed spacer (ITS) sequences and the random amplified polymorphic DNA (RAPD)-sequence characterized amplified region (SCAR) were developed here to authenticate different species of Cordyceps. Both approaches generally enabled discrimination of C. sinensis from others. The application of the two methods, supporting each other, increases the security of identification. For better reproducibility and faster analysis, the SCAR markers derived from the RAPD results provide a new method for quick authentication of Cordyceps.
Complete genome sequences of Geobacillus sp. WCH70, a thermophilic strain isolated from wood compost
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brumm, Phillip; Land, Miriam L.; Mead, David
Geobacillus sp. WCH70 was one of several thermophilic organisms isolated from hot composts in the Middleton, WI area. Comparison of 16 S rRNA sequences showed the strain may be a new species, and is most closely related to G. galactosidasius and G. toebii. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2009 (CP001638). The genome of Geobacillus species WCH70 consists of one circular chromosome of 3,893,306 bp with an average G + C content of 43 %, and two circular plasmids of 33,899 and 10,287 bp with anmore » average G + C content of 40 %. Among sequenced organisms, Geobacillus sp. WCH70 shares highest Average Nucleotide Identity (86 %) with G. thermoglucosidasius strains, as well as similar genome organization. Geobacillus sp. WCH70 appears to be a highly adaptable organism, with an exceptionally high 125 annotated transposons in the genome. The organism also possesses four predicted restriction-modification systems not found in other Geobacillus species.« less
Akins, R A; Grant, D M; Stohl, L L; Bottorff, D A; Nargang, F E; Lambowitz, A M
1988-11-05
The Mauriceville and Varkud mitochondrial plasmids of Neurospora are closely related, closed circular DNAs (3.6 and 3.7 kb, respectively; 1 kb = 10(3) bases or base-pairs), whose characteristics suggest relationships to mitochondrial DNA introns and retrotransposons. Here, we characterized the structure of the Varkud plasmid, determined its complete nucleotide sequence and mapped its major transcripts. The Mauriceville and Varkud plasmids have more than 97% positional identity. Both plasmids contain a 710 amino acid open reading frame that encodes a reverse transcriptase-like protein. The amino acid sequence of this open reading frame is strongly conserved between the two plasmids (701/710 amino acids) as expected for a functionally important protein. Both plasmids have a 0.4 kb region that contains five PstI palindromes and a direct repeat of approximately 160 base-pairs. Comparison of sequences in this region suggests that the Varkud plasmid has diverged less from a common ancestor than has the Mauriceville plasmid. Two major transcripts of the Varkud plasmid were detected by Northern hybridization experiments: a full-length linear RNA of 3.7 kb and an additional prominent transcript of 4.9 kb, 1.2 kb longer than monomer plasmid. Remarkably, we find that the 4.9 kb transcript is a hybrid RNA consisting of the full-length 3.7 kb Varkud plasmid transcript plus a 5' leader of 1.2 kb that is derived from the 5' end of the mitochondrial small rRNA. This and other findings suggest that the Varkud plasmid, like certain RNA viruses, has a mechanism for joining heterologous RNAs to the 5' end of its major transcript, and that, under some circumstances, nucleotide sequences in mitochondria may be recombined at the RNA level.
The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus)
Ming, Ray; Hou, Shaobin; Feng, Yun; Yu, Qingyi; Dionne-Laporte, Alexandre; Saw, Jimmy H.; Senin, Pavel; Wang, Wei; Ly, Benjamin V.; Lewis, Kanako L. T.; Salzberg, Steven L.; Feng, Lu; Jones, Meghan R.; Skelton, Rachel L.; Murray, Jan E.; Chen, Cuixia; Qian, Wubin; Shen, Junguo; Du, Peng; Eustice, Moriah; Tong, Eric; Tang, Haibao; Lyons, Eric; Paull, Robert E.; Michael, Todd P.; Wall, Kerr; Rice, Danny W.; Albert, Henrik; Wang, Ming-Li; Zhu, Yun J.; Schatz, Michael; Nagarajan, Niranjan; Acob, Ricelle A.; Guan, Peizhu; Blas, Andrea; Wai, Ching Man; Ackerman, Christine M.; Ren, Yan; Liu, Chao; Wang, Jianmei; Wang, Jianping; Na, Jong-Kuk; Shakirov, Eugene V.; Haas, Brian; Thimmapuram, Jyothi; Nelson, David; Wang, Xiyin; Bowers, John E.; Gschwend, Andrea R.; Delcher, Arthur L.; Singh, Ratnesh; Suzuki, Jon Y.; Tripathi, Savarni; Neupane, Kabi; Wei, Hairong; Irikura, Beth; Paidi, Maya; Jiang, Ning; Zhang, Wenli; Presting, Gernot; Windsor, Aaron; Navajas-Pérez, Rafael; Torres, Manuel J.; Feltus, F. Alex; Porter, Brad; Li, Yingjun; Burroughs, A. Max; Luo, Ming-Cheng; Liu, Lei; Christopher, David A.; Mount, Stephen M.; Moore, Paul H.; Sugimura, Tak; Jiang, Jiming; Schuler, Mary A.; Friedman, Vikki; Mitchell-Olds, Thomas; Shippen, Dorothy E.; dePamphilis, Claude W.; Palmer, Jeffrey D.; Freeling, Michael; Paterson, Andrew H.; Gonsalves, Dennis; Wang, Lei; Alam, Maqsudul
2010-01-01
Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3× draft genome sequence of ‘SunUp’ papaya, the first commercial virus-resistant transgenic fruit tree1 to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far2–5, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica's distinguishing morpho-physiological, medicinal and nutritional properties. PMID:18432245
The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus).
Ming, Ray; Hou, Shaobin; Feng, Yun; Yu, Qingyi; Dionne-Laporte, Alexandre; Saw, Jimmy H; Senin, Pavel; Wang, Wei; Ly, Benjamin V; Lewis, Kanako L T; Salzberg, Steven L; Feng, Lu; Jones, Meghan R; Skelton, Rachel L; Murray, Jan E; Chen, Cuixia; Qian, Wubin; Shen, Junguo; Du, Peng; Eustice, Moriah; Tong, Eric; Tang, Haibao; Lyons, Eric; Paull, Robert E; Michael, Todd P; Wall, Kerr; Rice, Danny W; Albert, Henrik; Wang, Ming-Li; Zhu, Yun J; Schatz, Michael; Nagarajan, Niranjan; Acob, Ricelle A; Guan, Peizhu; Blas, Andrea; Wai, Ching Man; Ackerman, Christine M; Ren, Yan; Liu, Chao; Wang, Jianmei; Wang, Jianping; Na, Jong-Kuk; Shakirov, Eugene V; Haas, Brian; Thimmapuram, Jyothi; Nelson, David; Wang, Xiyin; Bowers, John E; Gschwend, Andrea R; Delcher, Arthur L; Singh, Ratnesh; Suzuki, Jon Y; Tripathi, Savarni; Neupane, Kabi; Wei, Hairong; Irikura, Beth; Paidi, Maya; Jiang, Ning; Zhang, Wenli; Presting, Gernot; Windsor, Aaron; Navajas-Pérez, Rafael; Torres, Manuel J; Feltus, F Alex; Porter, Brad; Li, Yingjun; Burroughs, A Max; Luo, Ming-Cheng; Liu, Lei; Christopher, David A; Mount, Stephen M; Moore, Paul H; Sugimura, Tak; Jiang, Jiming; Schuler, Mary A; Friedman, Vikki; Mitchell-Olds, Thomas; Shippen, Dorothy E; dePamphilis, Claude W; Palmer, Jeffrey D; Freeling, Michael; Paterson, Andrew H; Gonsalves, Dennis; Wang, Lei; Alam, Maqsudul
2008-04-24
Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3x draft genome sequence of 'SunUp' papaya, the first commercial virus-resistant transgenic fruit tree to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica's distinguishing morpho-physiological, medicinal and nutritional properties.
Chromobacterium sphagni sp. nov., an insecticidal bacterium isolated from Sphagnum bogs.
Blackburn, Michael B; Farrar, Robert R; Sparks, Michael E; Kuhar, Daniel; Mitchell, Ashaki; Gundersen-Rindal, Dawn E
2017-09-01
Sixteen isolates of Gram-reaction-negative, motile, violet-pigmented bacteria were isolated from Sphagnum bogs in West Virginia and Maine, USA. 16S rRNA gene sequences and fatty acid analysis revealed a high degree of relatedness among the isolates, and genome sequencing of two isolates, IIBBL 14B-1T and IIBBL 37-2 (from West Virginia and Maine, respectively), revealed highly similar genomic sequences. The average nucleotide identity (gANI) calculated for these two isolates was found to be in excess of 99 %, but did not exceed 88 % when comparing either isolate with genomic sequences of Chromobacterium violaceum ATCC 12472T, C. haemolyticum DSM 19808T, C. piscinae ND17, C. subtsugae PRAA4-1T, C. vaccinii MWU205T or C. amazonense CBMAI 310T. Collectively, gANI and 16S rRNA gene sequence comparisons suggested that isolates IIBBL 14B-1T and IIBBL 37-2 were most closely related to C. subtsugae, but represented a distinct species. We propose the name Chromobacterium sphagni sp. nov. for this taxon; the type strain is IIBBL 14B-1T (=NRRL B-67130T=JCM 31882T).
Hernández-Orts, Jesús S; Smales, Lesley R; Pinacho-Pinacho, Carlos D; García-Varela, Martín; Presswell, Bronwen
2017-02-01
The polymorphid acanthocephalan, Corynosoma hannae Zdzitowiecki, 1984 is characterised on the basis of newly collected material from a New Zealand sea lion, Phocarctos hookeri (Gray), and long-nosed fur seal, Arctophoca forsteri (Lesson) (definitive hosts), and from Stewart Island shags, Leucocarbo chalconotus (Gray), spotted shags, Phalacrocorax punctatus (Sparrman) and yellow-eyed penguins, Megadyptes antipodes (Hombron & Jacquinot) (non-definitive hosts) from New Zealand. Specimens are described in detail and scanning electron micrographs for C. hannae are provided. Additionally, cystacanths of C. hannae are reported and described for the first time from the body cavity and mesenteries of New Zealand brill, Colistium guntheri (Hutton) and from New Zealand sole, Peltorhamphus novaezeelandiae Günther from Kaka Point, Otago in New Zealand. Partial sequence data for the mitochondrial cytochrome c oxidase 1 gene (cox1) for adults, immature specimens and cystacanths of C. hannae were obtained. Phylogenetic analyses of the newly-generated sequences and for available cox1 sequences of Corynosoma spp. revealed a close relationship between C. hannae and C. australe Johnston, 1937, both species infecting pinnipeds in the Southern Hemisphere. However, a morphological comparison of the species suggests that C. hannae mostly closely resembles C. evae Zdzitowiecki, 1984 and C. semerme (Forssell, 1904), the latter of which occurs in pinnipeds in the Northern Hemisphere. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Vaughn, J C; Mason, M T; Sper-Whitis, G L; Kuhlman, P; Palmer, J D
1995-11-01
We present phylogenetic evidence that a group I intron in an angiosperm mitochondrial gene arose recently by horizontal transfer from a fungal donor species. A 1,716-bp fragment of the mitochondrial coxI gene from the angiosperm Peperomia polybotrya was amplified via the polymerase chain reaction and sequenced. Comparison to other coxI genes revealed a 966-bp group I intron, which, based on homology with the related yeast coxI intron aI4, potentially encodes a 279-amino-acid site-specific DNA endonuclease. This intron, which is believed to function as a ribozyme during its own splicing, is not present in any of 19 coxI genes examined from other diverse vascular plant species. Phylogenetic analysis of intron origin was carried out using three different tree-generating algorithms, and on a variety of nucleotide and amino acid data sets from the intron and its flanking exon sequences. These analyses show that the Peperomia coxI gene intron and exon sequences are of fundamentally different evolutionary origin. The Peperomia intron is more closely related to several fungal mitochondrial introns, two of which are located at identical positions in coxI, than to identically located coxI introns from the land plant Marchantia and the green alga Prototheca. Conversely, the exon sequence of this gene is, as expected, most closely related to other angiosperm coxI genes. These results, together with evidence suggestive of co-conversion of exonic markers immediately flanking the intron insertion site, lead us to conclude that the Peperomia coxI intron probably arose by horizontal transfer from a fungal donor, using the double-strand-break repair pathway. The donor species may have been one of the symbiotic mycorrhizal fungi that live in close obligate association with most plants.
Equilibrium, stability, and orbital evolution of close binary systems
NASA Technical Reports Server (NTRS)
Lai, Dong; Rasio, Frederic A.; Shapiro, Stuart L.
1994-01-01
We present a new analytic study of the equilibrium and stability properties of close binary systems containing polytropic components. Our method is based on the use of ellipsoidal trial functions in an energy variational principle. We consider both synchronized and nonsynchronized systems, constructing the compressible generalizations of the classical Darwin and Darwin-Riemann configurations. Our method can be applied to a wide variety of binary models where the stellar masses, radii, spins, entropies, and polytropic indices are all allowed to vary over wide ranges and independently for each component. We find that both secular and dynamical instabilities can develop before a Roche limit or contact is reached along a sequence of models with decreasing binary separation. High incompressibility always makes a given binary system more susceptible to these instabilities, but the dependence on the mass ratio is more complicated. As simple applications, we construct models of double degenerate systems and of low-mass main-sequence star binaries. We also discuss the orbital evoltuion of close binary systems under the combined influence of fluid viscosity and secular angular momentum losses from processes like gravitational radiation. We show that the existence of global fluid instabilities can have a profound effect on the terminal evolution of coalescing binaries. The validity of our analytic solutions is examined by means of detailed comparisons with the results of recent numerical fluid calculations in three dimensions.
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species
Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha
2011-01-01
Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
Galinier, Richard; van Beurden, Steven; Amilhat, Elsa; Castric, Jeannette; Schoehn, Guy; Verneau, Olivier; Fazio, Géraldine; Allienne, Jean-François; Engelsma, Marc; Sasal, Pierre; Faliex, Elisabeth
2012-06-01
Eel virus European X (EVEX) was first isolated from diseased European eel Anguilla anguilla in Japan at the end of seventies. The virus was tentatively classified into the Rhabdoviridae family on the basis of morphology and serological cross reactivity. This family of viruses is organized into six genera and currently comprises approximately 200 members, many of which are still unassigned because of the lack of molecular data. This work presents the morphological, biochemical and genetic characterizations of EVEX, and proposes a taxonomic classification for this virus. We provide its complete genome sequence, plus a comprehensive sequence comparison between isolates from different geographical origins. The genome encodes the five classical structural proteins plus an overlapping open reading frame in the phosphoprotein gene, coding for a putative C protein. Phylogenic relationship with other rhabdoviruses indicates that EVEX is most closely related to the Vesiculovirus genus and shares the highest identity with trout rhabdovirus 903/87. Copyright © 2012 Elsevier B.V. All rights reserved.
Application of a mitochondrial DNA control region frequency database for UK domestic cats.
Ottolini, Barbara; Lall, Gurdeep Matharu; Sacchini, Federico; Jobling, Mark A; Wetton, Jon H
2017-03-01
DNA variation in 402bp of the mitochondrial control region flanked by repeat sequences RS2 and RS3 was evaluated by Sanger sequencing in 152 English domestic cats, in order to determine the significance of matching DNA sequences between hairs found with a victim's body and the suspect's pet cat. Whilst 95% of English cats possessed one of the twelve globally widespread mitotypes, four new variants were observed, the most common of which (2% frequency) was shared with the evidential samples. No significant difference in mitotype frequency was seen between 32 individuals from the locality of the crime and 120 additional cats from the rest of England, suggesting a lack of local population structure. However, significant differences were observed in comparison with frequencies in other countries, including the closely neighbouring Netherlands, highlighting the importance of appropriate genetic databases when determining the evidential significance of mitochondrial DNA evidence. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Falk, K.; Batts, W.N.; Kvellestad, A.; Kurath, G.; Wiik-Nielsen, J.; Winton, J.R.
2008-01-01
Atlantic salmon paramyxovirus (ASPV) was isolated in 1995 from gills of farmed Atlantic salmon suffering from proliferative gill inflammation. The complete genome sequence of ASPV was determined, revealing a genome 16,968 nucleotides in length consisting of six non-overlapping genes coding for the nucleo- (N), phospho- (P), matrix- (M), fusion- (F), haemagglutinin-neuraminidase- (HN) and large polymerase (L) proteins in the order 3???-N-P-M-F-HN-L-5???. The various conserved features related to virus replication found in most paramyxoviruses were also found in ASPV. These include: conserved and complementary leader and trailer sequences, tri-nucleotide intergenic regions and highly conserved transcription start and stop signal sequences. The P gene expression strategy of ASPV was like that of the respiro-, morbilli- and henipaviruses, which express the P and C proteins from the primary transcript and edit a portion of the mRNA to encode V and W proteins. Sequence similarities among various features related to virus replication, pairwise comparisons of all deduced ASPV protein sequences with homologous regions from other members of the family Paramyxoviridae, and phylogenetic analyses of these amino acid sequences suggested that ASPV was a novel member of the sub-family Paramyxovirinae, most closely related to the respiroviruses. ?? 2008 Elsevier B.V. All rights reserved.
Dessimoz, Christophe; Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro
2011-09-01
Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references.
Zoller, Stefan; Manousaki, Tereza; Qiu, Huan; Meyer, Axel; Kuraku, Shigehiro
2011-01-01
Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references. PMID:21712341
2014-01-01
Background Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published. Results A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a phophoenolpyruvate synthase) and substrate utilization pathway (mannose and aromatics utilization) that might explain phenotypic differences between C. autoethanogenum and C. ljungdahlii. Conclusions Single molecule sequencing will be increasingly used to produce finished microbial genomes. The complete genome will facilitate comparative genomics and functional genomics and support future comparisons between Clostridia and studies that examine the evolution of plasmids, bacteriophage and CRISPR systems. PMID:24655715
Reranking candidate gene models with cross-species comparison for improved gene prediction
Liu, Qian; Crammer, Koby; Pereira, Fernando CN; Roos, David S
2008-01-01
Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models. PMID:18854050
Offerman, Kristy; Carulei, Olivia; van der Walt, Anelda Philine; Douglass, Nicola; Williamson, Anna-Lise
2014-06-12
Two novel avipoxviruses from South Africa have been sequenced, one from a Feral Pigeon (Columba livia) (FeP2) and the other from an African penguin (Spheniscus demersus) (PEPV). We present a purpose-designed bioinformatics pipeline for analysis of next generation sequence data of avian poxviruses and compare the different avipoxviruses sequenced to date with specific emphasis on their evolution and gene content. The FeP2 (282 kbp) and PEPV (306 kbp) genomes encode 271 and 284 open reading frames respectively and are more closely related to one another (94.4%) than to either fowlpox virus (FWPV) (85.3% and 84.0% respectively) or Canarypox virus (CNPV) (62.0% and 63.4% respectively). Overall, FeP2, PEPV and FWPV have syntenic gene arrangements; however, major differences exist throughout their genomes. The most striking difference between FeP2 and the FWPV-like avipoxviruses is a large deletion of ~16 kbp from the central region of the genome of FeP2 deleting a cc-chemokine-like gene, two Variola virus B22R orthologues, an N1R/p28-like gene and a V-type Ig domain family gene. FeP2 and PEPV both encode orthologues of vaccinia virus C7L and Interleukin 10. PEPV contains a 77 amino acid long orthologue of Ubiquitin sharing 97% amino acid identity to human ubiquitin. The genome sequences of FeP2 and PEPV have greatly added to the limited repository of genomic information available for the Avipoxvirus genus. In the comparison of FeP2 and PEPV to existing sequences, FWPV and CNPV, we have established insights into African avipoxvirus evolution. Our data supports the independent evolution of these South African avipoxviruses from a common ancestral virus to FWPV and CNPV.
Hornok, Sándor; Wang, Yuanzhi; Otranto, Domenico; Keskin, Adem; Lia, Riccardo Paolo; Kontschán, Jenő; Takács, Nóra; Farkas, Róbert; Sándor, Attila D
2016-12-15
Haemaphysalis erinacei is one of the few ixodid tick species for which valid names of subspecies exist. Despite their disputed taxonomic status in the literature, these subspecies have not yet been compared with molecular methods. The aim of the present study was to investigate the phylogenetic relationships of H. erinacei subspecies, in the context of the first finding of this tick species in Romania. After morphological identification, DNA was extracted from five adults of H. e. taurica (from Romania and Turkey), four adults of H. e. erinacei (from Italy) and 17 adults of H. e. turanica (from China). From these samples fragments of the cytochrome c oxidase subunit 1 (cox1) and 16S rRNA genes were amplified via PCR and sequenced. Results showed that cox1 and 16S rRNA gene sequence divergences between H. e. taurica from Romania and H. e. erinacei from Italy were below 2%. However, the sequence divergences between H. e. taurica from Romania and H. e. turanica from China were high (up to 7.3% difference for the 16S rRNA gene), exceeding the reported level of sequence divergence between closely related tick species. At the same time, two adults of H. e. taurica from Turkey had higher 16S rRNA gene similarity to H. e. turanica from China (up to 97.5%) than to H. e. taurica from Romania (96.3%), but phylogenetically clustered more closely to H. e. taurica than to H. e. turanica. This is the first finding of H. erinacei in Romania, and the first (although preliminary) phylogenetic comparison of H. erinacei subspecies. Phylogenetic analyses did not support that the three H. erinacei subspecies evaluated here are of equal taxonomic rank, because the genetic divergence between H. e. turanica from China and H. e. taurica from Romania exceeded the usual level of sequence divergence between closely related tick species, suggesting that they might represent different species. Therefore, the taxonomic status of the subspecies of H. erinacei needs to be revised based on a larger number of specimens collected throughout its geographical range.
González, Víctor M; Aventín, Núria; Centeno, Emilio; Puigdomènech, Pere
2014-12-17
Plant NBS-LRR -resistance genes tend to be found in clusters, which have been shown to be hot spots of genome variability. In melon, half of the 81 predicted NBS-LRR genes group in nine clusters, and a 1 Mb region on linkage group V contains the highest density of R-genes and presence/absence gene polymorphisms found in the melon genome. This region is known to contain the locus of Vat, an agronomically important gene that confers resistance to aphids. However, the presence of duplications makes the sequencing and annotation of R-gene clusters difficult, usually resulting in multi-gapped sequences with higher than average errors. A 1-Mb sequence that contains the largest NBS-LRR gene cluster found in melon was improved using a strategy that combines Illumina paired-end mapping and PCR-based gap closing. Unknown sequence was decreased by 70% while about 3,000 SNPs and small indels were corrected. As a result, the annotations of 18 of a total of 23 NBS-LRR genes found in this region were modified, including additional coding sequences, amino acid changes, correction of splicing boundaries, or fussion of ORFs in common transcription units. A phylogeny analysis of the R-genes and their comparison with syntenic sequences in other cucurbits point to a pattern of local gene amplifications since the diversification of cucurbits from other families, and through speciation within the family. A candidate Vat gene is proposed based on the sequence similarity between a reported Vat gene from a Korean melon cultivar and a sequence fragment previously absent in the unrefined sequence. A sequence refinement strategy allowed substantial improvement of a 1 Mb fragment of the melon genome and the re-annotation of the largest cluster of NBS-LRR gene homologues found in melon. Analysis of the cluster revealed that resistance genes have been produced by sequence duplication in adjacent genome locations since the divergence of cucurbits from other close families, and through the process of speciation within the family a candidate Vat gene was also identified using sequence previously unavailable, which demonstrates the advantages of genome assembly refinements when analyzing complex regions such as those containing clusters of highly similar genes.
Karamian, Mehdi; Kuhls, Katrin; Hemmati, Mina; Ghatee, Mohammad Amin
2016-06-01
Iran has been identified being among the countries with the highest number of cutaneous leishmaniasis (CL) cases. South Khorasan province in East Iran is an emerging focus of CL. Species identification of sixty clinical samples by ITS1 PCR-RFLP presented evidence for the dominance of Leishmania tropica (90%) in this region. Analysis of the ITS1 sequence of 19 L. tropica isolates revealed seven closely related sequence types. In addition, ITS1 sequences available in GenBank from other Iranian regions were compiled for comparison with the studied isolates. Iranian L. tropica was distributed in two main clusters. All East Iranian sequence types were grouped with strains from foci from Southeast and Central regions in cluster A, showing highly similar sequences. The highest similarity was observed between most L. tropica from East and all isolates from Southeast regions and from Savojbolagh county in Central Iran. Southwest L. tropica was shown to be paraphyletic as the isolates were distributed in both clusters A and B. All Northeastern L. tropica were part of cluster B, however they showed significant heterogeneity and were distributed in different subclusters. Distribution of L. tropica populations was to some extent congruent with genetic lineages of Phlebotomus sergenti in Iran and may be an evidence for parasite-vector co-evolution. Southeast-East L. tropica was also similar to strains from Herat province in Afghanistan at the East border of Iran. This is the first comprehensive study on population structure of L. tropica in Iran that provides a guideline for appropriate sampling for further molecular based epidemiological studies. Copyright © 2016 Elsevier B.V. All rights reserved.
Hamilton, P T; Reeve, J N
1985-01-01
DNA fragments cloned from the methanogenic archaebacterium Methanobrevibacter smithii which complement mutations in the purE and proC genes of E. coli have been sequenced. Sequence analyses, transposon mutagenesis and expression in E. coli minicells indicate that purE and proC complementations result from the synthesis of M. smithii polypeptides with molecular weights of 36,697 and 27,836 respectively. The encoding genes appear to be located in operons. The M. smithii genome contains 69% A/T basepairs (bp) which is reflected in unusual codon usages and intergenic regions containing approximately 85% A/T bp. An insertion element, designated ISM1, was found within the cloned M. smithii DNA located adjacent to the proC complementing region. ISM1 is 1381 bp in length, has 29 bp terminal inverted repeat sequences and contains one major ORF encoded in 87% of the ISM1 sequence. ISM1 is mobile, present in approximately 10 copies per genome and integration duplicates 8 bp at the site of insertion. The duplicated sequences show homology with sequences within the 29 bp terminal repeat sequence of ISM1. Comparison of our data with sequences from halophilic archaebacteria suggests that 5'GAANTTTCA and 5'TTTTAATATAAA may be consensus promoter sequences for archaebacteria. These sequences closely resemble the consensus sequences which precede Drosophila heat-shock genes (Pelham 1982; Davidson et al. 1983). Methanogens appear to employ the eubacterial system of mRNA: 16SrRNA hybridization to ensure initiation of translation; the consensus ribosome binding sequence is 5'AGGTGA.
A brief period of eyes-closed rest enhances motor skill consolidation.
Humiston, Graelyn B; Wamsley, Erin J
2018-06-05
Post-training sleep benefits both declarative and procedural memory consolidation. However, recent research suggests that eyes-closed waking rest may provide a similar benefit. Brokaw et al. (2016), for example, recently demonstrated that verbal declarative memory improved more following a 15 min period of waking rest, in comparison to 15 min of active wake. Here, we used the same procedures to test whether procedural memory similarly benefits from waking rest. Participants were trained on the Motor Sequence Task (MST), followed by a 15 min retention interval during which they either rested with their eyes closed or completed a distractor task. Rest significantly enhanced MST performance, mirroring the effect observed in Brokaw et al. (2016) and demonstrating that waking rest benefits the early stages of procedural memory. An additional group of participants tested 4 h later displayed no effect of rest. Overall, these results suggest that the early MST performance "boost" described in prior studies may depend on post-learning state. Copyright © 2018 Elsevier Inc. All rights reserved.
Genome sequences of two closely related strains of Escherichia coli K-12 GM4792.
Zhang, Yan-Cong; Zhang, Yan; Zhu, Bi-Ru; Zhang, Bo-Wen; Ni, Chuan; Zhang, Da-Yong; Huang, Ying; Pang, Erli; Lin, Kui
2015-01-01
Escherichia coli lab strains K-12 GM4792 Lac(+) and GM4792 Lac(-) carry opposite lactose markers, which are useful for distinguishing evolved lines as they produce different colored colonies. The two closely related strains are chosen as ancestors for our ongoing studies of experimental evolution. Here, we describe the genome sequences, annotation, and features of GM4792 Lac(+) and GM4792 Lac(-). GM4792 Lac(+) has a 4,622,342-bp long chromosome with 4,061 protein-coding genes and 83 RNA genes. Similarly, the genome of GM4792 Lac(-) consists of a 4,621,656-bp chromosome containing 4,043 protein-coding genes and 74 RNA genes. Genome comparison analysis reveals that the differences between GM4792 Lac(+) and GM4792 Lac(-) are minimal and limited to only the targeted lac region. Moreover, a previous study on competitive experimentation indicates the two strains are identical or nearly identical in survivability except for lactose utilization in a nitrogen-limited environment. Therefore, at both a genetic and a phenotypic level, GM4792 Lac(+) and GM4792 Lac(-), with opposite neutral markers, are ideal systems for future experimental evolution studies.
NASA Technical Reports Server (NTRS)
Dutta, Soumyo; Way, David W.
2017-01-01
Mars 2020, the next planned U.S. rover mission to land on Mars, is based on the design of the successful 2012 Mars Science Laboratory (MSL) mission. Mars 2020 retains most of the entry, descent, and landing (EDL) sequences of MSL, including the closed-loop entry guidance scheme based on the Apollo guidance algorithm. However, unlike MSL, Mars 2020 will trigger the parachute deployment and descent sequence on range trigger rather than the previously used velocity trigger. This difference will greatly reduce the landing ellipse sizes. Additionally, the relative contribution of each models to the total ellipse sizes have changed greatly due to the switch to range trigger. This paper considers the effect on trajectory dispersions due to changing the trigger schemes and the contributions of these various models to trajectory and EDL performance.
Qiu, T; Lu, R H; Zhang, J; Zhu, Z Y
2001-07-01
The complete nucleotide sequence of M6 gene of grass carp hemorrhage virus (GCHV) was determined. It is 2039 nucleotides in length and contains a single large open reading frame that could encode a protein of 648 amino acids with predicted molecular mass of 68.7 kDa. Amino acid sequence comparison revealed that the protein encoded by GCHV M6 is closely related to the protein mu1 of mammalian reovirus. The M6 gene, encoding the major outer-capsid protein, was expressed using the pET fusion protein vector in Escherichia coli and detected by Western blotting using chicken anti-GCHV immunoglobulin (IgY). The result indicates that the protein encoded by M6 may share a putative Asn-42-Pro-43 proteolytic cleavage site with mu1.
Phylogeny and evolution of the auks (subfamily Alcinae) based on mitochondrial DNA sequences
Moum, Truls; Johansen, Steinar; Erikstad, Kjell Einar; Piatt, John F.
1994-01-01
The genetic divergence and phylogeny of the auks was assessed by mitochondrial DNA sequence comparisons in a study using 19 of the 22 auk species and two outgroup representatives. We compared more than 500 nucleotides from each of two mitochondrial genes encoding 12S rRNA and the NADH dehydrogenase subunit 6. Divergence times were estimated from transversional substitutions. The dovekie (Alle alle) is related to the razorbill (Alca torda) and the murres (Uria spp). Furthermore, the Xantus's murrelet (Synthliboramphus hypoleucus) and the ancient (Synthliboramphus antiquus) and Japanese murrelets (Synthliboramphus wumizusume) are genetically distinct members of the same main lineage, whereas brachyramphine and synthliboramphine murrelets are not closely related. An early adaptive radiation of six main species groups of auks seems to trace back to Middle Miocene. Later speciation probably involved ecological differentiations and geographical isolations.
Schmidt, Olga; Hausmann, Axel; Cancian de Araujo, Bruno; Sutrisno, Hari; Peggie, Djunijanti; Schmidt, Stefan
2017-01-01
Here we present a general collecting and preparation protocol for DNA barcoding of Lepidoptera as part of large-scale rapid biodiversity assessment projects, and a comparison with alternative preserving and vouchering methods. About 98% of the sequenced specimens processed using the present collecting and preparation protocol yielded sequences with more than 500 base pairs. The study is based on the first outcomes of the Indonesian Biodiversity Discovery and Information System (IndoBioSys). IndoBioSys is a German-Indonesian research project that is conducted by the Museum für Naturkunde in Berlin and the Zoologische Staatssammlung München, in close cooperation with the Research Center for Biology - Indonesian Institute of Sciences (RCB-LIPI, Bogor).
Voelker, Toni A.; Staswick, Paul; Chrispeels, Maarten J.
1986-01-01
Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained. ImagesFig. 5. PMID:16453730
Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro
2010-04-27
To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be caused by a recent human selection in rice breeding. The definition of pedigree haplotypes by means of genome-wide SNPs will facilitate next-generation breeding of rice and other crops.
Alignment-free genome tree inference by learning group-specific distance metrics.
Patil, Kaustubh R; McHardy, Alice C
2013-01-01
Understanding the evolutionary relationships between organisms is vital for their in-depth study. Gene-based methods are often used to infer such relationships, which are not without drawbacks. One can now attempt to use genome-scale information, because of the ever increasing number of genomes available. This opportunity also presents a challenge in terms of computational efficiency. Two fundamentally different methods are often employed for sequence comparisons, namely alignment-based and alignment-free methods. Alignment-free methods rely on the genome signature concept and provide a computationally efficient way that is also applicable to nonhomologous sequences. The genome signature contains evolutionary signal as it is more similar for closely related organisms than for distantly related ones. We used genome-scale sequence information to infer taxonomic distances between organisms without additional information such as gene annotations. We propose a method to improve genome tree inference by learning specific distance metrics over the genome signature for groups of organisms with similar phylogenetic, genomic, or ecological properties. Specifically, our method learns a Mahalanobis metric for a set of genomes and a reference taxonomy to guide the learning process. By applying this method to more than a thousand prokaryotic genomes, we showed that, indeed, better distance metrics could be learned for most of the 18 groups of organisms tested here. Once a group-specific metric is available, it can be used to estimate the taxonomic distances for other sequenced organisms from the group. This study also presents a large scale comparison between 10 methods--9 alignment-free and 1 alignment-based.
Mao, Yaping; Wang, Jigui; Hou, Qiang; Xi, Ji; Zhang, Xiaomei; Bian, Dawei; Yu, Yongle; Wang, Xi; Liu, Weiquan
2016-06-01
A virus isolated from mink showing clinical signs of enteritis was identified as a high virulent mink enteritis parvovirus (MEV) based on its biological characteristics in vivo and in vitro. Mink, challenged with this strain named MEV-LHV, exhibited severe pathological lesions as compared to those challenged with attenuated strain MEV-L. MEV-LHV also showed higher infection and replication efficiencies in vitro than MEV-L. Sequence of the complete genome of MEV-LHV was determined and analyzed in comparison with those in GenBank, which revealed that MEV-LHV shared high homology with virulent strain MEV SD12/01, whereas MEV-L was closely related to Abashiri and vaccine strain MEVB, and belonged to a different branch of the phylogenetic tree. The genomes of the two strains differed by insertions and deletions in their palindromic termini and specific unique mutations (especially VP2 300) in coding sequences which may be involved in viral replication and pathogenicity. The results of this study provide a better understanding of the biological and genomic characteristics of MEV and identify certain regions and sites that may be involved in viral replication and pathogenicity.
Support vector machine multiuser receiver for DS-CDMA signals in multipath channels.
Chen, S; Samingan, A K; Hanzo, L
2001-01-01
The problem of constructing an adaptive multiuser detector (MUD) is considered for direct sequence code division multiple access (DS-CDMA) signals transmitted through multipath channels. The emerging learning technique, called support vector machines (SVM), is proposed as a method of obtaining a nonlinear MUD from a relatively small training data block. Computer simulation is used to study this SVM MUD, and the results show that it can closely match the performance of the optimal Bayesian one-shot detector. Comparisons with an adaptive radial basis function (RBF) MUD trained by an unsupervised clustering algorithm are discussed.
F4TCNQ-Induced Exciton Quenching Studied by Using in-situ Photoluminescence Measurements
NASA Astrophysics Data System (ADS)
Zhu, Jian; Lu, Min; Wu, Bo; Hou, Xiao-Yuan
2012-09-01
The role of F4TCNQ as an exciton quenching material in thin organic light-emitting films is investigated by means of in situ photoluminescence measurements. C60 was used as another quenching material in the experiment for comparison, with Alq3 as a common organic light-emitting material. The effect of the growth sequence of the materials on quenching was also examined. It is found that the radius of Förster energy transfer between F4TCNQ and Alq3 is close to 0 nm and Dexter energy transfer dominates in the quenching process.
Olijnyk, Helmut
2005-01-12
Lattice vibrations in high-pressure phases of Y, Gd and Lu were studied by Raman spectroscopy. The observed phonon frequencies decrease towards the transitions to the dhcp and fcc phases. There is evidence that the entire structural sequence [Formula: see text] under pressure for the individual regular rare-earth metals and along the lanthanide series at ambient pressure involve softening of certain acoustic and optical phonon modes and of the elastic shear modulus C(44). Comparison is made to transitions between close-packed lattices in other metals, and possible correlations to s-d electron transfer are discussed.
Lattice dynamics of the lanthanides: Samarium at high pressure
NASA Astrophysics Data System (ADS)
Olijnyk, H.; Jephcoat, A. P.
2005-02-01
Sm was studied by Raman spectroscopy at pressures up to 20 GPa. The Raman-active phonon modes, both of the Sm-type phase and the dhcp phase, show a frequency decrease as pressure increases. There is evidence that the entire structural sequence hcp → Sm-type → dhcp → fcc under pressure for the individual regular lanthanides is associated with softening of certain acoustic and optical-phonon modes as well as elastic anomalies. Comparison is made to corresponding transitions between close-packed lattices in other metals and possible relations to the lanthanide's electronic structure are addressed.
Makeyev, Aleksandr V.; Erdenechimeg, Lkhamsuren; Mungunsukh, Ognoon; Roth, Jutta J.; Enkhmandakh, Badam; Ruddle, Frank H.; Bayarsaihan, Dashzeveg
2004-01-01
Williams–Beuren syndrome (also known as Williams syndrome) is caused by a deletion of a 1.55- to 1.84-megabase region from chromosome band 7q11.23. GTF2IRD1 and GTF2I, located within this critical region, encode proteins of the TFII-I family with multiple helix–loop–helix domains known as I repeats. In the present work, we characterize a third member, GTF2IRD2, which has sequence and structural similarity to the GTF2I and GTF2IRD1 paralogs. The ORF encodes a protein with several features characteristic of regulatory factors, including two I repeats, two leucine zippers, and a single Cys-2/His-2 zinc finger. The genomic organization of human, baboon, rat, and mouse genes is well conserved. Our exon-by-exon comparison has revealed that GTF2IRD2 is more closely related to GTF2I than to GTF2IRD1 and apparently is derived from the GTF2I sequence. The comparison of GTF2I and GTF2IRD2 genes revealed two distinct regions of homology, indicating that the helix–loop–helix domain structure of the GTF2IRD2 gene has been generated by two independent genomic duplications. We speculate that GTF2I is derived from GTF2IRD1 as a result of local duplication and the further evolution of its structure was associated with its functional specialization. Comparison of genomic sequences surrounding GTF2IRD2 genes in mice and humans allows refinement of the centromeric breakpoint position of the primate-specific inversion within the Williams–Beuren syndrome critical region. PMID:15243160
Makeyev, Aleksandr V; Erdenechimeg, Lkhamsuren; Mungunsukh, Ognoon; Roth, Jutta J; Enkhmandakh, Badam; Ruddle, Frank H; Bayarsaihan, Dashzeveg
2004-07-27
Williams-Beuren syndrome (also known as Williams syndrome) is caused by a deletion of a 1.55- to 1.84-megabase region from chromosome band 7q11.23. GTF2IRD1 and GTF2I, located within this critical region, encode proteins of the TFII-I family with multiple helix-loop-helix domains known as I repeats. In the present work, we characterize a third member, GTF2IRD2, which has sequence and structural similarity to the GTF2I and GTF2IRD1 paralogs. The ORF encodes a protein with several features characteristic of regulatory factors, including two I repeats, two leucine zippers, and a single Cys-2/His-2 zinc finger. The genomic organization of human, baboon, rat, and mouse genes is well conserved. Our exon-by-exon comparison has revealed that GTF2IRD2 is more closely related to GTF2I than to GTF2IRD1 and apparently is derived from the GTF2I sequence. The comparison of GTF2I and GTF2IRD2 genes revealed two distinct regions of homology, indicating that the helix-loop-helix domain structure of the GTF2IRD2 gene has been generated by two independent genomic duplications. We speculate that GTF2I is derived from GTF2IRD1 as a result of local duplication and the further evolution of its structure was associated with its functional specialization. Comparison of genomic sequences surrounding GTF2IRD2 genes in mice and humans allows refinement of the centromeric breakpoint position of the primate-specific inversion within the Williams-Beuren syndrome critical region.
KEPLER ECLIPSING BINARIES WITH DELTA SCUTI/GAMMA DORADUS PULSATING COMPONENTS. I. KIC 9851944
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Zhao; Gies, Douglas R.; Matson, Rachel A.
2016-07-20
KIC 9851944 is a short-period ( P = 2.16 days) eclipsing binary in the Kepler field of view. By combining the analysis of Kepler photometry and phase-resolved spectra from Kitt Peak National Observatory and Lowell Observatory, we determine the atmospheric and physical parameters of both stars. The two components have very different radii (2.27 R {sub ⊙}, 3.19 R {sub ⊙}) but close masses (1.76 M {sub ⊙}, 1.79 M {sub ⊙}) and effective temperatures (7026, 6902 K), indicating different evolutionary stages. The hotter primary is still on the main sequence (MS), while the cooler and larger secondary star hasmore » evolved to the post-MS, burning hydrogen in a shell. A comparison with coeval evolutionary models shows that it requires solar metallicity and a higher mass ratio to fit the radii and temperatures of both stars simultaneously. Both components show δ Scuti-type pulsations, which we interpret as p -modes and p and g mixed modes. After a close examination of the evolution of δ Scuti pulsational frequencies, we make a comparison of the observed frequencies with those calculated from MESA/GYRE.« less
Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz
2015-01-01
Aim To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Methods Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequences were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Results Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Conclusion Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material. PMID:25727040
Lu, Bingxin; Leong, Hon Wai
2016-02-01
Genomic islands (GIs) are clusters of functionally related genes acquired by lateral genetic transfer (LGT), and they are present in many bacterial genomes. GIs are extremely important for bacterial research, because they not only promote genome evolution but also contain genes that enhance adaption and enable antibiotic resistance. Many methods have been proposed to predict GI. But most of them rely on either annotations or comparisons with other closely related genomes. Hence these methods cannot be easily applied to new genomes. As the number of newly sequenced bacterial genomes rapidly increases, there is a need for methods to detect GI based solely on sequences of a single genome. In this paper, we propose a novel method, GI-SVM, to predict GIs given only the unannotated genome sequence. GI-SVM is based on one-class support vector machine (SVM), utilizing composition bias in terms of k-mer content. From our evaluations on three real genomes, GI-SVM can achieve higher recall compared with current methods, without much loss of precision. Besides, GI-SVM allows flexible parameter tuning to get optimal results for each genome. In short, GI-SVM provides a more sensitive method for researchers interested in a first-pass detection of GI in newly sequenced genomes.
Kowalczyk, Marek; Sekuła, Andrzej; Mleczko, Piotr; Olszowy, Zofia; Kujawa, Anna; Zubek, Szymon; Kupiec, Tomasz
2015-02-01
To assess the usefulness of a DNA-based method for identifying mushroom species for application in forensic laboratory practice. Two hundred twenty-one samples of clinical forensic material (dried mushrooms, food remains, stomach contents, feces, etc) were analyzed. ITS2 region of nuclear ribosomal DNA (nrDNA) was sequenced and the sequen-ces were compared with reference sequences collected from the National Center for Biotechnology Information gene bank (GenBank). Sporological identification of mushrooms was also performed for 57 samples of clinical material. Of 221 samples, positive sequencing results were obtained for 152 (69%). The highest percentage of positive results was obtained for samples of dried mushrooms (96%) and food remains (91%). Comparison with GenBank sequences enabled identification of all samples at least at the genus level. Most samples (90%) were identified at the level of species or a group of closely related species. Sporological and molecular identification were consistent at the level of species or genus for 30% of analyzed samples. Molecular analysis identified a larger number of species than sporological method. It proved to be suitable for analysis of evidential material (dried hallucinogenic mushrooms) in forensic genetic laboratories as well as to complement classical methods in the analysis of clinical material.
Sellem, C. H.; d'Aubenton-Carafa, Y.; Rossignol, M.; Belcour, L.
1996-01-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optional sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences. In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group I intronic ORFs are mobile elements and that their transfer, and comcomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes. PMID:8725226
Sellem, C H; d'Aubenton-Carafa, Y; Rossignol, M; Belcour, L
1996-06-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optional sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences. In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group 1 intronic ORFs are mobile elements and that their transfer, and concomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes.
Samuel, Arthur S.; Kumar, Sachin; Madhuri, Subbiah; Collins, Peter L.; Samal, Siba K.
2009-01-01
The complete genome consensus sequence was determined for avian paramyxovirus (APMV) serotype 9 prototype strain PMV-9/domestic Duck/New York/22/78. The genome is 15,438 nucleotides (nt) long and encodes six non-overlapping genes in the order of 3′-N-P/V/W-M-F-HN-L-5′ with intergenic regions of 0–30 nt. The genome length follows the “rule of six” and contains a 55-nt leader sequence at the 3′ end and a 47-nt trailer sequence at the 5′ end. The cleavage site of the F protein is I-R-E-G-R-I↓F, which does not conform to the conventional cleavage site of the ubiquitous cellular protease furin. The virus required exogenous protease for in vitro replication and grew only in a few established cell lines, indicating a restricted host range. Alignment and phylogenetic analysis of the predicted amino acid sequences of APMV-9 proteins with the cognate proteins of viruses of all five genera of family Paramyxoviridae showed that APMV-9 is more closely related to APMV-1 than to other APMVs. The mean death time in embryonated chicken eggs was found to be more than 120 h, indicating APMV-9 to be avirulent for chickens. PMID:19185593
Birth and death of genes linked to chromosomal inversion
Furuta, Yoshikazu; Kawai, Mikihiko; Yahara, Koji; Takahashi, Noriko; Handa, Naofumi; Tsuru, Takeshi; Oshima, Kenshiro; Yoshida, Masaru; Azuma, Takeshi; Hattori, Masahira; Uchiyama, Ikuo; Kobayashi, Ichizo
2011-01-01
The birth and death of genes is central to adaptive evolution, yet the underlying genome dynamics remain elusive. The availability of closely related complete genome sequences helps to follow changes in gene contents and clarify their relationship to overall genome organization. Helicobacter pylori, bacteria in our stomach, are known for their extreme genome plasticity through mutation and recombination and will make a good target for such an analysis. In comparing their complete genome sequences, we found that gain and loss of genes (loci) for outer membrane proteins, which mediate host interaction, occurred at breakpoints of chromosomal inversions. Sequence comparison there revealed a unique mechanism of DNA duplication: DNA duplication associated with inversion. In this process, a DNA segment at one chromosomal locus is copied and inserted, in an inverted orientation, into a distant locus on the same chromosome, while the entire region between these two loci is also inverted. Recognition of this and three more inversion modes, which occur through reciprocal recombination between long or short sequence similarity or adjacent to a mobile element, allowed reconstruction of synteny evolution through inversion events in this species. These results will guide the interpretation of extensive DNA sequencing results for understanding long- and short-term genome evolution in various organisms and in cancer cells. PMID:21212362
Host Cell Virus Entry Mediated by Australian Bat Lyssavirus Envelope G glycoprotein
2013-10-24
39 Figure 7. Comparison of the amino acid sequences of Saccolaimus and Pteropus ABLV G mature protein... sequence analysis revealed that the PCR products were identical. Sequence comparisons of the ABLV N and other lyssavirus N proteins showed that ABLV...Saccolaimus flaviventris) (129). Nucleoprotein sequence comparisons revealed that the Saccolaimus N protein shared 96% amino acid homology with the Pteropus
Nucleotide sequences of two genomic DNAs encoding peroxidase of Arabidopsis thaliana.
Intapruk, C; Higashimura, N; Yamamoto, K; Okada, N; Shinmyo, A; Takano, M
1991-02-15
The peroxidase (EC 1.11.1.7)-encoding gene of Arabidopsis thaliana was screened from a genomic library using a cDNA encoding a neutral isozyme of horseradish, Armoracia rusticana, peroxidase (HRP) as a probe, and two positive clones were isolated. From the comparison with the sequences of the HRP-encoding genes, we concluded that two clones contained peroxidase-encoding genes, and they were named prxCa and prxEa. Both genes consisted of four exons and three introns; the introns had consensus nucleotides, GT and AG, at the 5' and 3' ends, respectively. The lengths of each putative exon of the prxEa gene were the same as those of the HRP-basic-isozyme-encoding gene, prxC3, and coded for 349 amino acids (aa) with a sequence homology of 89% to that encoded by prxC3. The prxCa gene was very close to the HRP-neutral-isozyme-encoding gene, prxC1b, and coded for 354 aa with 91% homology to that encoded by prxC1b. The aa sequence homology was 64% between the two peroxidases encoded by prxCa and prxEa.
Whiteduck-Léveillée, Kerri; Whiteduck-Léveillée, Jenni; Cloutier, Michel; Tambong, James T; Xu, Renlin; Topp, Edward; Arts, Michael T; Chao, Jerry; Adam, Zaky; Lévesque, C André; Lapen, David R; Villemur, Richard; Khan, Izhar U H
2016-03-01
A study on the taxonomic classification of Arcobacter species was performed on the cultures isolated from various fecal sources where an Arcobacter strain AF1078(T) from human waste septic tank near Ottawa, Ontario, Canada was characterized using a polyphasic approach. Genetic investigations including 16S rRNA, atpA, cpn60, gyrA, gyrB and rpoB gene sequences of strain AF1078(T) are unique in comparison with other arcobacters. Phylogenetic analysis based on the 16S rRNA gene sequence revealed that the strain is most closely related to Arcobacter lanthieri and Arcobacter cibarius. Analyses of atpA, cpn60, gyrA, gyrB and rpoB gene sequences suggested that strain AF1078(T) formed a phylogenetic lineage independent of other species in the genus. Whole-genome sequence, DNA-DNA hybridization, fatty acid profile and phenotypic analysis further supported the conclusion that strain AF1078(T) represents a novel Arcobacter species, for which the name Arcobacter faecis sp. nov. is proposed, with type strain AF1078(T) (=LMG 28519(T); CCUG 66484(T)). Crown Copyright © 2015. Published by Elsevier GmbH. All rights reserved.
Zygosaccharomyces favi sp. nov., an obligate osmophilic yeast species from bee bread and honey.
Čadež, Neža; Fülöp, László; Dlauchy, Dénes; Péter, Gábor
2015-03-01
Five yeast strains representing a hitherto undescribed yeast species were isolated from bee bread and honey in Hungary. They are obligate osmophilic, i.e. they are unable to grow in/on high water activity culture media. Following isogamous conjugation, they form 1-4 spheroid or subspheroid ascospores in persistent asci. The analysis of the sequences of their large subunit rRNA gene D1/D2 domain placed the new species in the Zygosaccharomyces clade. In terms of pairwise sequence similarity, Zygosaccharomyces gambellarensis is the most closely related species. Comparisons of D1/D2, internal transcribed spacer and translation elongation factor-1α (EF-1α) gene sequences of the five strains with that of the type strain of Z. gambellarensis revealed that they represent a new yeast species. The name Zygosaccharomyces favi sp. nov. (type strain: NCAIM Y.01994(T) = CBS 13653(T) = NRRL Y-63719(T) = ZIM 2551(T)) is proposed for this new yeast species, which based on phenotype can be distinguished from related Zygosaccharomyces species by its obligate osmophilic nature. Some intragenomic sequence variability, mainly indels, was detected among the ITS copies of the strains of the new species.
Wang, Xiao-Wei; Zhao, Qiong-Yi; Luan, Jun-Bo; Wang, Yu-Jun; Yan, Gen-Hong; Liu, Shu-Sheng
2012-10-04
Genomic divergence between invasive and native species may provide insight into the molecular basis underlying specific characteristics that drive the invasion and displacement of closely related species. In this study, we sequenced the transcriptome of an indigenous species, Asia II 3, of the Bemisia tabaci complex and compared its genetic divergence with the transcriptomes of two invasive whiteflies species, Middle East Asia Minor 1 (MEAM1) and Mediterranean (MED), respectively. More than 16 million reads of 74 base pairs in length were obtained for the Asia II 3 species using the Illumina sequencing platform. These reads were assembled into 52,535 distinct sequences (mean size: 466 bp) and 16,596 sequences were annotated with an E-value above 10-5. Protein family comparisons revealed obvious diversification among the transcriptomes of these species suggesting species-specific adaptations during whitefly evolution. On the contrary, substantial conservation of the whitefly transcriptomes was also evident, despite their differences. The overall divergence of coding sequences between the orthologous gene pairs of Asia II 3 and MEAM1 is 1.73%, which is comparable to the average divergence of Asia II 3 and MED transcriptomes (1.84%) and much higher than that of MEAM1 and MED (0.83%). This is consistent with the previous phylogenetic analyses and crossing experiments suggesting these are distinct species. We also identified hundreds of highly diverged genes and compiled sequence identify data into gene functional groups and found the most divergent gene classes are Cytochrome P450, Glutathione metabolism and Oxidative phosphorylation. These results strongly suggest that the divergence of genes related to metabolism might be the driving force of the MEAM1 and Asia II 3 differentiation. We also analyzed single nucleotide polymorphisms within the orthologous gene pairs of indigenous and invasive whiteflies which are helpful for the investigation of association between allelic and phenotypes. Our data present the most comprehensive sequences for the indigenous whitefly species Asia II 3. The extensive comparisons of Asia II 3, MEAM1 and MED transcriptomes will serve as an invaluable resource for revealing the genetic basis of whitefly invasion and the molecular mechanisms underlying their biological differences.
2012-01-01
Background Genomic divergence between invasive and native species may provide insight into the molecular basis underlying specific characteristics that drive the invasion and displacement of closely related species. In this study, we sequenced the transcriptome of an indigenous species, Asia II 3, of the Bemisia tabaci complex and compared its genetic divergence with the transcriptomes of two invasive whiteflies species, Middle East Asia Minor 1 (MEAM1) and Mediterranean (MED), respectively. Results More than 16 million reads of 74 base pairs in length were obtained for the Asia II 3 species using the Illumina sequencing platform. These reads were assembled into 52,535 distinct sequences (mean size: 466 bp) and 16,596 sequences were annotated with an E-value above 10-5. Protein family comparisons revealed obvious diversification among the transcriptomes of these species suggesting species-specific adaptations during whitefly evolution. On the contrary, substantial conservation of the whitefly transcriptomes was also evident, despite their differences. The overall divergence of coding sequences between the orthologous gene pairs of Asia II 3 and MEAM1 is 1.73%, which is comparable to the average divergence of Asia II 3 and MED transcriptomes (1.84%) and much higher than that of MEAM1 and MED (0.83%). This is consistent with the previous phylogenetic analyses and crossing experiments suggesting these are distinct species. We also identified hundreds of highly diverged genes and compiled sequence identify data into gene functional groups and found the most divergent gene classes are Cytochrome P450, Glutathione metabolism and Oxidative phosphorylation. These results strongly suggest that the divergence of genes related to metabolism might be the driving force of the MEAM1 and Asia II 3 differentiation. We also analyzed single nucleotide polymorphisms within the orthologous gene pairs of indigenous and invasive whiteflies which are helpful for the investigation of association between allelic and phenotypes. Conclusions Our data present the most comprehensive sequences for the indigenous whitefly species Asia II 3. The extensive comparisons of Asia II 3, MEAM1 and MED transcriptomes will serve as an invaluable resource for revealing the genetic basis of whitefly invasion and the molecular mechanisms underlying their biological differences. PMID:23036081
Nguyen, Tuan; Ruan, Zheng; Oruganty, Krishnadev; Kannan, Natarajan
2015-01-01
Mitogen activated protein kinases (MAPKs) form a closely related family of kinases that control critical pathways associated with cell growth and survival. Although MAPKs have been extensively characterized at the biochemical, cellular, and structural level, an integrated evolutionary understanding of how MAPKs differ from other closely related protein kinases is currently lacking. Here, we perform statistical sequence comparisons of MAPKs and related protein kinases to identify sequence and structural features associated with MAPK functional divergence. We show, for the first time, that virtually all MAPK-distinguishing sequence features, including an unappreciated short insert segment in the β4-β5 loop, physically couple distal functional sites in the kinase domain to the D-domain peptide docking groove via the C-terminal flanking tail (C-tail). The coupling mediated by MAPK-specific residues confers an allosteric regulatory mechanism unique to MAPKs. In particular, the regulatory αC-helix conformation is controlled by a MAPK-conserved salt bridge interaction between an arginine in the αC-helix and an acidic residue in the C-tail. The salt-bridge interaction is modulated in unique ways in individual sub-families to achieve regulatory specificity. Our study is consistent with a model in which the C-tail co-evolved with the D-domain docking site to allosterically control MAPK activity. Our study provides testable mechanistic hypotheses for biochemical characterization of MAPK-conserved residues and new avenues for the design of allosteric MAPK inhibitors. PMID:25799139
The ancestry and affiliations of Kennewick Man.
Rasmussen, Morten; Sikora, Martin; Albrechtsen, Anders; Korneliussen, Thorfinn Sand; Moreno-Mayar, J Víctor; Poznik, G David; Zollikofer, Christoph P E; de León, Marcia Ponce; Allentoft, Morten E; Moltke, Ida; Jónsson, Hákon; Valdiosera, Cristina; Malhi, Ripan S; Orlando, Ludovic; Bustamante, Carlos D; Stafford, Thomas W; Meltzer, David J; Nielsen, Rasmus; Willerslev, Eske
2015-07-23
Kennewick Man, referred to as the Ancient One by Native Americans, is a male human skeleton discovered in Washington state (USA) in 1996 and initially radiocarbon dated to 8,340-9,200 calibrated years before present (BP). His population affinities have been the subject of scientific debate and legal controversy. Based on an initial study of cranial morphology it was asserted that Kennewick Man was neither Native American nor closely related to the claimant Plateau tribes of the Pacific Northwest, who claimed ancestral relationship and requested repatriation under the Native American Graves Protection and Repatriation Act (NAGPRA). The morphological analysis was important to judicial decisions that Kennewick Man was not Native American and that therefore NAGPRA did not apply. Instead of repatriation, additional studies of the remains were permitted. Subsequent craniometric analysis affirmed Kennewick Man to be more closely related to circumpacific groups such as the Ainu and Polynesians than he is to modern Native Americans. In order to resolve Kennewick Man's ancestry and affiliations, we have sequenced his genome to ∼1× coverage and compared it to worldwide genomic data including for the Ainu and Polynesians. We find that Kennewick Man is closer to modern Native Americans than to any other population worldwide. Among the Native American groups for whom genome-wide data are available for comparison, several seem to be descended from a population closely related to that of Kennewick Man, including the Confederated Tribes of the Colville Reservation (Colville), one of the five tribes claiming Kennewick Man. We revisit the cranial analyses and find that, as opposed to genome-wide comparisons, it is not possible on that basis to affiliate Kennewick Man to specific contemporary groups. We therefore conclude based on genetic comparisons that Kennewick Man shows continuity with Native North Americans over at least the last eight millennia.
The genome sequence of the model ascomycete fungus Podospora anserina.
Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne Gj; Henrissat, Bernard; Khoury, Riyad El; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe
2008-01-01
The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope.
Soo Shin, Jane Hae
2017-01-01
Abstract Guanine-rich (G-rich) homopurine–homopyrimidine nucleotide sequences can block transcription with an efficiency that depends upon their orientation, composition and length, as well as the presence of negative supercoiling or breaks in the non-template DNA strand. We report that a G-rich sequence in the non-template strand reduces the yield of T7 RNA polymerase transcription by more than an order of magnitude when positioned close (9 bp) to the promoter, in comparison to that for a distal (∼250 bp) location of the same sequence. This transcription blockage is much less pronounced for a C-rich sequence, and is not significant for an A-rich sequence. Remarkably, the blockage is not pronounced if transcription is performed in the presence of RNase H, which specifically digests the RNA strands within RNA–DNA hybrids. The blockage also becomes less pronounced upon reduced RNA polymerase concentration. Based upon these observations and those from control experiments, we conclude that the blockage is primarily due to the formation of stable RNA–DNA hybrids (R-loops), which inhibit successive rounds of transcription. Our results could be relevant to transcription dynamics in vivo (e.g. transcription ‘bursting’) and may also have practical implications for the design of expression vectors. PMID:28498974
Belotserkovskii, Boris P; Soo Shin, Jane Hae; Hanawalt, Philip C
2017-06-20
Guanine-rich (G-rich) homopurine-homopyrimidine nucleotide sequences can block transcription with an efficiency that depends upon their orientation, composition and length, as well as the presence of negative supercoiling or breaks in the non-template DNA strand. We report that a G-rich sequence in the non-template strand reduces the yield of T7 RNA polymerase transcription by more than an order of magnitude when positioned close (9 bp) to the promoter, in comparison to that for a distal (∼250 bp) location of the same sequence. This transcription blockage is much less pronounced for a C-rich sequence, and is not significant for an A-rich sequence. Remarkably, the blockage is not pronounced if transcription is performed in the presence of RNase H, which specifically digests the RNA strands within RNA-DNA hybrids. The blockage also becomes less pronounced upon reduced RNA polymerase concentration. Based upon these observations and those from control experiments, we conclude that the blockage is primarily due to the formation of stable RNA-DNA hybrids (R-loops), which inhibit successive rounds of transcription. Our results could be relevant to transcription dynamics in vivo (e.g. transcription 'bursting') and may also have practical implications for the design of expression vectors. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
New Insights into Asian Prunus Viruses in the Light of NGS-Based Full Genome Sequencing.
Marais, Armelle; Faure, Chantal; Candresse, Thierry
2016-01-01
Double stranded RNAs were purified from five Prunus sources of Asian origin and submitted to 454 pyrosequencing after a random, whole genome amplification. Four complete genomes of Asian prunus virus 1 (APV1), APV2 and APV3 were reconstructed from the sequencing reads, as well as four additional, near-complete genome sequences. Phylogenetic analyses confirmed the close relationships of these three viruses and the taxonomical position previously proposed for APV1, the only APV so far completely sequenced. The genetic distances in the respective polymerase and coat protein genes as well as their gene products suggest that APV2 should be considered as a distinct viral species in the genus Foveavirus, even if the amino acid identity levels in the polymerase are very close to the species demarcation criteria for the family Betaflexiviridae. However, the situation is more complex for APV1 and APV3, for which opposite conclusions are obtained depending on the gene (polymerase or coat protein) analyzed. Phylogenetic and recombination analyses suggest that recombination events may have been involved in the evolution of APV. Moreover, genome comparisons show that the unusually long 3' non-coding region (3' NCR) is highly variable and a hot spot for indel polymorphisms. In particular, two APV3 variants differing only in their 3' NCR were identified in a single Prunus source, with 3' NCRs of 214-312 nt, a size similar to that observed in other foveaviruses, but 567-850 nt smaller than in other APV3 isolates. Overall, this study provides critical genome information of these viruses, frequently associated with Prunus materials, even though their precise role as pathogens remains to be elucidated.
One Bacterial Cell, One Complete Genome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos
2010-04-26
While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated frommore » the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.« less
Nørskov-Lauritsen, Niels; Overballe, Merete D.; Kilian, Mogens
2009-01-01
To obtain more information on the much-debated definition of prokaryotic species, we investigated the borders of Haemophilus influenzae by comparative analysis of H. influenzae reference strains with closely related bacteria including strains assigned to Haemophilus haemolyticus, cryptic genospecies biotype IV, and the never formally validated species “Haemophilus intermedius”. Multilocus sequence phylogeny based on six housekeeping genes separated a cluster encompassing the type and the reference strains of H. influenzae from 31 more distantly related strains. Comparison of 16S rRNA gene sequences supported this delineation but was obscured by a conspicuously high number of polymorphic sites in many of the strains that did not belong to the core group of H. influenzae strains. The division was corroborated by the differential presence of genes encoding H. influenzae adhesion and penetration protein, fuculokinase, and Cu,Zn-superoxide dismutase, whereas immunoglobulin A1 protease activity or the presence of the iga gene was of limited discriminatory value. The existence of porphyrin-synthesizing strains (“H. intermedius”) closely related to H. influenzae was confirmed. Several chromosomally encoded hemin biosynthesis genes were identified, and sequence analysis showed these genes to represent an ancestral genotype rather than recent transfers from, e.g., Haemophilus parainfluenzae. Strains previously assigned to H. haemolyticus formed several separate lineages within a distinct but deeply branching cluster, intermingled with strains of “H. intermedius” and cryptic genospecies biotype IV. Although H. influenzae is phenotypically more homogenous than some other Haemophilus species, the genetic diversity and multicluster structure of strains traditionally associated with H. influenzae make it difficult to define the natural borders of that species. PMID:19060144
New Insights into Asian Prunus Viruses in the Light of NGS-Based Full Genome Sequencing
Marais, Armelle; Faure, Chantal; Candresse, Thierry
2016-01-01
Double stranded RNAs were purified from five Prunus sources of Asian origin and submitted to 454 pyrosequencing after a random, whole genome amplification. Four complete genomes of Asian prunus virus 1 (APV1), APV2 and APV3 were reconstructed from the sequencing reads, as well as four additional, near-complete genome sequences. Phylogenetic analyses confirmed the close relationships of these three viruses and the taxonomical position previously proposed for APV1, the only APV so far completely sequenced. The genetic distances in the respective polymerase and coat protein genes as well as their gene products suggest that APV2 should be considered as a distinct viral species in the genus Foveavirus, even if the amino acid identity levels in the polymerase are very close to the species demarcation criteria for the family Betaflexiviridae. However, the situation is more complex for APV1 and APV3, for which opposite conclusions are obtained depending on the gene (polymerase or coat protein) analyzed. Phylogenetic and recombination analyses suggest that recombination events may have been involved in the evolution of APV. Moreover, genome comparisons show that the unusually long 3’ non-coding region (3' NCR) is highly variable and a hot spot for indel polymorphisms. In particular, two APV3 variants differing only in their 3’ NCR were identified in a single Prunus source, with 3' NCRs of 214–312 nt, a size similar to that observed in other foveaviruses, but 567–850 nt smaller than in other APV3 isolates. Overall, this study provides critical genome information of these viruses, frequently associated with Prunus materials, even though their precise role as pathogens remains to be elucidated. PMID:26741704
Zienius, D; Lelešius, R; Kavaliauskis, H; Stankevičius, A; Šalomskas, A
2016-01-01
The aim of the present study was to detect canine parvovirus (CPV) from faecal samples of clinically ill domestic dogs by polymerase chain reaction (PCR) followed by VP2 gene partial sequencing and molecular characterization of circulating strains in Lithuania. Eleven clinically and antigen-tested positive dog faecal samples, collected during the period of 2014-2015, were investigated by using PCR. The phylogenetic investigations indicated that the Lithuanian CPV VP2 partial sequences (3025-3706 cds) were closely related and showed 99.0-99.9% identity. All Lithuanian sequences were associated with one phylogroup, but grouped in different clusters. Ten of investigated Lithuanian CPV VP2 sequences were closely associated with CPV 2a antigenic variant (99.4% nt identity). Five CPV VP2 sequences from Lithuania were related to CPV-2a, but were rather divergent (6.8 nt differences). Only one CPV VP2 sequence from Lithuania was associated (99.3% nt identity) with CPV-2b VP2 sequences from France, Italy, USA and Korea. The four of eleven investigated Lithuanian dogs with CPV infection symptoms were vaccinated with CPV-2 vaccine, but their VP2 sequences were phylogenetically distantly associated with CPV vaccine strains VP2 sequences (11.5-15.8 nt differences). Ten Lithuanian CPV VP2 sequences had monophyletic relations among the close geographically associated samples, but five of them were rather divergent (1.0% less sequence similarity). The one Lithuanian CPV VP2 sequence was closely related with CPV-2b antigenic variant. All the Lithuanian CPV VP2 partial sequences were conservative and phylogenetically low associated with most commonly used CPV vaccine strains.
Hausmann, Axel; Cancian de Araujo, Bruno; Sutrisno, Hari; Peggie, Djunijanti; Schmidt, Stefan
2017-01-01
Abstract Here we present a general collecting and preparation protocol for DNA barcoding of Lepidoptera as part of large-scale rapid biodiversity assessment projects, and a comparison with alternative preserving and vouchering methods. About 98% of the sequenced specimens processed using the present collecting and preparation protocol yielded sequences with more than 500 base pairs. The study is based on the first outcomes of the Indonesian Biodiversity Discovery and Information System (IndoBioSys). IndoBioSys is a German-Indonesian research project that is conducted by the Museum für Naturkunde in Berlin and the Zoologische Staatssammlung München, in close cooperation with the Research Center for Biology – Indonesian Institute of Sciences (RCB-LIPI, Bogor). PMID:29134041
Hosseini, A; Koohi Habibi, M; Izadpanah, K; Mosahebi, G H; Rubies-Autonell, C; Ratti, C
2010-10-01
Bermuda grass with mosaic symptoms have been found in many parts of Iran. No serological correlation was observed between two isolates of this filamentous virus and any of the members of the family Potyviridae that were tested. Aphid transmission was demonstrated at low efficiency for isolates of this virus, whereas no transmission through seed was observed. A DNA fragment corresponding to the 3' end of the viral genome of these two isolates from Iran and one isolate from Italy was amplified and sequenced. A BLAST search showed that these isolates are more closely related to Spartina mottle virus (SpMV) than to any other virus in the family Potyviridae. Specific serological assays confirmed the phylogenetic analysis. Sequence and phylogenetic analysis suggested that these isolates could be considered as divergent strains of SpMV in the proposed genus Sparmovirus.
Nessen, Merel A; van der Zwaan, Dennis J; Grevers, Sander; Dalebout, Hans; Staats, Martijn; Kok, Esther; Palmblad, Magnus
2016-05-11
Proteomics methodology has seen increased application in food authentication, including tandem mass spectrometry of targeted species-specific peptides in raw, processed, or mixed food products. We have previously described an alternative principle that uses untargeted data acquisition and spectral library matching, essentially spectral counting, to compare and identify samples without the need for genomic sequence information in food species populations. Here, we present an interlaboratory comparison demonstrating how a method based on this principle performs in a realistic context. We also increasingly challenge the method by using data from different types of mass spectrometers, by trying to distinguish closely related and commercially important flatfish, and by analyzing heavily contaminated samples. The method was found to be robust in different laboratories, and 94-97% of the analyzed samples were correctly identified, including all processed and contaminated samples.
NMR-based diffusion pore imaging by double wave vector measurements.
Kuder, Tristan Anselm; Laun, Frederik Bernd
2013-09-01
One main interest of nuclear magnetic resonance (NMR) diffusion experiments is the investigation of boundaries such as cell membranes hindering the diffusion process. NMR diffusion measurements allow collecting the signal from the whole sample. This mainly eliminates the problem of vanishing signal at increasing resolution. It has been a longstanding question if, in principle, the exact shape of closed pores can be determined by NMR diffusion measurements. In this work, we present a method using short diffusion gradient pulses only, which is able to reveal the shape of arbitrary closed pores without relying on a priori knowledge. In comparison to former approaches, the method has reduced demands on relaxation times due to faster convergence to the diffusion long-time limit and allows for a more flexible NMR sequence design, because, e.g., stimulated echoes can be used. Copyright © 2012 Wiley Periodicals, Inc.
Stelis zootrophionoides (Orchidaceae: Pleurothallidinae), a New Species from Mexico
Ramos-Castro, Sergio E.; Castañeda-Zárate, Miguel; Solano-Gómez, Rodolfo; Salazar, Gerardo A.
2012-01-01
Background Stelis (Orchidaceae) encompasses approximately 1100 species of epiphytic orchids distributed throughout the Neotropics, with the highest diversity in Andean South America. Sixty-two species were recorded previously in Mexico. Methods We formally describe here Stelis zootrophionoides as a new species from Chiapas, Mexico. To determine its systematic position, we conducted a morphological comparison with other members of Pleurothallidinae and a phylogenetic analysis of nucleotide sequences from the plastid matK/trnK and trnL/trnF regions, as well as the nuclear ribosomal ITS region for 52 species of Pleurothallidinae. Sequences of 49 species were downloaded from GenBank and those of three species, including the new taxon, were newly generated for this work. The new species is described and illustrated; notes on its ecological preferences and a comparison with closely related species are presented. Conclusions The new species, known only from one location and apparently restricted to the cloud forest in the central highlands of Chiapas, Mexico, is considered a rare species. This small epiphyte is unique among the Mexican species of Stelis by the combination of dark purple flowers with the distal third of the dorsal sepal adhered to the apices of the lateral sepals, which are partially united into a bifid synsepal, leaving two lateral window-like openings, and sagittate labellum. Stelis jalapensis, known from southern Mexico and Guatemala, also has the apices of the sepals adhered to each other, but it is distinguished by its larger flowers with lanceolate, acute dorsal sepal, completely fused lateral sepals (i.e. the synsepal is not bifid), and oblong-elliptic labellum. The phylogenetic analysis shows that S. zootrophionoides is closely related to other Mexican Stelis and corroborates previous suggestions that fused sepal apices have arisen independently in different lineages of Pleurothallidinae. PMID:23144987
Shakoori, Farah R; Tasneem, Fareeda; Al-Ghanim, K; Mahboob, S; Al-Misned, F; Jahan, Nusrat; Shakoori, Abdul Rauf
2014-12-01
Besides cytological and molecular applications, Paramecium is being used in water quality assessment and for determination of saprobic levels. An unambiguous identification of these unicellular eukaryotes is not only essential, but its ecological diversity must also be explored in the local environment. 18SrRNA genes of all the strains of Paramecium species isolated from waste water were amplified, cloned and sequenced. Phylogenetic comparison of the nucleotide sequences of these strains with 23 closely related Paramecium species from GenBank Database enabled identification of Paramecium multimicronucleatum and Paramecium jenningsi. Some isolates did not show significant close association with other Paramecium species, and because of their unique position in the phylogenetic tree, they were considered new to the field. In the present report, these isolates are being designated as Paramecium caudatum pakistanicus. In this article, secondary structure of 18SrRNA has also been analyzed as an additional and perhaps more reliable topological marker for species discrimination and for determining possible phylogenetic relationship between the ciliate species. On the basis of comparison of secondary structure of 18SrRNA of various isolated Paramacium strains, and among Paramecium caudatum pakistanicus, Tetrahymena thermophila, Drosophila melanogaster, and Homo sapiens, it can be deduced that variable regions are more helpful in differentiating the species at interspecific level rather than at intraspecific level. It was concluded that V3 was the least variable region in all the organisms, V2 and V7 were the longest expansion segments of D. melanogaster and there was continuous mutational bias towards G.C base pairing in H. sapiens. © 2014 Wiley Periodicals, Inc.
Wentz, Travis G.; Muruvanda, Tim; Thirunavukkarasu, Nagarajan; Hoffmann, Maria; Allard, Marc W.; Hodge, David R.; Pillai, Segaran P.; Hammack, Thomas S.; Brown, Eric W.
2017-01-01
ABSTRACT Clostridial neurotoxins, including botulinum and tetanus neurotoxins, are among the deadliest known bacterial toxins. Until recently, the horizontal mobility of this toxin gene family appeared to be limited to the genus Clostridium. We report here the closed genome sequence of Chryseobacterium piperi, a Gram-negative bacterium containing coding sequences with homology to clostridial neurotoxin family proteins. PMID:29192076
Wu, Y.; Zheng, J.; Robbins, R. T.
2007-01-01
A population of Xiphinema hunaniense Wang and Wu, 1992 with all four juvenile stages was found in the rhizosphere of Pinus sp. in Hangzhou, Zhejiang, China. Morphometrics of 18 females and 35 juveniles of this population are given herein. Detailed morphology and morphometrics of the four juvenile stages are provided. Further comparisons based on morphometrics of the population with previous studies of the females and the first-stage juveniles of X. hunaniense with X. radicicola are given, and morphological variation in X. hunaniense populations are discussed. A revised polytomous key code of Loof and Luc (1990) for X. hunaniense identification is provided, i.e., A1- B4- C4- D4/5- E1- F2(3)- G2- H2-I3- J4- K2- L1. In addition, the sequence of the D2 and D3 expansion region of the 28S rRNA gene was analyzed and compared with sequences of closely related species downloaded from the NCBI database. Cluster analysis of sequences confirmed and supported the species identifications. PMID:19259473
Variability Studies of Two Prunus-Infecting Fabaviruses with the Aid of High-Throughput Sequencing
Sarkisova, Tatiana; Lenz, Ondřej; Přibylová, Jaroslava; Špak, Josef; Lotos, Leonidas; Beta, Christina; Katsiani, Asimina; Candresse, Thierry
2018-01-01
During their lifetime, perennial woody plants are expected to face multiple infection events. Furthermore, multiple genotypes of individual virus species may co-infect the same host. This may eventually lead to a situation where plants harbor complex communities of viral species/strains. Using high-throughput sequencing, we describe co-infection of sweet and sour cherry trees with diverse genomic variants of two closely related viruses, namely prunus virus F (PrVF) and cherry virus F (CVF). Both viruses are most homologous to members of the Fabavirus genus (Secoviridae family). The comparison of CVF and PrVF RNA2 genomic sequences suggests that the two viruses may significantly differ in their expression strategy. Indeed, similar to comoviruses, the smaller genomic segment of PrVF, RNA2, may be translated in two collinear proteins while CVF likely expresses only the shorter of these two proteins. Linked with the observation that identity levels between the coat proteins of these two viruses are significantly below the family species demarcation cut-off, these findings support the idea that CVF and PrVF represent two separate Fabavirus species. PMID:29670059
Seck, E H; Diop, A; Armstrong, N; Delerce, J; Fournier, P-E; Raoult, D; Khelaifia, S
2018-05-01
Bacillus salis strain ES3 T (= CSUR P1478 = DSM 100598) is the type strain of B. salis sp. nov. It is an aerobic, Gram-positive, moderately halophilic, motile and spore-forming bacterium. It was isolated from commercial table salt as part of a broad culturomics study aiming to maximize the culture conditions for the in-depth exploration of halophilic bacteria in salty food. Here we describe the phenotypic characteristics of this isolate, its complete genome sequence and annotation, together with a comparison with closely related bacteria. Phylogenetic analysis based on 16S rRNA gene sequences indicated 97.5% similarity with Bacillus aquimaris, the closest species. The 8 329 771 bp long genome (one chromosome, no plasmids) exhibits a G+C content of 39.19%. It is composed of 18 scaffolds with 29 contigs. Of the 8303 predicted genes, 8109 were protein-coding genes and 194 were RNAs. A total of 5778 genes (71.25%) were assigned a putative function.
Johanne Hansen, Mie; Strøm Braaten, Mira; Miki Bojesen, Anders; Christensen, Henrik; Sonne, Christian; Dietz, Rune; Frost Bertelsen, Mads
2015-10-01
Thirty-three suspected strains of the family Pasteurellaceae isolated from the oral cavity of polar and brown bears were characterized by genotypic and phenotypic tests. Phylogenetic analysis of partial 16S rRNA gene and rpoB sequences showed that the investigated isolates formed two closely related monophyletic groups, representing two novel species of a new genus. Based on 16S rRNA gene sequence comparison Bibersteinia trehalosi was the closest related species with a validly published name, with 95.4 % similarity to the polar bear group and 94.4 % similarity to the brown bear group. Otariodibacter oris was the closest related species based on rpoB sequence comparison with a similarity of 89.8 % with the polar bear group and 90 % with the brown bear group. The new genus could be separated from existing genera of the family Pasteurellaceae by three to ten phenotypic characters, and the two novel species could be separated from each other by two phenotypic characters. It is proposed that the strains should be classified as representatives of a new genus, Ursidibacter gen. nov., with two novel species: the type species Ursidibacter maritimus sp. nov., isolated from polar bears (type strain Pb43106T = CCUG 65144T = DSM 28137T, DNA G+C content 39.3 mol%), and Ursidibacter arcticus sp. nov., isolated from brown bears (type strain Bamse61T = CCUG 65145T = DSM 28138T).
Monteiro, Rose A; Balsanelli, Eduardo; Tuleski, Thalita; Faoro, Helison; Cruz, Leonardo M; Wassem, Roseli; de Baura, Valter A; Tadra-Sfeir, Michelle Z; Weiss, Vinícius; DaRocha, Wanderson D; Muller-Santos, Marcelo; Chubatsu, Leda S; Huergo, Luciano F; Pedrosa, Fábio O; de Souza, Emanuel M
2012-05-01
Herbaspirillum rubrisubalbicans M1 causes the mottled stripe disease in sugarcane cv. B-4362. Inoculation of this cultivar with Herbaspirillum seropedicae SmR1 does not produce disease symptoms. A comparison of the genomic sequences of these closely related species may permit a better understanding of contrasting phenotype such as endophytic association and pathogenic life style. To achieve this goal, we constructed suppressive subtractive hybridization (SSH) libraries to identify DNA fragments present in one species and absent in the other. In a parallel approach, partial genomic sequence from H. rubrisubalbicans M1 was directly compared in silico with the H. seropedicae SmR1 genome. The genomic differences between the two organisms revealed by SSH suggested that lipopolysaccharide and adhesins are potential molecular factors involved in the different phenotypic behavior. The cluster wss probably involved in cellulose biosynthesis was found in H. rubrisubalbicans M1. Expression of this gene cluster was increased in H. rubrisubalbicans M1 cells attached to the surface of maize root, and knockout of wssD gene led to decrease in maize root surface attachment and endophytic colonization. The production of cellulose could be responsible for the maize attachment pattern of H. rubrisubalbicans M1 that is capable of outcompeting H. seropedicae SmR1. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
ERIC Educational Resources Information Center
Caglayan, Günhan
2016-01-01
A Steiner chain is defined as the sequence of n circles that are all tangent to two given non-intersecting circles. A closed chain, in particular, is one in which every circle in the sequence is tangent to the previous and next circles of the chain. In a closed Steiner chain the first and the "n"th circles of the chain are also tangent…
Genome sequence and rapid evolution of the rice pathogen Xanthomonas oryzae pv. oryzae PXO99A
Salzberg, Steven L; Sommer, Daniel D; Schatz, Michael C; Phillippy, Adam M; Rabinowicz, Pablo D; Tsuge, Seiji; Furutani, Ayako; Ochiai, Hirokazu; Delcher, Arthur L; Kelley, David; Madupu, Ramana; Puiu, Daniela; Radune, Diana; Shumway, Martin; Trapnell, Cole; Aparna, Gudlur; Jha, Gopaljee; Pandey, Alok; Patil, Prabhu B; Ishihara, Hiromichi; Meyer, Damien F; Szurek, Boris; Verdier, Valerie; Koebnik, Ralf; Dow, J Maxwell; Ryan, Robert P; Hirata, Hisae; Tsuyumu, Shinji; Won Lee, Sang; Ronald, Pamela C; Sonti, Ramesh V; Van Sluys, Marie-Anne; Leach, Jan E; White, Frank F; Bogdanove, Adam J
2008-01-01
Background Xanthomonas oryzae pv. oryzae causes bacterial blight of rice (Oryza sativa L.), a major disease that constrains production of this staple crop in many parts of the world. We report here on the complete genome sequence of strain PXO99A and its comparison to two previously sequenced strains, KACC10331 and MAFF311018, which are highly similar to one another. Results The PXO99A genome is a single circular chromosome of 5,240,075 bp, considerably longer than the genomes of the other strains (4,941,439 bp and 4,940,217 bp, respectively), and it contains 5083 protein-coding genes, including 87 not found in KACC10331 or MAFF311018. PXO99A contains a greater number of virulence-associated transcription activator-like effector genes and has at least ten major chromosomal rearrangements relative to KACC10331 and MAFF311018. PXO99A contains numerous copies of diverse insertion sequence elements, members of which are associated with 7 out of 10 of the major rearrangements. A rapidly-evolving CRISPR (clustered regularly interspersed short palindromic repeats) region contains evidence of dozens of phage infections unique to the PXO99A lineage. PXO99A also contains a unique, near-perfect tandem repeat of 212 kilobases close to the replication terminus. Conclusion Our results provide striking evidence of genome plasticity and rapid evolution within Xanthomonas oryzae pv. oryzae. The comparisons point to sources of genomic variation and candidates for strain-specific adaptations of this pathogen that help to explain the extraordinary diversity of Xanthomonas oryzae pv. oryzae genotypes and races that have been isolated from around the world. PMID:18452608
Whistler, Cheryl A; Hall, Jeffrey A; Xu, Feng; Ilyas, Saba; Siwakoti, Puskar; Cooper, Vaughn S; Jones, Stephen H
2015-06-01
Vibrio parahaemolyticus sequence type 36 (ST36) strains that are native to the Pacific Ocean have recently caused multistate outbreaks of gastroenteritis linked to shellfish harvested from the Atlantic Ocean. Whole-genome comparisons of 295 genomes of V. parahaemolyticus, including several traced to northeastern U.S. sources, were used to identify diagnostic loci, one putatively encoding an endonuclease (prp), and two others potentially conferring O-antigenic properties (cps and flp). The combination of all three loci was present in only one clade of closely related strains of ST36, ST59, and one additional unknown sequence type. However, each locus was also identified outside this clade, with prp and flp occurring in only two nonclade isolates and cps in four. Based on the distribution of these loci in sequenced genomes, prp identified clade strains with >99% accuracy, but the addition of one more locus increased accuracy to 100%. Oligonucleotide primers targeting prp and cps were combined in a multiplex PCR method that defines species using the tlh locus and determines the presence of both the tdh and trh hemolysin-encoding genes, which are also present in ST36. Application of the method in vitro to a collection of 94 clinical isolates collected over a 4-year period in three northeastern U.S. states and 87 environmental isolates revealed that the prp and cps amplicons were detected only in clinical isolates identified as belonging to the ST36 clade and in no environmental isolates from the region. The assay should improve detection and surveillance, thereby reducing infections. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
2014-01-01
Background Foxtail millet (Setaria italica (L.) Beauv.) is an important gramineous grain-food and forage crop. It is grown worldwide for human and livestock consumption. Its small genome and diploid nature have led to foxtail millet fast becoming a novel model for investigating plant architecture, drought tolerance and C4 photosynthesis of grain and bioenergy crops. Therefore, cost-effective, reliable and highly polymorphic molecular markers covering the entire genome are required for diversity, mapping and functional genomics studies in this model species. Result A total of 5,020 highly repetitive microsatellite motifs were isolated from the released genome of the genotype 'Yugu1’ by sequence scanning. Based on sequence comparison between S. italica and S. viridis, a set of 788 SSR primer pairs were designed. Of these primers, 733 produced reproducible amplicons and were polymorphic among 28 Setaria genotypes selected from diverse geographical locations. The number of alleles detected by these SSR markers ranged from 2 to 16, with an average polymorphism information content of 0.67. The result obtained by neighbor-joining cluster analysis of 28 Setaria genotypes, based on Nei’s genetic distance of the SSR data, showed that these SSR markers are highly polymorphic and effective. Conclusions A large set of highly polymorphic SSR markers were successfully and efficiently developed based on genomic sequence comparison between different genotypes of the genus Setaria. The large number of new SSR markers and their placement on the physical map represent a valuable resource for studying diversity, constructing genetic maps, functional gene mapping, QTL exploration and molecular breeding in foxtail millet and its closely related species. PMID:24472631
Zhang, Shuo; Tang, Chanjuan; Zhao, Qiang; Li, Jing; Yang, Lifang; Qie, Lufeng; Fan, Xingke; Li, Lin; Zhang, Ning; Zhao, Meicheng; Liu, Xiaotong; Chai, Yang; Zhang, Xue; Wang, Hailong; Li, Yingtao; Li, Wen; Zhi, Hui; Jia, Guanqing; Diao, Xianmin
2014-01-28
Foxtail millet (Setaria italica (L.) Beauv.) is an important gramineous grain-food and forage crop. It is grown worldwide for human and livestock consumption. Its small genome and diploid nature have led to foxtail millet fast becoming a novel model for investigating plant architecture, drought tolerance and C4 photosynthesis of grain and bioenergy crops. Therefore, cost-effective, reliable and highly polymorphic molecular markers covering the entire genome are required for diversity, mapping and functional genomics studies in this model species. A total of 5,020 highly repetitive microsatellite motifs were isolated from the released genome of the genotype 'Yugu1' by sequence scanning. Based on sequence comparison between S. italica and S. viridis, a set of 788 SSR primer pairs were designed. Of these primers, 733 produced reproducible amplicons and were polymorphic among 28 Setaria genotypes selected from diverse geographical locations. The number of alleles detected by these SSR markers ranged from 2 to 16, with an average polymorphism information content of 0.67. The result obtained by neighbor-joining cluster analysis of 28 Setaria genotypes, based on Nei's genetic distance of the SSR data, showed that these SSR markers are highly polymorphic and effective. A large set of highly polymorphic SSR markers were successfully and efficiently developed based on genomic sequence comparison between different genotypes of the genus Setaria. The large number of new SSR markers and their placement on the physical map represent a valuable resource for studying diversity, constructing genetic maps, functional gene mapping, QTL exploration and molecular breeding in foxtail millet and its closely related species.
Lewis, William H; Sendra, Kacper M; Embley, T Martin; Esteban, Genoveva F
2018-01-01
Many anaerobic ciliated protozoa contain organelles of mitochondrial ancestry called hydrogenosomes. These organelles generate molecular hydrogen that is consumed by methanogenic Archaea, living in endosymbiosis within many of these ciliates. Here we describe a new species of anaerobic ciliate, Trimyema finlayi n. sp., by using silver impregnation and microscopy to conduct a detailed morphometric analysis. Comparisons with previously published morphological data for this species, as well as the closely related species, Trimyema compressum , demonstrated that despite them being similar, both the mean cell size and the mean number of somatic kineties are lower for T. finlayi than for T. compressum , which suggests that they are distinct species. This was also supported by analysis of the 18S rRNA genes from these ciliates, the sequences of which are 97.5% identical (6 substitutions, 1479 compared bases), and in phylogenetic analyses these sequences grouped with other 18S rRNA genes sequenced from previous isolates of the same respective species. Together these data provide strong evidence that T. finlayi is a novel species of Trimyema , within the class Plagiopylea. Various microscopic techniques demonstrated that T. finlayi n. sp. contains polymorphic endosymbiotic methanogens, and analysis of the endosymbionts' 16S rRNA gene showed that they belong to the genus Methanocorpusculum , which was confirmed using fluorescence in situ hybridization with specific probes. Despite the degree of similarity and close relationship between these ciliates, T. compressum contains endosymbiotic methanogens from a different genus, Methanobrevibacter . In phylogenetic analyses of 16S rRNA genes, the Methanocorpusculum endosymbiont of T. finlayi n. sp. grouped with sequences from Methanomicrobia, including the endosymbiont of an earlier isolate of the same species, ' Trimyema sp.,' which was sampled approximately 22 years earlier, at a distant (∼400 km) geographical location. Identification of the same endosymbiont species in the two separate isolates of T. finlayi n. sp. provides evidence for spatial and temporal stability of the Methanocorpusculum-T. finlayi n. sp. endosymbiosis. T. finlayi n. sp. and T. compressum provide an example of two closely related anaerobic ciliates that have endosymbionts from different methanogen genera, suggesting that the endosymbionts have not co-speciated with their hosts.
Sharma, Sanjeev Kumar; Bolser, Daniel; de Boer, Jan; Sønderkær, Mads; Amoros, Walter; Carboni, Martin Federico; D’Ambrosio, Juan Martín; de la Cruz, German; Di Genova, Alex; Douches, David S.; Eguiluz, Maria; Guo, Xiao; Guzman, Frank; Hackett, Christine A.; Hamilton, John P.; Li, Guangcun; Li, Ying; Lozano, Roberto; Maass, Alejandro; Marshall, David; Martinez, Diana; McLean, Karen; Mejía, Nilo; Milne, Linda; Munive, Susan; Nagy, Istvan; Ponce, Olga; Ramirez, Manuel; Simon, Reinhard; Thomson, Susan J.; Torres, Yerisf; Waugh, Robbie; Zhang, Zhonghua; Huang, Sanwen; Visser, Richard G. F.; Bachem, Christian W. B.; Sagredo, Boris; Feingold, Sergio E.; Orjeda, Gisella; Veilleux, Richard E.; Bonierbale, Merideth; Jacobs, Jeanne M. E.; Milbourne, Dan; Martin, David Michael Alan; Bryan, Glenn J.
2013-01-01
The genome of potato, a major global food crop, was recently sequenced. The work presented here details the integration of the potato reference genome (DM) with a new sequence-tagged site marker−based linkage map and other physical and genetic maps of potato and the closely related species tomato. Primary anchoring of the DM genome assembly was accomplished by the use of a diploid segregating population, which was genotyped with several types of molecular genetic markers to construct a new ~936 cM linkage map comprising 2469 marker loci. In silico anchoring approaches used genetic and physical maps from the diploid potato genotype RH89-039-16 (RH) and tomato. This combined approach has allowed 951 superscaffolds to be ordered into pseudomolecules corresponding to the 12 potato chromosomes. These pseudomolecules represent 674 Mb (~93%) of the 723 Mb genome assembly and 37,482 (~96%) of the 39,031 predicted genes. The superscaffold order and orientation within the pseudomolecules are closely collinear with independently constructed high density linkage maps. Comparisons between marker distribution and physical location reveal regions of greater and lesser recombination, as well as regions exhibiting significant segregation distortion. The work presented here has led to a greatly improved ordering of the potato reference genome superscaffolds into chromosomal “pseudomolecules”. PMID:24062527
ORBITAL SOLUTIONS FOR TWO YOUNG, LOW-MASS SPECTROSCOPIC BINARIES IN OPHIUCHUS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rosero, V.; Prato, L.; Wasserman, L. H.
2011-01-15
We report the orbital parameters for ROXR1 14 and RX J1622.7-2325Nw, two young, low-mass, and double-lined spectroscopic binaries recently discovered in the Ophiuchus star-forming region. Accurate orbital solutions were determined from over a dozen high-resolution spectra taken with the Keck II and Gemini South telescopes. These objects are T Tauri stars with mass ratios close to unity and periods of {approx}5 and {approx}3 days, respectively. In particular, RX J1622.7-2325Nw shows a non-circularized orbit with an eccentricity of 0.30, higher than any other short-period pre-main-sequence (PMS) spectroscopic binary known to date. We speculate that the orbit of RX J1622.7-2325Nw has notmore » yet circularized because of the perturbing action of a {approx}1'' companion, itself a close visual pair. A comparison of known young spectroscopic binaries (SBs) and main-sequence (MS) SBs in the eccentricity-period plane shows an indistinguishable distribution of the two populations, implying that orbital circularization occurs in the first 1 Myr of a star's lifetime. With the results presented in this paper we increase by {approx}4% the small sample of PMS spectroscopic binary stars with known orbital elements.« less
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.
Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai
2015-12-01
The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Sun, Genlou; Komatsuda, Takao
2010-08-01
It is well known that Elymus arose through hybridization between representatives of different genera. Cytogenetic analyses show that all its members include the St genome in combination with one or more of four other genomes, the H, Y, P, and W genomes. The origins of the H, P, and W genomes are known, but not for the Y genome. We analyzed the single copy nuclear gene coding for elongation factor G (EF-G) from 28 accessions of polyploid Elymus species and 45 accessions of diploid Triticeae species in order to investigate origin of the Y genome and its relationship to other genomes in the tribe Triticeae. Sequence comparisons among the St, H, Y, P, W, and E genomes detected genome-specific polymorphisms at 66 nucleotide positions. The St and Y genomes are relatively dissimilar. The phylogeny of the Y genome sequences was investigated for the first time. They were most similar to the W genome sequences. The Y genome sequences were placed in two different groups. These two groups were included in an unresolved clade that included the W and E sequences as well as sequences from many annual species. The H genomes sequences were in a clade with the F, P, and Ns genome sequences as sister groups. These two clades were more closely related to each other and to the L and Xp genomes than they were to the St genome sequences. These data support the hypothesis that the Y genome evolved in a diploid species and has a different origin from the St genome. Copyright 2010 Elsevier Inc. All rights reserved.
Glynn, Neil C; Comstock, Jack C; Sood, Sushma G; Dang, Phat M; Chaparro, Jose X
2008-01-01
Resistance gene analogues (RGAs) have been isolated from many crops and offer potential in breeding for disease resistance through marker-assisted selection, either as closely linked or as perfect markers. Many R-gene sequences contain kinase domains, and indeed kinase genes have been reported as being proximal to R-genes, making kinase analogues an additionally promising target. The first step towards utilizing RGAs as markers for disease resistance is isolation and characterization of the sequences. Sugarcane clone US01-1158 was identified as resistant to yellow leaf caused by the sugarcane yellow leaf virus (SCYLV) and moderately resistant to rust caused by Puccinia melanocephala Sydow & Sydow. Degenerate primers that had previously proved useful for isolating RGAs and kinase analogues in wheat and soybean were used to amplify DNA from sugarcane (Saccharum spp.) clone US-01-1158. Sequences generated from 1512 positive clones were assembled into 134 contigs of between two and 105 sequences. Comparison of the contig consensuses with the NCBI sequence database using BLASTx showed that 20 had sequence homology to nuclear binding site and leucine rich repeat (NBS-LRR) RGAs, and eight to kinase genes. Alignment of the deduced amino acid sequences with similar sequences from the NCBI database allowed the identification of several conserved domains. The alignment and resulting phenetic tree showed that many of the sequences had greater similarity to sequences from other species than to one another. The use of degenerate primers is a useful method for isolating novel sugarcane RGA and kinase gene analogues. Further studies are needed to evaluate the role of these genes in disease resistance.
Almond, N; Jenkins, A; Heath, A B; Kitchin, P
1993-05-01
Three cynomolgus macaques were immunized with recombinant envelope protein preparations derived from simian immunodeficiency virus (SIV). Although humoral and cellular responses were elicited by the immunization regime, all macaques became infected upon challenge with 10 MID50 of the 11/88 virus challenge stock of SIVmac251-32H. The polymerase chain reaction was used to amplify proviral SIV gp120 sequences present in the blood of both immunized and control macaques at 2 months post-infection. A comparison of the predominant sequences found in the region from V2 to V5 of gp120 failed to differentiate provirus recovered from either immunized or control animals. A detailed investigation of sequences obtained from the hypervariable V1 region identified a mixture of sequences in both immunized and control macaques. Some sequences were identical to those previously detected in the virus challenge stock, whereas others had not been detected previously. Phenogram analysis of the new V1 sequences found in immunized animals revealed that they were quite distinct from those from the virus challenge stock and that they included alterations to potential N-linked glycosylation sites. In contrast, new sequence variants recovered from the control animals were closely related to sequences from the virus challenge stock. The difference in diversity of new V1 sequences recovered from immunized and control macaques was highly significant (P < 0.001). Thus, the presence of pre-existing immune responses to SIV envelope protein is associated with greater genetic change in the V1 region of gp120. These data are discussed in relation to the epitopes of SIV gp120 that may confer protection from in vivo challenge.
Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu
2009-01-01
Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
The sponge microbiome project.
Moitinho-Silva, Lucas; Nielsen, Shaun; Amir, Amnon; Gonzalez, Antonio; Ackermann, Gail L; Cerrano, Carlo; Astudillo-Garcia, Carmen; Easson, Cole; Sipkema, Detmer; Liu, Fang; Steinert, Georg; Kotoulas, Giorgos; McCormack, Grace P; Feng, Guofang; Bell, James J; Vicente, Jan; Björk, Johannes R; Montoya, Jose M; Olson, Julie B; Reveillaud, Julie; Steindler, Laura; Pineda, Mari-Carmen; Marra, Maria V; Ilan, Micha; Taylor, Michael W; Polymenakou, Paraskevi; Erwin, Patrick M; Schupp, Peter J; Simister, Rachel L; Knight, Rob; Thacker, Robert W; Costa, Rodrigo; Hill, Russell T; Lopez-Legentil, Susanna; Dailianis, Thanos; Ravasi, Timothy; Hentschel, Ute; Li, Zhiyong; Webster, Nicole S; Thomas, Torsten
2017-10-01
Marine sponges (phylum Porifera) are a diverse, phylogenetically deep-branching clade known for forming intimate partnerships with complex communities of microorganisms. To date, 16S rRNA gene sequencing studies have largely utilised different extraction and amplification methodologies to target the microbial communities of a limited number of sponge species, severely limiting comparative analyses of sponge microbial diversity and structure. Here, we provide an extensive and standardised dataset that will facilitate sponge microbiome comparisons across large spatial, temporal, and environmental scales. Samples from marine sponges (n = 3569 specimens), seawater (n = 370), marine sediments (n = 65) and other environments (n = 29) were collected from different locations across the globe. This dataset incorporates at least 268 different sponge species, including several yet unidentified taxa. The V4 region of the 16S rRNA gene was amplified and sequenced from extracted DNA using standardised procedures. Raw sequences (total of 1.1 billion sequences) were processed and clustered with (i) a standard protocol using QIIME closed-reference picking resulting in 39 543 operational taxonomic units (OTU) at 97% sequence identity, (ii) a de novo clustering using Mothur resulting in 518 246 OTUs, and (iii) a new high-resolution Deblur protocol resulting in 83 908 unique bacterial sequences. Abundance tables, representative sequences, taxonomic classifications, and metadata are provided. This dataset represents a comprehensive resource of sponge-associated microbial communities based on 16S rRNA gene sequences that can be used to address overarching hypotheses regarding host-associated prokaryotes, including host specificity, convergent evolution, environmental drivers of microbiome structure, and the sponge-associated rare biosphere. © The Authors 2017. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Hirano, Teruyuki; Sato, Bun'ei; Masuda, Kento; Benomar, Othman Michel; Takeda, Yoichi; Omiya, Masashi; Harakawa, Hiroki
2016-10-01
Tidal interactions are a key process to understand the evolution history of close-in exoplanets. But tidals still have a large uncertainty in their prediction for the damping timescales of stellar obliquity and semi-major axis. We have worked on a search for transiting giant planets around evolved stars, for which few close-in planets were discovered. It has been reported that evolved stars lack close-in planets, which is often attributed to the tidal evolution and/or engulfment of close-in planets by the hosts. Meanwhile, Kepler has detected a certain fraction of transiting planet candidates around evolved stars. Confirming the planetary nature for these candidates is especially important since the comparison between the occurrence rates of close-in planets around main sequence stars and evolved stars provides a unique opportunity to discuss the final stage of close-in planets. With the aim of confirming KOI planet candidates around evolved stars, we measured precision radial velocities (RVs) for evolved stars with transiting planet candidates using Subaru/HDS. We also developed a new code which simultaneously models and fits the observed RVs and phase-curve variations in the Kepler data (e.g., transits, stellar ellipsoidal variations, and planet emission/reflected light). As a result of applying the global fit to KOI giants/subgiants, we confirmed two giant planets around evolved stars (Kepler-91 and KOI-1894), as well as revealed that KOI-977 is more likely a false positive.
Assessing the Robustness of Complete Bacterial Genome Segmentations
NASA Astrophysics Data System (ADS)
Devillers, Hugo; Chiapello, Hélène; Schbath, Sophie; El Karoui, Meriem
Comparison of closely related bacterial genomes has revealed the presence of highly conserved sequences forming a "backbone" that is interrupted by numerous, less conserved, DNA fragments. Segmentation of bacterial genomes into backbone and variable regions is particularly useful to investigate bacterial genome evolution. Several software tools have been designed to compare complete bacterial chromosomes and a few online databases store pre-computed genome comparisons. However, very few statistical methods are available to evaluate the reliability of these software tools and to compare the results obtained with them. To fill this gap, we have developed two local scores to measure the robustness of bacterial genome segmentations. Our method uses a simulation procedure based on random perturbations of the compared genomes. The scores presented in this paper are simple to implement and our results show that they allow to discriminate easily between robust and non-robust bacterial genome segmentations when using aligners such as MAUVE and MGA.
Variational Dirac-Hartree-Fock calculation of the Breit interaction
NASA Astrophysics Data System (ADS)
Goldman, S. P.
1988-04-01
The calculation of the retarded version of the Breit interaction in the context of the VDHF method is discussed. With the use of Slater-type basis functions, all the terms involved can be calculated in closed form. The results are expressed as an expansion in powers of one-electron energy differences and linear combinations of hypergeometric functions. Convergence is fast and high accuracy is obtained with a small number of terms in the expansion even for high values of the nuclear charge. An added advantage is that the lowest order cancellations occurring in the retardation terms are accounted for exactly a priori. A comparison of the number of terms in the total expansion needed for an accuracy of 12 significant digits in the total energy, as well as a comparison of the results with an without retardation and in the local potential approximation, are presented for the carbon isoelectronic sequence.
The yak genome and adaptation to life at high altitude.
Qiu, Qiang; Zhang, Guojie; Ma, Tao; Qian, Wubin; Wang, Junyi; Ye, Zhiqiang; Cao, Changchang; Hu, Quanjun; Kim, Jaebum; Larkin, Denis M; Auvil, Loretta; Capitanu, Boris; Ma, Jian; Lewin, Harris A; Qian, Xiaoju; Lang, Yongshan; Zhou, Ran; Wang, Lizhong; Wang, Kun; Xia, Jinquan; Liao, Shengguang; Pan, Shengkai; Lu, Xu; Hou, Haolong; Wang, Yan; Zang, Xuetao; Yin, Ye; Ma, Hui; Zhang, Jian; Wang, Zhaofeng; Zhang, Yingmei; Zhang, Dawei; Yonezawa, Takahiro; Hasegawa, Masami; Zhong, Yang; Liu, Wenbin; Zhang, Yan; Huang, Zhiyong; Zhang, Shengxiang; Long, Ruijun; Yang, Huanming; Wang, Jian; Lenstra, Johannes A; Cooper, David N; Wu, Yi; Wang, Jun; Shi, Peng; Wang, Jian; Liu, Jianquan
2012-07-01
Domestic yaks (Bos grunniens) provide meat and other necessities for Tibetans living at high altitude on the Qinghai-Tibetan Plateau and in adjacent regions. Comparison between yak and the closely related low-altitude cattle (Bos taurus) is informative in studying animal adaptation to high altitude. Here, we present the draft genome sequence of a female domestic yak generated using Illumina-based technology at 65-fold coverage. Genomic comparisons between yak and cattle identify an expansion in yak of gene families related to sensory perception and energy metabolism, as well as an enrichment of protein domains involved in sensing the extracellular environment and hypoxic stress. Positively selected and rapidly evolving genes in the yak lineage are also found to be significantly enriched in functional categories and pathways related to hypoxia and nutrition metabolism. These findings may have important implications for understanding adaptation to high altitude in other animal species and for hypoxia-related diseases in humans.
Erwinia iniecta sp. nov., isolated from Russian wheat aphid (Diuraphis noxia).
Campillo, Tony; Luna, Emily; Portier, Perrine; Fischer-Le Saux, Marion; Lapitan, Nora; Tisserat, Ned A; Leach, Jan E
2015-10-01
Short, Gram-negative-staining, rod-shaped bacteria were isolated from crushed bodies of Russian wheat aphid [Diuraphis noxia (Kurdjumov)] and artificial diets after Russian wheat aphid feeding. Based on multilocus sequence analysis involving the 16S rRNA, atpD, infB, gyrB and rpoB genes, these bacterial isolates constitute a novel clade in the genus Erwinia, and were most closely related to Erwinia toletana. Representative distinct strains within this clade were used for comparisons with related species of Erwinia. Phenotypic comparisons using four distinct strains and average nucleotide identity (ANI) measurements using two distinct draft genomes revealed that these strains form a novel species within the genus Erwinia. The name Erwinia iniecta sp. nov. is proposed, and strain B120T ( = CFBP 8182T = NCCB 100485T) was designated the type strain. Erwinia iniecta sp. nov. was not pathogenic to plants. However, virulence to the Russian wheat aphid was observed.
Permanent draft genomes of the two Rhodopirellula europaea strains 6C and SH398.
Richter-Heitmann, Tim; Richter, Michael; Klindworth, Anna; Wegner, Carl-Eric; Frank, Carsten S; Glöckner, Frank Oliver; Harder, Jens
2014-02-01
The genomes of two Rhodopirellula europaea strains were sequenced as permanent drafts to study the genomic diversity within this genus, especially in comparison with the closed genome of the type strain Rhodopirellula baltica SH1(T). The isolates are part of a larger study to infer the biogeography of Rhodopirellula species in European marine waters, as well as to amend the genus description of R. baltica. This genomics resource article is the second of a series of five publications describing a total of eight new permanent daft genomes of Rhodopirellula species. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Olijnyk, Helmut
2005-01-01
Lattice vibrations in high-pressure phases of Y, Gd and Lu were studied by Raman spectroscopy. The observed phonon frequencies decrease towards the transitions to the dhcp and fcc phases. There is evidence that the entire structural sequence {\\mathrm {hcp \\to Sm\\mbox {-}type \\to dhcp \\to fcc}} under pressure for the individual regular rare-earth metals and along the lanthanide series at ambient pressure involve softening of certain acoustic and optical phonon modes and of the elastic shear modulus C44. Comparison is made to transitions between close-packed lattices in other metals, and possible correlations to s-d electron transfer are discussed.
Bacteriophages of Gordonia spp. Display a Spectrum of Diversity and Genetic Relationships.
Pope, Welkin H; Mavrich, Travis N; Garlena, Rebecca A; Guerrero-Bustamante, Carlos A; Jacobs-Sera, Deborah; Montgomery, Matthew T; Russell, Daniel A; Warner, Marcie H; Hatfull, Graham F
2017-08-15
The global bacteriophage population is large, dynamic, old, and highly diverse genetically. Many phages are tailed and contain double-stranded DNA, but these remain poorly characterized genomically. A collection of over 1,000 phages infecting Mycobacterium smegmatis reveals the diversity of phages of a common bacterial host, but their relationships to phages of phylogenetically proximal hosts are not known. Comparative sequence analysis of 79 phages isolated on Gordonia shows these also to be diverse and that the phages can be grouped into 14 clusters of related genomes, with an additional 14 phages that are "singletons" with no closely related genomes. One group of six phages is closely related to Cluster A mycobacteriophages, but the other Gordonia phages are distant relatives and share only 10% of their genes with the mycobacteriophages. The Gordonia phage genomes vary in genome length (17.1 to 103.4 kb), percentage of GC content (47 to 68.8%), and genome architecture and contain a variety of features not seen in other phage genomes. Like the mycobacteriophages, the highly mosaic Gordonia phages demonstrate a spectrum of genetic relationships. We show this is a general property of bacteriophages and suggest that any barriers to genetic exchange are soft and readily violable. IMPORTANCE Despite the numerical dominance of bacteriophages in the biosphere, there is a dearth of complete genomic sequences. Current genomic information reveals that phages are highly diverse genomically and have mosaic architectures formed by extensive horizontal genetic exchange. Comparative analysis of 79 phages of Gordonia shows them to not only be highly diverse, but to present a spectrum of relatedness. Most are distantly related to phages of the phylogenetically proximal host Mycobacterium smegmatis , although one group of Gordonia phages is more closely related to mycobacteriophages than to the other Gordonia phages. Phage genome sequence space remains largely unexplored, but further isolation and genomic comparison of phages targeted at related groups of hosts promise to reveal pathways of bacteriophage evolution. Copyright © 2017 Pope et al.
NASA Astrophysics Data System (ADS)
Et-Touhami, M.; Et-Touhami, M.; Olsen, P. E.; Puffer, J.
2001-05-01
Previously very sparse biostratigraphic data suggested that the Early Mesozoic tholeiitic effusive and intrusive magmatism in the various basins of the Maghreb occurred over a long time (Ladinian-Hettangian). However, a detailed comparison of the stratigraphy underlying, interbedded with, and overlying the basalts in these basins shows not only remarkable similarities with each other, but also with sequences in the latest Triassic and earliest Jurassic of eastern North America. There, the sequences have been shown to be cyclical, controlled by Milankovitch-type climate cycles; the same seems to be true in at least part of the Maghreb. Thus, the Moroccan basins have cyclical sequences surrounding and interbedded with one or two basaltic units. In the Argana and Khemisset basins the Tr-J boundary is identified by palynology to be below the lowest basalt, and the remarkably close lithological similarity between the pre-basalt sequence in the other Moroccan basins and to the North American basins - especially the Fundy basin - suggests a tight correlation in time. Likewise, the strata above the lowest basalt in Morocco show a similar pattern to what is seen above the lowest basalt formation in eastern North America, as do the overlying sequences. Furthermore, geochemistry on basalts in the Argana, Bou Fekrane, Khemisset, and Iouawen basins indicate they are high-Ti quartz-normative tholeiites as are the Orange Mountain Basalt (Fundy basin) and the North Mountain Basalt (Newark basin). The remarkable lithostratigraphic similarity across the Maghreb of these strata suggest contemporaneous and synchronous eruption over a time span of less than 200 ky, based on Milankovitch calibration, and within a ~20 ky interval after the Triassic-Jurassic boundary. Differences with previous interpretations of the biostratigraphy can be rationalized as a result of: 1, an over-reliance on comparisons with northern European palynology; 2, over-interpretation of poorly preserved fossils; and 3, rarity of early Jurassic non-marine ostracode assemblages.
Najm, Nour-Addeen; Meyer-Kayser, Elisabeth; Hoffmann, Lothar; Herb, Ingrid; Fensterer, Veronika; Pfister, Kurt; Silaghi, Cornelia
2014-06-01
Wild canines which are closely related to dogs constitute a potential reservoir for haemoparasites by both hosting tick species that infest dogs and harbouring tick-transmitted canine haemoparasites. In this study, the prevalence of Babesia spp. and Theileria spp. was investigated in German red foxes (Vulpes vulpes) and their ticks. DNA extracts of 261 spleen samples and 1953 ticks included 4 tick species: Ixodes ricinus (n=870), I. canisuga (n=585), I. hexagonus (n=485), and Dermacentor reticulatus (n=13) were examined for the presence of Babesia/Theileria spp. by a conventional PCR targeting the 18S rRNA gene. One hundred twenty-one out of 261 foxes (46.4%) were PCR-positive. Out of them, 44 samples were sequenced, and all sequences had 100% similarity to Theileria annae. Similarly, sequencing was carried out for 65 out of 118 PCR-positive ticks. Theileria annae DNA was detected in 61.5% of the sequenced samples, Babesia microti DNA was found in 9.2%, and Babesia venatorum in 7.6% of the sequenced samples. The foxes were most positive in June and October, whereas the peak of tick positivity was in October. Furthermore, the positivity of the ticks was higher for I. canisuga in comparison to the other tick species and for nymphs in comparison to adults. The high prevalence of T. annae DNA in red foxes in this study suggests a reservoir function of those animals for T. annae. To our knowledge, this is the first report of T. annae in foxes from Germany as well as the first detection of T. annae and B. microti in the fox tick I. canisuga. Detection of DNA of T. annae and B. microti in three tick species collected from foxes adds new potential vectors for these two pathogens and suggests a potential role of the red fox in their natural endemic cycles. Copyright © 2014 Elsevier GmbH. All rights reserved.
The Complete Genome Sequence of the Plant Growth-Promoting Bacterium Pseudomonas sp. UW4
Duan, Jin; Jiang, Wei; Cheng, Zhenyu; Heikkila, John J.; Glick, Bernard R.
2013-01-01
The plant growth-promoting bacterium (PGPB) Pseudomonas sp. UW4, previously isolated from the rhizosphere of common reeds growing on the campus of the University of Waterloo, promotes plant growth in the presence of different environmental stresses, such as flooding, high concentrations of salt, cold, heavy metals, drought and phytopathogens. In this work, the genome sequence of UW4 was obtained by pyrosequencing and the gaps between the contigs were closed by directed PCR. The P. sp. UW4 genome contains a single circular chromosome that is 6,183,388 bp with a 60.05% G+C content. The bacterial genome contains 5,423 predicted protein-coding sequences that occupy 87.2% of the genome. Nineteen genomic islands (GIs) were predicted and thirty one complete putative insertion sequences were identified. Genes potentially involved in plant growth promotion such as indole-3-acetic acid (IAA) biosynthesis, trehalose production, siderophore production, acetoin synthesis, and phosphate solubilization were determined. Moreover, genes that contribute to the environmental fitness of UW4 were also observed including genes responsible for heavy metal resistance such as nickel, copper, cadmium, zinc, molybdate, cobalt, arsenate, and chromate. Whole-genome comparison with other completely sequenced Pseudomonas strains and phylogeny of four concatenated “housekeeping” genes (16S rRNA, gyrB, rpoB and rpoD) of 128 Pseudomonas strains revealed that UW4 belongs to the fluorescens group, jessenii subgroup. PMID:23516524
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sellem, C.H.; Rossignol, M.; Belcour, L.
1996-06-01
The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optical sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences.more » In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group I intronic ORFs are mobile elements and that their transfer, and comcomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes. 46 refs., 5 figs., 2 tabs.« less
Structure of the highly repeated, long interspersed DNA family (LINE or L1Rn) of the rat.
D'Ambrosio, E; Waitzkin, S D; Witney, F R; Salemme, A; Furano, A V
1986-01-01
We present the DNA sequence of a 6.7-kilobase member of the rat long interspersed repeated DNA family (LINE or L1Rn). This member (LINE 3) is flanked by a perfect 14-base-pair (bp) direct repeat and is a full-length, or close-to-full-length, member of this family. LINE 3 contains an approximately 100-bp A-rich right end, a number of long (greater than 400-bp) open reading frames, and a ca. 200-bp G + C-rich (ca. 60%) cluster near each terminus. Comparison of the LINE 3 sequence with the sequence of about one-half of another member, which we also present, as well as restriction enzyme analysis of the genomic copies of this family, indicates that in length and overall structure LINE 3 is quite typical of the 40,000 or so other genomic members of this family which would account for as much as 10% of the rat genome. Therefore, the rat LINE family is relatively homogeneous, which contrasts with the heterogeneous LINE families in primates and mice. Transcripts corresponding to the entire LINE sequence are abundant in the nuclear RNA of rat liver. The characteristics of the rat LINE family are discussed with respect to the possible function and evolution of this family of DNA sequences. Images PMID:3023845
Chen, Caihui; Zheng, Yongjie; Liu, Sian; Zhong, Yongda; Wu, Yanfang; Li, Jiang; Xu, Li-An; Xu, Meng
2017-01-01
Cinnamomum camphora , a member of the Lauraceae family, is a valuable aromatic and timber tree that is indigenous to the south of China and Japan. All parts of Cinnamomum camphora have secretory cells containing different volatile chemical compounds that are utilized as herbal medicines and essential oils. Here, we reported the complete sequencing of the chloroplast genome of Cinnamomum camphora using illumina technology. The chloroplast genome of Cinnamomum camphora is 152,570 bp in length and characterized by a relatively conserved quadripartite structure containing a large single copy region of 93,705 bp, a small single copy region of 19,093 bp and two inverted repeat (IR) regions of 19,886 bp. Overall, the genome contained 123 coding regions, of which 15 were repeated in the IR regions. An analysis of chloroplast sequence divergence revealed that the small single copy region was highly variable among the different genera in the Lauraceae family. A total of 40 repeat structures and 83 simple sequence repeats were detected in both the coding and non-coding regions. A phylogenetic analysis indicated that Calycanthus is most closely related to Lauraceae , both being members of Laurales , which forms a sister group to Magnoliids . The complete sequence of the chloroplast of Cinnamomum camphora will aid in in-depth taxonomical studies of the Lauraceae family in the future. The genetic sequence information will also have valuable applications for chloroplast genetic engineering.
Pöggeler, S; Risch, S; Kück, U; Osiewacz, H D
1997-10-01
Homokaryons from the homothallic ascomycte Sordaria macrospora are able to enter the sexual pathway and to form fertile fruiting bodies. To analyze the molecular basis of homothallism and to elucidate the role of mating-products during fruiting body development, we cloned and sequenced the entire S. macrospora mating-type locus. Comparison of the Sordaria mating-type locus with mating-type idiomorphs from the heterothallic ascomycetes Neurospora crassa and Podospora anserina revealed that sequences from both idiomorphs (A/a and mat-/mat+, respectively) are contiguous in S. macrospora. DNA sequencing of the S. macrospora mating-type region allowed the identification of four open reading frames (ORFs), which were termed Smt-a1, SmtA-1, SmtA-2 and SmtA-3. While Smt-a1, SmtA-1, and SmtA-2 show strong sequence similarities with the corresponding N. crassa mating-type ORFs, SmtA-3 has a chimeric character. It comprises sequences that are similar to the A and a mating-type idiomorph from N. crassa. To determine functionality of the S. macrospora mating-type genes, we show that all ORFs are transcriptionally expressed. Furthermore, we transformed the S. macrospora mating-type genes into mat- and mat+ strains of the closely related heterothallic fungus P. anserina. The transformation experiments show that mating-type genes from S. macrospora induce fruiting body formation in P. anserina.
Boyd, Bret M; Allen, Julie M; de Crécy-Lagard, Valérie; Reed, David L
2014-09-11
The obligate-heritable endosymbionts of insects possess some of the smallest known bacterial genomes. This is likely due to loss of genomic material during symbiosis. The mode and rate of this erosion may change over evolutionary time: faster in newly formed associations and slower in long-established ones. The endosymbionts of human and anthropoid primate lice present a unique opportunity to study genome erosion in newly established (or young) symbionts. This is because we have a detailed phylogenetic history of these endosymbionts with divergence dates for closely related species. This allows for genome evolution to be studied in detail and rates of change to be estimated in a phylogenetic framework. Here, we sequenced the genome of the chimpanzee louse endosymbiont (Candidatus Riesia pediculischaeffi) and compared it with the closely related genome of the human body louse endosymbiont. From this comparison, we found evidence for recent genome erosion leading to gene loss in these endosymbionts. Although gene loss was detected, it was not significantly greater than in older endosymbionts from aphids and ants. Additionally, we searched for genes associated with B-vitamin synthesis in the two louse endosymbiont genomes because these endosymbionts are believed to synthesize essential B vitamins absent in the louse's diet. All of the expected genes were present, except those involved in thiamin synthesis. We failed to find genes encoding for proteins involved in the biosynthesis of thiamin or any complete exogenous means of salvaging thiamin, suggesting there is an undescribed mechanism for the salvage of thiamin. Finally, genes encoding for the pantothenate de novo biosynthesis pathway were located on a plasmid in both taxa along with a heat shock protein. Movement of these genes onto a plasmid may be functionally and evolutionarily significant, potentially increasing production and guarding against the deleterious effects of mutation. These data add to a growing resource of obligate endosymbiont genomes and to our understanding of the rate and mode of genome erosion in obligate animal-associated bacteria. Ultimately sequencing additional louse p-endosymbiont genomes will provide a model system for studying genome evolution in obligate host associated bacteria. Copyright © 2014 Boyd et al.
Alu repeats: A source for the genesis of primate microsatellites
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arcot, S.S.; Batzer, M.A.; Wang, Zhenyuan
1995-09-01
As a result of their abundance, relatively uniform distribution, and high degree of polymorphism, microsatellites and minisatellites have become valuable tools in genetic mapping, forensic identity testing, and population studies. In recent years, a number of microsatellite repeats have been found to be associated with Alu interspersed repeated DNA elements. The association of an Alu element with a microsatellite repeat could result from the integration of an Alu element within a preexisting microsatellite repeat. Alternatively, Alu elements could have a direct role in the origin of microsatellite repeats. Errors introduced during reverse transcription of the primary transcript derived from anmore » Alu {open_quotes}master{close_quote} gene or the accumulation of random mutations in the middle A-rich regions and oligo(dA)-rich tails of Alu elements after insertion and subsequent expansion and contraction of these sequences could result in the genesis of a microsatellite repeat. We have tested these hypotheses by a direct evolutionary comparison of the sequences of some recent Alu elements that are found only in humans and are absent from nonhuman primates, as well as some older Alu elements that are present at orthologous positions in a number of nonhuman primates. The origin of {open_quotes}young{close_quotes} Alu insertions, absence of sequences that resemble microsatellite repeats at the orthologous loci in chimpanzees, and the gradual expansion of microsatellite repeats in some old Alu repeats at orthologous positions within the genomes of a number of nonhuman primates suggest that Alu elements are a source for the genesis of primate microsatellite repeats. 48 refs., 5 figs., 3 tabs.« less
Lipovich, Leonard; Hou, Zhuo-Cheng; Jia, Hui; Sinkler, Christopher; McGowen, Michael; Sterner, Kirstin N; Weckle, Amy; Sugalski, Amara B; Pipes, Lenore; Gatti, Domenico L; Mason, Christopher E; Sherwood, Chet C; Hof, Patrick R; Kuzawa, Christopher W; Grossman, Lawrence I; Goodman, Morris; Wildman, Derek E
2016-02-01
The human brain and human cognitive abilities are strikingly different from those of other great apes despite relatively modest genome sequence divergence. However, little is presently known about the interspecies divergence in gene structure and transcription that might contribute to these phenotypic differences. To date, most comparative studies of gene structure in the brain have examined humans, chimpanzees, and macaque monkeys. To add to this body of knowledge, we analyze here the brain transcriptome of the western lowland gorilla (Gorilla gorilla gorilla), an African great ape species that is phylogenetically closely related to humans, but with a brain that is approximately one-third the size. Manual transcriptome curation from a sample of the planum temporale region of the neocortex revealed 12 protein-coding genes and one noncoding-RNA gene with exons in the gorilla unmatched by public transcriptome data from the orthologous human loci. These interspecies gene structure differences accounted for a total of 134 amino acids in proteins found in the gorilla that were absent from protein products of the orthologous human genes. Proteins varying in structure between human and gorilla were involved in immunity and energy metabolism, suggesting their relevance to phenotypic differences. This gorilla neocortical transcriptome comprises an empirical, not homology- or prediction-driven, resource for orthologous gene comparisons between human and gorilla. These findings provide a unique repository of the sequences and structures of thousands of genes transcribed in the gorilla brain, pointing to candidate genes that may contribute to the traits distinguishing humans from other closely related great apes. © 2015 Wiley Periodicals, Inc.
Actinomyces timonensis sp. nov., isolated from a human clinical osteo-articular sample.
Renvoise, Aurélie; Raoult, Didier; Roux, Véronique
2010-07-01
Gram-positive, non-spore-forming rods were isolated from a human osteo-articular sample (strain 7400942(T)). Based on cellular morphology and the results of biochemical analysis, this strain was tentatively identified as a novel species of the genus Actinomyces. Phylogenetic analysis based on 16S rRNA gene sequence comparisons showed that the bacterium was closely related to the type strain of Actinomyces denticolens (96.9 % 16S rRNA gene sequence similarity). A comparison of biochemical traits showed that strain 7400942(T) was distinct from A. denticolens in a number of characteristics, i.e. in contrast with A. denticolens, strain 7400942(T) was negative for nitrate reduction and for beta-galactosidase, alpha-glucosidase and alanine arylamidase activities, it was positive for acid production from N-acetylglucosamine, melezitose and glycogen, and it was negative for acid production from turanose. Matrix-assisted laser-desorption/ionization time-of-flight MS protein analysis confirmed that strain 7400942(T) represents a novel species, as scores obtained for its spectra were significant (>2.2) only with strain 7400942(T). On the basis of phenotypic data and phylogenetic inference, it is proposed that this strain should be designated Actinomyces timonensis sp. nov.; the type strain is strain 7400942(T) (=CSUR P35(T)=CCUG 55928(T)).
The ultraviolet morphology of evolved populations
NASA Astrophysics Data System (ADS)
Chávez, Miguel
2009-04-01
In this paper I present a summary of the recent investigations we have developed at the Stellar Atmospheres and Populations Research Group (GrAPEs-for its designation in Spanish) at INAOE and collaborators in Italy. These investigations have aimed at providing updated stellar tools for the analysis of the UV spectra of a variety of stellar aggregates, mainly evolved ones. The sequence of material here presented roughly corresponds to the steps we have identified as mandatory to properly establish the applicability of synthetic populations in the analyses of observational data of globular clusters and more complex aged aggregates. The sequence is composed of four main stages, namely, (a) the creation of a theoretical stellar data base that we have called UVBLUE, (b) the comparison of such data base with observational stellar data, (c) the calculation of a set of synthetic spectral energy distributions (SEDs) of simple stellar populations (SSPs) and their validation through a comparison with observations of a sample of galactic globular clusters (GGCs), (d) construction of models for dating local ellipticals and distant red-envelope galaxies. Most of the work relies on the analysis of absorption line spectroscopic indices. The global results are more than satisfactory in the sense that theoretical indices closely follow the overall trends with chemical composition depicted by their empirical counterparts (stars and GGCs).
Chayote mosaic virus, a New Tymovirus Infecting Cucurbitaceae.
Bernal, J J; Jiménez, I; Moreno, M; Hord, M; Rivera, C; Koenig, R; Rodríguez-Cerezo, E
2000-10-01
ABSTRACT Chayote mosaic virus (ChMV) is a putative tymovirus isolated from chayote crops in Costa Rica. ChMV was characterized at the host range, serological, and molecular levels. ChMV was transmitted mechanically and induced disease symptoms mainly in Cucurbitaceae hosts. Asymptomatic infections were detected in other host families. Serologically, ChMV is related to the Andean potato latent virus (APLV) and the Eggplant mosaic virus (EMV), both members of the genus Tymovirus infecting solanaceous hosts in the Caribbean Basin and South America. The sequence of the genomic RNA of ChMV was determined and its genetic organization was typical of tymoviruses. Comparisons with other tymoviral sequences showed that ChMV was a new member of the genus Tymovirus. The phylogenetic analyses of the coat protein gene were consistent with serological comparisons and positioned ChMV within a cluster of tymoviruses infecting mainly cucurbit or solanaceous hosts, including APLV and EMV. Phylogenetic analyses of the replicase protein gene confirmed the close relationship of ChMV and EMV. Our results suggest that ChMV is related to two tymoviruses (APLV and EMV) of proximal geographical provenance but with different natural host ranges. ChMV is the first cucurbit-infecting tymovirus to be fully characterized at the genomic level.
High quality de novo sequencing and assembly of the Saccharomyces arboricolus genome
2013-01-01
Background Comparative genomics is a formidable tool to identify functional elements throughout a genome. In the past ten years, studies in the budding yeast Saccharomyces cerevisiae and a set of closely related species have been instrumental in showing the benefit of analyzing patterns of sequence conservation. Increasing the number of closely related genome sequences makes the comparative genomics approach more powerful and accurate. Results Here, we report the genome sequence and analysis of Saccharomyces arboricolus, a yeast species recently isolated in China, that is closely related to S. cerevisiae. We obtained high quality de novo sequence and assemblies using a combination of next generation sequencing technologies, established the phylogenetic position of this species and considered its phenotypic profile under multiple environmental conditions in the light of its gene content and phylogeny. Conclusions We suggest that the genome of S. arboricolus will be useful in future comparative genomics analysis of the Saccharomyces sensu stricto yeasts. PMID:23368932
Xochelli, Aliki; Agathangelidis, Andreas; Kavakiotis, Ioannis; Minga, Evangelia; Sutton, Lesley Ann; Baliakas, Panagiotis; Chouvarda, Ioanna; Giudicelli, Véronique; Vlahavas, Ioannis; Maglaveras, Nikos; Bonello, Lisa; Trentin, Livio; Tedeschi, Alessandra; Panagiotidis, Panagiotis; Geisler, Christian; Langerak, Anton W; Pospisilova, Sarka; Jelinek, Diane F; Oscier, David; Chiorazzi, Nicholas; Darzentas, Nikos; Davi, Fred; Ghia, Paolo; Rosenquist, Richard; Hadzidimitriou, Anastasia; Belessi, Chrysoula; Lefranc, Marie-Paule; Stamatopoulos, Kostas
2015-01-01
Νext generation sequencing studies in Homo sapiens have identified novel immunoglobulin heavy variable (IGHV) genes and alleles necessitating changes in the international ImMunoGeneTics information system (IMGT) GENE-DB and reference directories of IMGT/V-QUEST. In chronic lymphocytic leukaemia (CLL), the somatic hypermutation (SHM) status of the clonotypic rearranged IGHV gene is strongly associated with patient outcome. Correct determination of this parameter strictly depends on the comparison of the nucleotide sequence of the clonotypic rearranged IGHV gene with that of the closest germline counterpart. Consequently, changes in the reference directories could, in principle, affect the correct interpretation of the IGHV mutational status in CLL. To this end, we analyzed 8066 productive IG heavy chain (IGH) rearrangement sequences from our consortium both before and after the latest update of the IMGT/V-QUEST reference directory. Differences were identified in 405 cases (5 % of the cohort). In 291/405 sequences (71.9 %), changes concerned only the IGHV gene or allele name, whereas a change in the percent germline identity (%GI) was noted in 114/405 (28.1 %) sequences; in 50/114 (43.8 %) sequences, changes in the %GI led to a change in the mutational set. In conclusion, recent changes in the IMGT reference directories affected the interpretation of SHM in a sizeable number of IGH rearrangement sequences from CLL patients. This indicates that both physicians and researchers should consider a re-evaluation of IG sequence data, especially for those IGH rearrangement sequences that, up to date, have a GI close to 98 %, where caution is warranted.
The genome sequence of the model ascomycete fungus Podospora anserina
Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne GJ; Henrissat, Bernard; Khoury, Riyad EL; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe
2008-01-01
Background The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. Results We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. Conclusion The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope. PMID:18460219
Sequence similarities and evolutionary relationships of microbial, plant and animal alpha-amylases.
Janecek, S
1994-09-01
Amino acid sequence comparison of 37 alpha-amylases from microbial, plant and animal sources was performed to identify their mutual sequence similarities in addition to the five already described conserved regions. These sequence regions were examined from structure/function and evolutionary perspectives. An unrooted evolutionary tree of alpha-amylases was constructed on a subset of 55 residues from the alignment of sequence similarities along with conserved regions. The most important new information extracted from the tree was as follows: (a) the close evolutionary relationship of Alteromonas haloplanctis alpha-amylase (thermolabile enzyme from an antarctic psychrotroph) with the already known group of homologous alpha-amylases from streptomycetes, Thermomonospora curvata, insects and mammals, and (b) the remarkable 40.1% identity between starch-saccharifying Bacillus subtilis alpha-amylase and the enzyme from the ruminal bacterium Butyrivibrio fibrisolvens, an alpha-amylase with an unusually large polypeptide chain (943 residues in the mature enzyme). Due to a very high degree of similarity, the whole amino acid sequences of three groups of alpha-amylases, namely (a) fungi and yeasts, (b) plants, and (c) A. haloplanctis, streptomycetes, T. curvata, insects and mammals, were aligned independently and their unrooted distance trees were calculated using these alignments. Possible rooting of the trees was also discussed. Based on the knowledge of the location of the five disulfide bonds in the structure of pig pancreatic alpha-amylase, the possible disulfide bridges were established for each of these groups of homologous alpha-amylases.
Brylinski, Michal; Konieczny, Leszek; Kononowicz, Andrzej; Roterman, Irena
2008-03-21
The well-known procedure implemented in ClustalW oriented on the sequence comparison was applied to structure comparison. The consensus sequence as well as consensus structure has been defined for proteins belonging to serpine family. The structure of early stage intermediate was the object for similarity search. The high values of W(sequence) appeared to be accordant with high values of W(structure) making possible structure comparison using common criteria for sequence and structure comparison. Since the early stage structural form has been created according to limited conformational sub-space which does not include the beta-structure (this structure is mediated by C7eq structural form), is particularly important to see, that the C7eq structural form may be treated as the seed for beta-structure present in the final native structure of protein. The applicability of ClustalW procedure to structure comparison makes these two comparisons unified.
Analysis of the Isolated SecA DEAD Motor Suggests a Mechanism for Chemical-Mechanical Coupling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nithianantham, Stanley; Shilton, Brian H
The preprotein cross-linking domain and C-terminal domains of Escherichia coli SecA were removed to create a minimal DEAD motor, SecA-DM. SecA-DM hydrolyzes ATP and has the same affinity for ADP as full-length SecA. The crystal structure of SecA-DM in complex with ADP was solved and shows the DEAD motor in a closed conformation. Comparison with the structure of the E. coli DEAD motor in an open conformation (Protein Data Bank ID 2FSI) indicates main-chain conformational changes in two critical sequences corresponding to Motif III and Motif V of the DEAD helicase family. The structures that the Motif III and Motifmore » V sequences adopt in the DEAD motor open conformation are incompatible with the closed conformation. Therefore, when the DEAD motor makes the transition from open to closed, Motif III and Motif V are forced to change their conformations, which likely functions to regulate passage through the transition state for ATP hydrolysis. The transition state for ATP hydrolysis for the SecA DEAD motor was modeled based on the conformation of the Vasa helicase in complex with adenylyl imidodiphosphate and RNA (Protein Data Bank ID 2DB3). A mechanism for chemical-mechanical coupling emerges, where passage through the transition state for ATP hydrolysis is hindered by the conformational changes required in Motif III and Motif V, and may be promoted by binding interactions with the preprotein substrate and/or other translocase domains and subunits.« less
Su, Fei; Ou, Hong-Yu; Tao, Fei; Tang, Hongzhi; Xu, Ping
2013-12-27
With genomic sequences of many closely related bacterial strains made available by deep sequencing, it is now possible to investigate trends in prokaryotic microevolution. Positive selection is a sub-process of microevolution, in which a particular mutation is favored, causing the allele frequency to continuously shift in one direction. Wide scanning of prokaryotic genomes has shown that positive selection at the molecular level is much more frequent than expected. Genes with significant positive selection may play key roles in bacterial adaption to different environmental pressures. However, selection pressure analyses are computationally intensive and awkward to configure. Here we describe an open access web server, which is designated as PSP (Positive Selection analysis for Prokaryotic genomes) for performing evolutionary analysis on orthologous coding genes, specially designed for rapid comparison of dozens of closely related prokaryotic genomes. Remarkably, PSP facilitates functional exploration at the multiple levels by assignments and enrichments of KO, GO or COG terms. To illustrate this user-friendly tool, we analyzed Escherichia coli and Bacillus cereus genomes and found that several genes, which play key roles in human infection and antibiotic resistance, show significant evidence of positive selection. PSP is freely available to all users without any login requirement at: http://db-mml.sjtu.edu.cn/PSP/. PSP ultimately allows researchers to do genome-scale analysis for evolutionary selection across multiple prokaryotic genomes rapidly and easily, and identify the genes undergoing positive selection, which may play key roles in the interactions of host-pathogen and/or environmental adaptation.
Analysis of the Isolated SecA DEAD Motor Suggests a Mechanism for Chemical-Mechanical Coupling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nithianantham, Stanley; Shilton, Brian H
2011-09-28
The preprotein cross-linking domain and C-terminal domains of Escherichia coli SecA were removed to create a minimal DEAD motor, SecA-DM. SecA-DM hydrolyzes ATP and has the same affinity for ADP as full-length SecA. The crystal structure of SecA-DM in complex with ADP was solved and shows the DEAD motor in a closed conformation. Comparison with the structure of the E. coli DEAD motor in an open conformation (Protein Data Bank ID 2FSI) indicates main-chain conformational changes in two critical sequences corresponding to Motif III and Motif V of the DEAD helicase family. The structures that the Motif III and Motifmore » V sequences adopt in the DEAD motor open conformation are incompatible with the closed conformation. Therefore, when the DEAD motor makes the transition from open to closed, Motif III and Motif V are forced to change their conformations, which likely functions to regulate passage through the transition state for ATP hydrolysis. The transition state for ATP hydrolysis for the SecA DEAD motor was modeled based on the conformation of the Vasa helicase in complex with adenylyl imidodiphosphate and RNA (Protein Data Bank ID 2DB3). A mechanism for chemical-mechanical coupling emerges, where passage through the transition state for ATP hydrolysis is hindered by the conformational changes required in Motif III and Motif V, and may be promoted by binding interactions with the preprotein substrate and/or other translocase domains and subunits.« less
Contacts between the factor TUF and RPG sequences.
Vignais, M L; Huet, J; Buhler, J M; Sentenac, A
1990-08-25
The yeast TUF factor binds specifically to RPG-like sequences involved in multiple functions at enhancers, silencers, and telomeres. We have characterized the interaction of TUF with its optimal binding sequence, rpg-1 (1-ACACCCATACATTT-14), using a gel DNA-binding assay in combination with methylation protection and mutagenesis experiments. As many as 10 base pairs appear to be engaged in factor binding. Analysis of a collection of 30 different RPG mutants demonstrated the importance of 8 base pairs at position 2, 3, 4, 5, 6, 7, 10, and 12 and the critical role of the central GC pair at position 5. Methylation protection data on four different natural sites confirmed a close contact at positions 4, 5, 6, and 10 and suggested additional contacts at base pairs 8, 12, and 13. The derived consensus sequence was RCAAYCCRYNCAYY. A quantitative band shift analysis was used to determine the equilibrium dissociation constant for the complex of TUF and its optimal binding site rpg-1. The specific dissociation constant (K8) was found to be 1.3 x 10(-11) M. The comparison of the K8 value with the dissociation constant obtained for nonspecific DNA sites (Kn8 = 8.7 x 10(-6) M) shows the high binding selectivity of TUF for its specific RPG target.
The Reference Genome of the Halophytic Plant Eutrema salsugineum
Yang, Ruolin; Jarvis, David E.; Chen, Hao; Beilstein, Mark A.; Grimwood, Jane; Jenkins, Jerry; Shu, ShengQiang; Prochnik, Simon; Xin, Mingming; Ma, Chuang; Schmutz, Jeremy; Wing, Rod A.; Mitchell-Olds, Thomas; Schumaker, Karen S.; Wang, Xiangfeng
2013-01-01
Halophytes are plants that can naturally tolerate high concentrations of salt in the soil, and their tolerance to salt stress may occur through various evolutionary and molecular mechanisms. Eutrema salsugineum is a halophytic species in the Brassicaceae that can naturally tolerate multiple types of abiotic stresses that typically limit crop productivity, including extreme salinity and cold. It has been widely used as a laboratorial model for stress biology research in plants. Here, we present the reference genome sequence (241 Mb) of E. salsugineum at 8× coverage sequenced using the traditional Sanger sequencing-based approach with comparison to its close relative Arabidopsis thaliana. The E. salsugineum genome contains 26,531 protein-coding genes and 51.4% of its genome is composed of repetitive sequences that mostly reside in pericentromeric regions. Comparative analyses of the genome structures, protein-coding genes, microRNAs, stress-related pathways, and estimated translation efficiency of proteins between E. salsugineum and A. thaliana suggest that halophyte adaptation to environmental stresses may occur via a global network adjustment of multiple regulatory mechanisms. The E. salsugineum genome provides a resource to identify naturally occurring genetic alterations contributing to the adaptation of halophytic plants to salinity and that might be bioengineered in related crop species. PMID:23518688
van Keulen, H; Campbell, S R; Erlandsen, S L; Jarroll, E L
1991-06-01
In an attempt to study Giardia at the DNA sequence level, the rRNA genes of three species, Giardia duodenalis, Giardia ardeae and Giardia muris were cloned and restriction enzyme maps were constructed. The rDNA repeats of these Giardia show completely different restriction enzyme recognition patterns. The size of the rDNA repeat ranges from approximately 5.6 kb in G. duodenalis to 7.6 kb in both G. muris and G. ardeae. These size differences are mainly attributable to the variation in length of the spacer. Minor differences exist among these Giardia in the sizes of their small subunit rRNA and the internal transcribed spacer between small and large subunit rRNA. The genetic maps were constructed by sequence analysis of the DNA around the 5' and 3' ends of the mature rRNA genes and between the rRNA covering the 5.8S rRNA gene and internal transcribed spacer. Comparison of the 5.8S rDNA and 3' end of large subunit rDNA from these three Giardia species showed considerable sequence variation, but the rDNA sequences of G. duodenalis and G. ardeae appear more closely related to each other than to G. muris.
Zhu, Xiao-Feng; Zhang, Dian-Peng; Yang, Sen; Zhang, Qing-Wen
2017-03-01
Three yeast strains designated as S44, XF1 and XF2, respectively, were isolated from Scolytus scheryrewi Semenov of apricot tree in Shule County, Xinjiang, China, and were demonstrated to be a new member of the genus Candida by sequence comparisons of 26S rRNA gene D1/D2 domain and internal transcribed spacer (ITS) region. BLASTn alignments on NCBI showed that the similarity of 26S rRNA gene sequences of S44 (type strain) to all sequences of other Candida yeasts was very low (≦93 %). The phylogenetic tree based on the 26S rRNA gene D1/D2 domain and ITS region sequences revealed that the strain S44 is closely related to C. blattae, C. dosseyi, C. pruni, C. asparagi, C. fructus and C. musae. However, the strain S44 is distinguished from these Candida species by the physiological characteristics. Moreover, the strain S44 formed typical pseudohyphae when grown on cornmeal agar at 25 °C for 7 days, but did not form ascospores in sporulation medium for 3-4 weeks. Therefore, the name Candida xinjiangensis is proposed for the novel species, with S44 (=KCTC T 27747) as the type strain.
Streptococcus ovuberis sp. nov., isolated from a subcutaneous abscess in the udder of a sheep.
Zamora, Leydis; Pérez-Sancho, Marta; Fernández-Garayzábal, Jose Francisco; Orden, Jose Antonio; Domínguez-Bernal, Gustavo; de la Fuente, Ricardo; Domínguez, Lucas; Vela, Ana Isabel
2017-11-01
One unidentified, Gram-stain-positive, catalase-negative coccus-shaped organism was recovered from a subcutaneous abscess of the udder of a sheep and subjected to a polyphasic taxonomic analysis. Based on cellular morphology and biochemical criteria, the isolate was tentatively assigned to the genus Streptococcus, although the organism did not appear to match any recognized species. 16S rRNA gene sequence comparison studies confirmed its identification as a member of the genus Streptococcus and showed that the nearest phylogenetic relatives of the unknown coccus corresponded to Streptococcus moroccensis and Streptococcus cameli (95.9 % 16S rRNA gene sequence similarity). The sodA sequence analysis showed less than 89.3 % sequence similarity with the currently recognized species of the genus Streptococcus. The novel bacterial isolate was distinguished from close relatives of the genus Streptococcusby using biochemical tests. A mass spectrometry profile was also obtained for the novel isolate using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). Based on both phenotypic and phylogenetic findings, it is proposed that the unknown bacterium be classified as a representative of a novel species of the genus Streptococcus, Streptococcus ovuberis sp. nov. The type strain of Streptococcus ovuberissp. nov. is VB15-00779 T (=CECT 9179 T =CCUG 69612 T ).
Pseudomonas aestus sp. nov., a plant growth-promoting bacterium isolated from mangrove sediments.
Vasconcellos, Rafael L F; Santos, Suikinai Nobre; Zucchi, Tiago Domingues; Silva, Fábio Sérgio Paulino; Souza, Danilo Tosta; Melo, Itamar Soares
2017-10-01
Strain CMAA 1215 T , a Gram-reaction-negative, aerobic, catalase positive, polarly flagellated, motile, rod-shaped (0.5-0.8 × 1.3-1.9 µm) bacterium, was isolated from mangrove sediments, Cananéia Island, Brazil. Analysis of the 16S rRNA gene sequences showed that strain CMAA 1215 T forms a distinct phyletic line within the Pseudomonas putida subclade, being closely related to P. plecoglossicida ATCC 700383 T , P. monteilii NBRC 103158 T , and P. taiwanensis BCRC 17751 T of sequence similarity of 98.86, 98.73, and 98.71%, respectively. Genomic comparisons of the strain CMAA 1215 T with its closest phylogenetic type strains using average nucleotide index (ANI) and DNA:DNA relatedness approaches revealed 84.3-85.3% and 56.0-63.0%, respectively. A multilocus sequence analysis (MLSA) performed concatenating 16S rRNA, gyrB and rpoB gene sequences from the novel species was related with Pseudomonas putida subcluster and formed a new phylogenetic lineage. The phenotypic, physiological, biochemical, and genetic characteristics support the assignment of CMAA 1215 T to the genus Pseudomonas, representing a novel species. The name Pseudomonas aestus sp.nov. is proposed, with CMAA 1215 T (=NRRL B-653100 T = CBMAI 1962 T ) as the type strain.
Bhatia, S; Singh Negi, M; Lakshmikumaran, M
1996-11-01
EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Genome Sequencing of Steroid Producing Bacteria Using Ion Torrent Technology and a Reference Genome.
Sola-Landa, Alberto; Rodríguez-García, Antonio; Barreiro, Carlos; Pérez-Redondo, Rosario
2017-01-01
The Next-Generation Sequencing technology has enormously eased the bacterial genome sequencing and several tens of thousands of genomes have been sequenced during the last 10 years. Most of the genome projects are published as draft version, however, for certain applications the complete genome sequence is required.In this chapter, we describe the strategy that allowed the complete genome sequencing of Mycobacterium neoaurum NRRL B-3805, an industrial strain exploited for steroid production, using Ion Torrent sequencing reads and the genome of a close strain as the reference. This protocol can be applied to analyze the genetic variations between closely related strains; for example, to elucidate the point mutations between a parental strain and a random mutagenesis-derived mutant.
Evolutionary analyses of hedgehog and Hoxd-10 genes in fish species closely related to the zebrafish
Zardoya, Rafael; Abouheif, Ehab; Meyer, Axel
1996-01-01
The study of development has relied primarily on the isolation of mutations in genes with specific functions in development and on the comparison of their expression patterns in normal and mutant phenotypes. Comparative evolutionary analyses can complement these approaches. Phylogenetic analyses of Sonic hedgehog (Shh) and Hoxd-10 genes from 18 cyprinid fish species closely related to the zebrafish provide novel insights into the functional constraints acting on Shh. Our results confirm and extend those gained from expression and crystalline structure analyses of this gene. Unexpectedly, exon 1 of Shh is found to be almost invariant even in third codon positions among these morphologically divergent species suggesting that this exon encodes for a functionally important domain of the hedgehog protein. This is surprising because the main functional domain of Shh had been thought to be that encoded by exon 2. Comparisons of Shh and Hoxd-10 gene sequences and of resulting gene trees document higher evolutionary constraints on the former than on the latter. This might be indicative of more general evolutionary patterns in networks of developmental regulatory genes interacting in a hierarchical fashion. The presence of four members of the hedgehog gene family in cyprinid fishes was documented and their homologies to known hedgehog genes in other vertebrates were established. PMID:8917540
Zardoya, R; Abouheif, E; Meyer, A
1996-11-12
The study of development has relied primarily on the isolation of mutations in genes with specific functions in development and on the comparison of their expression patterns in normal and mutant phenotypes. Comparative evolutionary analyses can complement these approaches. Phylogenetic analyses of Sonic hedgehog (Shh) and Hoxd-10 genes from 18 cyprinid fish species closely related to the zebrafish provide novel insights into the functional constraints acting on Shh. Our results confirm and extend those gained from expression and crystalline structure analyses of this gene. Unexpectedly, exon 1 of Shh is found to be almost invariant even in third codon positions among these morphologically divergent species suggesting that this exon encodes for a functionally important domain of the hedgehog protein. This is surprising because the main functional domain of Shh had been thought to be that encoded by exon 2. Comparisons of Shh and Hoxd-10 gene sequences and of resulting gene trees document higher evolutionary constraints on the former than on the latter. This might be indicative of more general evolutionary patterns in networks of developmental regulatory genes interacting in a hierarchical fashion. The presence of four members of the hedgehog gene family in cyprinid fishes was documented and their homologies to known hedgehog genes in other vertebrates were established.
Covell, Christine L; Sidani, Souraya; Ritchie, Judith A
2012-06-01
The sequence used for collecting quantitative and qualitative data in concurrent mixed-methods research may influence participants' responses. Empirical evidence is needed to determine if the order of data collection in concurrent mixed methods research biases participants' responses to closed and open-ended questions. To examine the influence of the quantitative-qualitative sequence on responses to closed and open-ended questions when assessing the same variables or aspects of a phenomenon simultaneously within the same study phase. A descriptive cross-sectional, concurrent mixed-methods design was used to collect quantitative (survey) and qualitative (interview) data. The setting was a large multi-site health care centre in Canada. A convenience sample of 50 registered nurses was selected and participated in the study. Participants were randomly assigned to one of two sequences for data collection, quantitative-qualitative or qualitative-quantitative. Independent t-tests were performed to compare the two groups' responses to the survey items. Directed content analysis was used to compare the participants' responses to the interview questions. The sequence of data collection did not greatly affect the participants' responses to the closed-ended questions (survey items) or the open-ended questions (interview questions). The sequencing of data collection, when using both survey and semi-structured interviews, may not bias participants' responses to closed or open-ended questions. Additional research is required to confirm these findings. Copyright © 2011 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Limongelli, Carla; Sciarrone, Filippo; Temperini, Marco; Vaste, Giulia
2011-01-01
LS-Lab provides automatic support to comparison/evaluation of the Learning Object Sequences produced by different Curriculum Sequencing Algorithms. Through this framework a teacher can verify the correspondence between the behaviour of different sequencing algorithms and her pedagogical preferences. In fact the teacher can compare algorithms…
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.
Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark
2003-07-04
The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
Small gene family encoding an eggshell (chorion) protein of the human parasite Schistosoma mansoni
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobek, L.A.; Rekosh, D.M.; Lo Verde, P.T.
1988-08-01
The authors isolated six independent genomic clones encoding schistosome chorion or eggshell proteins from a Schistosoma mansoni genomic library. A linkage map of five of the clones spanning 35 kilobase pairs (kbp) of the S. mansoni genome was constructed. The region contained two eggshell protein genes closely linked, separated by 7.5 kbp of intergenic DNA. The two genes of the cluster were arranged in the same orientation, that is, they were transcribed from the same strand. The sixth clone probably represents a third copy of the eggshell gene that is not contained within the 35-kbp region. The 5- end ofmore » the mRNA transcribed from these genes was defined by primer extension directly off the RNA. The ATCAT cap site sequence was homologous to a silkmoth chorion PuTCATT cap site sequence, where Pu indicates any purine. DNA sequence analysis showed that there were no introns in these genes. The DNA sequences of the three genes were very homologous to each other and to a cDNA clone, pSMf61-46, differing only in three or four nucleotices. A multiple TATA box was located at positions -23 to -31, and a CAAAT sequence was located at -52 upstream of the eggshell transcription unit. Comparison of sequences in regions further upstream with silkmoth and Drosophila sequences revealed very short elements that were shared. One such element, TCACGT, recently shown to be an essential cis-regulatory element for silkmoth chorion gene promoter function, was found at a similar position in all three organisms.« less
Equid herpesvirus 8: Complete genome sequence and association with abortion in mares
Garvey, Marie; Suárez, Nicolás M.; Kerr, Karen; Hector, Ralph; Moloney-Quinn, Laura; Arkins, Sean; Davison, Andrew J.
2018-01-01
Equid herpesvirus 8 (EHV-8), formerly known as asinine herpesvirus 3, is an alphaherpesvirus that is closely related to equid herpesviruses 1 and 9 (EHV-1 and EHV-9). The pathogenesis of EHV-8 is relatively little studied and to date has only been associated with respiratory disease in donkeys in Australia and horses in China. A single EHV-8 genome sequence has been generated for strain Wh in China, but is apparently incomplete and contains frameshifts in two genes. In this study, the complete genome sequences of four EHV-8 strains isolated in Ireland between 2003 and 2015 were determined by Illumina sequencing. Two of these strains were isolated from cases of abortion in horses, and were misdiagnosed initially as EHV-1, and two were isolated from donkeys, one with neurological disease. The four genome sequences are very similar to each other, exhibiting greater than 98.4% nucleotide identity, and their phylogenetic clustering together demonstrated that genomic diversity is not dependent on the host. Comparative genomic analysis revealed 24 of the 76 predicted protein sequences are completely conserved among the Irish EHV-8 strains. Evolutionary comparisons indicate that EHV-8 is phylogenetically closer to EHV-9 than it is to EHV-1. In summary, the first complete genome sequences of EHV-8 isolates from two host species over a twelve year period are reported. The current study suggests that EHV-8 can cause abortion in horses. The potential threat of EHV-8 to the horse industry and the possibility that donkeys may act as reservoirs of infection warrant further investigation. PMID:29414990
Lv, Qiang; Chen, Ming; Xu, Haiyan; Song, Yuqin; Sun, Zhihong; Dan, Tong; Sun, Tiansong
2013-07-04
Using the 16S rRNA, dnaA, murC and pyrG gene sequences, we identified the phylogenetic relationship among closely related Leuconostoc citreum species. Seven Leu. citreum strains originally isolated from sourdough were characterized by PCR methods to amplify the dnaA, murC and pyrG gene sequences, which were determined to assess the suitability as phylogenetic markers. Then, we estimated the genetic distance and constructed the phylogenetic trees including 16S rRNA and above mentioned three housekeeping genes combining with published corresponding sequences. By comparing the phylogenetic trees, the topology of three housekeeping genes trees were consistent with that of 16S rRNA gene. The homology of closely related Leu. citreum species among dnaA, murC, pyrG and 16S rRNA gene sequences were different, ranged from75.5% to 97.2%, 50.2% to 99.7%, 65.0% to 99.8% and 98.5% 100%, respectively. The phylogenetic relationship of three housekeeping genes sequences were highly consistent with the results of 16S rRNA gene sequence, while the genetic distance of these housekeeping genes were extremely high than 16S rRNA gene. Consequently, the dnaA, murC and pyrG gene are suitable for classification and identification closely related Leu. citreum species.
Bondre, Vijay P; Sankararaman, Vasudha; Andhare, Vijaysinh; Tupekar, Manisha; Sapkal, Gajanan N
2016-11-01
Human herpes simplex virus 1 (HSV-1) is the most common cause of sporadic encephalitis in humans that contributes to >10 per cent of the encephalitis cases occurring worldwide. Availability of limited full genome sequences from a small number of isolates resulted in poor understanding of host and viral factors responsible for variable clinical outcome. In this study genetic relationship, extent and source of recombination using full-length genome sequence derived from a newly isolated HSV-1 isolate was studied in comparison with those sampled from patients with varied clinical outcome. Full genome sequence of HSV-1 isolated from cerebrospinal fluid (CSF) of a patient with acute encephalitis syndrome (AES) by inoculation in baby hamster kidney-21 (BHK-21) cells was determined using next-generation sequencing (NGS) technology. Phylogenetic analysis of the newly generated sequence in comparison with 33 additional full-length genomes defined genetic relationship with worldwide distributed strains. The bootscan and similarity plot analysis defined recombination crossovers and similarities between newly isolated Indian HSV-1 with six Asian and a total of 34 worldwide isolated strains. Mapping of 376,332 reads amplified from HSV-1 DNA by NGS generated full-length genome of 151,024 bp from newly isolated Indian HSV-1. Phylogenetic analysis classified worldwide distributed strains into three major evolutionary lineages correlating to their geographic distribution. Lineage 1 containing strains were isolated from America and Europe; lineage 2 contained all the strains from Asian countries along with the North American KOS and RE strains whereas the South African isolates were distributed into two groups under lineage 3. Recombination analysis confirmed events of recombination in Indian HSV-1 genome resulting from mixing of different strains evolved in Asian countries. Our results showed that the full-length genome sequence generated from an Indian HSV-1 isolate shared close genetic relationship with the American KOS and Chinese CR38 strains which belonged to the Asian genetic lineage. Recombination analysis of Indian isolate demonstrated multiple recombination crossover points throughout the genome. This full-length genome sequence amplified from the Indian isolate would be helpful to study HSV evolution, genetic basis of differential pathogenesis, host-virus interactions and viral factors contributing towards differential clinical outcome in human infections.
Bondre, Vijay P.; Sankararaman, Vasudha; Andhare, Vijaysinh; Tupekar, Manisha; Sapkal, Gajanan N.
2016-01-01
Background & objectives: Human herpes simplex virus 1 (HSV-1) is the most common cause of sporadic encephalitis in humans that contributes to >10 per cent of the encephalitis cases occurring worldwide. Availability of limited full genome sequences from a small number of isolates resulted in poor understanding of host and viral factors responsible for variable clinical outcome. In this study genetic relationship, extent and source of recombination using full-length genome sequence derived from a newly isolated HSV-1 isolate was studied in comparison with those sampled from patients with varied clinical outcome. Methods: Full genome sequence of HSV-1 isolated from cerebrospinal fluid (CSF) of a patient with acute encephalitis syndrome (AES) by inoculation in baby hamster kidney-21 (BHK-21) cells was determined using next-generation sequencing (NGS) technology. Phylogenetic analysis of the newly generated sequence in comparison with 33 additional full-length genomes defined genetic relationship with worldwide distributed strains. The bootscan and similarity plot analysis defined recombination crossovers and similarities between newly isolated Indian HSV-1 with six Asian and a total of 34 worldwide isolated strains. Results: Mapping of 376,332 reads amplified from HSV-1 DNA by NGS generated full-length genome of 151,024 bp from newly isolated Indian HSV-1. Phylogenetic analysis classified worldwide distributed strains into three major evolutionary lineages correlating to their geographic distribution. Lineage 1 containing strains were isolated from America and Europe; lineage 2 contained all the strains from Asian countries along with the North American KOS and RE strains whereas the South African isolates were distributed into two groups under lineage 3. Recombination analysis confirmed events of recombination in Indian HSV-1 genome resulting from mixing of different strains evolved in Asian countries. Interpretation & conclusions: Our results showed that the full-length genome sequence generated from an Indian HSV-1 isolate shared close genetic relationship with the American KOS and Chinese CR38 strains which belonged to the Asian genetic lineage. Recombination analysis of Indian isolate demonstrated multiple recombination crossover points throughout the genome. This full-length genome sequence amplified from the Indian isolate would be helpful to study HSV evolution, genetic basis of differential pathogenesis, host-virus interactions and viral factors contributing towards differential clinical outcome in human infections. PMID:28361829
Murdock, Andrew G
2008-05-01
Closely related outgroups are optimal for rooting phylogenetic trees; however, such ideal outgroups are not always available. A phylogeny of the marattioid ferns (Marattiaceae), an ancient lineage with no close relatives, was reconstructed using nucleotide sequences of multiple chloroplast regions (rps4 + rps4-trnS spacer, trnS-trnG spacer + trnG intron, rbcL, atpB), from 88 collections, selected to cover the broadest possible range of morphologies and geographic distributions within the extant taxa. Because marattioid ferns are phylogenetically isolated from other lineages, and internal branches are relatively short, rooting was problematic. Root placement was strongly affected by long-branch attraction under maximum parsimony and by model choice under maximum likelihood. A multifaceted approach to rooting was employed to isolate the sources of bias and produce a consensus root position. In a statistical comparison of all possible root positions with three different outgroups, most root positions were not significantly less optimal than the maximum likelihood root position, including the consensus root position. This phylogeny has several important taxonomic implications for marattioid ferns: Marattia in the broad sense is paraphyletic; the Hawaiian endemic Marattia douglasii is most closely related to tropical American taxa; and Angiopteris is monophyletic only if Archangiopteris and Macroglossum are included.
What can comparative genomics tell us about species concepts in the genus Aspergillus?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rokas, Antonis; payne, gary; Federova, Natalie D.
2007-12-15
Understanding the nature of species" boundaries is a fundamental question in evolutionary biology. The availability of genomes from several species of the genus Aspergillus allows us for the first time to examine the demarcation of fungal species at the whole-genome level. Here, we examine four case studies, two of which involve intraspecific comparisons, whereas the other two deal with interspecific genomic comparisons between closely related species. These four comparisons reveal significant variation in the nature of species boundaries across Aspergillus. For example, comparisons between A. fumigatus and Neosartorya fischeri (the teleomorph of A. fischerianus) and between A. oryzae and A.more » flavus suggest that measures of sequence similarity and species-specific genes are significantly higher for the A. fumigatus - N. fischeri pair. Importantly, the values obtained from the comparison between A. oryzae and A. flavus are remarkably similar to those obtained from an intra-specific comparison of A. fumigatus strains, giving support to the proposal that A. oryzae represents a distinct ecotype of A. flavus and not a distinct species. We argue that genomic data can aid Aspergillus taxonomy by serving as a source of novel and unprecedented amounts of comparative data, as a resource for the development of additional diagnostic tools, and finally as a knowledge database about the biological differences between strains and species.« less
Rideout, Jai Ram; He, Yan; Navas-Molina, Jose A; Walters, William A; Ursell, Luke K; Gibbons, Sean M; Chase, John; McDonald, Daniel; Gonzalez, Antonio; Robbins-Pianka, Adam; Clemente, Jose C; Gilbert, Jack A; Huse, Susan M; Zhou, Hong-Wei; Knight, Rob; Caporaso, J Gregory
2014-01-01
We present a performance-optimized algorithm, subsampled open-reference OTU picking, for assigning marker gene (e.g., 16S rRNA) sequences generated on next-generation sequencing platforms to operational taxonomic units (OTUs) for microbial community analysis. This algorithm provides benefits over de novo OTU picking (clustering can be performed largely in parallel, reducing runtime) and closed-reference OTU picking (all reads are clustered, not only those that match a reference database sequence with high similarity). Because more of our algorithm can be run in parallel relative to "classic" open-reference OTU picking, it makes open-reference OTU picking tractable on massive amplicon sequence data sets (though on smaller data sets, "classic" open-reference OTU clustering is often faster). We illustrate that here by applying it to the first 15,000 samples sequenced for the Earth Microbiome Project (1.3 billion V4 16S rRNA amplicons). To the best of our knowledge, this is the largest OTU picking run ever performed, and we estimate that our new algorithm runs in less than 1/5 the time than would be required of "classic" open reference OTU picking. We show that subsampled open-reference OTU picking yields results that are highly correlated with those generated by "classic" open-reference OTU picking through comparisons on three well-studied datasets. An implementation of this algorithm is provided in the popular QIIME software package, which uses uclust for read clustering. All analyses were performed using QIIME's uclust wrappers, though we provide details (aided by the open-source code in our GitHub repository) that will allow implementation of subsampled open-reference OTU picking independently of QIIME (e.g., in a compiled programming language, where runtimes should be further reduced). Our analyses should generalize to other implementations of these OTU picking algorithms. Finally, we present a comparison of parameter settings in QIIME's OTU picking workflows and make recommendations on settings for these free parameters to optimize runtime without reducing the quality of the results. These optimized parameters can vastly decrease the runtime of uclust-based OTU picking in QIIME.
Recognition of Yeast Species from Gene Sequence Comparisons
USDA-ARS?s Scientific Manuscript database
This review discusses recognition of yeast species from gene sequence comparisons, which have been responsible for doubling the number of known yeasts over the past decade. The resolution provided by various single gene sequences is examined for both ascomycetous and basidiomycetous species, and th...
Sequence analysis and expression of the M1 and M2 matrix protein genes of hirame rhabdovirus (HIRRV)
Nishizawa, T.; Kurath, G.; Winton, J.R.
1997-01-01
We have cloned and sequenced a 2318 nucleotide region of the genomic RNA of hirame rhabdovirus (HIRRV), an important viral pathogen of Japanese flounder Paralichthys olivaceus. This region comprises approximately two-thirds of the 3' end of the nucleocapsid protein (N) gene and the complete matrix protein (M1 and M2) genes with the associated intergenic regions. The partial N gene sequence was 812 nucleotides in length with an open reading frame (ORF) that encoded the carboxyl-terminal 250 amino acids of the N protein. The M1 and M2 genes were 771 and 700 nucleotides in length, respectively, with ORFs encoding proteins of 227 and 193 amino acids. The M1 gene sequence contained an additional small ORF that could encode a highly basic, arginine-rich protein of 25 amino acids. Comparisons of the N, M1, and M2 gene sequences of HIRRV with the corresponding sequences of the fish rhabdoviruses, infectious hematopoietic necrosis virus (IHNV) or viral hemorrhagic septicemia virus (VHSV) indicated that HIRRV was more closely related to IHNV than to VHSV, but was clearly distinct from either. The putative consensus gene termination sequence for IHNV and VHSV, AGAYAG(A)(7), was present in the N-M1, M1-M2, and M2-G intergenic regions of HIRRV as were the putative transcription initiation sequences YGGCAC and AACA. An Escherichia coli expression system was used to produce recombinant proteins from the M1 and M2 genes of HIRRV. These were the same size as the authentic M1 and M2 proteins and reacted with anti-HIRRV rabbit serum in western blots. These reagents can be used for further study of the fish immune response and to test novel control methods.
The limits of protein sequence comparison?
Pearson, William R; Sierk, Michael L
2010-01-01
Modern sequence alignment algorithms are used routinely to identify homologous proteins, proteins that share a common ancestor. Homologous proteins always share similar structures and often have similar functions. Over the past 20 years, sequence comparison has become both more sensitive, largely because of profile-based methods, and more reliable, because of more accurate statistical estimates. As sequence and structure databases become larger, and comparison methods become more powerful, reliable statistical estimates will become even more important for distinguishing similarities that are due to homology from those that are due to analogy (convergence). The newest sequence alignment methods are more sensitive than older methods, but more accurate statistical estimates are needed for their full power to be realized. PMID:15919194
Whistle sequences in wild killer whales (Orcinus orca).
Riesch, Rüdiger; Ford, John K B; Thomsen, Frank
2008-09-01
Combining different stereotyped vocal signals into specific sequences increases the range of information that can be transferred between individuals. The temporal emission pattern and the behavioral context of vocal sequences have been described in detail for a variety of birds and mammals. Yet, in cetaceans, the study of vocal sequences is just in its infancy. Here, we provide a detailed analysis of sequences of stereotyped whistles in killer whales off Vancouver Island, British Columbia. A total of 1140 whistle transitions in 192 whistle sequences recorded from resident killer whales were analyzed using common spectrographic analysis techniques. In addition to the stereotyped whistles described by Riesch et al., [(2006). "Stability and group specificity of stereotyped whistles in resident killer whales, Orcinus orca, off British Columbia," Anim. Behav. 71, 79-91.] We found a new and rare stereotyped whistle (W7) as well as two whistle elements, which are closely linked to whistle sequences: (1) stammers and (2) bridge elements. Furthermore, the frequency of occurrence of 12 different stereotyped whistle types within the sequences was not randomly distributed and the transition patterns between whistles were also nonrandom. Finally, whistle sequences were closely tied to close-range behavioral interactions (in particular among males). Hence, we conclude that whistle sequences in wild killer whales are complex signal series and propose that they are most likely emitted by single individuals.
Hornok, Sándor; Estrada-Peña, Agustín; Kontschán, Jenő; Plantard, Olivier; Kunz, Bernd; Mihalca, Andrei D; Thabah, Adora; Tomanović, Snežana; Burazerović, Jelena; Takács, Nóra; Görföl, Tamás; Estók, Péter; Tu, Vuong Tan; Szőke, Krisztina; Fernández de Mera, Isabel G; de la Fuente, José; Takahashi, Mamoru; Yamauchi, Takeo; Takano, Ai
2015-09-17
Phylogeographical studies allow precise genetic comparison of specimens, which were collected over large geographical ranges and belong to the same or closely related animal species. These methods have also been used to compare ticks of veterinary-medical importance. However, relevant data are missing in the case of ixodid ticks of bats, despite (1) the vast geographical range of both Ixodes vespertilionis and Ixodes simplex, and (2) the considerable uncertainty in their taxonomy, which is currently unresolvable by morphological clues. In the present study 21 ticks were selected from collections or were freshly removed from bats or cave walls in six European and four Asian countries. The DNA was extracted and PCRs were performed to amplify part of the cytochrome oxidase I (COI), 16S and 12S rDNA genes, followed by sequencing for identification and molecular-phylogenetic comparison. No morphological differences were observed between Ixodes vespertilionis specimens from Spain and from other parts of Europe, but corresponding genotypes had only 94.6 % COI sequence identity. An I. vespertilionis specimen collected in Vietnam was different both morphologically and genetically (i.e. with only 84.1 % COI sequence identity in comparison with I. vespertilionis from Europe). Two ticks (collected in Vietnam and in Japan) formed a monophyletic clade and shared morphological features with I. ariadnae, recently described and hitherto only reported in Europe. In addition, two Asiatic specimens of I. simplex were shown to differ markedly from European genotypes of the same species. Phylogenetic relationships of ticks showed similar clustering patterns with those of their associated bat host species. Although all three ixodid bat tick species evaluated in the present study appear to be widespread in Eurasia, they exhibit pronounced genetic differences. Data of this study also reflect that I. vespertilionis may represent a species complex.
Medzihradszky, K F; Gibson, B W; Kaur, S; Yu, Z H; Medzihradszky, D; Burlingame, A L; Bass, N M
1992-02-01
The primary structure of a fatty-acid-binding protein (FABP) isolated from the liver of the nurse shark (Ginglymostoma cirratum) was determined by high-performance tandem mass spectrometry (employing multichannel array detection) and Edman degradation. Shark liver FABP consists of 132 amino acids with an acetylated N-terminal valine. The chemical molecular mass of the intact protein determined by electrospray ionization mass spectrometry (Mr = 15124 +/- 2.5) was in good agreement with that calculated from the amino acid sequence (Mr = 15121.3). The amino acid sequence of shark liver FABP displays significantly greater similarity to the FABP expressed in mammalian heart, peripheral nerve myelin and adipose tissue (61-53% sequence similarity) than to the FABP expressed in mammalian liver (22% similarity). Phylogenetic trees derived from the comparison of the shark liver FABP amino acid sequence with the members of the mammalian fatty-acid/retinoid-binding protein gene family indicate the initial divergence of an ancestral gene into two major subfamilies: one comprising the genes for mammalian liver FABP and gastrotropin, the other comprising the genes for mammalian cellular retinol-binding proteins I and II, cellular retinoic-acid-binding protein myelin P2 protein, adipocyte FABP, heart FABP and shark liver FABP, the latter having diverged from the ancestral gene that ultimately gave rise to the present day mammalian heart-FABP, adipocyte FABP and myelin P2 protein sequences. The sequence for intestinal FABP from the rat could be assigned to either subfamily, depending on the approach used for phylogenetic tree construction, but clearly diverged at a relatively early evolutionary time point. Indeed, sequences proximately ancestral or closely related to mammalian intestinal FABP, liver FABP, gastrotropin and the retinoid-binding group of proteins appear to have arisen prior to the divergence of shark liver FABP and should therefore also be present in elasmobranchs. The presence in shark liver of an FABP which differs substantially in primary structure from mammalian liver FABP, while being closely related to the FABP expressed in mammalian heart muscle, peripheral nerve myelin and adipocytes, opens a further dimension regarding the question of the existence of structure-dependent and tissue-specific specialization of FABP function in lipid metabolism.
Genetic Comparison of B. Anthracis and its Close Relatives Using AFLP and PCR Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jackson, P.J.; Hill, K.K.; Laker, M.T.
1999-02-01
Amplified Fragment length Polymorphism (AFLP) analysis allows a rapid, relatively simple analysis of a large portion of a microbial genome, providing information about the species and its phylogenetic relationship to other microbes (Vos, et al., 1995). The method simply surveys the genome for length and sequence polymorphisms. The pattern identified can be used for comparison to the genomes of other species. Unlike other methods, it does not rely on analysis of a single genetic locus that may bias the interpretation of results and it does not require any prior knowledge of the targeted organism. Moreover, a standard set of reagentsmore » can be applied to any species without using species-specific information or molecular probes. The authors are using AFLP's to rapidly identify different bacterial species. A comparison of AFLP profiles generated from a large battery of B. anthracis strains shows very little variability among different isolates (Keim, et al., 1997). By contrast, there is a significant difference between AFLP profiles generated for any B. anthracis strain and even the most closely related Bacillus species. Sufficient variability is apparent among all known microbial species to allow phylogenetic analysis based on large numbers of genetically unlinked loci. These striking differences among AFLP profiles allow unambiguous identification of previously identified species and phylogenetic placement of newly characterized isolates relative to known species based on a large number of independent genetic loci. Data generated thus far show that the method provides phylogenetic analyses that are consistent with other widely accepted phylogenetic methods. However, AFLP analysis provides a more detailed analysis of the targets and samples a much larger portion of the genome. Consequently, it provides an inexpensive, rapid means of characterizing microbial isolates to further differentiate among strains and closely related microbial species. Such information cannot be rapidly generated by other means. AFLP sample analysis quickly generates a very large amount of molecular information about microbial genomes. However, this information cannot be analyzed rapidly using manual methods. The authors are developing a large archive of electronic AFLP signatures that is being used to identify isolates collected from medical, veterinary, forensic and environmental samples. They are also developing the computational packages necessary to rapidly and unambiguously analyze the AFLP profiles and conduct a phylogenetic comparison of these data relative to information already in the database. They will use this archive and the associated algorithms to determine the species identity of previously uncharacterized isolates and place them phylogenetically relative to other microbes based on their AFLP signatures. This study provides significant new information about microbes with environmental, veterinary and medical significance. This information can be used in further studies to understand the relationships among these species and the factors that distinguish them from one another. It should also allow identification of unique factors that contribute to important microbial traits including pathogenicity and virulence. They are also using AFLP data to identify, isolate and sequence DNA fragments that are unique to particular microbial species and strains. The fragment patterns and sequence information provide insights into the complexity and organization of bacterial genomes relative to one another. They also provide the information necessary for development of species-specific PCR primers that can be used to interrogate complex samples for the presence of B. anthracis, other microbial pathogens or their remnants.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curry, J J; Gallagher, D W; Modarres, M
Appendices are presented concerning isolation condenser makeup; vapor suppression system; station air system; reactor building closed cooling water system; turbine building secondary closed water system; service water system; emergency service water system; fire protection system; emergency ac power; dc power system; event probability estimation; methodology of accident sequence quantification; and assignment of dominant sequences to release categories.
33 CFR 117.1035 - Columbia River.
Code of Federal Regulations, 2010 CFR
2010-07-01
... the remote control stations located at the ends of the bridge. Operation of the bridge shall be as... visually inspect the waterway for marine traffic approaching the bridge. The closing sequence shall not be activated until after marine traffic has cleared the bridge. (3) When the closing sequence is activated, the...
33 CFR 117.1035 - Columbia River.
Code of Federal Regulations, 2011 CFR
2011-07-01
... the remote control stations located at the ends of the bridge. Operation of the bridge shall be as... visually inspect the waterway for marine traffic approaching the bridge. The closing sequence shall not be activated until after marine traffic has cleared the bridge. (3) When the closing sequence is activated, the...
A report on the outbreak of Zika virus on Easter Island, South Pacific, 2014.
Tognarelli, J; Ulloa, S; Villagra, E; Lagos, J; Aguayo, C; Fasce, R; Parra, B; Mora, J; Becerra, N; Lagos, N; Vera, L; Olivares, B; Vilches, M; Fernández, J
2016-03-01
Zika virus (ZIKV) is an emerging mosquito-borne flavivirus circulating in Asia and Africa. In 2013, a large outbreak was reported on the archipelago of French Polynesia. In this study, we report the detection and molecular characterization of Zika virus for the first time in Chile from an outbreak among the inhabitants of Easter Island. A total of 89 samples from patients suspected of having ZIKV infection were collected between the period from January to May, 2014. Molecular diagnosis of the virus was performed by RT-PCR followed by the sequencing of the region containing the NS5 gene. A comparison of the viral nucleic acid sequence with those of other strains of ZIKA virus was performed using the MEGA software. Fifty-one samples were found positive for ZIKV by RT-PCR analysis. Further analysis of the NS5 gene revealed that the ZIKV strains identified in Easter Island were most closely related to those found in French Polynesia (99.8 to 99.9% nt and 100% aa sequence identity). These results strongly suggest that the transmission pathway leading to the introduction of Zika virus on Easter Island has its origin in French Polynesia.
A comprehensive molecular cytogenetic analysis of chromosome rearrangements in gibbons
Capozzi, Oronzo; Carbone, Lucia; Stanyon, Roscoe R.; Marra, Annamaria; Yang, Fengtang; Whelan, Christopher W.; de Jong, Pieter J.; Rocchi, Mariano; Archidiacono, Nicoletta
2012-01-01
Chromosome rearrangements in small apes are up to 20 times more frequent than in most mammals. Because of their complexity, the full extent of chromosome evolution in these hominoids is not yet fully documented. However, previous work with array painting, BAC-FISH, and selective sequencing in two of the four karyomorphs has shown that high-resolution methods can precisely define chromosome breakpoints and map the complex flow of evolutionary chromosome rearrangements. Here we use these tools to precisely define the rearrangements that have occurred in the remaining two karyomorphs, genera Symphalangus (2n = 50) and Hoolock (2n = 38). This research provides the most comprehensive insight into the evolutionary origins of chromosome rearrangements involved in transforming small apes genome. Bioinformatics analyses of the human–gibbon synteny breakpoints revealed association with transposable elements and segmental duplications, providing some insight into the mechanisms that might have promoted rearrangements in small apes. In the near future, the comparison of gibbon genome sequences will provide novel insights to test hypotheses concerning the mechanisms of chromosome evolution. The precise definition of synteny block boundaries and orientation, chromosomal fusions, and centromere repositioning events presented here will facilitate genome sequence assembly for these close relatives of humans. PMID:22892276
Grzes, M; Nowacka-Woszuk, J; Szczerbal, I; Czerwinska, J; Gracz, J; Switonski, M
2009-01-01
The gene encoding myostatin (MSTN), due to its crucial function for growth of skeletal muscle mass, is an important candidate for muscularity. In this study we analyzed the nucleotide sequence and FISH localization of this gene in 4 canids, including 3 farm species. The nucleotide sequence of the MSTN coding fragment turned out to be highly conserved, since its identity among the studied species was very high and varied between 99.4 and 99.7%. Only 1, widely spread, silent single nucleotide polymorphism (SNP) was found in exon 1 of the Chinese raccoon dog. The MSTN gene was localized close to the centromere in one-armed chromosomes of the dog (37q11) and bi-armed chromosomes of the red fox (16p11) and arctic fox (10q11), with an exception of the Chinese raccoon dog chromosome (2q14-q21). This chromosome is orthologous to 3 canine chromosomes and thus the MSTN was found more interstitially. Our results are in agreement with the hypothesis that karyotypes of the canids evolved mainly through centric fusion/fission events, while tandem fusions occurred rarely. (c) 2009 S. Karger AG, Basel.
Hayashi, Kei; Mohanta, Uday K; Ohari, Yuma; Neeraja, Tambireddy; Singh, T Shantikumar; Sugiyama, Hiromu; Itagaki, Tadashi
2016-12-01
The aim of this study was to analyze the phylogenetic relationship between Explanatum explanatum populations in India and other countries of the Indian subcontinent. Seventy liver amphistomes collected from four localities in India were identified as E. explanatum based on the nucleotide sequences of ribosomal ITS2. The flukes were then analyzed phylogenetically based on the nucleotide sequence of the mitochondrial gene nad1 in comparison with flukes from Bangladesh and Nepal. In the resulting phylogenetic tree, the nad1 haplotypes from India were divided into four clades, and the flukes showing the haplotypes of clades A and C were predominant in India. The haplotypes of the clades A and C have also been detected in Bangladesh and Nepal, and therefore, it seems they occur commonly throughout the Indian subcontinent. The results of AMOVA suggested that gene flow was likely to occur between E. explanatum populations in these countries. These countries are geographically close and have been historically and culturally connected to each other, and therefore, the movements of host ruminants among these countries might have been involved in the migration of the flukes and their gene flow.
Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins.
Raimondi, Daniele; Orlando, Gabriele; Pancsa, Rita; Khan, Taushif; Vranken, Wim F
2017-08-18
Protein folding is a complex process that can lead to disease when it fails. Especially poorly understood are the very early stages of protein folding, which are likely defined by intrinsic local interactions between amino acids close to each other in the protein sequence. We here present EFoldMine, a method that predicts, from the primary amino acid sequence of a protein, which amino acids are likely involved in early folding events. The method is based on early folding data from hydrogen deuterium exchange (HDX) data from NMR pulsed labelling experiments, and uses backbone and sidechain dynamics as well as secondary structure propensities as features. The EFoldMine predictions give insights into the folding process, as illustrated by a qualitative comparison with independent experimental observations. Furthermore, on a quantitative proteome scale, the predicted early folding residues tend to become the residues that interact the most in the folded structure, and they are often residues that display evolutionary covariation. The connection of the EFoldMine predictions with both folding pathway data and the folded protein structure suggests that the initial statistical behavior of the protein chain with respect to local structure formation has a lasting effect on its subsequent states.
MHC class I diversity in chimpanzees and bonobos.
Maibach, Vincent; Hans, Jörg B; Hvilsom, Christina; Marques-Bonet, Tomas; Vigilant, Linda
2017-10-01
Major histocompatibility complex (MHC) class I genes are critically involved in the defense against intracellular pathogens. MHC diversity comparisons among samples of closely related taxa may reveal traces of past or ongoing selective processes. The bonobo and chimpanzee are the closest living evolutionary relatives of humans and last shared a common ancestor some 1 mya. However, little is known concerning MHC class I diversity in bonobos or in central chimpanzees, the most numerous and genetically diverse chimpanzee subspecies. Here, we used a long-read sequencing technology (PacBio) to sequence the classical MHC class I genes A, B, C, and A-like in 20 and 30 wild-born bonobos and chimpanzees, respectively, with a main focus on central chimpanzees to assess and compare diversity in those two species. We describe in total 21 and 42 novel coding region sequences for the two species, respectively. In addition, we found evidence for a reduced MHC class I diversity in bonobos as compared to central chimpanzees as well as to western chimpanzees and humans. The reduced bonobo MHC class I diversity may be the result of a selective process in their evolutionary past since their split from chimpanzees.
Morphological and phylogenetic comparisons amongst powdery mildews on Catalpa in the UK.
Cook, Roger T A; Henricot, Béatrice; Henrici, Alick; Beales, Paul
2006-06-01
Three species of powdery mildew, Erysiphe elevata, E. catalpae, and Neoerysiphe galeopsidis were identified on Catalpa species in England in 2004. A new disease record, N. galeopsidis was the first Catalpa mildew to appear (in June), but it was later out-competed by E. elevata that caused the most serious damage. Both mildews also attacked C. speciosa, C. xerubescens and a new host, xChitalpa tashkentensis, a Chilopsis xCatalpa hybrid. No powdery mildew was detected on C. bungei, C. ovata, or C. fargesii. Identifications of the pathogens using morphological data were fully supported by DNA analysis yielding characteristic rDNA ITS sequences. The sequences placed E. catalpae within the E. aquilegiae clade. The sequences for E. elevata from southern England and France closely matched those from Hungary and North America, confirming the recent spread of this pathogen from the USA. It eventually overran N. galeopsidis and its sudden appearance in the UK could be due to greater aggressiveness and to the production of more ascomata especially during autumns with delayed leaf fall as in 2001. A further species, Oidium hiratae (i.e. Podosphaera sp.), though described from a 1978 UK collection on C. bignonioides, was not detected in the field.
rVISTA 2.0: Evolutionary Analysis of Transcription Factor Binding Sites
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loots, G G; Ovcharenko, I
2004-01-28
Identifying and characterizing the patterns of DNA cis-regulatory modules represents a challenge that has the potential to reveal the regulatory language the genome uses to dictate transcriptional dynamics. Several studies have demonstrated that regulatory modules are under positive selection and therefore are often conserved between related species. Using this evolutionary principle we have created a comparative tool, rVISTA, for analyzing the regulatory potential of noncoding sequences. The rVISTA tool combines transcription factor binding site (TFBS) predictions, sequence comparisons and cluster analysis to identify noncoding DNA regions that are highly conserved and present in a specific configuration within an alignment. Heremore » we present the newly developed version 2.0 of the rVISTA tool that can process alignments generated by both zPicture and PipMaker alignment programs or use pre-computed pairwise alignments of seven vertebrate genomes available from the ECR Browser. The rVISTA web server is closely interconnected with the TRANSFAC database, allowing users to either search for matrices present in the TRANSFAC library collection or search for user-defined consensus sequences. rVISTA tool is publicly available at http://rvista.dcode.org/.« less
Martin, Donald S; Wright, André-Denis G; Barta, John R; Desser, Sherwin S
2002-06-01
Phylogenetic relationships within the kinetoplastid flagellates were inferred from comparisons of small-subunit ribosomal RNA gene sequences. These included 5 new gene sequences, Trypanosoma fallisi (2,239 bp), Trypanosoma chattoni (2,180 bp), Trypanosoma mega (2,211 bp), Trypanosoma neveulemairei (2,197 bp), and Trypanosoma ranarum (2,203 bp). Trees produced using maximum-parsimony and distance-matrix methods (least-squares, neighbor-joining, and maximum-likelihood), supported by strong bootstrap and quartet-puzzle analyses, indicated that the trypanosomes are a monophyletic group that divides into 2 major lineages, the salivarian trypanosomes and the nonsalivarian trypanosomes. The nonsalivarian trypanosomes further divide into 2 lineages, 1 containing trypanosomes of birds, mammals, and reptiles and the other containing trypanosomes of fish, reptiles, and anurans. Among the giant trypanosomes, T. chattoni is clearly shown to be distantly related to all the other anuran trypanosome species. Trypanosoma mega is closely associated with T. fallisi and T. ranarum, whereas T. neveulemairei and Trypanosoma rotatorium are sister taxa. The branching order of the anuran trypanosomes suggests that some toad trypanosomes may have evolved by host switching from frogs to toads.
Test equality in binary data for a 4 × 4 crossover trial under a Latin-square design.
Lui, Kung-Jong; Chang, Kuang-Chao
2016-10-15
When there are four or more treatments under comparison, the use of a crossover design with a complete set of treatment-receipt sequences in binary data is of limited use because of too many treatment-receipt sequences. Thus, we may consider use of a 4 × 4 Latin square to reduce the number of treatment-receipt sequences when comparing three experimental treatments with a control treatment. Under a distribution-free random effects logistic regression model, we develop simple procedures for testing non-equality between any of the three experimental treatments and the control treatment in a crossover trial with dichotomous responses. We further derive interval estimators in closed forms for the relative effect between treatments. To evaluate the performance of these test procedures and interval estimators, we employ Monte Carlo simulation. We use the data taken from a crossover trial using a 4 × 4 Latin-square design for studying four-treatments to illustrate the use of test procedures and interval estimators developed here. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Corcoran, Martin M.; Phad, Ganesh E.; Bernat, Néstor Vázquez; Stahl-Hennig, Christiane; Sumida, Noriyuki; Persson, Mats A. A.; Martin, Marcel; Hedestam, Gunilla B. Karlsson
2016-12-01
Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool.
A Comprehensive Curation Shows the Dynamic Evolutionary Patterns of Prokaryotic CRISPRs.
Mai, Guoqin; Ge, Ruiquan; Sun, Guoquan; Meng, Qinghan; Zhou, Fengfeng
2016-01-01
Motivation. Clustered regularly interspaced short palindromic repeat (CRISPR) is a genetic element with active regulation roles for foreign invasive genes in the prokaryotic genomes and has been engineered to work with the CRISPR-associated sequence (Cas) gene Cas9 as one of the modern genome editing technologies. Due to inconsistent definitions, the existing CRISPR detection programs seem to have missed some weak CRISPR signals. Results. This study manually curates all the currently annotated CRISPR elements in the prokaryotic genomes and proposes 95 updates to the annotations. A new definition is proposed to cover all the CRISPRs. The comprehensive comparison of CRISPR numbers on the taxonomic levels of both domains and genus shows high variations for closely related species even in the same genus. The detailed investigation of how CRISPRs are evolutionarily manipulated in the 8 completely sequenced species in the genus Thermoanaerobacter demonstrates that transposons act as a frequent tool for splitting long CRISPRs into shorter ones along a long evolutionary history.
Corcoran, Martin M.; Phad, Ganesh E.; Bernat, Néstor Vázquez; Stahl-Hennig, Christiane; Sumida, Noriyuki; Persson, Mats A.A.; Martin, Marcel; Hedestam, Gunilla B. Karlsson
2016-01-01
Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool. PMID:27995928
Nguyen, Khuong B.; Shapiro-Ilan, David I.; Fuxa, James R.; Wood, Bruce W.; Bertolotti, Maria A.; Adams, Byron J.
2006-01-01
Two Steinernema isolates found in Louisiana and Mississippi were later identified as isolates of S. rarum. DNA sequences of ITS regions of the United States isolates are identical with sequences of Argentinean S. rarum strains Samiento and Noetinger and differ by two bases from the Arroyo Cabral isolate from Córdoba, Argentina. SEM observations revealed several new structures in the isolates from the US: female face views have a hexagonal-star perioral disc and eye-shaped lips; some females do not have cephalic papillae; lateral fields of infective juveniles are variable; there are two openings observed close to the posterior edge of the cloaca. Virulence of the US isolates to Anthonomus grandis, Diaprepes abbreviatus, Solenopsis invicta, Coptotermes formosanus, Agrotis ipsilon, Spodoptera frugiperda, and Trichoplusia ni and reproductive potential were evaluated in comparison with other heterorhabditid and steinernematid nematodes. Results such as particularly high virulence to S. frugiperda indicate that the biocontrol potential of the new S. rarum strains merits further study. PMID:19259427
Suárez-Castillo, Edna C; Medina-Ortíz, Wanda E; Roig-López, José L; García-Arrarás, José E
2004-06-09
We report the characterization of an ependymin-related gene (EpenHg) from a regenerating intestine cDNA library of the sea cucumber Holothuria glaberrima. This finding is remarkable because no ependymin sequence has ever been reported from invertebrates. Database comparisons of the conceptual translation of the EpenHg gene reveal 63% similarity (47% identity) with mammalian ependymin-related proteins (MERPs) and close relationship with the frog and piscine ependymins. We also report the partial sequences of ependymin representatives from another species of sea cucumber and from a sea urchin species. Conventional and real-time reverse transcriptase polymerase chain reaction (RT-PCRs) show that the gene is expressed in several echinoderm tissues, including esophagus, mesenteries, gonads, respiratory trees, hemal system, tentacles and body wall. Moreover, the ependymin product in the intestine is overexpressed during sea cucumber intestinal regeneration. The discovery of ependymins in echinoderms, a group well known for their regenerative capacities, can give us an insight on the evolution and roles of ependymin molecules.
Analyses of pig genomes provide insight into porcine demography and evolution
Groenen, Martien A. M.; Archibald, Alan L.; Uenishi, Hirohide; Tuggle, Christopher K.; Takeuchi, Yasuhiro; Rothschild, Max F.; Rogel-Gaillard, Claire; Park, Chankyu; Milan, Denis; Megens, Hendrik-Jan; Li, Shengting; Larkin, Denis M.; Kim, Heebal; Frantz, Laurent A. F.; Caccamo, Mario; Ahn, Hyeonju; Aken, Bronwen L.; Anselmo, Anna; Anthon, Christian; Auvil, Loretta; Badaoui, Bouabid; Beattie, Craig W.; Bendixen, Christian; Berman, Daniel; Blecha, Frank; Blomberg, Jonas; Bolund, Lars; Bosse, Mirte; Botti, Sara; Bujie, Zhan; Bystrom, Megan; Capitanu, Boris; Silva, Denise Carvalho; Chardon, Patrick; Chen, Celine; Cheng, Ryan; Choi, Sang-Haeng; Chow, William; Clark, Richard C.; Clee, Christopher; Crooijmans, Richard P. M. A.; Dawson, Harry D.; Dehais, Patrice; De Sapio, Fioravante; Dibbits, Bert; Drou, Nizar; Du, Zhi-Qiang; Eversole, Kellye; Fadista, João; Fairley, Susan; Faraut, Thomas; Faulkner, Geoffrey J.; Fowler, Katie E.; Fredholm, Merete; Fritz, Eric; Gilbert, James G. R.; Giuffra, Elisabetta; Gorodkin, Jan; Griffin, Darren K.; Harrow, Jennifer L.; Hayward, Alexander; Howe, Kerstin; Hu, Zhi-Liang; Humphray, Sean J.; Hunt, Toby; Hornshøj, Henrik; Jeon, Jin-Tae; Jern, Patric; Jones, Matthew; Jurka, Jerzy; Kanamori, Hiroyuki; Kapetanovic, Ronan; Kim, Jaebum; Kim, Jae-Hwan; Kim, Kyu-Won; Kim, Tae-Hun; Larson, Greger; Lee, Kyooyeol; Lee, Kyung-Tai; Leggett, Richard; Lewin, Harris A.; Li, Yingrui; Liu, Wansheng; Loveland, Jane E.; Lu, Yao; Lunney, Joan K.; Ma, Jian; Madsen, Ole; Mann, Katherine; Matthews, Lucy; McLaren, Stuart; Morozumi, Takeya; Murtaugh, Michael P.; Narayan, Jitendra; Nguyen, Dinh Truong; Ni, Peixiang; Oh, Song-Jung; Onteru, Suneel; Panitz, Frank; Park, Eung-Woo; Park, Hong-Seog; Pascal, Geraldine; Paudel, Yogesh; Perez-Enciso, Miguel; Ramirez-Gonzalez, Ricardo; Reecy, James M.; Zas, Sandra Rodriguez; Rohrer, Gary A.; Rund, Lauretta; Sang, Yongming; Schachtschneider, Kyle; Schraiber, Joshua G.; Schwartz, John; Scobie, Linda; Scott, Carol; Searle, Stephen; Servin, Bertrand; Southey, Bruce R.; Sperber, Goran; Stadler, Peter; Sweedler, Jonathan V.; Tafer, Hakim; Thomsen, Bo; Wali, Rashmi; Wang, Jian; Wang, Jun; White, Simon; Xu, Xun; Yerle, Martine; Zhang, Guojie; Zhang, Jianguo; Zhang, Jie; Zhao, Shuhong; Rogers, Jane; Churcher, Carol; Schook, Lawrence B.
2013-01-01
For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars ~1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model. PMID:23151582
Lu, Xin; Liang, Weili; Wang, Yunduan; Xu, Jialiang
2014-01-01
Vibrio fluvialis is an important food-borne pathogen that causes diarrheal illness and sometimes extraintestinal infections in humans. In this study, we sequenced the genome of a clinical V. fluvialis strain and determined its phylogenetic relationships with other Vibrio species by comparative genomic analysis. We found that the closest relationship was between V. fluvialis and V. furnissii, followed by those with V. cholerae and V. mimicus. Moreover, based on genome comparisons and gene complementation experiments, we revealed genetic mechanisms of the biochemical tests that differentiate V. fluvialis from closely related species. Importantly, we identified a variety of genes encoding potential virulence factors, including multiple hemolysins, transcriptional regulators, and environmental survival and adaptation apparatuses, and the type VI secretion system, which is indicative of complex regulatory pathways modulating pathogenesis in this organism. The availability of V. fluvialis genome sequences may promote our understanding of pathogenic mechanisms for this emerging pathogen. PMID:24441165
Chen, Yawen; Shen, Xuemei; Peng, Huasong; Hu, Hongbo; Wang, Wei; Zhang, Xuehong
2015-01-01
Pseudomonas chlororaphis HT66, a plant growth-promoting rhizobacterium that produces phenazine-1-carboxamide with high yield, was compared with three genomic sequenced P. chlororaphis strains, GP72, 30–84 and O6. The genome sizes of four strains vary from 6.66 to 7.30 Mb. Comparisons of predicted coding sequences indicated 4833 conserved genes in 5869–6455 protein-encoding genes. Phylogenetic analysis showed that the four strains are closely related to each other. Its competitive colonization indicates that P. chlororaphis can adapt well to its environment. No virulence or virulence-related factor was found in P. chlororaphis. All of the four strains could synthesize antimicrobial metabolites including different phenazines and insecticidal protein FitD. Some genes related to the regulation of phenazine biosynthesis were detected among the four strains. It was shown that P. chlororaphis is a safe PGPR in agricultural application and could also be used to produce some phenazine antibiotics with high-yield. PMID:26484173
Becságh, Péter; Szakács, Orsolya
2014-10-01
During diagnostic workflow when detecting sequence alterations, sometimes it is important to design an algorithm that includes screening and direct tests in combination. Normally the use of direct test, which is mainly sequencing, is limited. There is an increased need for effective screening tests, with "closed tube" during the whole process and therefore decreasing the risk of PCR product contamination. The aim of this study was to design such a closed tube, detection probe based screening assay to detect different kind of sequence alterations in the exon 11 of the human c-kit gene region. Inside this region there are variable possible deletions and single nucleotide changes. During assay setup, more probe chemistry formats were screened and tested. After some optimization steps the taqman probe format was selected.
ERIC Educational Resources Information Center
Noell, George H.; Gresham, Frank M.
2001-01-01
Describes design logic and potential uses of a variant of the multiple-baseline design. The multiple-baseline multiple-sequence (MBL-MS) consists of multiple-baseline designs that are interlaced with one another and include all possible sequences of treatments. The MBL-MS design appears to be primarily useful for comparison of treatments taking…
Liu, Nian; Huang, Yuan
2010-01-01
The complete 15,599-bp mitogenome of Acrida cinerea was determined and compared with that of the other 20 orthopterans. It displays characteristic gene content, genome organization, nucleotide composition, and codon usage found in other Caelifera mitogenomes. Comparison of 21 orthopteran sequences revealed that the tRNAs encoded by the H-strand appear more conserved than those by the L-stand. All tRNAs form the typical clover-leaf structure except trnS (agn), and most of the size variation among tRNAs stemmed from the length variation in the arm and loop of TΨC and the loop of DHU. The derived secondary structure models of the rrnS and rrnL from 21 orthoptera species closely resemble those from other insects on CRW except a considerably enlarged loop of helix 1399 of rrnS in Caelifera, which is a potentially autapomorphy of Caelifera. In the A+T-rich region, tandem repeats are not only conserved in the closely related mitogenome but also share some conserved motifs in the same subfamily. A stem-loop structure, 16 bp or longer, is likely to be involved in replication initiation in Caelifera and Grylloidea. A long T-stretch (>17 bp) with conserved stem-loop structure next to rrnS on the H-strand, bounded by a purine at either end, exists in the three species from Tettigoniidae. PMID:21197069
Emergence of Arctic-like Rabies Lineage in India
Turner, Geoff; Paul, Joel P. V.; Madhusudana, Shampur N.; Wandeler, Alexander I.
2007-01-01
A collection of 37 rabies-infected samples, 10 human saliva and 27 animal brain, were recovered during 2001–2004 from the cities of Bangalore and Hyderabad in southern India and from Kasauli, a mountainous region in Himachal Pradesh, northern India. Phylogenetic analysis of partial N gene nucleotide sequences of these 37 specimens and 1 archival specimen identified 2 groups, divided according to their geographic (north or south) origins. Comparison of selected Indian viruses with representative rabies viruses recovered worldwide showed a close association of all Indian isolates with the circumpolar Arctic rabies lineage distributed throughout northern latitudes of North America and Europe and other viruses recovered from several Asian countries. PMID:17370523
Genetic heterogeneity of hepatitis E virus in Darfur, Sudan, and neighboring Chad.
Nicand, Elisabeth; Armstrong, Gregory L; Enouf, Vincent; Guthmann, Jean Paul; Guerin, Jean-Philippe; Caron, Mélanie; Nizou, Jacques Yves; Andraghetti, Roberta
2005-12-01
The within-outbreak diversity of hepatitis E virus (HEV) was studied during the outbreak of hepatitis E that occurred in Sudan in 2004. Specimens were collected from internally displaced persons living in a Sudanese refugee camp and two camps implanted in Chad. A comparison of the sequences in the ORF2 region of 23 Sudanese isolates and five HEV samples from the two Chadian camps displayed a high similarity (>99.7%) to strains belonging to Genotype 1. But four isolates collected in one of the Chadian camps were close to Genotype 2. Circulation of divergent strains argues for possible multiple sources of infection. Copyright (c) 2005 Wiley-Liss, inc.
Arai, Satoru; Gu, Se Hun; Baek, Luck Ju; Tabara, Kenji; Bennett, Shannon; Oh, Hong-Shik; Takada, Nobuhiro; Kang, Hae Ji; Tanaka-Taya, Keiko; Morikawa, Shigeru; Okabe, Nobuhiko; Yanagihara, Richard; Song, Jin-Won
2012-01-01
Spurred by the recent isolation of a novel hantavirus, named Imjin virus (MJNV), from the Ussuri white-toothed shrew (Crocidura lasiura), targeted trapping was conducted for the phylogenetically related Asian lesser white-toothed shrew (Crocidura shantungensis). Pair-wise alignment and comparison of the S, M and L segments of a newfound hantavirus, designated Jeju virus (JJUV), indicated remarkably low nucleotide and amino acid sequence similarity with MJNV. Phylogenetic analyses, using maximum likelihood and Bayesian methods, showed divergent ancestral lineages for JJUV and MJNV, despite the close phylogenetic relationship of their reservoir soricid hosts. Also, no evidence of host switching was apparent in tanglegrams, generated by TreeMap 2.0β. PMID:22230701
Register, Karen B; Ivanov, Yury V; Harvill, Eric T; Davison, Nick; Foster, Geoffrey
2015-03-01
During a succession of phocine morbillivirus outbreaks spanning the past 25 years, Bordetella bronchiseptica was identified as a frequent secondary invader and cause of death. The goal of this study was to evaluate genetic diversity and the molecular basis for host specificity among seal isolates from these outbreaks. MLST and PvuII ribotyping of 54 isolates from Scottish, English or Danish coasts of the Atlantic or North Sea revealed a single, host-restricted genotype. A single, novel genotype, unique from that of the Atlantic and North Sea isolates, was found in isolates from an outbreak in the Caspian Sea. Phylogenetic analysis based either on MLST sequence, ribotype patterns or genome-wide SNPs consistently placed both seal-specific genotypes within the same major clade but indicates a distinct evolutionary history for each. An additional isolate from the intestinal tract of a seal on the south-west coast of England has a genotype otherwise found in rabbit, guinea pig and pig isolates. To investigate the molecular basis for host specificity, DNA and predicted protein sequences of virulence genes that mediate host interactions were used in comparisons between a North Sea isolate, a Caspian Sea isolate and each of their closest relatives as inferred from genome-wide SNP analysis. Despite their phylogenetic divergence, fewer nucleotide and amino acid substitutions were found in comparisons of the two seal isolates than in comparisons with closely related strains. These data indicate isolates of B. bronchiseptica associated with respiratory disease in seals comprise unique, host-adapted and highly clonal populations. © 2015 The Authors.
UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.
Meinicke, Peter
2009-09-02
Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment
Freschi, Valerio; Bogliolo, Alessandro
2012-01-01
In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
Wu, Yue-Hong; Yu, Pei-Song; Zhou, Ya-Dong; Xu, Lin; Wang, Chun-Sheng; Wu, Min; Oren, Aharon; Xu, Xue-Wei
2013-09-01
A Gram-stain-negative, rod-shaped bacterium with appendages, designated Ar-22(T), was isolated from a seawater sample collected from the western part of Prydz Bay, near Cape Darnley, Antarctica. Strain Ar-22(T) grew optimally at 35 °C, at pH 7.5 and in the presence of 1-3% (w/v) NaCl. The isolate was positive for casein, gelatin and Tween 20 decomposition and negative for H2S production and indole formation. Chemotaxonomic analysis showed that MK-6 was the major isoprenoid quinone and phosphatidylethanolamine was the major polar lipid. The major fatty acids were iso-C(17:0) 3-OH, iso-C(15:1) G, iso-C(15:0) and C(16:1)ω7c/iso-C(15:0) 2OH. The genomic DNA G+C content was 44.8 mol%. Comparative 16S rRNA gene sequence analysis revealed that strain Ar-22(T) is closely related to members of the genus Muricauda, sharing 94.2-97.3% sequence similarity with the type strains of species of the genus Muricauda and being most closely related to the Muricauda aquimarina. Phylogenetic analysis based on the 16S rRNA gene sequence comparison confirmed that strain Ar-22(T) formed a deep lineage with Muricauda flavescens. Sequence similarity between strain Ar-22(T) and Muricauda ruestringensis DSM 13258(T), the type species of the genus Muricauda, was 96.9%. Strain Ar-22(T) exhibited mean DNA-DNA relatedness values of 40.1%, 49.4% and 25.7% to M. aquimarina JCM 11811(T), M. flavescens JCM 11812(T) and Muricauda lutimaris KCTC 22173(T), respectively. On the basis of phenotypic and genotypic data, strain Ar-22(T) represents a novel species of the genus Muricauda, for which the name Muricauda antarctica sp. nov. (type strain Ar-22(T) =CGMCC 1.12174(T) = JCM 18450(T)) is proposed.
New powerful statistics for alignment-free sequence comparison under a pattern transfer model.
Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S; Sun, Fengzhu
2011-09-07
Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D*2 and D(s)2 showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D*2 and D(s)2 by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. Copyright © 2011 Elsevier Ltd. All rights reserved.
New Powerful Statistics for Alignment-free Sequence Comparison Under a Pattern Transfer Model
Liu, Xuemei; Wan, Lin; Li, Jing; Reinert, Gesine; Waterman, Michael S.; Sun, Fengzhu
2011-01-01
Alignment-free sequence comparison is widely used for comparing gene regulatory regions and for identifying horizontally transferred genes. Recent studies on the power of a widely used alignment-free comparison statistic D2 and its variants D2∗ and D2s showed that their power approximates a limit smaller than 1 as the sequence length tends to infinity under a pattern transfer model. We develop new alignment-free statistics based on D2, D2∗ and D2s by comparing local sequence pairs and then summing over all the local sequence pairs of certain length. We show that the new statistics are much more powerful than the corresponding statistics and the power tends to 1 as the sequence length tends to infinity under the pattern transfer model. PMID:21723298
Multiple alignment-free sequence comparison
Ren, Jie; Song, Kai; Sun, Fengzhu; Deng, Minghua; Reinert, Gesine
2013-01-01
Motivation: Recently, a range of new statistics have become available for the alignment-free comparison of two sequences based on k-tuple word content. Here, we extend these statistics to the simultaneous comparison of more than two sequences. Our suite of statistics contains, first, and , extensions of statistics for pairwise comparison of the joint k-tuple content of all the sequences, and second, , and , averages of sums of pairwise comparison statistics. The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences. Results: Our investigation uses both simulated data as well as cis-regulatory module data where the task is to identify cis-regulatory modules with similar transcription factor binding sites. We find that although for real data, all of our statistics show a similar performance, on simulated data the Shepp-type statistics are in some instances outperformed by star-type statistics. The multiple alignment-free statistics are more sensitive to contamination in the data than the pairwise average statistics. Availability: Our implementation of the five statistics is available as R package named ‘multiAlignFree’ at be http://www-rcf.usc.edu/∼fsun/Programs/multiAlignFree/multiAlignFreemain.html. Contact: reinert@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23990418
Silva, C; Garcia-Mas, J; Sánchez, A M; Arús, P; Oliveira, M M
2005-03-01
Blooming time is one of the most important agronomic traits in almond. Biochemical and molecular events underlying flowering regulation must be understood before methods to stimulate late flowering can be developed. Attempts to elucidate the genetic control of this process have led to the identification of a major gene (Lb) and quantitative trait loci (QTLs) linked to observed phenotypic differences, but although this gene and these QTLs have been placed on the Prunus reference genetic map, their sequences and specific functions remain unknown. The aim of our investigation was to associate these loci with known genes using a candidate gene approach. Two almond cDNAs and eight Prunus expressed sequence tags were selected as candidate genes (CGs) since their sequences were highly identical to those of flowering regulatory genes characterized in other species. The CGs were amplified from both parental lines of the mapping population using specific primers. Sequence comparison revealed DNA polymorphisms between the parental lines, mainly of the single nucleotide type. Polymorphisms were used to develop co-dominant cleaved amplified polymorphic sequence markers or length polymorphisms based on insertion/deletion events for mapping the candidate genes on the Prunus reference map. Ten candidate genes were assigned to six linkage groups in the Prunus genome. The positions of two of these were compatible with the regions where two QTLs for blooming time were detected. One additional candidate was localized close to the position of the Evergrowing gene, which determines a non-deciduous behaviour in peach.
Lin, C S; Sun, Y L; Liu, C Y; Yang, P C; Chang, L C; Cheng, I C; Mao, S J; Huang, M C
1999-08-05
The complete nucleotide sequence of the pig (Sus scrofa) mitochondrial genome, containing 16613bp, is presented in this report. The genome is not a specific length because of the presence of the variable numbers of tandem repeats, 5'-CGTGCGTACA in the displacement loop (D-loop). Genes responsible for 12S and 16S rRNAs, 22 tRNAs, and 13 protein-coding regions are found. The genome carries very few intergenic nucleotides with several instances of overlap between protein-coding or tRNA genes, except in the D-loop region. For evaluating the possible evolutionary relationships between Artiodactyla and Cetacea, the nucleotide substitutions and amino acid sequences of 13 protein-coding genes were aligned by pairwise comparisons of the pig, cow, and fin whale. By comparing these sequences, we suggest that there is a closer relationship between the pig and cow than that between either of these species and fin whale. In addition, the accumulation of transversions and gaps in pig 12S and 16S rRNA genes was compared with that in other eutherian species, including cow, fin whale, human, horse, and harbor seal. The results also reveal a close phylogenetic relationship between pig and cow, as compared to fin whale and others. Thus, according to the sequence differences of mitochondrial rRNA genes in eutherian species, the evolutionary separation of pig and cow occurred about 53-60 million years ago.
A Modified LS+AR Model to Improve the Accuracy of the Short-term Polar Motion Prediction
NASA Astrophysics Data System (ADS)
Wang, Z. W.; Wang, Q. X.; Ding, Y. Q.; Zhang, J. J.; Liu, S. S.
2017-03-01
There are two problems of the LS (Least Squares)+AR (AutoRegressive) model in polar motion forecast: the inner residual value of LS fitting is reasonable, but the residual value of LS extrapolation is poor; and the LS fitting residual sequence is non-linear. It is unsuitable to establish an AR model for the residual sequence to be forecasted, based on the residual sequence before forecast epoch. In this paper, we make solution to those two problems with two steps. First, restrictions are added to the two endpoints of LS fitting data to fix them on the LS fitting curve. Therefore, the fitting values next to the two endpoints are very close to the observation values. Secondly, we select the interpolation residual sequence of an inward LS fitting curve, which has a similar variation trend as the LS extrapolation residual sequence, as the modeling object of AR for the residual forecast. Calculation examples show that this solution can effectively improve the short-term polar motion prediction accuracy by the LS+AR model. In addition, the comparison results of the forecast models of RLS (Robustified Least Squares)+AR, RLS+ARIMA (AutoRegressive Integrated Moving Average), and LS+ANN (Artificial Neural Network) confirm the feasibility and effectiveness of the solution for the polar motion forecast. The results, especially for the polar motion forecast in the 1-10 days, show that the forecast accuracy of the proposed model can reach the world level.
Eduardoff, M; Gross, T E; Santos, C; de la Puente, M; Ballard, D; Strobl, C; Børsting, C; Morling, N; Fusco, L; Hussing, C; Egyed, B; Souto, L; Uacyisrael, J; Syndercombe Court, D; Carracedo, Á; Lareu, M V; Schneider, P M; Parson, W; Phillips, C; Parson, W; Phillips, C
2016-07-01
The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures, and the ancestry differentiation power of the final panel design, which required substitution of three original ancestry-informative SNPs with alternatives. Fourteen populations that had not been previously analyzed were genotyped using the custom multiplex and these studies allowed assessment of genotyping performance by comparison of data across five laboratories. Results indicate a low level of genotyping error can still occur from sequence misalignment caused by homopolymeric tracts close to the target SNP, despite careful scrutiny of candidate SNPs at the design stage. Such sequence misalignment required the exclusion of component SNP rs2080161 from the Global AIM-SNPs panel. However, the overall genotyping precision and sensitivity of this custom multiplex indicates the Ion PGM™ assay for the Global AIM-SNPs is highly suitable for forensic ancestry analysis with massively parallel sequencing. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The complete chloroplast genome sequence of Dodonaea viscosa: comparative and phylogenetic analyses.
Saina, Josphat K; Gichira, Andrew W; Li, Zhi-Zhong; Hu, Guang-Wan; Wang, Qing-Feng; Liao, Kuo
2018-02-01
The plant chloroplast (cp) genome is a highly conserved structure which is beneficial for evolution and systematic research. Currently, numerous complete cp genome sequences have been reported due to high throughput sequencing technology. However, there is no complete chloroplast genome of genus Dodonaea that has been reported before. To better understand the molecular basis of Dodonaea viscosa chloroplast, we used Illumina sequencing technology to sequence its complete genome. The whole length of the cp genome is 159,375 base pairs (bp), with a pair of inverted repeats (IRs) of 27,099 bp separated by a large single copy (LSC) 87,204 bp, and small single copy (SSC) 17,972 bp. The annotation analysis revealed a total of 115 unique genes of which 81 were protein coding, 30 tRNA, and four ribosomal RNA genes. Comparative genome analysis with other closely related Sapindaceae members showed conserved gene order in the inverted and single copy regions. Phylogenetic analysis clustered D. viscosa with other species of Sapindaceae with strong bootstrap support. Finally, a total of 249 SSRs were detected. Moreover, a comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates in D. viscosa showed very low values. The availability of cp genome reported here provides a valuable genetic resource for comprehensive further studies in genetic variation, taxonomy and phylogenetic evolution of Sapindaceae family. In addition, SSR markers detected will be used in further phylogeographic and population structure studies of the species in this genus.
The TGA codons are present in the open reading frame of selenoprotein P cDNA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hill, K.E.; Lloyd, R.S.; Read, R.
1991-03-11
The TGA codon in DNA has been shown to direct incorporation of selenocysteine into protein. Several proteins from bacteria and animals contain selenocysteine in their primary structures. Each of the cDNA clones of these selenoproteins contains one TGA codon in the open reading frame which corresponds to the selenocysteine in the protein. A cDNA clone for selenoprotein P (SeP), obtained from a {gamma}ZAP rat liver library, was sequenced by the dideoxy termination method. The correct reading frame was determined by comparison of the deduced amino acid sequence with the amino acid sequence of several peptides from SeP. Using SeP labelledmore » with {sup 75}Se in vivo, the selenocysteine content of the peptides was verified by the collection of carboxymethylated {sup 77}Se-selenocysteine as it eluted from the amino acid analyzer and determination of the radioactivity contained in the collected samples. Ten TGA codons are present in the open reading frame of the cDNA. Peptide fragmentation studies and the deduced sequence indicate that selenium-rich regions are located close to the carboxy terminus. Nine of the 10 selenocysteines are located in the terminal 26% of the sequence with four in the terminal 15 amino acids. The deduced sequence codes for a protein of 385 amino acids. Cleavage of the signal peptide gives the mature protein with 366 amino acids and a calculated mol wt of 41,052 Da. Searches of PIR and SWISSPROT protein databases revealed no similarity with glutathione peroxidase or other selenoproteins.« less
Phenotypic and phylogenetic characterization of ruminal tannin-tolerant bacteria.
Nelson, K E; Thonney, M L; Woolston, T K; Zinder, S H; Pell, A N
1998-10-01
The 16S rRNA sequences and selected phenotypic characteristics were determined for six recently isolated bacteria that can tolerate high levels of hydrolyzable and condensed tannins. Bacteria were isolated from the ruminal contents of animals in different geographic locations, including Sardinian sheep (Ovis aries), Honduran and Colombian goats (Capra hircus), white-tail deer (Odocoileus virginianus) from upstate New York, and Rocky Mountain elk (Cervus elaphus nelsoni) from Oregon. Nearly complete sequences of the small-subunit rRNA genes, which were obtained by PCR amplification, cloning, and sequencing, were used for phylogenetic characterization. Comparisons of the 16S rRNA of the six isolates showed that four of the isolates were members of the genus Streptococcus and were most closely related to ruminal strains of Streptococcus bovis and the recently described organism Streptococcus gallolyticus. One of the other isolates, a gram-positive rod, clustered with the clostridia in the low-G+C-content group of gram-positive bacteria. The sixth isolate, a gram-negative rod, was a member of the family Enterobacteriaceae in the gamma subdivision of the class Proteobacteria. None of the 16S rRNA sequences of the tannin-tolerant bacteria examined was identical to the sequence of any previously described microorganism or to the sequence of any of the other organisms examined in this study. Three phylogenetically distinct groups of ruminal bacteria were isolated from four species of ruminants in Europe, North America, and South America. The presence of tannin-tolerant bacteria is not restricted by climate, geography, or host animal, although attempts to isolate tannin-tolerant bacteria from cows on low-tannin diets failed.
Khan, Abdul Latif; Khan, Muhammad Aaqil; Shahzad, Raheem; Lubna; Kang, Sang Mo; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung
2018-01-01
Pinaceae, the largest family of conifers, has a diversified organization of chloroplast (cp) genomes with two typical highly reduced inverted repeats (IRs). In the current study, we determined the complete sequence of the cp genome of an economically and ecologically important conifer tree, the loblolly pine (Pinus taeda L.), using Illumina paired-end sequencing and compared the sequence with those of other pine species. The results revealed a genome size of 121,531 base pairs (bp) containing a pair of 830-bp IR regions, distinguished by a small single copy (42,258 bp) and large single copy (77,614 bp) region. The chloroplast genome of P. taeda encodes 120 genes, comprising 81 protein-coding genes, four ribosomal RNA genes, and 35 tRNA genes, with 151 randomly distributed microsatellites. Approximately 6 palindromic, 34 forward, and 22 tandem repeats were found in the P. taeda cp genome. Whole cp genome comparison with those of other Pinus species exhibited an overall high degree of sequence similarity, with some divergence in intergenic spacers. Higher and lower numbers of indels and single-nucleotide polymorphism substitutions were observed relative to P. contorta and P. monophylla, respectively. Phylogenomic analyses based on the complete genome sequence revealed that 60 shared genes generated trees with the same topologies, and P. taeda was closely related to P. contorta in the subgenus Pinus. Thus, the complete P. taeda genome provided valuable resources for population and evolutionary studies of gymnosperms and can be used to identify related species. PMID:29596414
Phenotypic and Phylogenetic Characterization of Ruminal Tannin-Tolerant Bacteria
Nelson, Karen E.; Thonney, Michael L.; Woolston, Tina K.; Zinder, Stephen H.; Pell, Alice N.
1998-01-01
The 16S rRNA sequences and selected phenotypic characteristics were determined for six recently isolated bacteria that can tolerate high levels of hydrolyzable and condensed tannins. Bacteria were isolated from the ruminal contents of animals in different geographic locations, including Sardinian sheep (Ovis aries), Honduran and Colombian goats (Capra hircus), white-tail deer (Odocoileus virginianus) from upstate New York, and Rocky Mountain elk (Cervus elaphus nelsoni) from Oregon. Nearly complete sequences of the small-subunit rRNA genes, which were obtained by PCR amplification, cloning, and sequencing, were used for phylogenetic characterization. Comparisons of the 16S rRNA of the six isolates showed that four of the isolates were members of the genus Streptococcus and were most closely related to ruminal strains of Streptococcus bovis and the recently described organism Streptococcus gallolyticus. One of the other isolates, a gram-positive rod, clustered with the clostridia in the low-G+C-content group of gram-positive bacteria. The sixth isolate, a gram-negative rod, was a member of the family Enterobacteriaceae in the gamma subdivision of the class Proteobacteria. None of the 16S rRNA sequences of the tannin-tolerant bacteria examined was identical to the sequence of any previously described microorganism or to the sequence of any of the other organisms examined in this study. Three phylogenetically distinct groups of ruminal bacteria were isolated from four species of ruminants in Europe, North America, and South America. The presence of tannin-tolerant bacteria is not restricted by climate, geography, or host animal, although attempts to isolate tannin-tolerant bacteria from cows on low-tannin diets failed. PMID:9758806
The need for high-quality whole-genome sequence databases in microbial forensics.
Sjödin, Andreas; Broman, Tina; Melefors, Öjar; Andersson, Gunnar; Rasmusson, Birgitta; Knutsson, Rickard; Forsman, Mats
2013-09-01
Microbial forensics is an important part of a strengthened capability to respond to biocrime and bioterrorism incidents to aid in the complex task of distinguishing between natural outbreaks and deliberate acts. The goal of a microbial forensic investigation is to identify and criminally prosecute those responsible for a biological attack, and it involves a detailed analysis of the weapon--that is, the pathogen. The recent development of next-generation sequencing (NGS) technologies has greatly increased the resolution that can be achieved in microbial forensic analyses. It is now possible to identify, quickly and in an unbiased manner, previously undetectable genome differences between closely related isolates. This development is particularly relevant for the most deadly bacterial diseases that are caused by bacterial lineages with extremely low levels of genetic diversity. Whole-genome analysis of pathogens is envisaged to be increasingly essential for this purpose. In a microbial forensic context, whole-genome sequence analysis is the ultimate method for strain comparisons as it is informative during identification, characterization, and attribution--all 3 major stages of the investigation--and at all levels of microbial strain identity resolution (ie, it resolves the full spectrum from family to isolate). Given these capabilities, one bottleneck in microbial forensics investigations is the availability of high-quality reference databases of bacterial whole-genome sequences. To be of high quality, databases need to be curated and accurate in terms of sequences, metadata, and genetic diversity coverage. The development of whole-genome sequence databases will be instrumental in successfully tracing pathogens in the future.
Identification of Escherichia coli and Shigella Species from Whole-Genome Sequences.
Chattaway, Marie A; Schaefer, Ulf; Tewolde, Rediat; Dallman, Timothy J; Jenkins, Claire
2017-02-01
Escherichia coli and Shigella species are closely related and genetically constitute the same species. Differentiating between these two pathogens and accurately identifying the four species of Shigella are therefore challenging. The organism-specific bioinformatics whole-genome sequencing (WGS) typing pipelines at Public Health England are dependent on the initial identification of the bacterial species by use of a kmer-based approach. Of the 1,982 Escherichia coli and Shigella sp. isolates analyzed in this study, 1,957 (98.4%) had concordant results by both traditional biochemistry and serology (TB&S) and the kmer identification (ID) derived from the WGS data. Of the 25 mismatches identified, 10 were enteroinvasive E. coli isolates that were misidentified as Shigella flexneri or S. boydii by the kmer ID, and 8 were S. flexneri isolates misidentified by TB&S as S. boydii due to nonfunctional S. flexneri O antigen biosynthesis genes. Analysis of the population structure based on multilocus sequence typing (MLST) data derived from the WGS data showed that the remaining discrepant results belonged to clonal complex 288 (CC288), comprising both S. boydii and S. dysenteriae strains. Mismatches between the TB&S and kmer ID results were explained by the close phylogenetic relationship between the two species and were resolved with reference to the MLST data. Shigella can be differentiated from E. coli and accurately identified to the species level by use of kmer comparisons and MLST. Analysis of the WGS data provided explanations for the discordant results between TB&S and WGS data, revealed the true phylogenetic relationships between different species of Shigella, and identified emerging pathoadapted lineages. © Crown copyright 2017.
Bałazy, Stanisław; Wrzosek, Marta; Sosnowska, Danuta; Tkaczuk, Cezary; Muszewska, Anna
2008-02-01
Laboratory assays have been carried out to artificially infect insect larvae of the birch bark-beetle (Scolytus ratzeburgi Jans.-Coleoptera, Scolytidae) and codling moth Cydia pomonella L. -Lepidoptera, Tortricidae) as well as the potato cyst nematode-Globodera rostochiensis Wollenweber, sugar beet nematode-Heterodera schachtii Schmidt and root-knot nematode-Meloidogyne hapla Chif (Nematoda, Heteroderidae), by the phialoconidia of some fungal species of the genus Hirsutella. From among four species tested on insects only H. nodulosa Petch infected about 20% of S. ratzeburgi larvae, whereas H. kirchneri (Rostrup) Minter, Brady et Hall, H. minnesotensis Chen, Liu et Chen, and H. rostrata Bałazy et Wiśniewski did not affect insect larvae. Only single eggs of the root-knot nematode were infected by H. minnesotensis in the laboratory trials, whereas its larvae remained unaffected. No infection cases of the potato cyst nematode (G. rostochiensis) and sugar beet nematode eggs were obtained. Comparisons of DNA-ITS-region sequences of the investigated strains with GenBank data showed no differences between H. minnesotensis isolates from the nematodes Heterodera glycines Ichinohe and from tarsonemid mites (authors' isolate). A fragment of ITS 2 with the sequence characteristic only for H. minnesotensis was selected. Two cluster analyses indicated close similarity of this species to H. thompsonii as sister clades, but the latter appeared more heterogenous. Insect and mite pathogenic species H. nodulosa localizes close to specialized aphid pathogen H. aphidis, whereas the phytophagous mite pathogens H. kirchneri and H. gregis form a separate sister clade. Hirsutella rostrata does not show remarkable relations to the establishment of aforementioned groups. Interrelated considerations on the morphology, biology and DNA sequencing of investigated Hirsutella species state their identification more precisely and facilitate the establishment of systematic positions.
Facey, Paul D.; Méric, Guillaume; Hitchings, Matthew D.; Pachebat, Justin A.; Hegarty, Matt J.; Chen, Xiaorui; Morgan, Laura V.A.; Hoeppner, James E.; Whitten, Miranda M.A.; Kirk, William D.J.; Dyson, Paul J.; Sheppard, Sam K.; Sol, Ricardo Del
2015-01-01
Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. PMID:26185096
Facey, Paul D; Méric, Guillaume; Hitchings, Matthew D; Pachebat, Justin A; Hegarty, Matt J; Chen, Xiaorui; Morgan, Laura V A; Hoeppner, James E; Whitten, Miranda M A; Kirk, William D J; Dyson, Paul J; Sheppard, Sam K; Del Sol, Ricardo
2015-07-15
Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Wang, Guiqin; Yin, Renfu; Zhou, Paul; Ding, Zhuang
2017-01-01
Hemagglutinin (HA) head has long been considered to be able to elicit only a narrow, strain-specific antibody response as it undergoes rapid antigenic drift. However, we previously showed that a heterologous prime-boost strategy, in which mice were primed twice with DNA encoding HA and boosted once with virus-like particles (VLP) from an H5N1 strain A/Thailand/1(KAN)-1/2004 (noted as TH DDV), induced anti-head broad cross-H5 neutralizing antibody response. To explain why TH DDV immunization could generate such breadth, we systemically compared the neutralization breadth and potency between TH DDV sera and immune sera elicited by TH DDD (three times of DNA immunizations), TH VVV (three times of VLP immunizations), TH DV (one DNA prime plus one VLP boost) and TK DDV (plasmid DNA and VLP derived from another H5N1 strain, A/Turkey/65596/2006). Then we determined the antigenic sites (AS) on TH HA head and the key residues of the main antigenic site. Through the comparison of different regiments, we found that the combination of the immunization with the sequence close to the consensus sequence and two DNA prime plus one VLP boost caused that TH DDV immunization generate broad neutralizing antibodies. Antigenic analysis showed that TH DDV, TH DV, TH DDD and TH VVV sera recognize the common antigenic site AS1. Antibodies directed to AS1 contribute to the largest proportion of the neutralizing activity of these immune sera. Residues 188 and 193 in AS1 are the key residues which are responsible for neutralization breadth of the immune sera. Interestingly, residues 188 and 193 locate in classical antigen sites but are relatively conserved among the 16 tested strains and 1,663 HA sequences from NCBI database. Thus, our results strongly indicate that it is feasible to develop broad cross-H5 influenza vaccines against HA head. PMID:28542275
Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele
2016-03-15
Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Muwonge, Apollo; Nanyunja, Miriam; Bwogi, Josephine; Lowe, Luis; Liffick, Stephanie L.; Bellini, William J.; Sylvester, Sempala
2005-01-01
We report the first genetic characterization of wildtype measles viruses from Uganda. Thirty-six virus isolates from outbreaks in 6 districts were analyzed from 2000 to 2002. Analyses of sequences of the nucleoprotein (N) and hemagglutinin (H) genes showed that the Ugandan isolates were all closely related, and phylogenetic analysis indicated that these viruses were members of a unique group within clade D. Sequences of the Ugandan viruses were not closely related to any of the World Health Organization reference sequences representing the 22 currently recognized genotypes. The minimum nucleotide divergence between the Ugandan viruses and the most closely related reference strain, genotype D2, was 3.1% for the N gene and 2.6% for the H gene. Therefore, Ugandan viruses should be considered a new, proposed genotype (d10). This new sequence information will expand the utility of molecular epidemiologic techniques for describing measles transmission patterns in eastern Africa. PMID:16318690
Enterobacter muelleri sp. nov., isolated from the rhizosphere of Zea mays.
Kämpfer, Peter; McInroy, John A; Glaeser, Stefanie P
2015-11-01
A beige-pigmented, oxidase-negative bacterial strain (JM-458T), isolated from a rhizosphere sample, was studied using a polyphasic taxonomic approach. Cells of the isolate were rod-shaped and stained Gram-negative. A comparison of the 16S rRNA gene sequence of strain JM-458T with sequences of the type strains of closely related species of the genus Enterobacter showed that it shared highest sequence similarity with Enterobacter mori (98.7 %), Enterobacter hormaechei (98.3 %), Enterobacter cloacae subsp. dissolvens, Enterobacter ludwigii and Enterobacter asburiae (all 98.2 %). 16S rRNA gene sequence similarities to all other Enterobacter species were below 98 %. Multilocus sequence analysis based on concatenated partial rpoB, gyrB, infB and atpD gene sequences showed a clear distinction of strain JM-458T from its closest related type strains. The fatty acid profile of the strain consisted of C16 : 0, C17 : 0 cyclo, iso-C15 : 0 2-OH/C16 : 1ω7c and C18 : 1ω7c as major components. DNA-DNA hybridizations between strain JM-458T and the type strains of E. mori, E. hormaechei and E. ludwigii resulted in relatedness values of 29 % (reciprocal 25 %), 24 % (reciprocal 43 %) and 16 % (reciprocal 17 %), respectively. DNA-DNA hybridization results together with multilocus sequence analysis results and differential biochemical and chemotaxonomic properties showed that strain JM-458T represents a novel species of the genus Enterobacter, for which the name Enterobacter muelleri sp. nov. is proposed. The type strain is JM-458T ( = DSM 29346T = CIP 110826T = LMG 28480T = CCM 8546T).
GALAVANI, Hossein; GHOLIZADEH, Saber; HAZRATI TAPPEH, Khosrow
2016-01-01
Background: Fascioliasis, caused by Fasciola hepatica and F. gigantica, has medical and economic importance in the world. Molecular approaches comparing traditional methods using for identification and characterization of Fasciola spp. are precise and reliable. The aims of current study were molecular characterization of Fasciola spp. in West Azerbaijan Province, Iran and then comparative analysis of them using GenBank sequences. Methods: A total number of 580 isolates were collected from different hosts in five cities of West Azerbaijan Province, in 2014 from 90 slaughtered cattle (n=50) and sheep (n=40). After morphological identification and DNA extraction, designing specific primer were used to amplification of ITS1, 5.8s and ITS2 regions, 50 samples were conducted to sequence, randomly. Result: Using morphometric characters 99.14% and 0.86% of isolates identified as F. hepatica and F. gigantica, respectively. PCR amplification of 1081 bp fragment and sequencing result showed 100% similarity with F. hepatica in ITS1 (428 bp), 5.8s (158 bp), and ITS2 (366 bp) regions. Sequence comparison among current study sequences and GenBank data showed 98% identity with 11 nucleotide mismatches. However, in phylogenetic tree F. hepatica sequences of West Azerbaijan Province, Iran, were in a close relationship with Iranian, Asian, and African isolates. Conclusions: Only F. hepatica species is distributed among sheep and cattle in West Azerbaijan Province Iran. However, 5 and 6 bp variation in ITS1 and ITS2 regions, respectively, is not enough to separate of Fasciola spp. Therefore, more studies are essential for designing new molecular markers to correct species identification. PMID:27095969
Nullomers and High Order Nullomers in Genomic Sequences
Vergni, Davide; Santoni, Daniele
2016-01-01
A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon. Finally, high order nullomers could emphasize those features that already make simple nullomers useful in several applications. PMID:27906971
Lin, Mingqun; Zhang, Chunbin; Gibson, Kathryn; Rikihisa, Yasuko
2009-01-01
Neorickettsia risticii is an obligate intracellular bacterium of the trematodes and mammals. Horses develop Potomac horse fever (PHF) when they ingest aquatic insects containing encysted N. risticii-infected trematodes. The complete genome sequence of N. risticii Illinois consists of a single circular chromosome of 879 977 bp and encodes 38 RNA species and 898 proteins. Although N. risticii has limited ability to synthesize amino acids and lacks many metabolic pathways, it is capable of making major vitamins, cofactors and nucleotides. Comparison with its closely related human pathogen N. sennetsu showed that 758 (88.2%) of protein-coding genes are conserved between N. risticii and N. sennetsu. Four-way comparison of genes among N. risticii and other Anaplasmataceae showed that most genes are either shared among Anaplasmataceae (525 orthologs that generally associated with housekeeping functions), or specific to each genome (>200 genes that are mostly hypothetical proteins). Genes potentially involved in the pathogenesis of N. risticii were identified, including those encoding putative outer membrane proteins, two-component systems and a type IV secretion system (T4SS). The bipolar localization of T4SS pilus protein VirB2 on the bacterial surface was demonstrated for the first time in obligate intracellular bacteria. These data provide insights toward genomic potential of N. risticii and intracellular parasitism, and facilitate our understanding of PHF pathogenesis. PMID:19661282
Lin, Mingqun; Zhang, Chunbin; Gibson, Kathryn; Rikihisa, Yasuko
2009-10-01
Neorickettsia risticii is an obligate intracellular bacterium of the trematodes and mammals. Horses develop Potomac horse fever (PHF) when they ingest aquatic insects containing encysted N. risticii-infected trematodes. The complete genome sequence of N. risticii Illinois consists of a single circular chromosome of 879 977 bp and encodes 38 RNA species and 898 proteins. Although N. risticii has limited ability to synthesize amino acids and lacks many metabolic pathways, it is capable of making major vitamins, cofactors and nucleotides. Comparison with its closely related human pathogen N. sennetsu showed that 758 (88.2%) of protein-coding genes are conserved between N. risticii and N. sennetsu. Four-way comparison of genes among N. risticii and other Anaplasmataceae showed that most genes are either shared among Anaplasmataceae (525 orthologs that generally associated with housekeeping functions), or specific to each genome (>200 genes that are mostly hypothetical proteins). Genes potentially involved in the pathogenesis of N. risticii were identified, including those encoding putative outer membrane proteins, two-component systems and a type IV secretion system (T4SS). The bipolar localization of T4SS pilus protein VirB2 on the bacterial surface was demonstrated for the first time in obligate intracellular bacteria. These data provide insights toward genomic potential of N. risticii and intracellular parasitism, and facilitate our understanding of PHF pathogenesis.
Use of DNA barcodes to identify flowering plants.
Kress, W John; Wurdack, Kenneth J; Zimmer, Elizabeth A; Weigt, Lee A; Janzen, Daniel H
2005-06-07
Methods for identifying species by using short orthologous DNA sequences, known as "DNA barcodes," have been proposed and initiated to facilitate biodiversity studies, identify juveniles, associate sexes, and enhance forensic analyses. The cytochrome c oxidase 1 sequence, which has been found to be widely applicable in animal barcoding, is not appropriate for most species of plants because of a much slower rate of cytochrome c oxidase 1 gene evolution in higher plants than in animals. We therefore propose the nuclear internal transcribed spacer region and the plastid trnH-psbA intergenic spacer as potentially usable DNA regions for applying barcoding to flowering plants. The internal transcribed spacer is the most commonly sequenced locus used in plant phylogenetic investigations at the species level and shows high levels of interspecific divergence. The trnH-psbA spacer, although short ( approximately 450-bp), is the most variable plastid region in angiosperms and is easily amplified across a broad range of land plants. Comparison of the total plastid genomes of tobacco and deadly nightshade enhanced with trials on widely divergent angiosperm taxa, including closely related species in seven plant families and a group of species sampled from a local flora encompassing 50 plant families (for a total of 99 species, 80 genera, and 53 families), suggest that the sequences in this pair of loci have the potential to discriminate among the largest number of plant species for barcoding purposes.
Dupont, L; Boizet-Bonhoure, B; Coddeville, M; Auvray, F; Ritzenthaler, P
1995-01-01
Temperate phage mv4 integrates its DNA into the chromosome of Lactobacillus delbrueckii subsp. bulgaricus strains via site-specific recombination. Nucleotide sequencing of a 2.2-kb attP-containing phage fragment revealed the presence of four open reading frames. The larger open reading frame, close to the attP site, encoded a 427-amino-acid polypeptide with similarity in its C-terminal domain to site-specific recombinases of the integrase family. Comparison of the sequences of attP, bacterial attachment site attB, and host-phage junctions attL and attR identified a 17-bp common core sequence, where strand exchange occurs during recombination. Analysis of the attB sequence indicated that the core region overlaps the 3' end of a tRNA(Ser) gene. Phage mv4 DNA integration into the tRNA(Ser) gene preserved an intact tRNA(Ser) gene at the attL site. An integration vector based on the mv4 attP site and int gene was constructed. This vector transforms a heterologous host, L. plantarum, through site-specific integration into the tRNA(Ser) gene of the genome and will be useful for development of an efficient integration system for a number of additional bacterial species in which an identical tRNA gene is present. PMID:7836291
Niehaus, Eva-Maria; Münsterkötter, Martin; Proctor, Robert H.; Brown, Daren W.; Sharon, Amir; Idan, Yifat; Oren-Young, Liat; Sieber, Christian M.; Novák, Ondřej; Pěnčík, Aleš; Tarkowská, Danuše; Hromadová, Kristýna; Freeman, Stanley; Maymon, Marcel; Elazar, Meirav; Youssef, Sahar A.; El-Shabrawy, El Said M.; Shalaby, Abdel Baset A.; Houterman, Petra; Brock, Nelson L.; Burkhardt, Immo; Tsavkelova, Elena A.; Dickschat, Jeroen S.; Galuszka, Petr; Güldener, Ulrich; Tudzynski, Bettina
2016-01-01
Species of the Fusarium fujikuroi species complex (FFC) cause a wide spectrum of often devastating diseases on diverse agricultural crops, including coffee, fig, mango, maize, rice, and sugarcane. Although species within the FFC are difficult to distinguish by morphology, and their genes often share 90% sequence similarity, they can differ in host plant specificity and life style. FFC species can also produce structurally diverse secondary metabolites (SMs), including the mycotoxins fumonisins, fusarins, fusaric acid, and beauvericin, and the phytohormones gibberellins, auxins, and cytokinins. The spectrum of SMs produced can differ among closely related species, suggesting that SMs might be determinants of host specificity. To date, genomes of only a limited number of FFC species have been sequenced. Here, we provide draft genome sequences of three more members of the FFC: a single isolate of F. mangiferae, the cause of mango malformation, and two isolates of F. proliferatum, one a pathogen of maize and the other an orchid endophyte. We compared these genomes to publicly available genome sequences of three other FFC species. The comparisons revealed species-specific and isolate-specific differences in the composition and expression (in vitro and in planta) of genes involved in SM production including those for phytohormome biosynthesis. Such differences have the potential to impact host specificity and, as in the case of F. proliferatum, the pathogenic versus endophytic life style. PMID:28040774
Functional dissection of the alphavirus capsid protease: sequence requirements for activity.
Thomas, Saijo; Rai, Jagdish; John, Lijo; Günther, Stephan; Drosten, Christian; Pützer, Brigitte M; Schaefer, Stephan
2010-11-18
The alphavirus capsid is multifunctional and plays a key role in the viral life cycle. The nucleocapsid domain is released by the self-cleavage activity of the serine protease domain within the capsid. All alphaviruses analyzed to date show this autocatalytic cleavage. Here we have analyzed the sequence requirements for the cleavage activity of Chikungunya virus capsid protease of genus alphavirus. Amongst alphaviruses, the C-terminal amino acid tryptophan (W261) is conserved and found to be important for the cleavage. Mutating tryptophan to alanine (W261A) completely inactivated the protease. Other amino acids near W261 were not having any effect on the activity of this protease. However, serine protease inhibitor AEBSF did not inhibit the activity. Through error-prone PCR we found that isoleucine 227 is important for the effective activity. The loss of activity was analyzed further by molecular modelling and comparison of WT and mutant structures. It was found that lysine introduced at position 227 is spatially very close to the catalytic triad and may disrupt electrostatic interactions in the catalytic site and thus inactivate the enzyme. We are also examining other sequence requirements for this protease activity. We analyzed various amino acid sequence requirements for the activity of ChikV capsid protease and found that amino acids outside the catalytic triads are important for the activity.
Liu, Di; Zhang, Xiang-Bin; Yan, Zhuan-Qiang; Chen, Feng; Ji, Jun; Qin, Jian-Ping; Li, Hai-Yan; Lu, Jun-Peng; Xue, Yu; Liu, Jia-Jia; Xie, Qing-Mei; Ma, Jing-Yun; Xue, Chun-Yi; Bee, Ying-Zuo
2013-06-01
Infectious bursal disease virus (IBDV) is a double-stranded RNA virus that causes immunosuppressive disease in young chickens. Thousands of cases of IBDV infection are reported each year in South China, and these infections can result in considerable economic losses to the poultry industry. To monitor variations of the virus during the outbreaks, 30 IBDVs were identified from vaccinated chicken flocks from nine provinces in South China in 2011. VP2 fragments from different virus strains were sequenced and analyzed by comparison with the published sequences of IBDV strains from China and around the world. Phylogenetic analysis of hypervariable regions of the VP2 (vVP2) gene showed that 29 of the isolates were very virulent (vv) IBDVs, and were closely related to vvIBDV strains from Europe and Asia. Alignment analysis of the deduced amino acid (aa) sequences of vVP2 showed the 29 vv isolates had high uniformity, indicated low variability and slow evolution of the virus. The non-vvIBDV isolate JX2-11 was associated with higher than expected mortality, and had high deduced aa sequence similarity (99.2 %) with the attenuated vaccine strain B87 (BJ). The present study has demonstrated the continued circulation of IBDV strains in South China, and emphasizes the importance of reinforcing IBDV surveillance.
Pelsy, F.; Merdinoglu, D.
2002-09-01
A chromosome-walking strategy was used to sequence and characterize retrotransposons in the grapevine genome. The reconstitution of a family of retroelements, named Tvv1, was achieved by six successive steps. These elements share a single, highly conserved open reading frame 4,153 nucleotides-long, putatively encoding the gag, pro, int, rt and rh proteins. Comparison of the Tvv1 open reading frame coding potential with those of drosophila copia and tobacco Tnt1, revealed that Tvv1 is closely related to Ty 1 copia-like retrotransposons. A highly variable untranslated leader region, upstream of the open reading frame, allowed us to differentiate Tvv1 variants, which represent a family of at least 28 copies, in varying sizes. This internal region is flanked by two long terminal repeats in direct orientation, sized between 149 and 157 bp. Among elements theoretically sized from 4,970 to 5,550 bp, we describe the full-length sequence of a reference element Tvv1-1, 5,343 nucleotides-long. The full-length sequence of Tvv1-1 compared to pea PDR1 shows a 53.3% identity. In addition, both elements contain long terminal repeats of nearly the same size in which the U5 region could be entirely absent. Therefore, we assume that Tvv1 and PDR1 could constitute a particular class of short LTRs retroelements.
Marzocchi, W.; Vilardo, G.; Hill, D.P.; Ricciardi, G.P.; Ricco, C.
2001-01-01
We analyzed and compared the seismic activity that has occurred in the last two to three decades in three distinct volcanic areas: Phlegraean Fields, Italy; Vesuvius, Italy; and Long Valley, California. Our main goal is to identify and discuss common features and peculiarities in the temporal evolution of earthquake sequences that may reflect similarities and differences in the generating processes between these volcanic systems. In particular, we tried to characterize the time series of the number of events and of the seismic energy release in terms of stochastic, deterministic, and chaotic components. The time sequences from each area consist of thousands of earthquakes that allow a detailed quantitative analysis and comparison. The results obtained showed no evidence for either deterministic or chaotic components in the earthquake sequences in Long Valley caldera, which appears to be dominated by stochastic behavior. In contrast, earthquake sequences at Phlegrean Fields and Mount Vesuvius show a deterministic signal mainly consisting of a 24-hour periodicity. Our analysis suggests that the modulation in seismicity is in some way related to thermal diurnal processes, rather than luni-solar tidal effects. Independently from the process that generates these periodicities on the seismicity., it is suggested that the lack (or presence) of diurnal cycles is seismic swarms of volcanic areas could be closely linked to the presence (or lack) of magma motion.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lamb, J.; Harris, P.C.; Wood, W.G.
The authors have previously described a series of patients in whom the deletion of 1--2 megabases (Mb) of DNA from the tip of the short arm of chromosome 16 (band 16p13.3) is associated with [alpha]-thalassemia/mental retardation syndrome (ATR-16). They now show that one of these patients has a de novo truncation of the terminal 2 Mb of chromosome 16p and that telomeric sequence (TTAGGG)[sub n] has been added at the site of breakage. This suggests that the chromosomal break, which is paternal in origin and which probably arose at meiosis, has been stabilized in vivo by the direct addition ofmore » the telomeric sequence. Sequence comparisons of this breakpoint with that of a previously described chromosomal truncation ([alpha][alpha][sup TI]) do not reveal extensive sequence homology. However, both breakpoints show minimal complementarity (3--4 bp) to the proposed RNA template of human telomerase at the site at which telomere repeats have been added. Unlike previously characterized individuals with ATR-16, the clinical features of this patient appear to be solely due to monosomy for the terminal portion of 16p13.3. The identification of further patients with [open quotes]pure[close quotes] monosomy for the tip of chromosome 16p will be important for defining the loci contributing to the phenotype of this syndrome. 33 refs., 4 figs., 1 tab.« less
Klaassen, V A; Boeshore, M; Dolja, V V; Falk, B W
1994-07-01
Purified virions of lettuce infectious yellows virus (LIYV), a tentative member of the closterovirus group, contained two RNAs of approximately 8500 and 7300 nucleotides (RNAs 1 and 2 respectively) and a single coat protein species with M(r) of approximately 28,000. LIYV-infected plants contained multiple dsRNAs. The two largest were the correct size for the replicative forms of LIYV virion RNAs 1 and 2. To assess the relationships between LIYV RNAs 1 and 2, cDNAs corresponding to the virion RNAs were cloned. Northern blot hybridization analysis showed no detectable sequence homology between these RNAs. A partial amino acid sequence obtained from purified LIYV coat protein was found to align in the most upstream of four complete open reading frames (ORFs) identified in a LIYV RNA 2 cDNA clone. The identity of this ORF was confirmed as the LIYV coat protein gene by immunological analysis of the gene product expressed in vitro and in Escherichia coli. Computer analysis of the LIYV coat protein amino acid sequence indicated that it belongs to a large family of proteins forming filamentous capsids of RNA plant viruses. The LIYV coat protein appears to be most closely related to the coat proteins of two closteroviruses, beet yellows virus and citrus tristeza virus.
Thiry, Damien; Mauroy, Axel; Saegerman, Claude; Thomas, Isabelle; Wautier, Magali; Miry, Cora; Czaplicki, Guy; Berkvens, Dirk; Praet, Nicolas; van der Poel, Wim; Cariolet, Roland; Brochier, Bernard; Thiry, Etienne
2014-08-27
Zoonotic transmission of hepatitis E virus (HEV) is of special concern, particularly in high income countries were waterborne infections are less frequent than in developing countries. High HEV seroprevalences can be found in European pig populations. The aims of this study were to obtain prevalence data on HEV infection in swine in Belgium and to phylogenetically compare Belgian human HEV sequences with those obtained from swine. An ELISA screening prevalence of 73% (95% CI 68.8-77.5) was determined in Belgian pigs and a part of the results were re-evaluated by Western blot (WB). A receiver operating characteristic curve analysis was performed and scenarios varying the ELISA specificity relative to WB were analysed. The seroprevalences estimated by the different scenarios ranged between 69 and 81% and are in agreement with the high exposure of the European pig population to HEV. Pig HEV sequences were genetically compared to those detected in humans in Belgium and a predominance of genotype 3 subtype f was shown in both swine and humans. The high HEV seroprevalence in swine and the close phylogenetic relationships between pig and human HEV sequences further support the risk for zoonotic transmission of HEV between humans and pigs. Copyright © 2014 Elsevier B.V. All rights reserved.
Feng, X; Happ, G M
1996-11-14
The cDNA for Sp23, a structural protein of the spermatophore of Tenebrio molitor, had been previously cloned and characterized (Paesen, G.C., Schwartz, M.B., Peferoen, M., Weyda, F. and Happ, G.M. (1992a) Amino acid sequence of Sp23, a structure protein of the spermatophore of the mealworm beetle, Tenebrio molitor. J. Biol. Chem. 257, 18852-18857). Using the labeled cDNA for Sp23 as a probe to screen a library of genomic DNA from Tenebrio molitor, we isolated a genomic clone for Sp23. A 5373-base pair (bp) restriction fragment containing the Sp23 gene was sequenced. The coding region is separated by a 55-bp intron which is located close to the translation start site. Three putative ecdysone response elements (EcRE) are identified in the 5' flanking region of the Sp23 gene. Comparison of the flanking regions of the Sp23 gene with those of the D-protein gene expressed in the accessory glands of Tenebrio reveals similar sequences present in the flanking regions of the two genes. The genomic organization of the coding region of the Sp23 gene shares similarities with that of the D-protein gene, three Drosophila accessory gland genes and two Drosophila 20-OH ecdysone-responsive genes.
Premaratna, Ranjan; Blanton, Lucas S; Samaraweera, Dilhar N; de Silva, G Nalika N; Chandrasena, Nilmini T G A; Walker, David H; de Silva, H J
2017-01-13
To date more than 20 antigenically distinct strains of Orientia tsutsugamushi (OT) reported within the tsutsugamushi triangle that cause an undifferentiated acute febrile illness in humans. Genotypic characterization of OT in different geographic regions or within the same country, is important in order to establish effective diagnostics, clinical management and to develop effective vaccines. Genetic and antigenic characterization of OT causing human disease in OT-endemic regions is not known for Sri Lanka. Adult patients and children who were admitted with an acute febrile illness and presumed to having acute scrub typhus based on presence of an eschar and other supporting clinical features were recruited. Eschar biopsies and buffy coat samples collected from patients who were confirmed having OT by IFA were further studied by real time PCR (Orientia 47 kD) and nested PCR (Orientia 56 kD) amplification. DNA sequences were obtained for 56 kD gene amplicons and phylogenetic comparisons were analyzed using currently available data in GenBank [Neucleotide substitution per 100 residues, 1000 Bootstrap Trials]. Twenty eschar biopsies (Location1,19, Location 2,1) and eight buffy coat samples (Location1,6, Location2,2) examined by real time PCR revealed Orientia amplicons in 16 samples. DNA sequences were obtained for the 56 kD gene amplicons in 12 eschars and 4 buffy coat samples. The genotypes of the Location1 samples revealed that, 7 exhibiting close homology with JP1 [distantly related to UT177 Thai (Karp related)], five had close homology with Kato strain, two had close homology with JGv and JG AF [Distantly related to Kawasaki M63383] and one had close homology with Gilliam strain. The Location 2 strain was closely related to Kuroki-Boryong L04956, the genotype which is distributed in far eastern Asia. Similar to other patients in the cohort this patient also had never travelled out of Sri Lanka. We observed all three main OT genotypes in Sri Lanka, and the majority fell into Thai Karp related clade. These results demonstrate great antigenic diversity of OT in the studied areas of Sri Lanka.
ABACAS: algorithm-based automatic contiguation of assembled sequences
Assefa, Samuel; Keane, Thomas M.; Otto, Thomas D.; Newbold, Chris; Berriman, Matthew
2009-01-01
Summary: Due to the availability of new sequencing technologies, we are now increasingly interested in sequencing closely related strains of existing finished genomes. Recently a number of de novo and mapping-based assemblers have been developed to produce high quality draft genomes from new sequencing technology reads. New tools are necessary to take contigs from a draft assembly through to a fully contiguated genome sequence. ABACAS is intended as a tool to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence. The input to ABACAS is a set of contigs which will be aligned to the reference genome, ordered and orientated, visualized in the ACT comparative browser, and optimal primer sequences are automatically generated. Availability and Implementation: ABACAS is implemented in Perl and is freely available for download from http://abacas.sourceforge.net Contact: sa4@sanger.ac.uk PMID:19497936
Ott, Alina; Trautschold, Brian; Sandhu, Devinder
2011-01-01
Soybean is a major crop that is an important source of oil and proteins. A number of genetic linkage maps have been developed in soybean. Specifically, hundreds of simple sequence repeat (SSR) markers have been developed and mapped. Recent sequencing of the soybean genome resulted in the generation of vast amounts of genetic information. The objectives of this investigation were to use SSR markers in developing a connection between genetic and physical maps and to determine the physical distribution of recombination on soybean chromosomes. A total of 2,188 SSRs were used for sequence-based physical localization on soybean chromosomes. Linkage information was used from different maps to create an integrated genetic map. Comparison of the integrated genetic linkage maps and sequence based physical maps revealed that the distal 25% of each chromosome was the most marker-dense, containing an average of 47.4% of the SSR markers and 50.2% of the genes. The proximal 25% of each chromosome contained only 7.4% of the markers and 6.7% of the genes. At the whole genome level, the marker density and gene density showed a high correlation (R(2)) of 0.64 and 0.83, respectively with the physical distance from the centromere. Recombination followed a similar pattern with comparisons indicating that recombination is high in telomeric regions, though the correlation between crossover frequency and distance from the centromeres is low (R(2) = 0.21). Most of the centromeric regions were low in recombination. The crossover frequency for the entire soybean genome was 7.2%, with extremes much higher and lower than average. The number of recombination hotspots varied from 1 to 12 per chromosome. A high correlation of 0.83 between the distribution of SSR markers and genes suggested close association of SSRs with genes. The knowledge of distribution of recombination on chromosomes may be applied in characterizing and targeting genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lan, Yemin; Rosen, Gail; Hershberg, Ruth
The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less
Lan, Yemin; Rosen, Gail; Hershberg, Ruth
2016-05-03
The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less
Harhay, Gregory P; Harhay, Dayna M; Bono, James L; Smith, Timothy P L; Capik, Sarah F; DeDonder, Keith D; Apley, Michael D; Lubbers, Brian V; White, Bradley J; Larson, Robert L
2017-10-05
Histophilus somni is a fastidious Gram-negative opportunistic pathogenic Pasteurellaceae that affects multiple organ systems and is one of the principal bacterial species contributing to bovine respiratory disease complex (BRDC) in feed yard cattle. Here, we present seven closed genome sequences isolated from three beef calves showing sign of BRDC.
Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem
2008-11-27
The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.
Transcriptome-based differentiation of closely-related Miscanthus lines.
Chouvarine, Philippe; Cooksey, Amanda M; McCarthy, Fiona M; Ray, David A; Baldwin, Brian S; Burgess, Shane C; Peterson, Daniel G
2012-01-01
Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthus (Miscanthus × giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations. A SNP comparative analysis of rhizome-derived cDNA sequences was successfully utilized to distinguish three Miscanthus × giganteus cultivars from each other and from other Miscanthus species. Moreover, the resulting phylogenetic tree generated from SNP frequency data parallels the known breeding history of the plants examined. Some of the giant miscanthus plants exhibit considerable sequence divergence. Here we describe an analysis of Miscanthus in which high-throughput exome sequencing was utilized to differentiate between closely related genotypes despite the current lack of a reference genome sequence. We functionally annotated the exome sequences and provide resources to support Miscanthus systems biology. In addition, we demonstrate the use of the commercial high-performance cloud computing to do computational GO annotation.
2012-01-01
Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST) 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA) strains (including STs 16, 17, 18, and 78), in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade) and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA) clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains, as previously reported. Conclusions Our findings along with other studies show that HA clonal lineages harbor specific genetic elements as well as sequence differences in the core genome which may confer selection advantages over the more heterogeneous CA E. faecium isolates. Which of these differences are important for the success of specific E. faecium lineages in the hospital environment remain(s) to be determined. PMID:22769602
Larsen, J B; Larsen, A; Bratbak, G; Sandaa, R-A
2008-05-01
Algal viruses are considered ecologically important by affecting host population dynamics and nutrient flow in aquatic food webs. Members of the family Phycodnaviridae are also interesting due to their extraordinary genome size. Few algal viruses in the Phycodnaviridae family have been sequenced, and those that have been have few genes in common and low gene homology. It has hence been difficult to design general PCR primers that allow further studies of their ecology and diversity. In this study, we screened the nine type I core genes of the nucleocytoplasmic large DNA viruses for sequences suitable for designing a general set of primers. Sequence comparison between members of the Phycodnaviridae family, including three partly sequenced viruses infecting the prymnesiophyte Pyramimonas orientalis and the haptophytes Phaeocystis pouchetii and Chrysochromulina ericina (Pyramimonas orientalis virus 01B [PoV-01B], Phaeocystis pouchetii virus 01 [PpV-01], and Chrysochromulina ericina virus 01B [CeV-01B], respectively), revealed eight conserved regions in the major capsid protein (MCP). Two of these regions also showed conservation at the nucleotide level, and this allowed us to design degenerate PCR primers. The primers produced 347- to 518-bp amplicons when applied to lysates from algal viruses kept in culture and from natural viral communities. The aim of this work was to use the MCP as a proxy to infer phylogenetic relationships and genetic diversity among members of the Phycodnaviridae family and to determine the occurrence and diversity of this gene in natural viral communities. The results support the current legitimate genera in the Phycodnaviridae based on alga host species. However, while placing the mimivirus in close proximity to the type species, PBCV-1, of Phycodnaviridae along with the three new viruses assigned to the family (PoV-01B, PpV-01, and CeV-01B), the results also indicate that the coccolithoviruses and phaeoviruses are more diverged from this group. Phylogenetic analysis of amplicons from virus assemblages from Norwegian coastal waters as well as from isolated algal viruses revealed a cluster of viruses infecting members of the prymnesiophyte and prasinophyte alga divisions. Other distinct clusters were also identified, containing amplicons from this study as well as sequences retrieved from the Sargasso Sea metagenome. This shows that closely related sequences of this family are present at geographically distant locations within the marine environment.
Nupur; Tanuku, Naga Radha Srinivas; Shinichi, Takaichi; Pinnaka, Anil Kumar
2015-08-01
A novel brown-coloured, Gram-negative-staining, rod-shaped, motile, phototrophic, purple sulfur bacterium, designated strain AK40T, was isolated in pure culture from a sediment sample collected from Coringa mangrove forest, India. Strain AK40T contained bacteriochlorophyll a and carotenoids of the rhodopin series as major photosynthetic pigments. Strain AK40T was able to grow photoheterotrophically and could utilize a number of organic substrates. It was unable to grow photoautotrophically and did not utilize sulfide or thiosulfate as electron donors. Thiamine and riboflavin were required for growth. The dominant fatty acids were C12 : 0, C16 : 0, C18 : 1ω7c and summed feature 3 (C16 : 1ω7c and/or iso-C15 : 0 2-OH). The polar lipid profile of strain AK40T was found to contain diphosphatidylglycerol, phosphatidylethanolamine, phosphatidylglycerol and eight unidentified lipids. Q-10 was the predominant respiratory quinone. The DNA G+C content of strain AK40T was 65.5 mol%. 16S rRNA gene sequence comparisons indicated that the isolate represented a member of the family Chromatiaceae within the class Gammaproteobacteria. 16S rRNA gene sequence analysis indicated that strain AK40T was closely related to Phaeochromatium fluminis, with 95.2% pairwise sequence similarity to the type strain; sequence similarity to strains of other species of the family was 90.8-94.8%. Based on the sequence comparison data, strain AK40T was positioned distinctly outside the group formed by the genera Phaeochromatium, Marichromatium, Halochromatium, Thiohalocapsa, Rhabdochromatium and Thiorhodovibrio. Distinct morphological, physiological and genotypic differences from previously described taxa supported the classification of this isolate as a representative of a novel species in a new genus, for which the name Phaeobacterium nitratireducens gen. nov., sp. nov. is proposed. The type strain of Phaeobacterium nitratireducens is AK40T ( = JCM 19219T = MTCC 11824T).
Chen, Jie; Moinard, Magalie; Xu, Jianping; Wang, Shouxian; Foulongne-Oriol, Marie; Zhao, Ruilin; Hyde, Kevin D.; Callac, Philippe
2016-01-01
The internal transcribed spacer (ITS) region of the nuclear ribosomal RNA gene cluster is widely used in fungal taxonomy and phylogeographic studies. The medicinal and edible mushroom Agaricus subrufescens has a worldwide distribution with a high level of polymorphism in the ITS region. A previous analysis suggested notable ITS sequence heterogeneity within the wild French isolate CA487. The objective of this study was to investigate the pattern and potential mechanism of ITS sequence heterogeneity within this strain. Using PCR, cloning, and sequencing, we identified three types of ITS sequences, A, B, and C with a balanced distribution, which differed from each other at 13 polymorphic positions. The phylogenetic comparisons with samples from different continents revealed that the type C sequence was similar to those found in Oceanian and Asian specimens of A. subrufescens while types A and B sequences were close to those found in the Americas or in Europe. We further investigated the inheritance of these three ITS sequence types by analyzing their distribution among single-spore isolates from CA487. In this analysis, three co-dominant markers were used firstly to distinguish the homokaryotic offspring from the heterokaryotic offspring. The homokaryotic offspring were then analyzed for their ITS types. Our genetic analyses revealed that types A and B were two alleles segregating at one locus ITSI, while type C was not allelic with types A and B but was located at another unlinked locus ITSII. Furthermore, type C was present in only one of the two constitutive haploid nuclei (n) of the heterokaryotic (n+n) parent CA487. These data suggest that there was a relatively recent introduction of the type C sequence and a duplication of the ITS locus in this strain. Whether other genes were also transferred and duplicated and their impacts on genome structure and stability remain to be investigated. PMID:27228131
DNA Barcode for Identifying Folium Artemisiae Argyi from Counterfeits.
Mei, Quanxi; Chen, Xiaolu; Xiang, Li; Liu, Yue; Su, Yanyan; Gao, Yuqiao; Dai, Weibo; Dong, Pengpeng; Chen, Shilin
2016-01-01
Folium Artemisiae Argyi is an important herb in traditional Chinese medicine. It is commonly used in moxibustion, medicine, etc. However, identifying Artemisia argyi is difficult because this herb exhibits similar morphological characteristics to closely related species and counterfeits. To verify the applicability of DNA barcoding, ITS2 and psbA-trnH were used to identify A. argyi from 15 closely related species and counterfeits. Results indicated that total DNA was easily extracted from all the samples and that both ITS2 and psbA-trnH fragments can be easily amplified. ITS2 was a more ideal barcode than psbA-trnH and ITS2+psbA-trnH to identify A. argyi from closely related species and counterfeits on the basis of sequence character, genetic distance, and tree methods. The sequence length was 225 bp for the 56 ITS2 sequences of A. argyi, and no variable site was detected. For the ITS2 sequences, A. capillaris, A. anomala, A. annua, A. igniaria, A. maximowicziana, A. princeps, Dendranthema vestitum, and D. indicum had single nucleotide polymorphisms (SNPs). The intraspecific Kimura 2-Parameter distance was zero, which is lower than the minimum interspecific distance (0.005). A. argyi, the closely related species, and counterfeits, except for Artemisia maximowicziana and Artemisia sieversiana, were separated into pairs of divergent clusters by using the neighbor joining, maximum parsimony, and maximum likelihood tree methods. Thus, the ITS2 sequence was an ideal barcode to identify A. argyi from closely related species and counterfeits to ensure the safe use of this plant.
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.
Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue
2018-05-15
Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.
Azevedo Antunes, Camila; Richardson, Emily J; Quick, Joshua; Fuentes-Utrilla, Pablo; Isom, Georgia L; Goodall, Emily C; Möller, Jens; Hoskisson, Paul A; Mattos-Guaraldi, Ana Luiza; Cunningham, Adam F; Loman, Nicholas J; Sangal, Vartul; Burkovski, Andreas; Henderson, Ian R
2018-02-01
The genome sequence of the human pathogen Corynebacterium diphtheriae bv. mitis strain ISS 3319 was determined and closed in this study. The genome is estimated to have 2,404,936 bp encoding 2,257 proteins. This strain also possesses a plasmid of 1,960 bp. Copyright © 2018 Azevedo Antunes et al.
Harhay, Dayna M.; Bono, James L.; Smith, Timothy P. L.; Capik, Sarah F.; DeDonder, Keith D.; Apley, Michael D.; Lubbers, Brian V.; White, Bradley J.; Larson, Robert L.
2017-01-01
ABSTRACT Histophilus somni is a fastidious Gram-negative opportunistic pathogenic Pasteurellaceae that affects multiple organ systems and is one of the principal bacterial species contributing to bovine respiratory disease complex (BRDC) in feed yard cattle. Here, we present seven closed genome sequences isolated from three beef calves showing sign of BRDC. PMID:28983006
Umetsu, Kazuo; Iwabuchi, Naruki; Yuasa, Isao; Saitou, Naruya; Clark, Paul F; Boxshall, Geoff; Osawa, Motoki; Igarashi, Keiji
2002-12-01
The complete mitochondrial DNA (mtNDA) of the tadpole shrimp Triops cancriformis was sequenced. The sequence consisted of 15,101 bp with an A+T content of 69%. Its gene arrangement was identical with those sequences of the water flea (Daphnia pulex) and giant tiger prawn (Penaeus monodon), whereas it differed from that of the brine shrimp (Artemia franciscana) in the arrangement of its genes for tRNAs. Phylogenetic analysis revealed T. cancriformis to be more closely related to the water flea than to the brine shrimp and giant tiger prawn. We also compared the 16S rRNA sequences of five formalin-fixed tadpole shrimps that had been collected in five different locations and stored in a museum. The sequence divergence was in the range of 0-1.51%, suggesting that those samples were closely related to each other.
USDA-ARS?s Scientific Manuscript database
The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...
Reubel, Gerhard H.; Barlough, Jeffrey E.; Madigan, John E.
1998-01-01
We report on the production and characterization of Ehrlichia risticii, the agent of Potomac horse fever (PHF), from snails (Pleuroceridae: Juga spp.) maintained in aquarium culture and compare it genetically to equine strains. Snails were collected from stream waters on a pasture in Siskiyou County, Calif., where PHF is enzootic and were maintained for several weeks in freshwater aquaria in the laboratory. Upon exposure to temperatures above 22°C the snails released trematode cercariae tentatively identified as virgulate cercariae. Fragments of three different genes (genes for 16S rRNA, the groESL heat shock operon, and the 51-kDa major antigen) were amplified from cercaria lysates by PCR and sequenced. Genetic information was also obtained from E. risticii strains from horses with PHF. The PCR positivity of snail secretions was associated with the presence of trematode cercariae. Sequence analysis of the three genes indicated that the source organism closely resembled E. risticii, and the sequences of all three genes were virtually identical to those of the genes of an equine E. risticii strain from a property near the snail collection site. Phylogenetic analyses of the three genes indicated the presence of geographical E. risticii strain clusters. PMID:9620368
Reubel, G H; Barlough, J E; Madigan, J E
1998-06-01
We report on the production and characterization of Ehrlichia risticii, the agent of Potomac horse fever (PHF), from snails (Pleuroceridae: Juga spp.) maintained in aquarium culture and compare it genetically to equine strains. Snails were collected from stream waters on a pasture in Siskiyou County, Calif., where PHF is enzootic and were maintained for several weeks in freshwater aquaria in the laboratory. Upon exposure to temperatures above 22 degrees C the snails released trematode cercariae tentatively identified as virgulate cercariae. Fragments of three different genes (genes for 16S rRNA, the groESL heat shock operon, and the 51-kDa major antigen) were amplified from cercaria lysates by PCR and sequenced. Genetic information was also obtained from E. risticii strains from horses with PHF. The PCR positivity of snail secretions was associated with the presence of trematode cercariae. Sequence analysis of the three genes indicated that the source organism closely resembled E. risticii, and the sequences of all three genes were virtually identical to those of the genes of an equine E. risticii strain from a property near the snail collection site. Phylogenetic analyses of the three genes indicated the presence of geographical E. risticii strain clusters.
Fonfara, Ines; Le Rhun, Anaïs; Chylinski, Krzysztof; Makarova, Kira S.; Lécrivain, Anne-Laure; Bzdrenga, Janek; Koonin, Eugene V.; Charpentier, Emmanuelle
2014-01-01
The CRISPR-Cas-derived RNA-guided Cas9 endonuclease is the key element of an emerging promising technology for genome engineering in a broad range of cells and organisms. The DNA-targeting mechanism of the type II CRISPR-Cas system involves maturation of tracrRNA:crRNA duplex (dual-RNA), which directs Cas9 to cleave invading DNA in a sequence-specific manner, dependent on the presence of a Protospacer Adjacent Motif (PAM) on the target. We show that evolution of dual-RNA and Cas9 in bacteria produced remarkable sequence diversity. We selected eight representatives of phylogenetically defined type II CRISPR-Cas groups to analyze possible coevolution of Cas9 and dual-RNA. We demonstrate that these two components are interchangeable only between closely related type II systems when the PAM sequence is adjusted to the investigated Cas9 protein. Comparison of the taxonomy of bacterial species that harbor type II CRISPR-Cas systems with the Cas9 phylogeny corroborates horizontal transfer of the CRISPR-Cas loci. The reported collection of dual-RNA:Cas9 with associated PAMs expands the possibilities for multiplex genome editing and could provide means to improve the specificity of the RNA-programmable Cas9 tool. PMID:24270795
NASA Technical Reports Server (NTRS)
Reddy, A. S.; Czernik, A. J.; An, G.; Poovaiah, B. W.
1992-01-01
We cloned and sequenced a plant cDNA that encodes U1 small nuclear ribonucleoprotein (snRNP) 70K protein. The plant U1 snRNP 70K protein cDNA is not full length and lacks the coding region for 68 amino acids in the amino-terminal region as compared to human U1 snRNP 70K protein. Comparison of the deduced amino acid sequence of the plant U1 snRNP 70K protein with the amino acid sequence of animal and yeast U1 snRNP 70K protein showed a high degree of homology. The plant U1 snRNP 70K protein is more closely related to the human counter part than to the yeast 70K protein. The carboxy-terminal half is less well conserved but, like the vertebrate 70K proteins, is rich in charged amino acids. Northern analysis with the RNA isolated from different parts of the plant indicates that the snRNP 70K gene is expressed in all of the parts tested. Southern blotting of genomic DNA using the cDNA indicates that the U1 snRNP 70K protein is coded by a single gene.
Rodriguez Parkitna, Jan M; Ozyhar, Andrzej; Wiśniewski, Jacek R; Kochman, Marian
2002-09-01
Juvenile hormone binding proteins (JHBPs) serve as specific carriers of juvenile hormone (JH) in insect hemolymph. As shown in this report, Galleria mellonella JHBP is encoded by a cDNA of 1063 nucleotides. The pre-protein consists of 245 amino acids with a 20 amino acid leader sequence. The concentration of the JHBP mRNA reaches a maximum on the third day of the last larval instar, and decreases five-fold towards pupation. Comparison of amino acid sequences of JHBPs from Bombyx mori, Heliothis virescens, Manduca sexta and G. mellonella shows that 57 positions out of 226 are occupied by identical amino acids. A phylogeny tree was constructed from 32 proteins, which function could be associated to JH. It has three major branches: (i) ligand binding domains of nuclear receptors, (ii) JHBPs and JH esterases (JHEs), and (iii) hypothetical proteins found in Drosophila melanogaster genome. Despite the close positioning of JHEs and JHBPs on the tree, which probably arises from the presence of a common JH binding motif, these proteins are unlikely to belong to the same family. Detailed analysis of the secondary structure modeling shows that JHBPs may contain a beta-barrel motif flanked by alpha-helices and thus be evolutionary related to the same superfamily as calycins.
Noise and drift analysis of non-equally spaced timing data
NASA Technical Reports Server (NTRS)
Vernotte, F.; Zalamansky, G.; Lantz, E.
1994-01-01
Generally, it is possible to obtain equally spaced timing data from oscillators. The measurement of the drifts and noises affecting oscillators is then performed by using a variance (Allan variance, modified Allan variance, or time variance) or a system of several variances (multivariance method). However, in some cases, several samples, or even several sets of samples, are missing. In the case of millisecond pulsar timing data, for instance, observations are quite irregularly spaced in time. Nevertheless, since some observations are very close together (one minute) and since the timing data sequence is very long (more than ten years), information on both short-term and long-term stability is available. Unfortunately, a direct variance analysis is not possible without interpolating missing data. Different interpolation algorithms (linear interpolation, cubic spline) are used to calculate variances in order to verify that they neither lose information nor add erroneous information. A comparison of the results of the different algorithms is given. Finally, the multivariance method was adapted to the measurement sequence of the millisecond pulsar timing data: the responses of each variance of the system are calculated for each type of noise and drift, with the same missing samples as in the pulsar timing sequence. An estimation of precision, dynamics, and separability of this method is given.
Genetic characterization and phylogenetic analysis of Eimeria arloingi in Iranian native kids.
Khodakaram-Tafti, A; Hashemnia, M; Razavi, S M; Sharifiyazdi, H; Nazifi, S
2013-09-01
Among the 16 species of Eimeria from goats, Eimeria arloingi and Eimeria ninakohlyakimovae are regarded as the most pathogenic species in the world and cause clinical caprine coccidiosis. E. arloingi is known to be an important cause of coccidiosis in Iranian kids. Molecular analyses of two portions of nuclear ribosomal DNA (internal transcribed spacer1 (ITS1) and 18S rDNA) were used for the genetic characterization of the E. arloingi. Comparison of the sequencing data of E. arloingi obtained in the present study (ITS1: KC507793 and 18S rDNA: KC507792) with other Eimeria species in the GenBank database revealed a particularly close relationship between E. arloingi and Eimeria spp. from the cattle and sheep. The phylogram based on the ITS1 sequences shows that the E. arloingi, Eimeria bovis, and Eimeria zuernii formed a distinct group separate from the other remaining Eimeria spp. in cattle and poultry. In pairwise alignment, 18S rDNA sequence derived from E. arloingi showed 99% similarity to Eimeria ahsata with differences observed at only three nucleotides. This study showed that the ITS1 and 18S rDNA gene are useful genetic markers for the specific identification and differentiation of Eimeria spp. in ruminants.
Deciphering the Diploid Ancestral Genome of the Mesohexaploid Brassica rapa[C][W
Cheng, Feng; Mandáková, Terezie; Wu, Jian; Xie, Qi; Lysak, Martin A.; Wang, Xiaowu
2013-01-01
The genus Brassica includes several important agricultural and horticultural crops. Their current genome structures were shaped by whole-genome triplication followed by extensive diploidization. The availability of several crucifer genome sequences, especially that of Chinese cabbage (Brassica rapa), enables study of the evolution of the mesohexaploid Brassica genomes from their diploid progenitors. We reconstructed three ancestral subgenomes of B. rapa (n = 10) by comparing its whole-genome sequence to ancestral and extant Brassicaceae genomes. All three B. rapa paleogenomes apparently consisted of seven chromosomes, similar to the ancestral translocation Proto-Calepineae Karyotype (tPCK; n = 7), which is the evolutionarily younger variant of the Proto-Calepineae Karyotype (n = 7). Based on comparative analysis of genome sequences or linkage maps of Brassica oleracea, Brassica nigra, radish (Raphanus sativus), and other closely related species, we propose a two-step merging of three tPCK-like genomes to form the hexaploid ancestor of the tribe Brassiceae with 42 chromosomes. Subsequent diversification of the Brassiceae was marked by extensive genome reshuffling and chromosome number reduction mediated by translocation events and followed by loss and/or inactivation of centromeres. Furthermore, via interspecies genome comparison, we refined intervals for seven of the genomic blocks of the Ancestral Crucifer Karyotype (n = 8), thus revising the key reference genome for evolutionary genomics of crucifers. PMID:23653472
Nucleotide sequence and phylogenetic analysis of Cucurbit yellow stunting disorder virus RNA 2.
Livieratos, Ioannis C; Coutts, Robert H A
2002-06-01
The complete nucleotide sequence of Cucurbit yellow stunting disorder virus (CYSDV) RNA 2, a whitefly (Bemisia tabaci)-transmitted closterovirus with a bi-partite genome, is reported. CYSDV RNA 2 is 7,281 nucleotides long and contains the closterovirus hallmark gene array with a similar arrangement to the prototype member of the genus Crinivirus, Lettuce infectious yellows virus (LIYV). CYSDV RNA 2 contains open reading frames (ORFs) potentially encoding in a 5' to 3' direction for proteins of 5 kDa (ORF 1; hydrophobic protein), 62 kDa (ORF 2; heat shock protein 70 homolog, HSP70h), 59 kDa (ORF 3; protein of unknown function), 9 kDa (ORF 4; protein of unknown function), 28.5 kDa (ORF 5; coat protein, CP), 53 kDa (ORF 6; coat protein minor, CPm), and 26.5 kDa (ORF 7; protein of unknown function). Pairwise comparisons of CYSDV RNA 2-encoded proteins (HSP70h, p59 and CPm) among the closteroviruses showed that CYSDV is closely related to LIYV. Phylogenetic analysis based on the amino acid sequence of the HSP70h, indicated that CYSDV clusters with other members of the genus Crinivirus, and it is related to Little cherry virus-1 (LChV-1), but is distinct from the aphid- or mealybug-transmitted closteroviruses.
A new polymorphic and multicopy MHC gene family related to nonmammalian class I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leelayuwat, C.; Degli-Esposti, M.A.; Abraham, L.J.
1994-12-31
The authors have used genomic analysis to characterize a region of the central major histocompatibility complex (MHC) spanning {approximately} 300 kilobases (kb) between TNF and HLA-B. This region has been suggested to carry genetic factors relevant to the development of autoimmune diseases such as myasthenia gravis (MG) and insulin dependent diabetes mellitus (IDDM). Genomic sequence was analyzed for coding potential, using two neural network programs, GRAIL and GeneParser. A genomic probe, JAB, containing putative coding sequences (PERB11) located 60 kb centromeric of HLA-B, was used for northern analysis of human tissues. Multiple transcripts were detected. Southern analysis of genomic DNAmore » and overlapping YAC clones, covering the region from BAT1 to HLA-F, indicated that there are at least five copies of PERB11, four of which are located within this region of the MHC. The partial cDNA sequence of PERB11 was obtained from poly-A RNA derived from skeletal muscle. The putative amino acid sequence of PERB11 shares {approximately} 30% identity to MHC class I molecules from various species, including reptiles, chickens, and frogs, as well as to other MHC class I-like molecules, such as the IgG FcR of the mouse and rat and the human Zn-{alpha}2-glycoprotein. From direct comparison of amino acid sequences, it is concluded that PERB11 is a distinct molecule more closely related to nonmammalian than known mammalian MHC class I molecules. Genomic sequence analysis of PERB11 from five MHC ancestral haplotypes (AH) indicated that the gene is polymorphic at both DNA and protein level. The results suggest that the authors have identified a novel polymorphic gene family with multiple copies within the MHC. 48 refs., 10 figs., 2 tabs.« less
Chen, Y C; Huang, F D; Chen, N H; Shou, J Y; Wu, L
1998-04-01
In the last 2-3 decades the role of the premotor cortex (PM) of monkey in memorized spatial sequential (MSS) movements has been amply investigated. However, it is as yet not known whether PM participates in the movement sequence behaviour guided by recognition of visual figures (i.e. the figure-recognition sequence, FRS). In the present work three monkeys were trained to perform both FRS and MSS tasks. Postmortem examination showed that 202 cells were in the dorso-lateral premotor cortex. Among 111 cells recorded during the two tasks, more than 50% changed their activity during the cue periods in either task. During the response period, the ratios of cells with changes of firing rate in both FRS and MSS were high and roughly equal to each other, while during the image period, the proportion in the FRS (83.7%) was significantly higher than that in the MSS (66.7%). Comparison of neuronal activities during same motor sequence of two different tasks showed that during the image periods PM neuronal activities were more closely related to the FRS task, while during the cue periods no difference could be found. Analysis of cell responses showed that the neurons with longer latency were much more in MSS than in FRS in either cue or image period. The present results indicate that the premotor cortex participates in FRS motor sequence as well as in MSS and suggest that the dorso-lateral PM represents another subarea in function shared by both FRS and MSS tasks. However, in view of the differences of PM neuronal responses in cue or image periods of FRS and MSS tasks, it seems likely that neural networks involved in FRS and MSS tasks are different.
Application of Genomic Technologies to the Breeding of Trees
Badenes, Maria L.; Fernández i Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J.
2016-01-01
The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species. PMID:27895664
Bjorklund, H.V.; Higman, K.H.; Kurath, G.
1996-01-01
The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2–33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.
Bjorklund, H.V.; Higman, K.H.; Kurath, G.
1996-01-01
The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2-33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.
Lei, Haiyan; Li, Tianwei; Hung, Guo-Chiuan; Li, Bingjie; Tsai, Shien; Lo, Shyh-Ching
2013-11-19
We conducted genomic sequencing to identify Epstein Barr Virus (EBV) genomes in 2 human peripheral blood B lymphocytes that underwent spontaneous immortalization promoted by mycoplasma infections in culture, using the high-throughput sequencing (HTS) Illumina MiSeq platform. The purpose of this study was to examine if rapid detection and characterization of a viral agent could be effectively achieved by HTS using a platform that has become readily available in general biology laboratories. Raw read sequences, averaging 175 bps in length, were mapped with DNA databases of human, bacteria, fungi and virus genomes using the CLC Genomics Workbench bioinformatics tool. Overall 37,757 out of 49,520,834 total reads in one lymphocyte line (# K4413-Mi) and 28,178 out of 45,335,960 reads in the other lymphocyte line (# K4123-Mi) were identified as EBV sequences. The two EBV genomes with estimated 35.22-fold and 31.06-fold sequence coverage respectively, designated K4413-Mi EBV and K4123-Mi EBV (GenBank accession number KC440852 and KC440851 respectively), are characteristic of type-1 EBV. Sequence comparison and phylogenetic analysis among K4413-Mi EBV, K4123-Mi EBV and the EBV genomes previously reported to GenBank as well as the NA12878 EBV genome assembled from database of the 1000 Genome Project showed that these 2 EBVs are most closely related to B95-8, an EBV previously isolated from a patient with infectious mononucleosis and WT-EBV. They are less similar to EBVs associated with nasopharyngeal carcinoma (NPC) from Hong Kong and China as well as the Akata strain of a case of Burkitt's lymphoma from Japan. They are most different from type 2 EBV found in Western African Burkitt's lymphoma.
Uda, Kouji; Ishida, Mikako; Matsui, Tohru; Suzuki, Tomohiko
2010-10-01
Arginine kinase (AK), which catalyzes the reversible transfer of phosphate from ATP to arginine to yield phosphoarginine and ADP, is widely distributed throughout the invertebrates. We determined the cDNA sequence of AK from the tardigrade (water bear) Macrobiotus occidentalis, cloned the sequence into pET30b plasmid, and expressed it in Escherichia coli as a 6x His-tag—fused protein. The cDNA is 1377 bp, has an open reading frame of 1080 bp, and has 5′- and 3′-untranslated regions of 116 and 297 bp, respectively. The open reading frame encodes a 359-amino acid protein containing the 12 residues considered necessary for substrate binding in Limulus AK. This is the first AK sequence from a tardigrade. From fragmented and non-annotated sequences available from DNA databases, we assembled 46 complete AK sequences: 26 from arthropods (including 19 from Insecta), 11 from nematodes, 4 from mollusks, 2 from cnidarians and 2 from onychophorans. No onychophoran sequences have been reported previously. The phylogenetic trees of 104 AKs indicated clearly that Macrobiotus AK (from the phylum Tardigrada) shows close affinity with Epiperipatus and Euperipatoides AKs (from the phylum Onychophora), and therefore forms a sister group with the arthropod AKs. Recombinant 6x His-tagged Macrobiotus AK was successfully expressed as a soluble protein, and the kinetic constants (K(m), K(d), V(ma) and k(cat)) were determined for the forward reaction. Comparison of these kinetic constants with those of AKs from other sources (arthropods, mollusks and nematodes) indicated that Macrobiotus AK is unique in that it has the highest values for k(cat) and K(d)K(m) (indicative of synergistic substrate binding) of all characterized AKs.
Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.
Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark
2016-01-01
Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies.
Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks
Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark
2016-01-01
Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies. PMID:27446025
Application of Genomic Technologies to the Breeding of Trees.
Badenes, Maria L; Fernández I Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J
2016-01-01
The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species.
Haney, Robert A.; Clarke, Thomas H.; Gadgil, Rujuta; Fitzpatrick, Ryan; Hayashi, Cheryl Y.; Ayoub, Nadia A.; Garb, Jessica E.
2016-01-01
Gene duplication and positive selection can be important determinants of the evolution of venom, a protein-rich secretion used in prey capture and defense. In a typical model of venom evolution, gene duplicates switch to venom gland expression and change function under the action of positive selection, which together with further duplication produces large gene families encoding diverse toxins. Although these processes have been demonstrated for individual toxin families, high-throughput multitissue sequencing of closely related venomous species can provide insights into evolutionary dynamics at the scale of the entire venom gland transcriptome. By assembling and analyzing multitissue transcriptomes from the Western black widow spider and two closely related species with distinct venom toxicity phenotypes, we do not find that gene duplication and duplicate retention is greater in gene families with venom gland biased expression in comparison with broadly expressed families. Positive selection has acted on some venom toxin families, but does not appear to be in excess for families with venom gland biased expression. Moreover, we find 309 distinct gene families that have single transcripts with venom gland biased expression, suggesting that the switching of genes to venom gland expression in numerous unrelated gene families has been a dominant mode of evolution. We also find ample variation in protein sequences of venom gland–specific transcripts, lineage-specific family sizes, and ortholog expression among species. This variation might contribute to the variable venom toxicity of these species. PMID:26733576
Approximation theory for LQG (Linear-Quadratic-Gaussian) optimal control of flexible structures
NASA Technical Reports Server (NTRS)
Gibson, J. S.; Adamian, A.
1988-01-01
An approximation theory is presented for the LQG (Linear-Quadratic-Gaussian) optimal control problem for flexible structures whose distributed models have bounded input and output operators. The main purpose of the theory is to guide the design of finite dimensional compensators that approximate closely the optimal compensator. The optimal LQG problem separates into an optimal linear-quadratic regulator problem and an optimal state estimation problem. The solution of the former problem lies in the solution to an infinite dimensional Riccati operator equation. The approximation scheme approximates the infinite dimensional LQG problem with a sequence of finite dimensional LQG problems defined for a sequence of finite dimensional, usually finite element or modal, approximations of the distributed model of the structure. Two Riccati matrix equations determine the solution to each approximating problem. The finite dimensional equations for numerical approximation are developed, including formulas for converting matrix control and estimator gains to their functional representation to allow comparison of gains based on different orders of approximation. Convergence of the approximating control and estimator gains and of the corresponding finite dimensional compensators is studied. Also, convergence and stability of the closed-loop systems produced with the finite dimensional compensators are discussed. The convergence theory is based on the convergence of the solutions of the finite dimensional Riccati equations to the solutions of the infinite dimensional Riccati equations. A numerical example with a flexible beam, a rotating rigid body, and a lumped mass is given.
Mekalanos, John J.
2014-01-01
Modern genomic and bioinformatic approaches have been applied to interrogate the V. cholerae genome, the role of genomic elements in cholera disease, and the origin, relatedness, and dissemination of epidemic strains. A universal attribute of choleragenic strains includes a repertoire of pathogenicity islands and virulence genes, namely the CTX–ϕ prophage and Toxin Co-regulated Pilus (TCP) in addition to other virulent genetic elements including those referred to as Seventh Pandemic Islands. During the last decade, the advent of Next Generation Sequencing (NGS) has provided highly resolved and often complete genomic sequences of epidemic isolates in addition to both clinical and environmental strains isolated from geographically unconnected regions. Genomic comparisons of these strains, as was completed during and following the Haitian outbreak in 2010, reveals that most epidemic strains appear closely related, regardless of region of origin. Non-O1 clinical or environmental strains may also possess some virulence islands, but phylogenic analysis of the core genome suggests they are more diverse and distantly related than those isolated during epidemics. Like Haiti, genomic studies that examine both the Vibrio core- and pan-genome in addition to Single Nucleotide Polymorphisms (SNPs) conclude that a number of epidemics are caused by strains that closely resemble those in Asia, and often appear to originate there and then spread globally. The accumulation of SNPs in the epidemic strains over time can then be applied to better understand the evolution of the V. cholerae genome as an etiological agent. PMID:24590676
Arslan, Naİme; Timm, Tarmo; Rojo, VerÓnica; VizcaÍno, AntÓn; Schmelz, RÜdiger M
2018-02-21
Enchytraeus polatdemiri sp. nov. (Enchytaeidae, Oligochaeta) was discovered in the framework of a sampling campaign of the benthic invertebrate fauna of the hyperalkaline Lake Van in Eastern Anatolia, Turkey, the third-largest closed lake and the largest soda lake on Earth. It was the only oligochaete species found in all samples. DNA sequencing included a fragment of the mitochondrial cytochrome c oxidase subunit I (COI) gene, and a fragment of the nuclear histone 3 (H3) gene. For comparison, specimens from laboratory cultures of E. albidus Henle, 1837, a widespread and morphologically similar species, were sequenced as well. The new species differs from E. albidus in comparatively small body size, 2 or 3 chaetae per bundle, saddle-shaped clitellum, absence of a copulatory field between the male pores and vasa deferentia usually not extending beyond the clitellum. The individual gene trees of COI and H3, as well as the combined phylogenetic analysis of both trees, recovered Enchytraeus polatdemiri sp. nov. as a monophyletic group within the genus Enchytraeus, closely related to E. albidus, but with an average p-distance for COI of 14.5 %. E. polatdemiri sp. nov. may have evolved from a local population of Enchytraeus albidus, a species well-adapted to changing salinity conditions, or from a common ancestor into an extremophile species that dwells and reproduces in the profundal of a strongly alkaline soda lake.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, C S; Xie, G; Challacombe, J F
The sequencing and analysis of two close relatives of Bacillus anthracis are reported. AFLP analysis of over 300 isolates of B. cereus, B. thuringiensis and B. anthracis identified two isolates as being very closely related to B. anthracis. One, a B. cereus, BcE33L, was isolated from a zebra carcass in Nambia; the second, a B. thuringiensis, 97-27, was isolated from a necrotic human wound. The B. cereus appears to be the closest anthracis relative sequenced to date. A core genome of over 3,900 genes was compiled for the Bacillus cereus group, including B anthracis. Comparative analysis of these two genomesmore » with other members of the B. cereus group provides insight into the evolutionary relationships among these organisms. Evidence is presented that differential regulation modulates virulence, rather than simple acquisition of virulence factors. These genome sequences provide insight into the molecular mechanisms contributing to the host range and virulence of this group of organisms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, Cliff S.; Xie, Gary; Challacombe, Jean F.
The sequencing and analysis of two close relatives of Bacillus anthracis are reported. AFLP analysis of over 300 isolates of B.cereus, B. thuringiensis and B. anthracis identified two isolates as being very closely related to B. anthracis. One, a B. cereus, BcE33L, was isolated from a zebra carcass in Nambia; the second, a B. thuringiensis, 97-27, was isolated from a necrotic human wound. The B. cereus appears to be the closest anthracis relative sequenced to date. A core genome of over 3,900 genes was compiled for the Bacillus cereus group, including Banthracis. Comparative analysis of these two genomes with othermore » members of the B. cereus group provides insight into the evolutionary relationships among these organisms. Evidence is presented that differential regulation modulates virulence, rather than simple acquisition of virulence factors. These genome sequences provide insight into the molecular mechanisms contributing to the host range and virulence of this group of organisms.« less
Challacombe, Jean Faust; Petersen, Jeannine M.; Gallegos-Graves, La Verne A.; ...
2016-11-23
Francisella tularensis is a highly virulent zoonotic pathogen that causes tularemia and, because of weaponization efforts in past world wars, is considered a tier 1 biothreat agent. Detection and surveillance of F. tularensis may be confounded by the presence of uncharacterized, closely related organisms. Through DNA-based diagnostics and environmental surveys, novel clinical and environmental Francisella isolates have been obtained in recent years. Here we present 7 new Francisella genomes and a comparison of their characteristics to each other and to 24 publicly available genomes as well as a comparative analysis of 16S rRNA and sdhA genes from over 90 Francisellamore » strains. Delineation of new species in bacteria is challenging, especially when isolates having very close genomic characteristics exhibit different physiological features—for example, when some are virulent pathogens in humans and animals while others are nonpathogenic or are opportunistic pathogens. Species resolution within Francisella varies with analyses of single genes, multiple gene or protein sets, or whole-genome comparisons of nucleic acid and amino acid sequences. Analyses focusing on single genes (16S rRNA, sdhA), multiple gene sets (virulence genes, lipopolysaccharide [LPS] biosynthesis genes, pathogenicity island), and whole-genome comparisons (nucleotide and protein) gave congruent results, but with different levels of discrimination confidence. We designate four new species within the genus; Francisella opportunistica sp. nov. (MA06-7296), Francisella salina sp. nov. (TX07-7308), Francisella uliginis sp. nov. (TX07-7310), and Francisella frigiditurris sp. nov. (CA97-1460). Lastly, this study provides a robust comparative framework to discern species and virulence features of newly detected Francisella bacteria.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Challacombe, Jean Faust; Petersen, Jeannine M.; Gallegos-Graves, La Verne A.
Francisella tularensis is a highly virulent zoonotic pathogen that causes tularemia and, because of weaponization efforts in past world wars, is considered a tier 1 biothreat agent. Detection and surveillance of F. tularensis may be confounded by the presence of uncharacterized, closely related organisms. Through DNA-based diagnostics and environmental surveys, novel clinical and environmental Francisella isolates have been obtained in recent years. Here we present 7 new Francisella genomes and a comparison of their characteristics to each other and to 24 publicly available genomes as well as a comparative analysis of 16S rRNA and sdhA genes from over 90 Francisellamore » strains. Delineation of new species in bacteria is challenging, especially when isolates having very close genomic characteristics exhibit different physiological features—for example, when some are virulent pathogens in humans and animals while others are nonpathogenic or are opportunistic pathogens. Species resolution within Francisella varies with analyses of single genes, multiple gene or protein sets, or whole-genome comparisons of nucleic acid and amino acid sequences. Analyses focusing on single genes (16S rRNA, sdhA), multiple gene sets (virulence genes, lipopolysaccharide [LPS] biosynthesis genes, pathogenicity island), and whole-genome comparisons (nucleotide and protein) gave congruent results, but with different levels of discrimination confidence. We designate four new species within the genus; Francisella opportunistica sp. nov. (MA06-7296), Francisella salina sp. nov. (TX07-7308), Francisella uliginis sp. nov. (TX07-7310), and Francisella frigiditurris sp. nov. (CA97-1460). Lastly, this study provides a robust comparative framework to discern species and virulence features of newly detected Francisella bacteria.« less
Shien, J-H; Wang, Y-S; Chen, C-H; Shieh, H K; Hu, C-C; Chang, P-C
2008-10-01
Live attenuated vaccines have been used for control of the disease caused by goose parvovirus (GPV), but the mechanism involved in attenuation of GPV remains elusive. This report presents the complete nucleotide sequences of two live attenuated strains of GPV (82-0321V and VG32/1) that were independently developed in Taiwan and Europe, together with the parental strain of 82-0321V and a field strain isolated in Taiwan in 2006. Sequence comparisons showed that 82-0321V and VG32/1 had multiple deletions and substitutions in the inverted terminal repeats region when compared with their parental strain or the field virus, but these changes did not affect the formation of the hairpin structure essential for viral replication. Moreover, 82-0321V and VG32/1 had five amino acid changes in the non-structural protein, but these changes were located at positions distant from known functional motifs in the non-structural protein. In contrast, 82-0321V had nine changes and VG32/1 had 11 changes in their capsid proteins (VP1), and the majority of these changes occurred at positions close to the putative receptor binding sites of VP1, as predicted using the structure of adeno-associated virus 2 as the model system. Taken together, the results suggest that changes in sequence near the receptor binding sites of VP1 might be responsible for attenuation of GPV. This is the first report of complete nucleotide sequences of GPV other than the virulent B strain, and suggests a possible mechanism for attenuation of GPV.
Poly A tail length analysis of in vitro transcribed mRNA by LC-MS.
Beverly, Michael; Hagen, Caitlin; Slack, Olga
2018-02-01
The 3'-polyadenosine (poly A) tail of in vitro transcribed (IVT) mRNA was studied using liquid chromatography coupled to mass spectrometry (LC-MS). Poly A tails were cleaved from the mRNA using ribonuclease T1 followed by isolation with dT magnetic beads. Extracted tails were then analyzed by LC-MS which provided tail length information at single-nucleotide resolution. A 2100-nt mRNA with plasmid-encoded poly A tail lengths of either 27, 64, 100, or 117 nucleotides was used for these studies as enzymatically added poly A tails showed significant length heterogeneity. The number of As observed in the tails closely matched Sanger sequencing results of the DNA template, and even minor plasmid populations with sequence variations were detected. When the plasmid sequence contained a discreet number of poly As in the tail, analysis revealed a distribution that included tails longer than the encoded tail lengths. These observations were consistent with transcriptional slippage of T7 RNAP taking place within a poly A sequence. The type of RNAP did not alter the observed tail distribution, and comparison of T3, T7, and SP6 showed all three RNAPs produced equivalent tail length distributions. The addition of a sequence at the 3' end of the poly A tail did, however, produce narrower tail length distributions which supports a previously described model of slippage where the 3' end can be locked in place by having a G or C after the poly nucleotide region. Graphical abstract Determination of mRNA poly A tail length using magnetic beads and LC-MS.
Cai, J; Collins, M D
1994-04-01
The 16S rRNA gene sequence of Melissococcus pluton, the causative agent of European foulbrood disease, was determined in order to investigate the phylogenetic relationships between this organism and other low-G + C-content gram-positive bacteria. A comparative sequence analysis revealed that M. pluton is a close phylogenetic relative of the genus Enterococcus.
Korber, B T; Osmanov, S; Esparza, J; Myers, G
1994-11-01
The World Health Organization Global Programme on AIDS (WHO/GPA) is conducting a large-scale collaborative study of human immunodeficiency virus type 1 (HIV-1) variation, based in four potential vaccine-trial site countries: Brazil, Rwanda, Thailand, and Uganda. Through the course of this study, it was crucial to keep track of certain attributes of the samples from which the viral nucleotide sequences were derived (e.g., country of origin and viral culture characterization), so that meaningful sequence comparisons could be made. Here we describe a system developed in the context of the WHO/GPA study that summarizes such critical attributes by representing them as standardized characters directly incorporated into sequence names. This nomenclature allows linkage of clinical, phenotypic, and geographic information with molecular data. We propose that other investigators involved in human immunodeficiency virus (HIV) nucleotide sequencing efforts adopt a similar standardized sequence nomenclature to facilitate cross-study sequence comparison. HIV sequence data are being generated at an ever-increasing rate; directly coupled to this increase is our deepening understanding of biological parameters that influence or result from sequence variability. A standardized sequence nomenclature that includes relevant biological information would enable researchers to better utilize the growing body of sequence data, and enhance their ability to interpret the biological implications of their own data through facilitating comparisons with previously published work.
Dai, Qi; Yang, Yanchun; Wang, Tianming
2008-10-15
Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
Mosaic Graphs and Comparative Genomics in Phage Communities
Belcaid, Mahdi; Bergeron, Anne
2010-01-01
Abstract Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities. PMID:20874413
NASA Astrophysics Data System (ADS)
Li, Lixia; Pan, Dong; Yu, Xuezhe; So, Hyok; Zhao, Jianhua
2017-10-01
Self-catalyzed GaAs nanowires (NWs) are grown on Si (111) substrates by molecular-beam epitaxy. The effect of different closing sequences of the Ga and As cell shutters on the morphology and structural phase of GaAs NWs is investigated. For the sequences of closing the Ga and As cell shutters simultaneously or closing the As cell shutter 1 min after closing the Ga cell shutter, the NWs grow vertically to the substrate surface. In contrast, when the As cell shutter is closed first, maintaining the Ga flux is found to be critical for the following growth of GaAs NWs, which can change the growth direction from [111] to < 11\\bar{1}> . The evolution of the morphology and structural phase transition at the tips of these GaAs NWs confirm that the triple-phase-line shift mode is at work even for the growth with different cell shutter closing sequences. Our work will provide new insights for better understanding of the growth mechanism and realizing of the morphology and structure control of the GaAs NWs. Project supported partly by the MOST of China (No. 2015CB921503), the National Natural Science Foundation of China (Nos. 61504133, 61334006, 61404127), and Youth Innovation Promotion Association, CAS (No. 2017156).
Crossley, Beate M.; Mock, Richard E.; Callison, Scott A.; Hietala, Sharon K.
2012-01-01
In 2007, a novel coronavirus associated with an acute respiratory disease in alpacas (Alpaca Coronavirus, ACoV) was isolated. Full-length genomic sequencing of the ACoV demonstrated the genome to be consistent with other Alphacoronaviruses. A putative additional open-reading frame was identified between the nucleocapsid gene and 3'UTR. The ACoV was genetically most similar to the common human coronavirus (HCoV) 229E with 92.2% nucleotide identity over the entire genome. A comparison of spike gene sequences from ACoV and from HCoV-229E isolates recovered over a span of five decades showed the ACoV to be most similar to viruses isolated in the 1960’s to early 1980’s. The true origin of the ACoV is unknown, however a common ancestor between the ACoV and HCoV-229E appears to have existed prior to the 1960’s, suggesting virus transmission, either as a zoonosis or anthroponosis, has occurred between alpacas and humans. PMID:23235471
Yamashita, Teruo; Ito, Miyabi; Tsuzuki, Hideaki; Sakae, Kenji; Minagawa, Hiroko
2010-04-01
Of 58 enterovirus strains isolated from Japanese travellers returning from Asian countries, eight were non-serotypable with existing antisera. By sequencing a part of the VP1 region, six of these strains were typed as echovirus 9, enterovirus (EV)-73, EV-79 or EV-97. The nucleotide identity of the VP1 region of isolate T92-1499 to all enterovirus prototypes was <70 %. The VP1 sequence of isolate TN94-0349 was closely related to coxsackievirus (CV)-A9 (73.3 % nucleotide identity), but the virus could not be neutralized with a serum raised against the prototype CV-A9 strain. On the basis of complete molecular comparisons, T92-1499 and TN94-0349 were identified as EV-98 and EV-107, respectively, by the ICTV Picornavirus Study Group. Serum neutralization tests of Japanese individuals revealed a seroprevalence rate of 11 % for EV-73, and even lower seroprevalence rates, 1.0-3.8 %, were found for the other new enteroviruses, suggesting that prior circulation of these viruses in Japan was unlikely.
Campylobacter fetus subsp. testudinum subsp. nov., isolated from humans and reptiles.
Fitzgerald, Collette; Tu, Zheng Chao; Patrick, Mary; Stiles, Tracy; Lawson, Andy J; Santovenia, Monica; Gilbert, Maarten J; van Bergen, Marcel; Joyce, Kevin; Pruckler, Janet; Stroika, Steven; Duim, Birgitta; Miller, William G; Loparev, Vladimir; Sinnige, Jan C; Fields, Patricia I; Tauxe, Robert V; Blaser, Martin J; Wagenaar, Jaap A
2014-09-01
A polyphasic study was undertaken to determine the taxonomic position of 13 Campylobacter fetus-like strains from humans (n = 8) and reptiles (n = 5). The results of matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) MS and genomic data from sap analysis, 16S rRNA gene and hsp60 sequence comparison, pulsed-field gel electrophoresis, amplified fragment length polymorphism analysis, DNA-DNA hybridization and whole genome sequencing demonstrated that these strains are closely related to C. fetus but clearly differentiated from recognized subspecies of C. fetus. Therefore, this unique cluster of 13 strains represents a novel subspecies within the species C. fetus, for which the name Campylobacter fetus subsp. testudinum subsp. nov. is proposed, with strain 03-427(T) ( = ATCC BAA-2539(T) = LMG 27499(T)) as the type strain. Although this novel taxon could not be differentiated from C. fetus subsp. fetus and C. fetus subsp. venerealis using conventional phenotypic tests, MALDI-TOF MS revealed the presence of multiple phenotypic biomarkers which distinguish Campylobacter fetus subsp. testudinum subsp. nov. from recognized subspecies of C. fetus.
Expansion of inverted repeat does not decrease substitution rates in Pelargonium plastid genomes.
Weng, Mao-Lun; Ruhlman, Tracey A; Jansen, Robert K
2017-04-01
For species with minor inverted repeat (IR) boundary changes in the plastid genome (plastome), nucleotide substitution rates were previously shown to be lower in the IR than the single copy regions (SC). However, the impact of large-scale IR expansion/contraction on plastid nucleotide substitution rates among closely related species remains unclear. We included plastomes from 22 Pelargonium species, including eight newly sequenced genomes, and used both pairwise and model-based comparisons to investigate the impact of the IR on sequence evolution in plastids. Ten types of plastome organization with different inversions or IR boundary changes were identified in Pelargonium. Inclusion in the IR was not sufficient to explain the variation of nucleotide substitution rates. Instead, the rate heterogeneity in Pelargonium plastomes was a mixture of locus-specific, lineage-specific and IR-dependent effects. Our study of Pelargonium plastomes that vary in IR length and gene content demonstrates that the evolutionary consequences of retaining these repeats are more complicated than previously suggested. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Luo, Xiaoteng; Hsing, I-Ming
2009-10-01
Nucleic acid based analysis provides accurate differentiation among closely affiliated species and this species- and sequence-specific detection technique would be particularly useful for point-of-care (POC) testing for prevention and early detection of highly infectious and damaging diseases. Electrochemical (EC) detection and polymerase chain reaction (PCR) are two indispensable steps, in our view, in a nucleic acid based point-of-care testing device as the former, in comparison with the fluorescence counterpart, provides inherent advantages of detection sensitivity, device miniaturization and operation simplicity, and the latter offers an effective way to boost the amount of targets to a detectable quantity. In this mini-review, we will highlight some of the interesting investigations using the combined EC detection and PCR amplification approaches for end-point detection and real-time monitoring. The promise of current approaches and the direction for future investigations will be discussed. It would be our view that the synergistic effect of the combined EC-PCR steps in a portable device provides a promising detection technology platform that will be ready for point-of-care applications in the near future.
Luo, C; Zhang, F; Zhang, Q L; Guo, D Y; Luo, Z R
2013-01-09
We developed and characterized expressed sequence tags (ESTs)-simple sequence repeats (SSRs) and targeted region amplified polymorphism (TRAP) markers to examine genetic relationships in the persimmon genus Diospyros gene pool. In total, we characterized 14 EST-SSR primer pairs and 36 TRAP primer combinations, which were amplified across 20 germplasms of 4 species in the genus Diospyros. We used various genetic parameters, including effective multiplex ratio (EMR), diversity index (DI), and marker index (MI), to test the utility of these markers. TRAP markers gave higher EMR (24.85) but lower DI (0.33), compared to EST-SSRs (EMR = 3.65, DI = 0.34). TRAP gave a very high MI (8.08), which was about 8 times than the MI of EST-SSR (1.25). These markers were utilized for phylogenetic inference of 20 genotypes of Diospyros kaki Thunb. and allied species, with a result that all kaki genotypes clustered closely and 3 allied species formed an independent group. These markers could be further exploited for large-scale genetic relationship inference.
Diehn, Till A.; Pommerrenig, Benjamin; Bernhardt, Nadine; Hartmann, Anja; Bienert, Gerd P.
2015-01-01
Aquaporins (AQPs) are essential channel proteins that regulate plant water homeostasis and the uptake and distribution of uncharged solutes such as metalloids, urea, ammonia, and carbon dioxide. Despite their importance as crop plants, little is known about AQP gene and protein function in cabbage (Brassica oleracea) and other Brassica species. The recent releases of the genome sequences of B. oleracea and Brassica rapa allow comparative genomic studies in these species to investigate the evolution and features of Brassica genes and proteins. In this study, we identified all AQP genes in B. oleracea by a genome-wide survey. In total, 67 genes of four plant AQP subfamilies were identified. Their full-length gene sequences and locations on chromosomes and scaffolds were manually curated. The identification of six additional full-length AQP sequences in the B. rapa genome added to the recently published AQP protein family of this species. A phylogenetic analysis of AQPs of Arabidopsis thaliana, B. oleracea, B. rapa allowed us to follow AQP evolution in closely related species and to systematically classify and (re-) name these isoforms. Thirty-three groups of AQP-orthologous genes were identified between B. oleracea and Arabidopsis and their expression was analyzed in different organs. The two selectivity filters, gene structure and coding sequences were highly conserved within each AQP subfamily while sequence variations in some introns and untranslated regions were frequent. These data suggest a similar substrate selectivity and function of Brassica AQPs compared to Arabidopsis orthologs. The comparative analyses of all AQP subfamilies in three Brassicaceae species give initial insights into AQP evolution in these taxa. Based on the genome-wide AQP identification in B. oleracea and the sequence analysis and reprocessing of Brassica AQP information, our dataset provides a sequence resource for further investigations of the physiological and molecular functions of Brassica crop AQPs. PMID:25904922
Dombrovsky, Aviv; Glanz, Eyal; Lachman, Oded; Sela, Noa; Doron-Faigenboim, Adi; Antignus, Yehezkel
2013-01-01
We determined the complete sequence and organization of the genome of a putative member of the genus Polerovirus tentatively named Pepper yellow leaf curl virus (PYLCV). PYLCV has a wider host range than Tobacco vein-distorting virus (TVDV) and has a close serological relationship with Cucurbit aphid-borne yellows virus (CABYV) (both poleroviruses). The extracted viral RNA was subjected to SOLiD next-generation sequence analysis and used as a template for reverse transcription synthesis, which was followed by PCR amplification. The ssRNA genome of PYLCV includes 6,028 nucleotides encoding six open reading frames (ORFs), which is typical of the genus Polerovirus. Comparisons of the deduced amino acid sequences of the PYLCV ORFs 2-4 and ORF5, indicate that there are high levels of similarity between these sequences to ORFs 2-4 of TVDV (84-93%) and to ORF5 of CABYV (87%). Both PYLCV and Pepper vein yellowing virus (PeVYV) contain sequences that point to a common ancestral polerovirus. The recombination breakpoint which is located at CABYV ORF3, which encodes the viral coat protein (CP), may explain the CABYV-like sequences found in the genomes of the pepper infecting viruses PYLCV and PeVYV. Two additional regions unique to PYLCV (PY1 and PY2) were identified between nucleotides 4,962 and 5,061 (ORF 5) and between positions 5,866 and 6,028 in the 3' NCR. Sequence analysis of the pepper-infecting PeVYV revealed three unique regions (Pe1-Pe3) with no similarity to other members of the genus Polerovirus. Genomic analyses of PYLCV and PeVYV suggest that the speciation of these viruses occurred through putative recombination event(s) between poleroviruses co-infecting a common host(s), resulting in the emergence of PYLCV, a novel pathogen with a wider host range. PMID:23936244
Dombrovsky, Aviv; Glanz, Eyal; Lachman, Oded; Sela, Noa; Doron-Faigenboim, Adi; Antignus, Yehezkel
2013-01-01
We determined the complete sequence and organization of the genome of a putative member of the genus Polerovirus tentatively named Pepper yellow leaf curl virus (PYLCV). PYLCV has a wider host range than Tobacco vein-distorting virus (TVDV) and has a close serological relationship with Cucurbit aphid-borne yellows virus (CABYV) (both poleroviruses). The extracted viral RNA was subjected to SOLiD next-generation sequence analysis and used as a template for reverse transcription synthesis, which was followed by PCR amplification. The ssRNA genome of PYLCV includes 6,028 nucleotides encoding six open reading frames (ORFs), which is typical of the genus Polerovirus. Comparisons of the deduced amino acid sequences of the PYLCV ORFs 2-4 and ORF5, indicate that there are high levels of similarity between these sequences to ORFs 2-4 of TVDV (84-93%) and to ORF5 of CABYV (87%). Both PYLCV and Pepper vein yellowing virus (PeVYV) contain sequences that point to a common ancestral polerovirus. The recombination breakpoint which is located at CABYV ORF3, which encodes the viral coat protein (CP), may explain the CABYV-like sequences found in the genomes of the pepper infecting viruses PYLCV and PeVYV. Two additional regions unique to PYLCV (PY1 and PY2) were identified between nucleotides 4,962 and 5,061 (ORF 5) and between positions 5,866 and 6,028 in the 3' NCR. Sequence analysis of the pepper-infecting PeVYV revealed three unique regions (Pe1-Pe3) with no similarity to other members of the genus Polerovirus. Genomic analyses of PYLCV and PeVYV suggest that the speciation of these viruses occurred through putative recombination event(s) between poleroviruses co-infecting a common host(s), resulting in the emergence of PYLCV, a novel pathogen with a wider host range.
Zhang, Yong; Hong, Mei; Sun, Qiang; Zhu, Shuangli; Tsewang; Li, Xiaolei; Yan, Dongmei; Wang, Dongyan; Xu, Wenbo
2014-04-01
Molecular methods, based on sequencing the region encoding the complete VP1 or P1 protein, have enabled the rapid identification of new enterovirus serotypes. In the present study, the complete genome of a newly discovered enterovirus serotype, strain Q0011/XZ/CHN/2000 (hereafter referred to as Q0011), was sequenced and analyzed. The virus, isolated from a stool sample from a patient with acute flaccid paralysis in the Tibet region of China in 2000, was characterized by amplicon sequencing and comparison to a GenBank database of enterovirus nucleotide sequences. The nucleotide sequence encoding the complete VP1 capsid protein is most closely related to the sequences of viruses within the species enterovirus B (EV-B), but is less than 72.1% identical to the homologous sequences of the recognized human enterovirus serotypes, with the greatest homology to EV-B101 and echovirus 32. Moreover, the deduced amino acid sequence of the complete VP1 region is less than 84.7% identical to those of the recognized serotypes, suggesting that the strain is a new serotype of enterovirus within EV-B. The virus was characterized as a new enterovirus type, named EV-B111, by the Picornaviridae Study Group of the International Committee on Taxonomy of Viruses. Low positive rate and titer of neutralizing antibody against EV-B111 were found in the Tibet region of China. Nearly 50% of children ≤5 years had no neutralizing antibody against EV-B111. So the extent of transmission and the exposure of the population to this new EV are very limited. This is the first identification of a new serotype of human enterovirus in China, and strain Q0011 was designated the prototype strain of EV-B111. Copyright © 2014 Elsevier B.V. All rights reserved.
Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L
2011-06-02
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
Oono, Ryoko
2017-01-01
High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions 'how and why are communities different?' This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences.
2017-01-01
High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions ‘how and why are communities different?’ This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences. PMID:29253889
Impact of cultivation on characterisation of species composition of soil bacterial communities.
McCaig, A E.; Grayston, S J.; Prosser, J I.; Glover, L A.
2001-03-01
The species composition of culturable bacteria in Scottish grassland soils was investigated using a combination of Biolog and 16S rDNA analysis for characterisation of isolates. The inclusion of a molecular approach allowed direct comparison of sequences from culturable bacteria with sequences obtained during analysis of DNA extracted directly from the same soil samples. Bacterial strains were isolated on Pseudomonas isolation agar (PIA), a selective medium, and on tryptone soya agar (TSA), a general laboratory medium. In total, 12 and 21 morphologically different bacterial cultures were isolated on PIA and TSA, respectively. Biolog and sequencing placed PIA isolates in the same taxonomic groups, the majority of cultures belonging to the Pseudomonas (sensu stricto) group. However, analysis of 16S rDNA sequences proved more efficient than Biolog for characterising TSA isolates due to limitations of the Microlog database for identifying environmental bacteria. In general, 16S rDNA sequences from TSA isolates showed high similarities to cultured species represented in sequence databases, although TSA-8 showed only 92.5% similarity to the nearest relative, Bacillus insolitus. In general, there was very little overlap between the culturable and uncultured bacterial communities, although two sequences, PIA-2 and TSA-13, showed >99% similarity to soil clones. A cloning step was included prior to sequence analysis of two isolates, TSA-5 and TSA-14, and analysis of several clones confirmed that these cultures comprised at least four and three sequence types, respectively. All isolate clones were most closely related to uncultured bacteria, with clone TSA-5.1 showing 99.8% similarity to a sequence amplified directly from the same soil sample. Interestingly, one clone, TSA-5.4, clustered within a novel group comprising only uncultured sequences. This group, which is associated with the novel, deep-branching Acidobacterium capsulatum lineage, also included clones isolated during direct analysis of the same soil and from a wide range of other sample types studied elsewhere. The study demonstrates the value of fine-scale molecular analysis for identification of laboratory isolates and indicates the culturability of approximately 1% of the total population but under a restricted range of media and cultivation conditions.
Phylo-VISTA: Interactive visualization of multiple DNA sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.
The power of multi-sequence comparison for biological discovery is well established. The need for new capabilities to visualize and compare cross-species alignment data is intensified by the growing number of genomic sequence datasets being generated for an ever-increasing number of organisms. To be efficient these visualization algorithms must support the ability to accommodate consistently a wide range of evolutionary distances in a comparison framework based upon phylogenetic relationships. Results: We have developed Phylo-VISTA, an interactive tool for analyzing multiple alignments by visualizing a similarity measure for multiple DNA sequences. The complexity of visual presentation is effectively organized using a frameworkmore » based upon interspecies phylogenetic relationships. The phylogenetic organization supports rapid, user-guided interspecies comparison. To aid in navigation through large sequence datasets, Phylo-VISTA leverages concepts from VISTA that provide a user with the ability to select and view data at varying resolutions. The combination of multiresolution data visualization and analysis, combined with the phylogenetic framework for interspecies comparison, produces a highly flexible and powerful tool for visual data analysis of multiple sequence alignments. Availability: Phylo-VISTA is available at http://www-gsd.lbl. gov/phylovista. It requires an Internet browser with Java Plugin 1.4.2 and it is integrated into the global alignment program LAGAN at http://lagan.stanford.edu« less
2012-01-01
Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. Conclusions We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species. PMID:22805587
Yang, Huaan; Tao, Ye; Zheng, Zequn; Li, Chengdao; Sweetingham, Mark W; Howieson, John G
2012-07-17
In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species.
Sequence information signal processor
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1999-01-01
An electronic circuit is used to compare two sequences, such as genetic sequences, to determine which alignment of the sequences produces the greatest similarity. The circuit includes a linear array of series-connected processors, each of which stores a single element from one of the sequences and compares that element with each successive element in the other sequence. For each comparison, the processor generates a scoring parameter that indicates which segment ending at those two elements produces the greatest degree of similarity between the sequences. The processor uses the scoring parameter to generate a similar scoring parameter for a comparison between the stored element and the next successive element from the other sequence. The processor also delivers the scoring parameter to the next processor in the array for use in generating a similar scoring parameter for another pair of elements. The electronic circuit determines which processor and alignment of the sequences produce the scoring parameter with the highest value.
Use of DNA barcodes to identify flowering plants
Kress, W. John; Wurdack, Kenneth J.; Zimmer, Elizabeth A.; Weigt, Lee A.; Janzen, Daniel H.
2005-01-01
Methods for identifying species by using short orthologous DNA sequences, known as “DNA barcodes,” have been proposed and initiated to facilitate biodiversity studies, identify juveniles, associate sexes, and enhance forensic analyses. The cytochrome c oxidase 1 sequence, which has been found to be widely applicable in animal barcoding, is not appropriate for most species of plants because of a much slower rate of cytochrome c oxidase 1 gene evolution in higher plants than in animals. We therefore propose the nuclear internal transcribed spacer region and the plastid trnH-psbA intergenic spacer as potentially usable DNA regions for applying barcoding to flowering plants. The internal transcribed spacer is the most commonly sequenced locus used in plant phylogenetic investigations at the species level and shows high levels of interspecific divergence. The trnH-psbA spacer, although short (≈450-bp), is the most variable plastid region in angiosperms and is easily amplified across a broad range of land plants. Comparison of the total plastid genomes of tobacco and deadly nightshade enhanced with trials on widely divergent angiosperm taxa, including closely related species in seven plant families and a group of species sampled from a local flora encompassing 50 plant families (for a total of 99 species, 80 genera, and 53 families), suggest that the sequences in this pair of loci have the potential to discriminate among the largest number of plant species for barcoding purposes. PMID:15928076
Honda, Takashi; Morimoto, Daichi; Sako, Yoshihiko; Yoshida, Takashi
2018-05-17
Previously, we showed that DNA replication and cell division in toxic cyanobacterium Microcystis aeruginosa are coordinated by transcriptional regulation of cell division gene ftsZ and that an unknown protein specifically bound upstream of ftsZ (BpFz; DNA-binding protein to an upstream site of ftsZ) during successful DNA replication and cell division. Here, we purified BpFz from M. aeruginosa strain NIES-298 using DNA-affinity chromatography and gel-slicing combined with gel electrophoresis mobility shift assay (EMSA). The N-terminal amino acid sequence of BpFz was identified as TNLESLTQ, which was identical to that of transcription repressor LexA from NIES-843. EMSA analysis using mutant probes showed that the sequence GTACTAN 3 GTGTTC was important in LexA binding. Comparison of the upstream regions of lexA in the genomes of closely related cyanobacteria suggested that the sequence TASTRNNNNTGTWC could be a putative LexA recognition sequence (LexA box). Searches for TASTRNNNNTGTWC as a transcriptional regulatory site (TRS) in the genome of M. aeruginosa NIES-843 showed that it was present in genes involved in cell division, photosynthesis, and extracellular polysaccharide biosynthesis. Considering that BpFz binds to the TRS of ftsZ during normal cell division, LexA may function as a transcriptional activator of genes related to cell reproduction in M. aeruginosa, including ftsZ. This may be an example of informality in the control of bacterial cell division.
High levels of variation in Salix lignocellulose genes revealed using poplar genomic resources
2013-01-01
Background Little is known about the levels of variation in lignin or other wood related genes in Salix, a genus that is being increasingly used for biomass and biofuel production. The lignin biosynthesis pathway is well characterized in a number of species, including the model tree Populus. We aimed to transfer the genomic resources already available in Populus to its sister genus Salix to assess levels of variation within genes involved in wood formation. Results Amplification trials for 27 gene regions were undertaken in 40 Salix taxa. Twelve of these regions were sequenced. Alignment searches of the resulting sequences against reference databases, combined with phylogenetic analyses, showed the close similarity of these Salix sequences to Populus, confirming homology of the primer regions and indicating a high level of conservation within the wood formation genes. However, all sequences were found to vary considerably among Salix species, mainly as SNPs with a smaller number of insertions-deletions. Between 25 and 176 SNPs per kbp per gene region (in predicted exons) were discovered within Salix. Conclusions The variation found is sizeable but not unexpected as it is based on interspecific and not intraspecific comparison; it is comparable to interspecific variation in Populus. The characterisation of genetic variation is a key process in pre-breeding and for the conservation and exploitation of genetic resources in Salix. This study characterises the variation in several lignocellulose gene markers for such purposes. PMID:23924375
Lentes, K U; Mathieu, E; Bischoff, R; Rasmussen, U B; Pavirani, A
1993-01-01
Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555-574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis. HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.
Dou, Rong-kun; Bi, Zhen-fei; Bai, Rui-xue; Ren, Yao-yao; Tan, Rui; Song, Liang-ke; Li, Di-qiang; Mao, Can-quan
2015-04-01
The study is aimed to ensure the quality and safety of medicinal plants by using ITS2 DNA barcode technology to identify Corydalis boweri, Meconopsis horridula and their close related species. The DNA of 13 herb samples including C. boweri and M. horridula from Lhasa of Tibet was extracted, ITS PCR were amplified and sequenced. Both assembled and web downloaded 71 ITS2 sequences were removed of 5. 8S and 28S. Multiple sequence alignment was completed and the intraspecific and interspecific genetic distances were calculated by MEGA 5.0, while the neighbor-joining phylogenetic trees were constructed. We also predicted the ITS2 secondary structure of C. boweri, M. horridula and their close related species. The results showed that ITS2 as DNA barcode was able to identify C. boweri, M. horridula as well as well as their close related species effectively. The established based on ITS2 barcode method provides the regular and safe detection technology for identification of C. boweri, M. horridula and their close related species, adulterants and counterfeits, in order to ensure their quality control, safe medication, reasonable development and utilization.
Reaction schemes visualized in network form: the syntheses of strychnine as an example.
Proudfoot, John R
2013-05-24
Representation of synthesis sequences in a network form provides an effective method for the comparison of multiple reaction schemes and an opportunity to emphasize features such as reaction scale that are often relegated to experimental sections. An example of data formatting that allows construction of network maps in Cytoscape is presented, along with maps that illustrate the comparison of multiple reaction sequences, comparison of scaffold changes within sequences, and consolidation to highlight common key intermediates used across sequences. The 17 different synthetic routes reported for strychnine are used as an example basis set. The reaction maps presented required a significant data extraction and curation, and a standardized tabular format for reporting reaction information, if applied in a consistent way, could allow the automated combination of reaction information across different sources.
PIPI: PTM-Invariant Peptide Identification Using Coding Method.
Yu, Fengchao; Li, Ning; Yu, Weichuan
2016-12-02
In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.
Liao, Weinan; Ren, Jie; Wang, Kun; Wang, Shun; Zeng, Feng; Wang, Ying; Sun, Fengzhu
2016-11-23
The comparison between microbial sequencing data is critical to understand the dynamics of microbial communities. The alignment-based tools analyzing metagenomic datasets require reference sequences and read alignments. The available alignment-free dissimilarity approaches model the background sequences with Fixed Order Markov Chain (FOMC) yielding promising results for the comparison of microbial communities. However, in FOMC, the number of parameters grows exponentially with the increase of the order of Markov Chain (MC). Under a fixed high order of MC, the parameters might not be accurately estimated owing to the limitation of sequencing depth. In our study, we investigate an alternative to FOMC to model background sequences with the data-driven Variable Length Markov Chain (VLMC) in metatranscriptomic data. The VLMC originally designed for long sequences was extended to apply to high-throughput sequencing reads and the strategies to estimate the corresponding parameters were developed. The flexible number of parameters in VLMC avoids estimating the vast number of parameters of high-order MC under limited sequencing depth. Different from the manual selection in FOMC, VLMC determines the MC order adaptively. Several beta diversity measures based on VLMC were applied to compare the bacterial RNA-Seq and metatranscriptomic datasets. Experiments show that VLMC outperforms FOMC to model the background sequences in transcriptomic and metatranscriptomic samples. A software pipeline is available at https://d2vlmc.codeplex.com.
NASA Astrophysics Data System (ADS)
Sessa, Jocelyn Anne; Larina, Ekaterina; Knoll, Katja; Garb, Matthew; Cochran, J. Kirk; Huber, Brian T.; MacLeod, Kenneth G.; Landman, Neil H.
2015-12-01
Ammonites are among the best-known fossils of the Phanerozoic, yet their habitat is poorly understood. Three common ammonite families (Baculitidae, Scaphitidae, and Sphenodiscidae) co-occur with well-preserved planktonic and benthic organisms at the type locality of the upper Maastrichtian Owl Creek Formation, offering an excellent opportunity to constrain their depth habitats through isotopic comparisons among taxa. Based on sedimentary evidence and the micro- and macrofauna at this site, we infer that the 9-m-thick sequence was deposited at a paleodepth of 70-150 m. Taxa present throughout the sequence include a diverse assemblage of ammonites, bivalves, and gastropods, abundant benthic foraminifera, and rare planktonic foraminifera. No stratigraphic trends are observed in the isotopic data of any taxon, and thus all of the data from each taxon are considered as replicates. Oxygen isotope-based temperature estimates from the baculites and scaphites overlap with those of the benthos and are distinct from those of the plankton. In contrast, sphenodiscid temperature estimates span a range that includes estimates of the planktonic foraminifera and of the warmer half of the benthic values. These results suggest baculites and scaphites lived close to the seafloor, whereas sphenodiscids sometimes inhabited the upper water column and/or lived closer to shore. In fact, the rarity and poorer preservation of the sphenodiscids relative to the baculites and scaphites suggests that the sphenodiscid shells may have only reached the Owl Creek locality by drifting seaward after death.
Chai, Huan-Na; Du, Yu-Zhou
2012-01-01
The complete 15,413-bp mitochondrial genome (mitogenome) of Sesamia inferens (Walker) (Lepidoptera: Noctuidae) was sequenced and compared with those of four other noctuid moths. All of the mitogenomes analyzed displayed similar characteristics with respect to gene content, genome organization, nucleotide comparison, and codon usages. Twelve-one protein-coding genes (PCGs) utilized the standard ATN, but the cox1 gene used CGA as the initiation codon; cox1, cox2, and nad4 genes had the truncated termination codon T in the S. inferens mitogenome. All of the tRNA genes had typical cloverleaf secondary structures except for trnS1(AGN), in which the dihydrouridine (DHU) arm did not form a stable stem-loop structure. Both the secondary structures of rrnL and rrnS genes inferred from the S. inferens mitogenome closely resembled those of other noctuid moths. In the A+T-rich region, the conserved motif "ATAGA" followed by a long T-stretch was observed in all noctuid moths, but other specific tandem-repeat elements were more variable. Additionally, the S. inferens mitogenome contained a potential stem-loop structure, a duplicated 17-bp repeat element, a decuplicated segment, and a microsatellite "(AT)(7)", without a poly-A element upstream of the trnM in the A+T-rich region. Finally, the phylogenetic relationships were reconstructed based on amino acid sequences of mitochondrial 13 PCGs, which support the traditional morphologically based view of relationships within the Noctuidae.
Chai, Huan-Na; Du, Yu-Zhou
2012-01-01
The complete 15,413-bp mitochondrial genome (mitogenome) of Sesamia inferens (Walker) (Lepidoptera: Noctuidae) was sequenced and compared with those of four other noctuid moths. All of the mitogenomes analyzed displayed similar characteristics with respect to gene content, genome organization, nucleotide comparison, and codon usages. Twelve-one protein-coding genes (PCGs) utilized the standard ATN, but the cox1 gene used CGA as the initiation codon; cox1, cox2, and nad4 genes had the truncated termination codon T in the S. inferens mitogenome. All of the tRNA genes had typical cloverleaf secondary structures except for trnS1(AGN), in which the dihydrouridine (DHU) arm did not form a stable stem-loop structure. Both the secondary structures of rrnL and rrnS genes inferred from the S. inferens mitogenome closely resembled those of other noctuid moths. In the A+T-rich region, the conserved motif “ATAGA” followed by a long T-stretch was observed in all noctuid moths, but other specific tandem-repeat elements were more variable. Additionally, the S. inferens mitogenome contained a potential stem-loop structure, a duplicated 17-bp repeat element, a decuplicated segment, and a microsatellite “(AT)7”, without a poly-A element upstream of the trnM in the A+T-rich region. Finally, the phylogenetic relationships were reconstructed based on amino acid sequences of mitochondrial 13 PCGs, which support the traditional morphologically based view of relationships within the Noctuidae. PMID:22949858
Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing
USDA-ARS?s Scientific Manuscript database
Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence--the general domain tendency to preferentially appear along with some favorite domains in the proteins--to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced.
Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent
2014-01-01
Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence — the general domain tendency to preferentially appear along with some favorite domains in the proteins — to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced. PMID:24901648
Mind the gap; seven reasons to close fragmented genome assemblies
USDA-ARS?s Scientific Manuscript database
Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including tho...
Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G
2012-09-01
Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A; Larsen, Martin Jakob
2016-01-01
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.
Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A.; Larsen, Martin Jakob
2016-01-01
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths. PMID:27002637
A 3D sequence-independent representation of the protein data bank.
Fischer, D; Tsai, C J; Nussinov, R; Wolfson, H
1995-10-01
Here we address the following questions. How many structurally different entries are there in the Protein Data Bank (PDB)? How do the proteins populate the structural universe? To investigate these questions a structurally non-redundant set of representative entries was selected from the PDB. Construction of such a dataset is not trivial: (i) the considerable size of the PDB requires a large number of comparisons (there were more than 3250 structures of protein chains available in May 1994); (ii) the PDB is highly redundant, containing many structurally similar entries, not necessarily with significant sequence homology, and (iii) there is no clear-cut definition of structural similarity. The latter depend on the criteria and methods used. Here, we analyze structural similarity ignoring protein topology. To date, representative sets have been selected either by hand, by sequence comparison techniques which ignore the three-dimensional (3D) structures of the proteins or by using sequence comparisons followed by linear structural comparison (i.e. the topology, or the sequential order of the chains, is enforced in the structural comparison). Here we describe a 3D sequence-independent automated and efficient method to obtain a representative set of protein molecules from the PDB which contains all unique structures and which is structurally non-redundant. The method has two novel features. The first is the use of strictly structural criteria in the selection process without taking into account the sequence information. To this end we employ a fast structural comparison algorithm which requires on average approximately 2 s per pairwise comparison on a workstation. The second novel feature is the iterative application of a heuristic clustering algorithm that greatly reduces the number of comparisons required. We obtain a representative set of 220 chains with resolution better than 3.0 A, or 268 chains including lower resolution entries, NMR entries and models. The resulting set can serve as a basis for extensive structural classification and studies of 3D recurring motifs and of sequence-structure relationships. The clustering algorithm succeeds in classifying into the same structural family chains with no significant sequence homology, e.g. all the globins in one single group, all the trypsin-like serine proteases in another or all the immunoglobulin-like folds into a third. In addition, unexpected structural similarities of interest have been automatically detected between pairs of chains. A cluster analysis of the representative structures demonstrates the way the "structural universe' is populated.
Host-specificity among abundant and rare taxa in the sponge microbiome.
Reveillaud, Julie; Maignien, Loïs; Murat Eren, A; Huber, Julie A; Apprill, Amy; Sogin, Mitchell L; Vanreusel, Ann
2014-06-01
Microbial communities have a key role in the physiology of the sponge host, and it is therefore essential to understand the stability and specificity of sponge-symbiont associations. Host-specific bacterial associations spanning large geographic distance are widely acknowledged in sponges. However, the full spectrum of specificity remains unclear. In particular, it is not known whether closely related sponges host similar or very different microbiota over wide bathymetric and geographic gradients, and whether specific associations extend to the rare members of the sponge microbiome. Using the ultra-deep Illumina sequencing technology, we conducted a comparison of sponge bacterial communities in seven closely related Hexadella species with a well-resolved host phylogeny, as well as of a distantly related sponge Mycale. These samples spanned unprecedentedly large bathymetric (15-960 m) gradients and varying European locations. In addition, this study included a bacterial community analysis of the local background seawater for both Mycale and the widespread deep-sea taxa Hexadella cf. dedritifera. We observed a striking diversity of microbes associated with the sponges, spanning 47 bacterial phyla. The data did not reveal any Hexadella microbiota co-speciation pattern, but confirmed sponge-specific and species-specific host-bacteria associations, even within extremely low abundant taxa. Oligotyping analysis also revealed differential enrichment preferences of closely related Nitrospira members in closely related sponges species. Overall, these results demonstrate highly diverse, remarkably specific and stable sponge-bacteria associations that extend to members of the rare biosphere at a very fine phylogenetic scale, over significant geographic and bathymetric gradients.
Sequence comparison alignment-free approach based on suffix tree and L-words frequency.
Soares, Inês; Goios, Ana; Amorim, António
2012-01-01
The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.
Huang, Yao-Ting; Chen, Jia-Min; Ho, Bing-Ching; Wu, Zong-Yen; Kuo, Rita C; Liu, Po-Yu
2018-01-01
Stenotrophomonas acidaminiphila is an aerobic, glucose non-fermentative, Gram-negative bacterium that been isolated from various environmental sources, particularly aquatic ecosystems. Although resistance to multiple antimicrobial agents has been reported in S. acidaminiphila , the mechanisms are largely unknown. Here, for the first time, we report the complete genome and antimicrobial resistome analysis of a clinical isolate S. acidaminiphila SUNEO which is resistant to sulfamethoxazole. Comparative analysis among closely related strains identified common and strain-specific genes. In particular, comparison with a sulfamethoxazole-sensitive strain identified a mutation within the sulfonamide-binding site of folP in SUNEO, which may reduce the binding affinity of sulfamethoxazole. Selection pressure analysis indicated folP in SUNEO is under purifying selection, which may be owing to long-term administration of sulfonamide against Stenotrophomonas .
Reclassification of Bacillus marismortui as Salibacillus marismortui comb. nov.
Arahal, D R; Márquez, M C; Volcani, B E; Schleifer, K H; Ventosa, A
2000-07-01
Recently, the features of a group of strains isolated from Dead Sea enrichments obtained in 1936 by one of us (B. E. Volcani) were described. They were gram-positive, moderately halophilic, spore-forming rods, and were placed in a new species, Bacillus marismortui. At the same time, the new genus Salibacillus was proposed for the halophilic species Bacillus salexigens. B. marismortui and Salibacillus salexigens have similar phenotypic characteristics and the same peptidoglycan type. Phylogenetic analysis based on 16S rRNA sequence comparisons showed that they are sufficiently closely related (96.6% similarity) as to warrant placement in the same genus. However, DNA-DNA hybridization experiments showed that they constitute two separate species (41% DNA similarity). Therefore the reclassification of Bacillus marismortui as Salibacillus marismortui comb. nov. is proposed.
Jiang, P; Stone, S; Wagner, R; Wang, S; Dayananth, P; Kozak, C A; Wold, B; Kamb, A
1995-12-01
Cyclin-dependent kinase inhibitors are a growing family of molecules that regulate important transitions in the cell cycle. At least one of these molecules, p16, has been implicated in human tumorigenesis while its close homolog, p15, is induced by cell contact and transforming growth factor-beta (TGF-beta). To investigate the evolutionary and functional features of p15 and p16, we have isolated mouse (Mus musculus) homologs of each gene. Comparative analysis of these sequences provides evidence that the genes have similar functions in mouse and human. In addition, the comparison suggests that a gene conversion event is part of the evolution of the human p15 and p16 genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Steven D; Nagaraju, Shilpa; Utturkar, Sagar M
Background Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published. Results A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G +more » C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a phophoenolpyruvate synthase) and substrate utilization pathway (mannose and aromatics utilization) that might explain phenotypic differences between C. autoethanogenum and C. ljungdahlii. Conclusions Single molecule sequencing will be increasingly used to produce finished microbial genomes. The complete genome will facilitate comparative genomics and functional genomics and support future comparisons between Clostridia and studies that examine the evolution of plasmids, bacteriophage and CRISPR systems.« less
Antisense transcription is pervasive but rarely conserved in enteric bacteria.
Raghavan, Rahul; Sloan, Daniel B; Ochman, Howard
2012-01-01
Noncoding RNAs, including antisense RNAs (asRNAs) that originate from the complementary strand of protein-coding genes, are involved in the regulation of gene expression in all domains of life. Recent application of deep-sequencing technologies has revealed that the transcription of asRNAs occurs genome-wide in bacteria. Although the role of the vast majority of asRNAs remains unknown, it is often assumed that their presence implies important regulatory functions, similar to those of other noncoding RNAs. Alternatively, many antisense transcripts may be produced by chance transcription events from promoter-like sequences that result from the degenerate nature of bacterial transcription factor binding sites. To investigate the biological relevance of antisense transcripts, we compared genome-wide patterns of asRNA expression in closely related enteric bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, by performing strand-specific transcriptome sequencing. Although antisense transcripts are abundant in both species, less than 3% of asRNAs are expressed at high levels in both species, and only about 14% appear to be conserved among species. And unlike the promoters of protein-coding genes, asRNA promoters show no evidence of sequence conservation between, or even within, species. Our findings suggest that many or even most bacterial asRNAs are nonadaptive by-products of the cell's transcription machinery. IMPORTANCE Application of high-throughput methods has revealed the expression throughout bacterial genomes of transcripts encoded on the strand complementary to protein-coding genes. Because transcription is costly, it is usually assumed that these transcripts, termed antisense RNAs (asRNAs), serve some function; however, the role of most asRNAs is unclear, raising questions about their relevance in cellular processes. Because natural selection conserves functional elements, comparisons between related species provide a method for assessing functionality genome-wide. Applying such an approach, we assayed all transcripts in two closely related bacteria, Escherichia coli and Salmonella enterica serovar Typhimurium, and demonstrate that, although the levels of genome-wide antisense transcription are similarly high in both bacteria, only a small fraction of asRNAs are shared across species. Moreover, the promoters associated with asRNAs show no evidence of sequence conservation between, or even within, species. These findings indicate that despite the genome-wide transcription of asRNAs, many of these transcripts are likely nonfunctional.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-03-01
Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
A Gibbs sampler for motif detection in phylogenetically close sequences
NASA Astrophysics Data System (ADS)
Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric
2004-03-01
Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.
Comparative analysis of chloroplast genomes of the genus Citrus and its close relatives.
Liu, Xiaogang; Wu, Hongkun; Luo, Yan; Xi, Wanpeng; Zhou, Zhiqin
2017-01-01
The genus Citrus and its close relatives are economically and nutritionally important fruit trees. However, the huge controversy over the phylogeny of key wild species, as well as the genetic relationship between the cultivated species and their putative wild progenitors, remains unresolved. Comparative analyses of chloroplast (cp) genomes have been useful in resolving various phylogenetic issues. Thus far, the cp genomes of only two Citrus species have been sequenced. In this study, we sequenced six complete cp genomes, four belonging to the genus Citrus, and two belonging to the genera Fortunella and Poncirus, respectively. These newly sequenced genomes together with the two publicly available were used for comparative analyses of the genus Citrus and its close relatives. All eight cp genomes share similar basic structure, gene order and gene content. Phylogenetic analyses supported the monophyly of the three genera in the order Sapindales within the major clade Malvidae.
Detectable close-in planets around white dwarfs through late unpacking
NASA Astrophysics Data System (ADS)
Veras, Dimitri; Gänsicke, Boris T.
2015-02-01
Although 25-50 per cent of white dwarfs (WDs) display evidence for remnant planetary systems, their orbital architectures and overall sizes remain unknown. Vibrant close-in (≃1 R⊙) circumstellar activity is detected at WDs spanning many Gyr in age, suggestive of planets further away. Here we demonstrate how systems with 4 and 10 closely packed planets that remain stable and ordered on the main sequence can become unpacked when the star evolves into a WD and experience pervasive inward planetary incursions throughout WD cooling. Our full-lifetime simulations run for the age of the Universe and adopt main-sequence stellar masses of 1.5, 2.0 and 2.5 M⊙, which correspond to the mass range occupied by the progenitors of typical present-day WDs. These results provide (i) a natural way to generate an ever-changing dynamical architecture in post-main-sequence planetary systems, (ii) an avenue for planets to achieve temporary close-in orbits that are potentially detectable by transit photometry and (iii) a dynamical explanation for how residual asteroids might pollute particularly old WDs.
Li, Chun-Xiang; Yang, Qun
2003-03-01
DNA sequences from 28S rDNA were used to assess relationships between and within traditional Taxodiaceae and Cupressaceae s.s. The MP tree and NJ tree generally are similar to one another. The results show that Taxodiaceae and Cupressaceae s.s. form a monophyletic conifer lineage excluding Sciadopitys. In the Taxodiaceae-Cupressaceae s.s. monophyletic group, the Taxodiaceae is paraphyletic. Taxodium, Glyptostrobus and Cryptomeria forming a clade(Taxodioideae), in which Glyptostrobus and Taxodium are closely related and sister to Cryptomeria; Sequoia, Sequoiadendron and Metasequoia are closely related to each other, forming another clade (Sequoioideae), in which Sequoia and Sequoiadendron are closely related and sister to Metasequoia; the seven genera of Cupressaceae s.s. are found to be closely related to form a monophyletic lineage (Cupressoideae). These results are basically similar to analyses from chloroplast gene data. But the relationships among Taiwania, Sequoioideae, Taxodioideae, and Cupressoideae remain unclear because of the slow evolution rate of 28S rDNA, which might best be answered by sequencing more rapidly evolving nuclear genes.
Daher, Rana K; Stewart, Gale; Boissinot, Maurice; Boudreau, Dominique K; Bergeron, Michel G
2015-04-01
Recombinase polymerase amplification (RPA) technology relies on three major proteins, recombinase proteins, single-strand binding proteins, and polymerases, to specifically amplify nucleic acid sequences in an isothermal format. The performance of RPA with respect to sequence mismatches of closely-related non-target molecules is not well documented and the influence of the number and distribution of mismatches in DNA sequences on RPA amplification reaction is not well understood. We investigated the specificity of RPA by testing closely-related species bearing naturally occurring mismatches for the tuf gene sequence of Pseudomonas aeruginosa and/or Mycobacterium tuberculosis and for the cfb gene sequence of Streptococcus agalactiae. In addition, the impact of the number and distribution of mismatches on RPA efficiency was assessed by synthetically generating 14 types of mismatched forward primers for detecting five bacterial species of high diagnostic relevance such as Clostridium difficile, Staphylococcus aureus, S. agalactiae, P. aeruginosa, and M. tuberculosis as well as Bacillus atropheus subsp. globigii for which we use the spores as internal control in diagnostic assays. A total of 87 mismatched primers were tested in this study. We observed that target specific RPA primers with mismatches (n > 1) at their 3'extrimity hampered RPA reaction. In addition, 3 mismatches covering both extremities and the center of the primer sequence negatively affected RPA yield. We demonstrated that the specificity of RPA was multifactorial. Therefore its application in clinical settings must be selected and validated a priori. We recommend that the selection of a target gene must consider the presence of closely-related non-target genes. It is advisable to choose target regions with a high number of mismatches (≥36%, relative to the size of amplicon) with respect to closely-related species and the best case scenario would be by choosing a unique target gene. Copyright © 2014 Elsevier Ltd. All rights reserved.
Wu, Linhuan; McCluskey, Kevin; Desmeth, Philippe; Liu, Shuangjiang; Hideaki, Sugawara; Yin, Ye; Moriya, Ohkuma; Itoh, Takashi; Kim, Cha Young; Lee, Jung-Sook; Zhou, Yuguang; Kawasaki, Hiroko; Hazbón, Manzour Hernando; Robert, Vincent; Boekhout, Teun; Lima, Nelson; Evtushenko, Lyudmila; Boundy-Mills, Kyria; Bunk, Boyke; Moore, Edward R B; Eurwilaichitr, Lily; Ingsriswang, Supawadee; Shah, Heena; Yao, Su; Jin, Tao; Huang, Jinqun; Shi, Wenyu; Sun, Qinglan; Fan, Guomei; Li, Wei; Li, Xian; Kurtböke, Ipek; Ma, Juncai
2018-05-01
Genomic information is essential for taxonomic, phylogenetic, and functional studies to comprehensively decipher the characteristics of microorganisms, to explore microbiomes through metagenomics, and to answer fundamental questions of nature and human life. However, large gaps remain in the available genomic sequencing information published for bacterial and archaeal species, and the gaps are even larger for fungal type strains. The Global Catalogue of Microorganisms (GCM) leads an internationally coordinated effort to sequence type strains and close gaps in the genomic maps of microorganisms. Hence, the GCM aims to promote research by deep-mining genomic data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsichlis, P.N.; Donehower, L.; Hager, G.
1982-11-01
NTRE is an avian retrovirus recombinant of the endogeneous nononcogenic Rous-associated virus-0 (RAV-0) and the oncogenic, exogeneous, transformation-defective (td) Prague strain of Rous sarcoma virus B (td-PrRSV-B). Oligonucleotide mapping had shown that the recombinant virus is indistinguishable from its RAV-0 parent except for the 3'-end sequences, which were derived from td-PrRSV-B. However, the virus exhibits properties which are typical of an exogenous virus: it grows to high titers in tissue culture, and it is oncogenic in vivo. To accurately define the genetic region responsible for these properties, the authors determined the nucleotide sequences of the recombinant and its RAV-0 parentmore » by using molecular clones of their DNA. These were compared with sequences already available for PrRSV-C, a virus closely related to the exogenous parent td-PrRSV-B. The results suggested that the crossover event which generated NTRE 7 took place in a region -501 to -401 nucleotides from the 3' end of the td-PrRSV parental genome and that sequences to the right of the recombination region were responsible for its growth properties and oncogenic potential. Since the exogenous-virus-specific sequences are expected to be missing from transformation-defective mutants of the Schmidt-Ruppin strain of RSV, which, like other exogeneous viruses, grow to high tiers in tissue culture and are oncogenic in vivo, the authors concluded that the growth properties and oncogenic potential of the exogeneous viruses are determined by sequences in the U3 region of the long terminal repeat. However, the authors propose that the exogeneous-virus-specific region may play a role in determining the oncogenic spectrum of a given oncogenic virus.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zemla, A; Lang, D; Kostova, T
2010-11-29
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
Bartonella dromedarii sp. nov. isolated from domesticated camels (Camelus dromedarius) in Israel.
Rasis, Michal; Rudoler, Nir; Schwartz, David; Giladi, Michael
2014-11-01
Bartonella spp. are fastidious, Gram-negative bacilli that cause a wide spectrum of diseases in humans. Most Bartonella spp. have adapted to a specific host, generally a domestic or wild mammal. Dromedary camels (Camelus dromedarius) have become a focus of growing public-health interest because they have been identified as a reservoir host for the Middle East respiratory syndrome coronavirus. Nevertheless, data on camel zoonoses are limited. We aimed to study the occurrence of Bartonella bacteremia among dromedaries in Israel. Nine of 51 (17.6%) camels were found to be bacteremic with Bartonella spp.; bacteremia levels ranged from five to >1000 colony-forming units/mL. Phylogenetic reconstruction based on the concatenated sequences of gltA and rpoB genes demonstrated that the dromedary Bartonella isolates are closely related to other ruminant-derived Bartonella spp., with B. bovis being the nearest relative. Using electron microscopy, the novel isolates were shown to be flagellated, whereas B. bovis is nonflagellated. Sequence comparisons analysis of the housekeeping genes ftsZ, ribC, and groEL showed the highest homology to B. chomelii, B. capreoli, and B. birtlesii, respectively. Sequence analysis of the gltA and rpoB revealed ∼96% identity to B. bovis, a previously suggested cutoff value for sequence-based differentiation of Bartonella spp., suggesting that this approach does not have sufficient discriminatory power for differentiating ruminant-related Bartonella spp. A comprehensive multilocus sequence typing (MLST) analysis based on nine genetic loci (gltA, rpoB, ftsZ, internal transcribed spacer (ITS), 16S rRNA, ribC, groEL, nuoG, and SsrA) identified seven sequence types of the new dromedary isolates. This is the first description of a Bartonella sp. from camelids. On the basis of a distinct reservoir and ecological niche, sequence analyses, and expression of flagella, we designate these isolates as a novel Bartonella sp. named Bartonella dromedarii sp. nov. Further studies are required to explore its zoonotic potential.
Amor, Nabil; Farjallah, Sarra; Salem, Mohamed; Lamine, Dia Mamadou; Merella, Paolo; Said, Khaled; Ben Slimane, Badreddine
2011-10-01
Fasciolosis caused by Fasciola hepatica and Fasciola gigantica (Platyhelminthes: Trematoda: Digenea) is considered the most important helminth infection of ruminants in tropical countries, causing considerable socioeconomic problems. From Africa, F. gigantica has been previously characterized from Burkina Faso, Senegal, Kenya, Zambia and Mali, while F. hepatica has been reported from Morocco and Tunisia, and both species have been observed from Ethiopia and Egypt on the basis of morphometric differences, while the use of molecular markers is necessary to distinguish exactly between species. Samples identified morphologically as F. gigantica (n=60) from sheep and cattle from different geographical localities of Mauritania were genetically characterized by sequences of the first (ITS-1), the 5.8S, and second (ITS-2) Internal Transcribed Spacers (ITS) of nuclear ribosomal DNA (rDNA) genes and the mitochondrial Cytochrome c Oxidase I (COI) gene. Comparison of the sequences of the Mauritanian samples with sequences of Fasciola spp. from GenBank confirmed that all samples belong to the species F. gigantica. The nucleotide sequencing of ITS rDNA of F. gigantica showed no nucleotide variation in the ITS-1, 5.8S, and ITS-2 rDNA sequences among all samples examined and those from Burkina Faso, Kenya, Egypt and Iran. The phylogenetic trees based on the ITS-1 and ITS-2 sequences showed a close relationship of the Mauritanian samples with isolates of F. gigantica from different localities of Africa and Asia. The COI genotypes of the Mauritanian specimens of F. gigantica had a high level of diversity, and they belonged to the F. gigantica phylogenically distinguishable clade. The present study is the first molecular characterization of F. gigantica in sheep and cattle from Mauritania, allowing a reliable approach for the genetic differentiation of Fasciola spp. and providing basis for further studies on liver flukes in the African countries. Copyright © 2011 Elsevier Inc. All rights reserved.
Transcriptome-Based Differentiation of Closely-Related Miscanthus Lines
Chouvarine, Philippe; Cooksey, Amanda M.; McCarthy, Fiona M.; ...
2012-01-10
Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthusmore » (Miscanthus6giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations."« less
Fernandes, A P; Nelson, K; Beverley, S M
1993-01-01
Molecular evolutionary relationships within the protozoan order Kinetoplastida were deduced from comparisons of the nuclear small and large subunit ribosomal RNA (rRNA) gene sequences. These studies show that relationships among the trypanosomatid protozoans differ from those previously proposed from studies of organismal characteristics or mitochondrial rRNAs. The genera Leishmania, Endotrypanum, Leptomonas, and Crithidia form a closely related group, which shows progressively more distant relationships to Phytomonas and Blastocrithidia, Trypanosoma cruzi, and lastly Trypanosoma brucei. The rooting of the trypanosomatid tree was accomplished by using Bodo caudatus (family Bodonidae) as an outgroup, a status confirmed by molecular comparisons with other eukaryotes. The nuclear rRNA tree agrees well with data obtained from comparisons of other nuclear genes. Differences with the proposed mitochondrial rRNA tree probably reflect the lack of a suitable outgroup for this tree, as the topologies are otherwise similar. Small subunit rRNA divergences within the trypanosomatids are large, approaching those among plants and animals, which underscores the evolutionary antiquity of the group. Analysis of the distribution of different parasitic life-styles of these species in conjunction with a probable timing of evolutionary divergences suggests that vertebrate parasitism arose multiple times in the trypanosomatids. PMID:8265597
Insights from 20 years of bacterial genome sequencing
Land, Miriam L.; Hauser, Loren; Jun, Se-Ran; ...
2015-02-27
Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date,more » there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.« less
Insights from 20 years of bacterial genome sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Land, Miriam L.; Hauser, Loren; Jun, Se-Ran
Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date,more » there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.« less
Moreno, Ana; Lelli, Davide; de Sabato, Luca; Zaccaria, Guendalina; Boni, Arianna; Sozzi, Enrica; Prosperi, Alice; Lavazza, Antonio; Cella, Eleonora; Castrucci, Maria Rita; Ciccozzi, Massimo; Vaccari, Gabriele
2017-12-19
Middle East respiratory syndrome coronavirus (MERS-CoV), which belongs to beta group of coronavirus, can infect multiple host species and causes severe diseases in humans. Multiple surveillance and phylogenetic studies suggest a bat origin. In this study, we describe the detection and full genome characterization of two CoVs closely related to MERS-CoV from two Italian bats, Pipistrellus kuhlii and Hypsugo savii. Pool of viscera were tested by a pan-coronavirus RT-PCR. Virus isolation was attempted by inoculation in different cell lines. Full genome sequencing was performed using the Ion Torrent platform and phylogenetic trees were performed using IQtree software. Similarity plots of CoV clade c genomes were generated by using SSE v1.2. The three dimensional macromolecular structure (3DMMS) of the receptor binding domain (RBD) in the S protein was predicted by sequence-homology method using the protein data bank (PDB). Both samples resulted positive to the pan-coronavirus RT-PCR (IT-batCoVs) and their genome organization showed identical pattern of MERS CoV. Phylogenetic analysis showed a monophyletic group placed in the Beta2c clade formed by MERS-CoV sequences originating from humans and camels and bat-related sequences from Africa, Italy and China. The comparison of the secondary and 3DMMS of the RBD of IT-batCoVs with MERS, HKU4 and HKU5 bat sequences showed two aa deletions located in a region corresponding to the external subdomain of MERS-RBD in IT-batCoV and HKU5 RBDs. This study reported two beta CoVs closely related to MERS that were obtained from two bats belonging to two commonly recorded species in Italy (P. kuhlii and H. savii). The analysis of the RBD showed similar structure in IT-batCoVs and HKU5 respect to HKU4 sequences. Since the RBD domain of HKU4 but not HKU5 can bind to the human DPP4 receptor for MERS-CoV, it is possible to suggest also for IT-batCoVs the absence of DPP4-binding potential. More surveillance studies are needed to better investigate the potential intermediate hosts that may play a role in the interspecies transmission of known and currently unknown coronaviruses with particular attention to the S protein and the receptor specificity and binding affinity.
Nogueira, Christiane Lourenço; Whipps, Christopher M.; Matsumoto, Cristianne Kayoko; Chimara, Erica; Droz, Sara; Tortoli, Enrico; de Freitas, Denise; Cnockaert, Margo; Palomino, Juan Carlos; Martin, Anandi; Vandamme, Peter
2015-01-01
Five isolates of non-pigmented, rapidly growing mycobacteria were isolated from three patients and, in an earlier study, from zebrafish. Phenotypic and molecular tests confirmed that these isolates belong to the Mycobacterium chelonae–Mycobacterium abscessus group, but they could not be confidently assigned to any known species of this group. Phenotypic analysis and biochemical tests were not helpful for distinguishing these isolates from other members of the M. chelonae–M. abscessus group. The isolates presented higher drug resistance in comparison with other members of the group, showing susceptibility only to clarithromycin. The five isolates showed a unique PCR restriction analysis pattern of the hsp65 gene, 100 % similarity in 16S rRNA gene and hsp65 sequences and 1–2 nt differences in rpoB and internal transcribed spacer (ITS) sequences. Phylogenetic analysis of a concatenated dataset including 16S rRNA gene, hsp65, and rpoB sequences from type strains of more closely related species placed the five isolates together, as a distinct lineage from previously described species, suggesting a sister relationship to a group consisting of M. chelonae, Mycobacterium salmoniphilum, Mycobacterium franklinii and Mycobacterium immunogenum. DNA–DNA hybridization values >70 % confirmed that the five isolates belong to the same species, while values < 70 % between one of the isolates and the type strains of M. chelonae and M. abscessus confirmed that the isolates belong to a distinct species. The polyphasic characterization of these isolates, supported by DNA–DNA hybridization results, demonstrated that they share characteristics with M. chelonae–M. abscessus members, but constitute a different species, for which the name Mycobacterium saopaulense sp. nov. is proposed. The type strain is EPM 10906T ( = CCUG 66554T = LMG 28586T = INCQS 0733T). PMID:26358475
Allison, Andrew B; Kohler, Dennis J; Ortega, Alicia; Hoover, Elizabeth A; Grove, Daniel M; Holmes, Edward C; Parrish, Colin R
2014-11-01
Canine parvovirus (CPV) emerged as a new pandemic pathogen of dogs in the 1970s and is closely related to feline panleukopenia virus (FPV), a parvovirus of cats and related carnivores. Although both viruses have wide host ranges, analysis of viral sequences recovered from different wild carnivore species, as shown here, demonstrated that>95% were derived from CPV-like viruses, suggesting that CPV is dominant in sylvatic cycles. Many viral sequences showed host-specific mutations in their capsid proteins, which were often close to sites known to control binding to the transferrin receptor (TfR), the host receptor for these carnivore parvoviruses, and which exhibited frequent parallel evolution. To further examine the process of host adaptation, we passaged parvoviruses with alternative backgrounds in cells from different carnivore hosts. Specific mutations were selected in several viruses and these differed depending on both the background of the virus and the host cells in which they were passaged. Strikingly, these in vitro mutations recapitulated many specific changes seen in viruses from natural populations, strongly suggesting they are host adaptive, and which were shown to result in fitness advantages over their parental virus. Comparison of the sequences of the transferrin receptors of the different carnivore species demonstrated that many mutations occurred in and around the apical domain where the virus binds, indicating that viral variants were likely selected through their fit to receptor structures. Some of the viruses accumulated high levels of variation upon passage in alternative hosts, while others could infect multiple different hosts with no or only a few additional mutations. Overall, these studies demonstrate that the evolutionary history of a virus, including how long it has been circulating and in which hosts, as well as its phylogenetic background, has a profound effect on determining viral host range.
Allison, Andrew B.; Kohler, Dennis J.; Ortega, Alicia; Hoover, Elizabeth A.; Grove, Daniel M.; Holmes, Edward C.; Parrish, Colin R.
2014-01-01
Canine parvovirus (CPV) emerged as a new pandemic pathogen of dogs in the 1970s and is closely related to feline panleukopenia virus (FPV), a parvovirus of cats and related carnivores. Although both viruses have wide host ranges, analysis of viral sequences recovered from different wild carnivore species, as shown here, demonstrated that >95% were derived from CPV-like viruses, suggesting that CPV is dominant in sylvatic cycles. Many viral sequences showed host-specific mutations in their capsid proteins, which were often close to sites known to control binding to the transferrin receptor (TfR), the host receptor for these carnivore parvoviruses, and which exhibited frequent parallel evolution. To further examine the process of host adaptation, we passaged parvoviruses with alternative backgrounds in cells from different carnivore hosts. Specific mutations were selected in several viruses and these differed depending on both the background of the virus and the host cells in which they were passaged. Strikingly, these in vitro mutations recapitulated many specific changes seen in viruses from natural populations, strongly suggesting they are host adaptive, and which were shown to result in fitness advantages over their parental virus. Comparison of the sequences of the transferrin receptors of the different carnivore species demonstrated that many mutations occurred in and around the apical domain where the virus binds, indicating that viral variants were likely selected through their fit to receptor structures. Some of the viruses accumulated high levels of variation upon passage in alternative hosts, while others could infect multiple different hosts with no or only a few additional mutations. Overall, these studies demonstrate that the evolutionary history of a virus, including how long it has been circulating and in which hosts, as well as its phylogenetic background, has a profound effect on determining viral host range. PMID:25375184
Paenibacillus nebraskensis sp. nov., isolated from the root surface of field-grown maize.
Kämpfer, Peter; Busse, Hans-Jürgen; McInroy, John A; Hu, Chia-Hui; Kloepper, Joseph W; Glaeser, Stefanie P
2017-12-01
A Gram-positive-staining, aerobic, non-endospore-forming bacterial strain (JJ-59 T ), isolated from a field-grown maize plant in Dunbar, Nebraska in 2014 was studied by a polyphasic approach. Based on 16S rRNA gene sequence similarity comparisons, strain JJ-59 T was shown to be a member of the genus Paenibacillus, most closely related to the type strains of Paenibacillus aceris (98.6 % 16S rRNA gene sequence similarity) and Paenibacillus chondroitinus (97.8 %). For all other type strains of species of the genus Paenibacillus lower 16S rRNA gene sequence similarities were obtained. DNA-DNA hybridization values of strain JJ-59 T to the type strains of P. aceris and P. chondroitinus were 26 % (reciprocal, 59 %) and 52 % (reciprocal, 59 %), respectively. Chemotaxonomic characteristics such as the presence of meso-diaminopimelic acid in the peptidoglycan, the major quinone MK-7 and spermidine as the major polyamine were in agreement with the characteristics of the genus Paenibacillus. Strain JJ-59 T shared with its next related species P. aceris the major lipids diphosphatidylglycerol, phosphatidylglycerol, phosphatidylethanolamine and an unidentified aminophospholipid, but the presence/absence of certain lipids was clearly distinguishable. Major fatty acids of strain JJ-59 T were anteiso-C15 : 0, iso-C15 : 0 and iso-C16 : 0, and the genomic G+C content is 47.2 mol%. Physiological and biochemical characteristics of strain JJ-59 T were clearly different from the most closely related species of the genus Paenibacillus. Thus, strain JJ-59 T represents a novel species of the genus Paenibacillus, for which the name Paenibacillus nebraskensis sp. nov. is proposed, with JJ-59 T (=DSM 103623 T =CIP 111179 T =LMG 29764 T ) as the type strain.
van der Linden, Mark; Otten, Julia; Bergmann, Carina; Latorre, Cristina; Liñares, Josefina
2017-01-01
ABSTRACT The identification of commensal streptococci species is an everlasting problem due to their ability to genetically transform. A new challenge in this respect is the recent description of Streptococcus pseudopneumoniae as a new species, which was distinguished from closely related pathogenic S. pneumoniae and commensal S. mitis by a variety of physiological and molecular biological tests. Forty-one atypical S. pneumoniae isolates have been collected at the German National Reference Center for Streptococci (GNRCS). Multilocus sequence typing (MLST) confirmed 35 isolates as the species S. pseudopneumoniae. A comparison with the pbp2x sequences from 120 commensal streptococci isolated from different continents revealed that pbp2x is distinct among penicillin-susceptible S. pseudopneumoniae isolates. Four penicillin-binding protein x (PBPx) alleles of penicillin-sensitive S. mitis account for most of the diverse sequence blocks in resistant S. pseudopneumoniae, S. pneumoniae, and S. mitis, and S. infantis and S. oralis sequences were found in S. pneumoniae from Japan. PBP2x genes of the family of mosaic genes related to pbp2x in the S. pneumoniae clone Spain23F-1 were observed in S. oralis and S. infantis as well, confirming its global distribution. Thirty-eight sites were altered within the PBP2x transpeptidase domains of penicillin-resistant strains, excluding another 37 sites present in the reference genes of sensitive strains. Specific mutational patterns were detected depending on the parental sequence blocks, in agreement with distinct mutational pathways during the development of beta-lactam resistance. The majority of the mutations clustered around the active site, whereas others are likely to affect stability or interactions with the C-terminal domain or partner proteins. PMID:28193649
Genotypic analysis of Mucor from the platypus in Australia.
Connolly, J H; Stodart, B J; Ash, G J
2010-01-01
Mucor amphibiorum is the only pathogen known to cause significant morbidity and mortality in the free-living platypus (Ornithorhynchus anatinus) in Tasmania. Infection has also been reported in free-ranging cane toads (Bufo marinus) and green tree frogs (Litoria caerulea) from mainland Australia but has not been confirmed in platypuses from the mainland. To date, there has been little genotyping specifically conducted on M. amphibiorum. A collection of 21 Mucor isolates representing isolates from the platypus, frogs and toads, and environmental samples were obtained for genotypic analysis. Internal transcribed spacer (ITS) region sequencing and GenBank comparison confirmed the identity of most of the isolates. Representative isolates from infected platypuses formed a clade containing the reference isolates of M. amphibiorum from the Centraal Bureau voor Schimmelcultures repository. The M. amphibiorum isolates showed a close sequence identity with Mucor indicus and consisted of two haplotypes, differentiated by single nucleotide polymorphisms within the ITS1 and ITS2 regions. With the exception of isolate 96-4049, all isolates from platypuses were in one haplotype. Multilocus fingerprinting via the use of intersimple sequence repeats polymerase chain reaction identified 19 genotypes. Two major clusters were evident: 1) M. amphibiorum and Mucor racemosus; and 2) Mucor circinelloides, Mucor ramosissimus, and Mucor fragilis. Seven M. amphibiorum isolates from platypuses were present in two subclusters, with isolate 96-4053 appearing genetically distinct from all other isolates. Isolates classified as M. circinelloides by sequence analysis formed a separate subcluster, distinct from other Mucor spp. The combination of sequencing and multilocus fingerprinting has the potential to provide the tools for rapid identification of M. amphibiorum. Data presented on the diversity of the pathogen and further work in linking genetic diversity to functional diversity will provide critical information for its management in Tasmanian river systems.
Thakur, Shalabh; Guttman, David S
2016-06-30
Comparative analysis of whole genome sequence data from closely related prokaryotic species or strains is becoming an increasingly important and accessible approach for addressing both fundamental and applied biological questions. While there are number of excellent tools developed for performing this task, most scale poorly when faced with hundreds of genome sequences, and many require extensive manual curation. We have developed a de-novo genome analysis pipeline (DeNoGAP) for the automated, iterative and high-throughput analysis of data from comparative genomics projects involving hundreds of whole genome sequences. The pipeline is designed to perform reference-assisted and de novo gene prediction, homolog protein family assignment, ortholog prediction, functional annotation, and pan-genome analysis using a range of proven tools and databases. While most existing methods scale quadratically with the number of genomes since they rely on pairwise comparisons among predicted protein sequences, DeNoGAP scales linearly since the homology assignment is based on iteratively refined hidden Markov models. This iterative clustering strategy enables DeNoGAP to handle a very large number of genomes using minimal computational resources. Moreover, the modular structure of the pipeline permits easy updates as new analysis programs become available. DeNoGAP integrates bioinformatics tools and databases for comparative analysis of a large number of genomes. The pipeline offers tools and algorithms for annotation and analysis of completed and draft genome sequences. The pipeline is developed using Perl, BioPerl and SQLite on Ubuntu Linux version 12.04 LTS. Currently, the software package accompanies script for automated installation of necessary external programs on Ubuntu Linux; however, the pipeline should be also compatible with other Linux and Unix systems after necessary external programs are installed. DeNoGAP is freely available at https://sourceforge.net/projects/denogap/ .
Draft genome sequence of non-shiga toxin-producing Escherichia coli O157 NCCP15738.
Kwon, Taesoo; Kim, Jung-Beom; Bak, Young-Seok; Yu, Young-Bin; Kwon, Ki Sung; Kim, Won; Cho, Seung-Hak
2016-01-01
The non-shiga toxin-producing Escherichia coli (non-STEC) O157 is a pathogenic strain that cause diarrhea but does not cause hemolytic-uremic syndrome, or hemorrhagic colitis. Here, we present the 5-Mb draft genome sequence of non-STEC O157 NCCP15738, which was isolated from the feces of a Korean patient with diarrhea, and describe its features and the structural basis for its genome evolution. A total of 565-Mbp paired-end reads were generated using the Illumina-HiSeq 2000 platform. The reads were assembled into 135 scaffolds throughout the de novo assembly. The assembled genome size of NCCP15738 was 5,005,278 bp with an N50 value of 142,450 bp and 50.65 % G+C content. Using Rapid Annotation using Subsystem Technology analysis, we predicted 4780 ORFs and 31 RNA genes. The evolutionary tree was inferred from multiple sequence alignment of 45 E. coli species. The most closely related neighbor of NCCP15738 indicated by whole-genome phylogeny was E. coli UMNK88, but that indicated by multilocus sequence analysis was E. coli DH1(ME8569). A comparison between the NCCP15738 genome and those of reference strains, E. coli K-12 substr. MG1655 and EHEC O157:H7 EDL933 by bioinformatics analyses revealed unique genes in NCCP15738 associated with lysis protein S, two-component signal transduction system, conjugation, the flagellum, nucleotide-binding proteins, and metal-ion binding proteins. Notably, NCCP15738 has a dual flagella system like that in Vibrio parahaemolyticus, Aeromonas spp., and Rhodospirillum centenum. The draft genome sequence and the results of bioinformatics analysis of NCCP15738 provide the basis for understanding the genomic evolution of this strain.
Watanabe, Kazuya; Teramoto, Maki; Futamata, Hiroyuki; Harayama, Shigeaki
1998-01-01
DNA was isolated from phenol-digesting activated sludge, and partial fragments of the 16S ribosomal DNA (rDNA) and the gene encoding the largest subunit of multicomponent phenol hydroxylase (LmPH) were amplified by PCR. An analysis of the amplified fragments by temperature gradient gel electrophoresis (TGGE) demonstrated that two major 16S rDNA bands (bands R2 and R3) and two major LmPH gene bands (bands P2 and P3) appeared after the activated sludge became acclimated to phenol. The nucleotide sequences of these major bands were determined. In parallel, bacteria were isolated from the activated sludge by direct plating or by plating after enrichment either in batch cultures or in a chemostat culture. The bacteria isolated were classified into 27 distinct groups by a repetitive extragenic palindromic sequence PCR analysis. The partial nucleotide sequences of 16S rDNAs and LmPH genes of members of these 27 groups were then determined. A comparison of these nucleotide sequences with the sequences of the major TGGE bands indicated that the major bacterial populations, R2 and R3, possessed major LmPH genes P2 and P3, respectively. The dominant populations could be isolated either by direct plating or by chemostat culture enrichment but not by batch culture enrichment. One of the dominant strains (R3) which contained a novel type of LmPH (P3), was closely related to Valivorax paradoxus, and the result of a kinetic analysis of its phenol-oxygenating activity suggested that this strain was the principal phenol digester in the activated sludge. PMID:9797297
Wyllie, David H; Sanderson, Nicholas; Myers, Richard; Peto, Tim; Robinson, Esther; Crook, Derrick W; Smith, E Grace; Walker, A Sarah
2018-06-06
Contact tracing requires reliable identification of closely related bacterial isolates. When we noticed the reporting of artefactual variation between M. tuberculosis isolates during routine next generation sequencing of Mycobacterium spp, we investigated its basis in 2,018 consecutive M. tuberculosis isolates. In the routine process used, clinical samples were decontaminated and inoculated into broth cultures; from positive broth cultures DNA was extracted, sequenced, reads mapped, and consensus sequences determined. We investigated the process of consensus sequence determination, which selects the most common nucleotide at each position. Having determined the high-quality read depth and depth of minor variants across 8,006 M. tuberculosis genomic regions, we quantified the relationship between the minor variant depth and the amount of non-Mycobacterial bacterial DNA, which originates from commensal microbes killed during sample decontamination. In the presence of non-Mycobacterial bacterial DNA, we found significant increases in minor variant frequencies of more than 1.5 fold in 242 regions covering 5.1% of the M. tuberculosis genome. Included within these were four high variation regions strongly influenced by the amount of non-Mycobacterial bacterial DNA. Excluding these four regions from pairwise distance comparisons reduced biologically implausible variation from 5.2% to 0% in an independent validation set derived from 226 individuals. Thus, we have demonstrated an approach identifying critical genomic regions contributing to clinically relevant artefactual variation in bacterial similarity searches. The approach described monitors the outputs of the complex multi-step laboratory and bioinformatics process, allows periodic process adjustments, and will have application to quality control of routine bacterial genomics. Copyright © 2018 Wyllie et al.
Serratia aquatilis sp. nov., isolated from drinking water systems.
Kämpfer, Peter; Glaeser, Stefanie P
2016-01-01
A cream-white-pigmented, oxidase-negative bacterium (strain 2015-2462-01T), isolated from a drinking water system, was investigated in detail to determine its taxonomic position. Cells of the isolate were rod-shaped and stained Gram-negative. A comparison of the 16S rRNA gene sequence of strain 2015-2462-01T with sequences of the type strains of closely related species of the genus Serratia revealed highest similarity to Serratia fonticola (98.4 %), Serratia proteamaculans (97.8 %), Serratia liquefaciens and Serratia grimesii (both 97.7 %). 16S rRNA gene sequence similarities to all other Serratia species were below 97.4 %. Multilocus sequence analysis (MLSA) on the basis of concatenated partial gyrB, rpoB, infB and atpD gene sequences showed a clear distinction of strain 2015-2462-01T from the type strains of the closest related Serratia species. The fatty acid profile of the strain consisted of C16 : 1 ω7c, C16 : 0; C14 : 0 and C14 : 0 3-OH/iso-C16 : 1 I as major components. DNA-DNA hybridizations between 2015-2462-01T and S. fonticola ATCC 29844T resulted in a relatedness value of 27 % (reciprocal 20 %). This DNA-DNA hybridization result in combination with the MLSA results and the differential biochemical properties indicated that strain 2015-2462-01T represents a novel species of the genus Serratia, for which the name Serratia aquatilis sp. nov. is proposed. The type strain is 2015-2462-01T ( = LMG 29119T = CCM 8626T).
Paldurai, Anandan; Subbiah, Madhuri; Kumar, Sachin; Collins, Peter L.; Samal, Siba K.
2009-01-01
Complete consensus genome sequences were determined for avian paramyxovirus type 8 (APMV-8) strains goose/Delaware/1053/76 (prototype strain) and pintail/Wakuya/20/78. The genome of each strain is 15,342 nucleotides (nt) long, which follows the “rule of six”. The genome consists of six genes in the order of 3′-N-P/V/W-M-F-HN-L-5′. The genes are flanked on either side by conserved transcription start and stop signals, and have intergenic regions ranging from 1 to 30 nt. The genome contains a 55 nt leader region at the 3′-end and a 171 nt trailer region at the 5′-end. Comparison of sequences of strains Delaware and Wakuya showed nucleotide identity of 96.8% at the genome level and amino acid identities of 99.3%, 96.5%, 98.6%, 99.4%, 98.6% and 99.1% for the predicted N, P, M, F, HN and L proteins, respectively. Both strains grew in embryonated chicken eggs and in primary chicken embryo kidney cells, and 293T cells. Both strains contained only a single basic residue at the cleavage activation site of the F protein and their efficiency of replication in vitro depended on and was augmented by, the presence of exogenous protease in most cell lines. Sequence alignment and phylogenic analysis of the predicted amino acid sequence of APMV-8 strain Delaware proteins with the cognate proteins of other available APMV serotypes showed that APMV-8 is more closely related to APMV-2 and -6 than to APMV-1, -3 and -4. PMID:19341613