sequence analyses show: Topics by Science.gov

Sample records for sequence analyses show

Genetic diversity of Babesia bovis in virulent and attenuated strains.

PubMed

Mazuz, M L; Molad, T; Fish, L; Leibovitz, B; Wolkomirsky, R; Fleiderovitz, L; Shkap, V

2012-03-01

The aim of this study was to compare the genetic diversity of the single copy Bv80 gene sequences of Babesia bovis in populations of attenuated and virulent parasites. PCR/ RT-PCR followed by cloning and sequence analyses of 4 attenuated and 4 virulent strains were performed. Multiple fragments in the range of 420 to 744 bp were amplified by PCR or RT-PCR. Cloning of the PCR fragments and sequence analyses revealed the presence of mixed subpopulations in either virulent or attenuated parasites with a total of 19 variants with 12 different sequences that differed in number and type of tandem repeats. High levels of intra- and inter-strain diversity of the Bv80 gene, with the presence of mixed populations of parasites were found in both the virulent field isolates and the attenuated vaccine strains. In addition, during the attenuation process, sequence analyses showed changes in the pattern of the parasite subpopulations. Despite high polymorphism found by sequence analyses, the patterns observed and the number of repeats, order, or motifs found could not discriminate between virulent field isolates and attenuated vaccine strains of the parasite.
Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic

PubMed Central

Yebra, Gonzalo; Hodcroft, Emma B.; Ragonnet-Cronin, Manon L.; Pillay, Deenan; Brown, Andrew J. Leigh; Fraser, Christophe; Kellam, Paul; de Oliveira, Tulio; Dennis, Ann; Hoppe, Anne; Kityo, Cissy; Frampton, Dan; Ssemwanga, Deogratius; Tanser, Frank; Keshani, Jagoda; Lingappa, Jairam; Herbeck, Joshua; Wawer, Maria; Essex, Max; Cohen, Myron S.; Paton, Nicholas; Ratmann, Oliver; Kaleebu, Pontiano; Hayes, Richard; Fidler, Sarah; Quinn, Thomas; Novitsky, Vladimir; Haywards, Andrew; Nastouli, Eleni; Morris, Steven; Clark, Duncan; Kozlakidis, Zisis

2016-01-01

HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree’s using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences. PMID:28008945
Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic.

PubMed

Yebra, Gonzalo; Hodcroft, Emma B; Ragonnet-Cronin, Manon L; Pillay, Deenan; Brown, Andrew J Leigh

2016-12-23

HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree's using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences.
5S ribosomal ribonucleic acid sequences in Bacteroides and Fusobacterium: evolutionary relationships within these genera and among eubacteria in general

NASA Technical Reports Server (NTRS)

Van den Eynde, H.; De Baere, R.; Shah, H. N.; Gharbia, S. E.; Fox, G. E.; Michalik, J.; Van de Peer, Y.; De Wachter, R.

1989-01-01

The 5S ribosomal ribonucleic acid (rRNA) sequences were determined for Bacteroides fragilis, Bacteroides thetaiotaomicron, Bacteroides capillosus, Bacteroides veroralis, Porphyromonas gingivalis, Anaerorhabdus furcosus, Fusobacterium nucleatum, Fusobacterium mortiferum, and Fusobacterium varium. A dendrogram constructed by a clustering algorithm from these sequences, which were aligned with all other hitherto known eubacterial 5S rRNA sequences, showed differences as well as similarities with respect to results derived from 16S rRNA analyses. In the 5S rRNA dendrogram, Bacteroides clustered together with Cytophaga and Fusobacterium, as in 16S rRNA analyses. Intraphylum relationships deduced from 5S rRNAs suggested that Bacteroides is specifically related to Cytophaga rather than to Fusobacterium, as was suggested by 16S rRNA analyses. Previous taxonomic considerations concerning the genus Bacteroides, based on biochemical and physiological data, were confirmed by the 5S rRNA sequence analysis.
HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

PubMed

Wan, Shixiang; Zou, Quan

2017-01-01

Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
Cloning, sequencing and characterization of lipase genes from a polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans

USDA-ARS?s Scientific Manuscript database

Lipase (lip) and lipase-specific foldase (lif) genes of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans NRRL B-2649 were cloned using primers based on consensus sequences, followed by PCR-based genome walking. Sequence analyses showed a putative Lip gene-product (...
Characterization of Dermanyssus gallinae (Acarina: Dermanissydae) by sequence analysis of the ribosomal internal transcribed spacer regions.

PubMed

Potenza, L; Cafiero, M A; Camarda, A; La Salandra, G; Cucchiarini, L; Dachà, M

2009-10-01

In the present work mites previously identified as Dermanyssus gallinae De Geer (Acari, Mesostigmata) using morphological keys were investigated by molecular tools. The complete internal transcribed spacer 1 (ITS1), 5.8S ribosomal DNA, and ITS2 region of the ribosomal DNA from mites were amplified and sequenced to examine the level of sequence variations and to explore the feasibility of using this region in the identification of this mite. Conserved primers located at the 3'end of 18S and at the 5'start of 28S rRNA genes were used first, and amplified fragments were sequenced. Sequence analyses showed no variation in 5.8S and ITS2 region while slight intraspecific variations involving substitutions as well as deletions concentrated in the ITS1 region. Based on the sequence analyses a nested PCR of the ITS2 region followed by RFLP analyses has been set up in the attempt to provide a rapid molecular diagnostic tool of D. gallinae.
RY-Coding and Non-Homogeneous Models Can Ameliorate the Maximum-Likelihood Inferences From Nucleotide Sequence Data with Parallel Compositional Heterogeneity.

PubMed

Ishikawa, Sohta A; Inagaki, Yuji; Hashimoto, Tetsuo

2012-01-01

In phylogenetic analyses of nucleotide sequences, 'homogeneous' substitution models, which assume the stationarity of base composition across a tree, are widely used, albeit individual sequences may bear distinctive base frequencies. In the worst-case scenario, a homogeneous model-based analysis can yield an artifactual union of two distantly related sequences that achieved similar base frequencies in parallel. Such potential difficulty can be countered by two approaches, 'RY-coding' and 'non-homogeneous' models. The former approach converts four bases into purine and pyrimidine to normalize base frequencies across a tree, while the heterogeneity in base frequency is explicitly incorporated in the latter approach. The two approaches have been applied to real-world sequence data; however, their basic properties have not been fully examined by pioneering simulation studies. Here, we assessed the performances of the maximum-likelihood analyses incorporating RY-coding and a non-homogeneous model (RY-coding and non-homogeneous analyses) on simulated data with parallel convergence to similar base composition. Both RY-coding and non-homogeneous analyses showed superior performances compared with homogeneous model-based analyses. Curiously, the performance of RY-coding analysis appeared to be significantly affected by a setting of the substitution process for sequence simulation relative to that of non-homogeneous analysis. The performance of a non-homogeneous analysis was also validated by analyzing a real-world sequence data set with significant base heterogeneity.
Genetic analysis of duck circovirus in Pekin ducks from South Korea.

PubMed

Cha, S-Y; Kang, M; Cho, J-G; Jang, H-K

2013-11-01

The genetic organization of the 24 duck circovirus (DuCV) strains detected in commercial Pekin ducks from South Korea between 2011 and 2012 is described in this study. Multiple sequence alignment and phylogenetic analyses were performed on the 24 viral genome sequences as well as on 45 genome sequences available from the GenBank database. Phylogenetic analyses based on the genomic and open reading frame 2/cap sequences demonstrated that all DuCV strains belonged to genotype 1 and were designated in a subcluster under genotype 1. Analysis of the capsid protein amino acid sequences of the 24 Korean DuCV strains showed 10 substitutions compared with that of other genotype 1 strains. Our analysis showed that genotype 1 is predominant and circulating in South Korea. These present results serve as incentive to add more data to the DuCV database and provide insight to conduct further intensive study on the geographic relationships among these virus strains.
A population genetics analysis in clinical isolates of Sporothrix schenckii based on calmodulin and calcium/calmodulin-dependent kinase partial gene sequences.

PubMed

Rangel-Gamboa, Lucia; Martinez-Hernandez, Fernando; Maravilla, Pablo; Flisser, Ana

2018-02-02

Sporotrichosis is a subcutaneous mycosis that is caused by diverse species of Sporothrix. High levels of genetic diversity in Sporothrix isolates have been reported, but few population genetics analyses have been documented. To analyse the genetic variability and population genetics relations of Sporothrix schenckii Mexican clinical isolates and to compare them with other reported isolates. We studied the partial sequences of calmodulin and calcium/calmodulin-dependent kinase genes in 24 isolates; 22 from Mexico, one from Colombia, and one ATCC ® 6331™; the latter was used as a positive control. In total, 24 isolates were analysed. Phylogenetic, haplotype and population genetic analyses were performed with 24 sequences obtained by us and 345 sequences obtained from GenBank. The frequency of S. schenckii sensu stricto was 81% in the 22 Mexican isolates, while the remaining 19% were Sporothrix globosa. Mexican S. schenckii sensu stricto had high genetic diversity and was related to isolates from South America. In contrast, S. globosa showed one haplotype related to isolates from Asia, Brazil, Spain and the USA. In S. schenckii sensu stricto, S. brasiliensis and S. globosa, haplotype polymorphism (θ) values were higher than the nucleotide diversity data (π). In addition, Tajima's D plus Fu and Li's tests analyses displayed negative values, suggesting directional selection and arguing against the model of neutral evolution in these populations. In addition, analyses showed that calcium/calmodulin-dependent kinase was a suitable genetic marker to discriminate between common Sporothrix species. © 2018 Blackwell Verlag GmbH.
Sequence variation and phylogenetic analysis of envelope glycoprotein of hepatitis G virus.

PubMed

Lim, M Y; Fry, K; Yun, A; Chong, S; Linnen, J; Fung, K; Kim, J P

1997-11-01

A transfusion-transmissible agent provisionally designated hepatitis G virus (HGV) was recently identified. In this study, we examined the variability of the HGV genome by analysing sequences in the putative envelope region from 72 isolates obtained from diverse geographical sources. The 1561 nucleotide sequence of the E1/E2/NS2a region of HGV was determined from 12 isolates, and compared with three published sequences. The most variability was observed in 400 nucleotides at the N terminus of E2. We next analysed this 400 nucleotide envelope variable region (EV) from an additional 60 HGV isolates. This sequence varied considerably among the 75 isolates, with overall identity ranging from 79.3% to 99.5% at the nucleotide level, and from 83.5% to 100% at the amino acid level. However, hypervariable regions were not identified. Phylogenetic analyses indicated that the 75 HGV isolates belong to a single genotype. A single-tier distribution of evolutionary distances was observed among the 15 E1/E2/NS2a sequences and the 75 EV sequences. In contrast, 11 isolates of HCV were analysed and showed a three-tiered distribution, representing genotypes, subtypes, and isolates. The 75 isolates of HGV fell into four clusters on the phylogenetic tree. Tight geographical clustering was observed among the HGV isolates from Japan and Korea.
A new fungal large subunit ribosomal RNA primer for high throughput sequencing surveys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mueller, Rebecca C.; Gallegos-Graves, La Verne; Kuske, Cheryl R.

The inclusion of phylogenetic metrics in community ecology has provided insights into important ecological processes, particularly when combined with high-throughput sequencing methods; however, these approaches have not been widely used in studies of fungal communities relative to other microbial groups. Two obstacles have been considered: (1) the internal transcribed spacer (ITS) region has limited utility for constructing phylogenies and (2) most PCR primers that target the large subunit (LSU) ribosomal unit generate amplicons that exceed current limits of high-throughput sequencing platforms. We designed and tested a PCR primer (LR22R) to target approximately 300–400 bp region of the D2 hypervariable regionmore » of the fungal LSU for use with the Illumina MiSeq platform. Both in silico and empirical analyses showed that the LR22R–LR3 pair captured a broad range of fungal taxonomic groups with a small fraction of non-fungal groups. Phylogenetic placement of publically available LSU D2 sequences showed broad agreement with taxonomic classification. Comparisons of the LSU D2 and the ITS2 ribosomal regions from environmental samples and known communities showed similar discriminatory abilities of the two primer sets. Altogether, these findings show that the LR22R–LR3 primer pair has utility for phylogenetic analyses of fungal communities using high-throughput sequencing methods.« less
A new fungal large subunit ribosomal RNA primer for high throughput sequencing surveys

DOE PAGES

Mueller, Rebecca C.; Gallegos-Graves, La Verne; Kuske, Cheryl R.

2015-12-09

The inclusion of phylogenetic metrics in community ecology has provided insights into important ecological processes, particularly when combined with high-throughput sequencing methods; however, these approaches have not been widely used in studies of fungal communities relative to other microbial groups. Two obstacles have been considered: (1) the internal transcribed spacer (ITS) region has limited utility for constructing phylogenies and (2) most PCR primers that target the large subunit (LSU) ribosomal unit generate amplicons that exceed current limits of high-throughput sequencing platforms. We designed and tested a PCR primer (LR22R) to target approximately 300–400 bp region of the D2 hypervariable regionmore » of the fungal LSU for use with the Illumina MiSeq platform. Both in silico and empirical analyses showed that the LR22R–LR3 pair captured a broad range of fungal taxonomic groups with a small fraction of non-fungal groups. Phylogenetic placement of publically available LSU D2 sequences showed broad agreement with taxonomic classification. Comparisons of the LSU D2 and the ITS2 ribosomal regions from environmental samples and known communities showed similar discriminatory abilities of the two primer sets. Altogether, these findings show that the LR22R–LR3 primer pair has utility for phylogenetic analyses of fungal communities using high-throughput sequencing methods.« less
Satellite DNA Sequences in Canidae and Their Chromosome Distribution in Dog and Red Fox.

PubMed

Vozdova, Miluse; Kubickova, Svatava; Cernohorska, Halina; Fröhlich, Jan; Rubes, Jiri

2016-01-01

Satellite DNA is a characteristic component of mammalian centromeric heterochromatin, and a comparative analysis of its evolutionary dynamics can be used for phylogenetic studies. We analysed satellite and satellite-like DNA sequences available in NCBI for 4 species of the family Canidae (red fox, Vulpes vulpes, VVU; domestic dog, Canis familiaris, CFA; arctic fox, Vulpes lagopus, VLA; raccoon dog, Nyctereutes procyonoides procyonoides, NPR) by comparative sequence analysis, which revealed 86-90% intraspecies and 76-79% interspecies similarity. Comparative fluorescence in situ hybridisation in the red fox and dog showed signals of the red fox satellite probe in canine and vulpine autosomal centromeres, on VVUY, B chromosomes, and in the distal parts of VVU9q and VVU10p which were shown to contain nucleolus organiser regions. The CFA satellite probe stained autosomal centromeres only in the dog. The CFA satellite-like DNA did not show any significant sequence similarity with the satellite DNA of any species analysed and was localised to the centromeres of 9 canine chromosome pairs. No significant heterochromatin block was detected on the B chromosomes of the red fox. Our results show extensive heterogeneity of satellite sequences among Canidae and prove close evolutionary relationships between the red and arctic fox. © 2017 S. Karger AG, Basel.
Evolution and molecular epidemiology of classical swine fever virus during a multi-annual outbreak amongst European wild boar.

PubMed

Goller, Katja V; Gabriel, Claudia; Dimna, Mireille Le; Le Potier, Marie-Frédérique; Rossi, Sophie; Staubach, Christoph; Merboth, Matthias; Beer, Martin; Blome, Sandra

2016-03-01

Classical swine fever is a viral disease of pigs that carries tremendous socio-economic impact. In outbreak situations, genetic typing is carried out for the purpose of molecular epidemiology in both domestic pigs and wild boar. These analyses are usually based on harmonized partial sequences. However, for high-resolution analyses towards the understanding of genetic variability and virus evolution, full-genome sequences are more appropriate. In this study, a unique set of representative virus strains was investigated that was collected during an outbreak in French free-ranging wild boar in the Vosges-du-Nord mountains between 2003 and 2007. Comparative sequence and evolutionary analyses of the nearly full-length sequences showed only slow evolution of classical swine fever virus strains over the years and no impact of vaccination on mutation rates. However, substitution rates varied amongst protein genes; furthermore, a spatial and temporal pattern could be observed whereby two separate clusters were formed that coincided with physical barriers.
Hepatitis a virus genotypes and strains from an endemic area of Europe, Bulgaria 2012-2014.

PubMed

Bruni, Roberto; Taffon, Stefania; Equestre, Michele; Cella, Eleonora; Lo Presti, Alessandra; Costantino, Angela; Chionne, Paola; Madonna, Elisabetta; Golkocheva-Markova, Elitsa; Bankova, Diljana; Ciccozzi, Massimo; Teoharov, Pavel; Ciccaglione, Anna Rita

2017-07-14

Hepatitis A virus (HAV) infection is endemic in Eastern European and Balkan region countries. In 2012, Bulgaria showed the highest rate (67.13 cases per 100,000) in Europe. Nevertheless, HAV genotypes and strains circulating in this country have never been described. The present study reports the molecular characterization of HAV from 105 patients from Bulgaria. Anti-HAV IgM positive serum samples collected in 2012-2014 from different towns and villages in Bulgaria were analysed by nested RT-PCR, sequencing of the VP1/2A region and phylogenetic analysis; the results were analysed together with patient and geographical data. Phylogenetic analysis revealed two main sequence groups corresponding to the IA (78/105, 74%) and IB (27/105, 26%) sub-genotypes. In the IA group, a major and a minor cluster were observed (62 and 16 sequences, respectively). Most sequences from the major cluster (44/62, 71%) belonged to either of two strains, termed "strain 1" and "strain 2", differing only for a single specific nucleotide; the remaining sequences (18/62, 29%) showed few (1 to 4) nucleotide variations respect to strain 1 and 2. Strain 2 is identical to the strain previously responsible for an outbreak in the Czech Republic in 2008 and a large multi-country European outbreak caused by contaminated mixed frozen berries in 2013. Most sequences of the IA minor cluster and the IB group were detected in large/medium centers (LMCs). Overall, sequences from the IA major cluster were more frequent in small centers (SCs), but strain 1 and strain 2 showed an opposite relative frequency in SCs and LMCs (strain 1 more frequent in SCs, strain 2 in LMCs). Genotype IA predominated in Bulgaria in 2012-2014 and phylogenetic analysis identified a major cluster of highly related or identical IA sequences, representing 59% of the analysed cases; these isolates were mostly detected in SCs, in which HAV shows higher endemicity than in LMCs. The distribution of viral sequences suggests the existence of some differences between the transmission routes in SCs and LMCs. Molecular characterization of an increased number of isolates from Bulgaria, regularly collected over time, will be useful to explore specific transmission routes and plan appropriate preventing measures.
Analysis of alkaptonuria (AKU) mutations and polymorphisms reveals that the CCC sequence motif is a mutational hot spot in the homogentisate 1,2 dioxygenase gene (HGO).

PubMed Central

Beltrán-Valero de Bernabé, D; Jimenez, F J; Aquaron, R; Rodríguez de Córdoba, S

1999-01-01

We recently showed that alkaptonuria (AKU) is caused by loss-of-function mutations in the homogentisate 1,2 dioxygenase gene (HGO). Herein we describe haplotype and mutational analyses of HGO in seven new AKU pedigrees. These analyses identified two novel single-nucleotide polymorphisms (INV4+31A-->G and INV11+18A-->G) and six novel AKU mutations (INV1-1G-->A, W60G, Y62C, A122D, P230T, and D291E), which further illustrates the remarkable allelic heterogeneity found in AKU. Reexamination of all 29 mutations and polymorphisms thus far described in HGO shows that these nucleotide changes are not randomly distributed; the CCC sequence motif and its inverted complement, GGG, are preferentially mutated. These analyses also demonstrated that the nucleotide substitutions in HGO do not involve CpG dinucleotides, which illustrates important differences between HGO and other genes for the occurrence of mutation at specific short-sequence motifs. Because the CCC sequence motifs comprise a significant proportion (34.5%) of all mutated bases that have been observed in HGO, we conclude that the CCC triplet is a mutational hot spot in HGO. PMID:10205262
Complete genome sequence and phylogenetic analyses of an aquabirnavirus isolated from a diseased marbled eel culture in Taiwan.

PubMed

Wen, Chiu-Ming

2017-08-01

An aquabirnavirus was isolated from diseased marbled eels (Anguilla marmorata; MEIPNV1310) with gill haemorrhages and associated mortality. Its genome segment sequences were obtained through next-generation sequencing and compared with published aquabirnavirus sequences. The results indicated that the genome sequence of MEIPNV1310 contains segment A (3099 nucleotides) and segment B (2789 nucleotides). Phylogenetic analysis showed that MEIPNV1310 is closely related to the infectious pancreatic necrosis Ab strain within genogroup II. This genome sequence is beneficial for studying the geographic distribution and evolution of aquabirnaviruses.
Complete Genome Sequence of a Multidrug-Resistant Salmonella enterica Serovar Typhimurium var. 5- Strain Isolated from Chicken Breast.

PubMed

Hoffmann, Maria; Muruvanda, Tim; Allard, Marc W; Korlach, Jonas; Roberts, Richard J; Timme, Ruth; Payne, Justin; McDermott, Patrick F; Evans, Peter; Meng, Jianghong; Brown, Eric W; Zhao, Shaohua

2013-12-19

Salmonella enterica subsp. enterica serovar Typhimurium is a leading cause of salmonellosis. Here, we report a closed genome sequence, including sequences of 3 plasmids, of Salmonella serovar Typhimurium var. 5- CFSAN001921 (National Antimicrobial Resistance Monitoring System [NARMS] strain ID N30688), which was isolated from chicken breast meat and shows resistance to 10 different antimicrobials. Whole-genome and plasmid sequence analyses of this isolate will help enhance our understanding of this pathogenic multidrug-resistant serovar.
Genomewide Analysis of the Antimicrobial Peptides in Python bivittatus and Characterization of Cathelicidins with Potent Antimicrobial Activity and Low Cytotoxicity.

PubMed

Kim, Dayeong; Soundrarajan, Nagasundarapandian; Lee, Juyeon; Cho, Hye-Sun; Choi, Minkyeung; Cha, Se-Yeoun; Ahn, Byeongyong; Jeon, Hyoim; Le, Minh Thong; Song, Hyuk; Kim, Jin-Hoi; Park, Chankyu

2017-09-01

In this study, we sought to identify novel antimicrobial peptides (AMPs) in Python bivittatus through bioinformatic analyses of publicly available genome information and experimental validation. In our analysis of the python genome, we identified 29 AMP-related candidate sequences. Of these, we selected five cathelicidin-like sequences and subjected them to further in silico analyses. The results showed that these sequences likely have antimicrobial activity. The sequences were named Pb-CATH1 to Pb-CATH5 according to their sequence similarity to previously reported snake cathelicidins. We predicted their molecular structure and then chemically synthesized the mature peptide for three putative cathelicidins and subjected them to biological activity tests. Interestingly, all three peptides showed potent antimicrobial effects against Gram-negative bacteria but very weak activity against Gram-positive bacteria. Remarkably, ΔPb-CATH4 showed potent activity against antibiotic-resistant clinical isolates and also was observed to possess very low hemolytic activity and cytotoxicity. ΔPb-CATH4 also showed considerable serum stability. Electron microscopic analysis indicated that ΔPb-CATH4 exerts its effects via toroidal pore preformation. Structural comparison of the cathelicidins identified in this study to previously reported ones revealed that these Pb-CATHs are representatives of a new group of reptilian cathelicidins lacking the acidic connecting domain. Furthermore, Pb-CATH4 possesses a completely different mature peptide sequence from those of previously described reptilian cathelicidins. These new AMPs may be candidates for the development of alternatives to or complements of antibiotics to control multidrug-resistant pathogens. Copyright © 2017 American Society for Microbiology.

Genomewide Analysis of the Antimicrobial Peptides in Python bivittatus and Characterization of Cathelicidins with Potent Antimicrobial Activity and Low Cytotoxicity

PubMed Central

Kim, Dayeong; Soundrarajan, Nagasundarapandian; Lee, Juyeon; Cho, Hye-sun; Choi, Minkyeung; Cha, Se-Yeoun; Ahn, Byeongyong; Jeon, Hyoim; Le, Minh Thong; Song, Hyuk; Kim, Jin-Hoi

2017-01-01

ABSTRACT In this study, we sought to identify novel antimicrobial peptides (AMPs) in Python bivittatus through bioinformatic analyses of publicly available genome information and experimental validation. In our analysis of the python genome, we identified 29 AMP-related candidate sequences. Of these, we selected five cathelicidin-like sequences and subjected them to further in silico analyses. The results showed that these sequences likely have antimicrobial activity. The sequences were named Pb-CATH1 to Pb-CATH5 according to their sequence similarity to previously reported snake cathelicidins. We predicted their molecular structure and then chemically synthesized the mature peptide for three putative cathelicidins and subjected them to biological activity tests. Interestingly, all three peptides showed potent antimicrobial effects against Gram-negative bacteria but very weak activity against Gram-positive bacteria. Remarkably, ΔPb-CATH4 showed potent activity against antibiotic-resistant clinical isolates and also was observed to possess very low hemolytic activity and cytotoxicity. ΔPb-CATH4 also showed considerable serum stability. Electron microscopic analysis indicated that ΔPb-CATH4 exerts its effects via toroidal pore preformation. Structural comparison of the cathelicidins identified in this study to previously reported ones revealed that these Pb-CATHs are representatives of a new group of reptilian cathelicidins lacking the acidic connecting domain. Furthermore, Pb-CATH4 possesses a completely different mature peptide sequence from those of previously described reptilian cathelicidins. These new AMPs may be candidates for the development of alternatives to or complements of antibiotics to control multidrug-resistant pathogens. PMID:28630199
Variability and genetic structure of the population of watermelon mosaic virus infecting melon in Spain.

PubMed

Moreno, I M; Malpica, J M; Díaz-Pendón, J A; Moriones, E; Fraile, A; García-Arenal, F

2004-01-05

The genetic structure of the population of Watermelon mosaic virus (WMV) in Spain was analysed by the biological and molecular characterisation of isolates sampled from its main host plant, melon. The population was a highly homogeneous one, built of a single pathotype, and comprising isolates closely related genetically. There was indication of temporal replacement of genotypes, but not of spatial structure of the population. Analyses of nucleotide sequences in three genomic regions, that is, in the cistrons for the P1, cylindrical inclusion (CI) and capsid (CP) proteins, showed lower similar values of nucleotide diversity for the P1 than for the CI or CP cistrons. The CI protein and the CP were under tighter evolutionary constraints than the P1 protein. Also, for the CI and CP cistrons, but not for the P1 cistron, two groups of sequences, defining two genetic strains, were apparent. Thus, different genomic regions of WMV show different evolutionary dynamics. Interestingly, for the CI and CP cistrons, sequences were clustered into two regions of the sequence space, defining the two strains above, and no intermediary sequences were identified. Recombinant isolates were found, accounting for at least 7% of the population. These recombinants presented two interesting features: (i) crossover points were detected between the analysed regions in the CI and CP cistrons, but not between those in the P1 and CI cistrons, (ii) crossover points were not observed within the analysed coding regions for the P1, CI or CP proteins. This indicates strong selection against isolates with recombinant proteins, even when originated from closely related strains. Hence, data indicate that genotypes of WMV, generated by mutation or recombination, outside of acceptable, discrete, regions in the evolutionary space, are eliminated from the virus population by negative selection.
Complex dissemination of the diversified mcr-1-harbouring plasmids in Escherichia coli of different sequence types

PubMed Central

Lin, Jingxia; Wang, Xiuna; Deng, Xianbo; Feng, Youjun

2016-01-01

The emergence of the mobilized colistin resistance gene, representing a novel mechanism for bacterial drug resistance, challenges the last resort against the severe infections by Gram-negative bacteria with multi-drug resistances. Very recently, we showed the diversity in the mcr-1-carrying plasmid reservoirs from the gut microbiota. Here, we reported that a similar but more complex scenario is present in the healthy swine populations, Southern China, 2016. Amongst the 1026 pieces of Escherichia coli isolates from 3 different pig farms, 302 E. coli isolates were determined to be positive for the mcr-1 gene (30%, 302/1026). Multi-locus sequence typing assigned no less than 11 kinds of sequence types including one novel Sequence Type to these mcr-1-positive strains. PCR analyses combined with the direct DNA sequencing revealed unexpected complexity of the mcr-1-harbouring plasmids whose backbones are at least grouped into 6 types four of which are new. Transcriptional analyses showed that the mcr-1 promoter of different origins exhibits similar activity. It seems likely that complex dissemination of the diversified mcr-1-bearing plasmids occurs amongst the various ST E. coli inhabiting the healthy swine populations, in Southern China. PMID:27741523
Draft Whole Genome Sequence Analyses on Pseudomonas syringae pv. actinidiae Hypersensitive Response Negative Strains Detected from Kiwifruit Bleeding Sap Samples.

PubMed

Biondi, Enrico; Zamorano, Alan; Vega, Ernesto; Ardizzi, Stefano; Sitta, Davide; De Salvador, Flavio Roberto; Campos-Vargas, Reinaldo; Meneses, Claudio; Perez, Set; Bertaccini, Assunta; Fiore, Nicola

2018-05-01

Kiwifruit bleeding sap samples, collected in Italian and Chilean orchards from symptomatic and asymptomatic plants, were evaluated for the presence of Pseudomonas syringae pv. actinidiae, the causal agent of bacterial canker. The saps were sampled during the spring in both hemispheres, before the bud sprouting, during the optimal time window for the collection of an adequate volume of sample for the early detection of the pathogen, preliminarily by molecular assays, and then through its direct isolation and identification. The results of molecular analyses showed more effectiveness in the P. syringae pv. actinidiae detection when compared with those of microbiological analyses through the pathogen isolation on the nutritive and semiselective media selected. The bleeding sap analyses allowed the isolation and identification of two hypersensitive response (HR) negative and hypovirulent P. syringae pv. actinidiae strains from different regions in Italy. Moreover, multilocus sequence analysis (MLSA) and whole genome sequence (WGS) were carried out on selected Italian and Chilean P. syringae pv. actinidiae virulent strains to verify the presence of genetic variability compared with the HR negative strains and to compare the variability of selected gene clusters between strains isolated in both countries. All the strains showed the lack of argK and coronatine gene clusters as reported for the biovar 3 P. syringae pv. actinidiae strains. Despite the biologic differences obtained in the tobacco bioassays and in pathogenicity assays, the MLSA and WGS analyses did not show significant differences between the WGS of the HR negative and HR positive strains; the difference, on the other hand, between PAC_ICE sequences of Italian and Chilean P. syringae pv. actinidiae strains was confirmed. The inability of the hypovirulent strains IPV-BO 8893 and IPV-BO 9286 to provoke HR in tobacco and the low virulence shown in this host could not be associated with mutations or recombinations in T3SS island.
Fossil rhabdoviral sequences integrated into arthropod genomes: ontogeny, evolution, and potential functionality.

PubMed

Fort, Philippe; Albertini, Aurélie; Van-Hua, Aurélie; Berthomieu, Arnaud; Roche, Stéphane; Delsuc, Frédéric; Pasteur, Nicole; Capy, Pierre; Gaudin, Yves; Weill, Mylène

2012-01-01

Retroelements represent a considerable fraction of many eukaryotic genomes and are considered major drives for adaptive genetic innovations. Recent discoveries showed that despite not normally using DNA intermediates like retroviruses do, Mononegaviruses (i.e., viruses with nonsegmented, negative-sense RNA genomes) can integrate gene fragments into the genomes of their hosts. This was shown for Bornaviridae and Filoviridae, the sequences of which have been found integrated into the germ line cells of many vertebrate hosts. Here, we show that Rhabdoviridae sequences, the major Mononegavirales family, have integrated only into the genomes of arthropod species. We identified 185 integrated rhabdoviral elements (IREs) coding for nucleoproteins, glycoproteins, or RNA-dependent RNA polymerases; they were mostly found in the genomes of the mosquito Aedes aegypti and the blacklegged tick Ixodes scapularis. Phylogenetic analyses showed that most IREs in A. aegypti derived from multiple independent integration events. Since RNA viruses are submitted to much higher substitution rates as compared with their hosts, IREs thus represent fossil traces of the diversity of extinct Rhabdoviruses. Furthermore, analyses of orthologous IREs in A. aegypti field mosquitoes sampled worldwide identified an integrated polymerase IRE fragment that appeared under purifying selection within several million years, which supports a functional role in the host's biology. These results show that A. aegypti was subjected to repeated Rhabdovirus infectious episodes during its evolution history, which led to the accumulation of many integrated sequences. They also suggest that like retroviruses, integrated rhabdoviral sequences may participate actively in the evolution of their hosts.
The maize stripe virus major noncapsid protein messenger RNA transcripts contain heterogeneous leader sequences at their 5' termini.

PubMed

Huiet, L; Feldstein, P A; Tsai, J H; Falk, B W

1993-12-01

Primer extension analyses and a PCR-based cloning strategy were used to identify and characterize 5' nucleotide sequences on the maize stripe virus (MStV) RNA4 mRNA transcripts encoding the major noncapsid protein (NCP). Direct RNA sequence analysis by primer extension showed that the NCP mRNA transcripts had 10-15 nucleotides beyond the 5' terminus of the MStV RNA4 nucleotide sequence. MStV genomic RNAs isolated from ribonucleoprotein particles (RNPs) lacked the additional 5' nucleotides. cDNA clones representing the 5' region of the mRNA transcripts were constructed, and the nucleotide sequences of the 5' regions were determined for 16 clones. Each was found to have a distinct 10-15 nucleotide sequence immediately 5' of the MStV RNA4 sequence. Eleven of 16 clones had the correct MStV RNA4 5' nucleotide sequence, while five showed minor variations at or near the 5' most MStV RNA4 nucleotide. These characteristics show strong similarities to other viral mRNA transcripts which are synthesized by cap snatching.
The W22 genome: a foundation for maize functional genomics and transposon biology

USDA-ARS?s Scientific Manuscript database

The maize W22 inbred has served as a platform for maize genetics since the mid twentieth century. To streamline maize genome analyses, we have sequenced and de novo assembled a W22 reference genome using small-read sequencing technologies. We show that significant structural heterogeneity exists in ...
A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria.

PubMed

Gaby, John Christian; Buckley, Daniel H

2014-01-01

We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.
A comprehensive aligned nifH gene database: a multipurpose tool for studies of nitrogen-fixing bacteria

PubMed Central

Gaby, John Christian; Buckley, Daniel H.

2014-01-01

We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having <3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm PMID:24501396
Genetic and structural analyses of cytochrome P450 hydroxylases in sex hormone biosynthesis: Sequential origin and subsequent coevolution.

PubMed

Goldstone, Jared V; Sundaramoorthy, Munirathinam; Zhao, Bin; Waterman, Michael R; Stegeman, John J; Lamb, David C

2016-01-01

Biosynthesis of steroid hormones in vertebrates involves three cytochrome P450 hydroxylases, CYP11A1, CYP17A1 and CYP19A1, which catalyze sequential steps in steroidogenesis. These enzymes are conserved in the vertebrates, but their origin and existence in other chordate subphyla (Tunicata and Cephalochordata) have not been clearly established. In this study, selected protein sequences of CYP11A1, CYP17A1 and CYP19A1 were compiled and analyzed using multiple sequence alignment and phylogenetic analysis. Our analyses show that cephalochordates have sequences orthologous to vertebrate CYP11A1, CYP17A1 or CYP19A1, and that echinoderms and hemichordates possess CYP11-like but not CYP19 genes. While the cephalochordate sequences have low identity with the vertebrate sequences, reflecting evolutionary distance, the data show apparent origin of CYP11 prior to the evolution of CYP19 and possibly CYP17, thus indicating a sequential origin of these functionally related steroidogenic CYPs. Co-occurrence of the three CYPs in early chordates suggests that the three genes may have coevolved thereafter, and that functional conservation should be reflected in functionally important residues in the proteins. CYP19A1 has the largest number of conserved residues while CYP11A1 sequences are less conserved. Structural analyses of human CYP11A1, CYP17A1 and CYP19A1 show that critical substrate binding site residues are highly conserved in each enzyme family. The results emphasize that the steroidogenic pathways producing glucocorticoids and reproductive steroids are several hundred million years old and that the catalytic structural elements of the enzymes have been conserved over the same period of time. Analysis of these elements may help to identify when precursor functions linked to these enzymes first arose. Copyright © 2015 Elsevier Inc. All rights reserved.
Stability of operational taxonomic units: an important but neglected property for analyzing microbial diversity.

PubMed

He, Yan; Caporaso, J Gregory; Jiang, Xiao-Tao; Sheng, Hua-Fang; Huse, Susan M; Rideout, Jai Ram; Edgar, Robert C; Kopylova, Evguenia; Walters, William A; Knight, Rob; Zhou, Hong-Wei

2015-01-01

The operational taxonomic unit (OTU) is widely used in microbial ecology. Reproducibility in microbial ecology research depends on the reliability of OTU-based 16S ribosomal subunit RNA (rRNA) analyses. Here, we report that many hierarchical and greedy clustering methods produce unstable OTUs, with membership that depends on the number of sequences clustered. If OTUs are regenerated with additional sequences or samples, sequences originally assigned to a given OTU can be split into different OTUs. Alternatively, sequences assigned to different OTUs can be merged into a single OTU. This OTU instability affects alpha-diversity analyses such as rarefaction curves, beta-diversity analyses such as distance-based ordination (for example, Principal Coordinate Analysis (PCoA)), and the identification of differentially represented OTUs. Our results show that the proportion of unstable OTUs varies for different clustering methods. We found that the closed-reference method is the only one that produces completely stable OTUs, with the caveat that sequences that do not match a pre-existing reference sequence collection are discarded. As a compromise to the factors listed above, we propose using an open-reference method to enhance OTU stability. This type of method clusters sequences against a database and includes unmatched sequences by clustering them via a relatively stable de novo clustering method. OTU stability is an important consideration when analyzing microbial diversity and is a feature that should be taken into account during the development of novel OTU clustering methods.
Two alternative ways of start site selection in human norovirus reinitiation of translation.

PubMed

Luttermann, Christine; Meyers, Gregor

2014-04-25

The calicivirus minor capsid protein VP2 is expressed via termination/reinitiation. This process depends on an upstream sequence element denoted termination upstream ribosomal binding site (TURBS). We have shown for feline calicivirus and rabbit hemorrhagic disease virus that the TURBS contains three sequence motifs essential for reinitiation. Motif 1 is conserved among caliciviruses and is complementary to a sequence in the 18 S rRNA leading to the model that hybridization between motif 1 and 18 S rRNA tethers the post-termination ribosome to the mRNA. Motif 2 and motif 2* are proposed to establish a secondary structure positioning the ribosome relative to the start site of the terminal ORF. Here, we analyzed human norovirus (huNV) sequences for the presence and importance of these motifs. The three motifs were identified by sequence analyses in the region upstream of the VP2 start site, and we showed that these motifs are essential for reinitiation of huNV VP2 translation. More detailed analyses revealed that the site of reinitiation is not fixed to a single codon and does not need to be an AUG, even though this codon is clearly preferred. Interestingly, we were able to show that reinitiation can occur at AUG codons downstream of the canonical start/stop site in huNV and feline calicivirus but not in rabbit hemorrhagic disease virus. Although reinitiation at the original start site is independent of the Kozak context, downstream initiation exhibits requirements for start site sequence context known for linear scanning. These analyses on start codon recognition give a more detailed insight into this fascinating mechanism of gene expression.
Predicting 3D structure and stability of RNA pseudoknots in monovalent and divalent ion solutions.

PubMed

Shi, Ya-Zhou; Jin, Lei; Feng, Chen-Jie; Tan, Ya-Lan; Tan, Zhi-Jie

2018-06-01

RNA pseudoknots are a kind of minimal RNA tertiary structural motifs, and their three-dimensional (3D) structures and stability play essential roles in a variety of biological functions. Therefore, to predict 3D structures and stability of RNA pseudoknots is essential for understanding their functions. In the work, we employed our previously developed coarse-grained model with implicit salt to make extensive predictions and comprehensive analyses on the 3D structures and stability for RNA pseudoknots in monovalent/divalent ion solutions. The comparisons with available experimental data show that our model can successfully predict the 3D structures of RNA pseudoknots from their sequences, and can also make reliable predictions for the stability of RNA pseudoknots with different lengths and sequences over a wide range of monovalent/divalent ion concentrations. Furthermore, we made comprehensive analyses on the unfolding pathway for various RNA pseudoknots in ion solutions. Our analyses for extensive pseudokonts and the wide range of monovalent/divalent ion concentrations verify that the unfolding pathway of RNA pseudoknots is mainly dependent on the relative stability of unfolded intermediate states, and show that the unfolding pathway of RNA pseudoknots can be significantly modulated by their sequences and solution ion conditions.
Complete Genome Sequence of a Multidrug-Resistant Salmonella enterica Serovar Typhimurium var. 5− Strain Isolated from Chicken Breast

PubMed Central

Muruvanda, Tim; Allard, Marc W.; Korlach, Jonas; Roberts, Richard J.; Timme, Ruth; Payne, Justin; McDermott, Patrick F.; Evans, Peter; Meng, Jianghong; Brown, Eric W.; Zhao, Shaohua

2013-01-01

Salmonella enterica subsp. enterica serovar Typhimurium is a leading cause of salmonellosis. Here, we report a closed genome sequence, including sequences of 3 plasmids, of Salmonella serovar Typhimurium var. 5− CFSAN001921 (National Antimicrobial Resistance Monitoring System [NARMS] strain ID N30688), which was isolated from chicken breast meat and shows resistance to 10 different antimicrobials. Whole-genome and plasmid sequence analyses of this isolate will help enhance our understanding of this pathogenic multidrug-resistant serovar. PMID:24356834
Complete genome sequences of four avian paramyxoviruses of serotype 10 isolated from Rockhopper Penguins on the Falkland Islands

USDA-ARS?s Scientific Manuscript database

The first complete genome sequences of four Avian paramyxovirus serotype 10 (APMV-10) isolates are described here. The viruses were isolated from Rockhopper Penguins sampled in 2007 on the Falkland Islands. All four genomes are 15,456 nucleotides in length and phylogenetic analyses show them to be c...
Cucumis melo endornavirus: Genome organization, host range and codivergence with the host

USDA-ARS?s Scientific Manuscript database

A high molecular weight dsRNA was isolated from a Cucumis melo plant (referred to as“CL01”) of an unknown cultivar and completely sequenced. Sequence analyses showed similarities with members of the Endornaviridae. The name Cucumis melo endornavirus (CmEV) is proposed. The genome of CmEV-CL01 consis...
De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis

PubMed Central

Nowrousian, Minou; Stajich, Jason E.; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D.; Pöggeler, Stefanie; Read, Nick D.; Seiler, Stephan; Smith, Kristina M.; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-01-01

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30–90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in ∼4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology. PMID:20386741
De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

PubMed

Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-04-08

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology.
Relationships between physical properties and sequence in silkworm silks

PubMed Central

Malay, Ali D.; Sato, Ryota; Yazawa, Kenjiro; Watanabe, Hiroe; Ifuku, Nao; Masunaga, Hiroyasu; Hikima, Takaaki; Guan, Juan; Mandal, Biman B.; Damrongsakkul, Siriporn; Numata, Keiji

2016-01-01

Silk has attracted widespread attention due to its superlative material properties and promising applications. However, the determinants behind the variations in material properties among different types of silk are not well understood. We analysed the physical properties of silk samples from a variety of silkmoth cocoons, including domesticated Bombyx mori varieties and several species from Saturniidae. Tensile deformation tests, thermal analyses, and investigations on crystalline structure and orientation of the fibres were performed. The results showed that saturniid silks produce more highly-defined structural transitions compared to B. mori, as seen in the yielding and strain hardening events during tensile deformation and in the changes observed during thermal analyses. These observations were analysed in terms of the constituent fibroin sequences, which in B. mori are predicted to produce heterogeneous structures, whereas the strictly modular repeats of the saturniid sequences are hypothesized to produce structures that respond in a concerted manner. Within saturniid fibroins, thermal stability was found to correlate with the abundance of poly-alanine residues, whereas differences in fibre extensibility can be related to varying ratios of GGX motifs versus bulky hydrophobic residues in the amorphous phase. PMID:27279149
Relationships between physical properties and sequence in silkworm silks

NASA Astrophysics Data System (ADS)

Malay, Ali D.; Sato, Ryota; Yazawa, Kenjiro; Watanabe, Hiroe; Ifuku, Nao; Masunaga, Hiroyasu; Hikima, Takaaki; Guan, Juan; Mandal, Biman B.; Damrongsakkul, Siriporn; Numata, Keiji

2016-06-01

Silk has attracted widespread attention due to its superlative material properties and promising applications. However, the determinants behind the variations in material properties among different types of silk are not well understood. We analysed the physical properties of silk samples from a variety of silkmoth cocoons, including domesticated Bombyx mori varieties and several species from Saturniidae. Tensile deformation tests, thermal analyses, and investigations on crystalline structure and orientation of the fibres were performed. The results showed that saturniid silks produce more highly-defined structural transitions compared to B. mori, as seen in the yielding and strain hardening events during tensile deformation and in the changes observed during thermal analyses. These observations were analysed in terms of the constituent fibroin sequences, which in B. mori are predicted to produce heterogeneous structures, whereas the strictly modular repeats of the saturniid sequences are hypothesized to produce structures that respond in a concerted manner. Within saturniid fibroins, thermal stability was found to correlate with the abundance of poly-alanine residues, whereas differences in fibre extensibility can be related to varying ratios of GGX motifs versus bulky hydrophobic residues in the amorphous phase.

HIPPI: highly accurate protein family classification with ensembles of HMMs.

PubMed

Nguyen, Nam-Phuong; Nute, Michael; Mirarab, Siavash; Warnow, Tandy

2016-11-11

Given a new biological sequence, detecting membership in a known family is a basic step in many bioinformatics analyses, with applications to protein structure and function prediction and metagenomic taxon identification and abundance profiling, among others. Yet family identification of sequences that are distantly related to sequences in public databases or that are fragmentary remains one of the more difficult analytical problems in bioinformatics. We present a new technique for family identification called HIPPI (Hierarchical Profile Hidden Markov Models for Protein family Identification). HIPPI uses a novel technique to represent a multiple sequence alignment for a given protein family or superfamily by an ensemble of profile hidden Markov models computed using HMMER. An evaluation of HIPPI on the Pfam database shows that HIPPI has better overall precision and recall than blastp, HMMER, and pipelines based on HHsearch, and maintains good accuracy even for fragmentary query sequences and for protein families with low average pairwise sequence identity, both conditions where other methods degrade in accuracy. HIPPI provides accurate protein family identification and is robust to difficult model conditions. Our results, combined with observations from previous studies, show that ensembles of profile Hidden Markov models can better represent multiple sequence alignments than a single profile Hidden Markov model, and thus can improve downstream analyses for various bioinformatic tasks. Further research is needed to determine the best practices for building the ensemble of profile Hidden Markov models. HIPPI is available on GitHub at https://github.com/smirarab/sepp .
Novel Primer Sets for Next Generation Sequencing-Based Analyses of Water Quality

PubMed Central

Lee, Elvina; Khurana, Maninder S.; Whiteley, Andrew S.; Monis, Paul T.; Bath, Andrew; Gordon, Cameron; Ryan, Una M.; Paparini, Andrea

2017-01-01

Next generation sequencing (NGS) has rapidly become an invaluable tool for the detection, identification and relative quantification of environmental microorganisms. Here, we demonstrate two new 16S rDNA primer sets, which are compatible with NGS approaches and are primarily for use in water quality studies. Compared to 16S rRNA gene based universal primers, in silico and experimental analyses demonstrated that the new primers showed increased specificity for the Cyanobacteria and Proteobacteria phyla, allowing increased sensitivity for the detection, identification and relative quantification of toxic bloom-forming microalgae, microbial water quality bioindicators and common pathogens. Significantly, Cyanobacterial and Proteobacterial sequences accounted for ca. 95% of all sequences obtained within NGS runs (when compared to ca. 50% with standard universal NGS primers), providing higher sensitivity and greater phylogenetic resolution of key water quality microbial groups. The increased selectivity of the new primers allow the parallel sequencing of more samples through reduced sequence retrieval levels required to detect target groups, potentially reducing NGS costs by 50% but still guaranteeing optimal coverage and species discrimination. PMID:28118368
Ancient Recombination Events between Human Herpes Simplex Viruses

PubMed Central

Burrel, Sonia; Boutolleau, David; Ryu, Diane; Agut, Henri; Merkel, Kevin; Leendertz, Fabian H.

2017-01-01

Abstract Herpes simplex viruses 1 and 2 (HSV-1 and HSV-2) are seen as close relatives but also unambiguously considered as evolutionary independent units. Here, we sequenced the genomes of 18 HSV-2 isolates characterized by divergent UL30 gene sequences to further elucidate the evolutionary history of this virus. Surprisingly, genome-wide recombination analyses showed that all HSV-2 genomes sequenced to date contain HSV-1 fragments. Using phylogenomic analyses, we could also show that two main HSV-2 lineages exist. One lineage is mostly restricted to subSaharan Africa whereas the other has reached a global distribution. Interestingly, only the worldwide lineage is characterized by ancient recombination events with HSV-1. Our findings highlight the complexity of HSV-2 evolution, a virus of putative zoonotic origin which later recombined with its human-adapted relative. They also suggest that coinfections with HSV-1 and 2 may have genomic and potentially functional consequences and should therefore be monitored more closely. PMID:28369565
Across the Gap: Geochronological and Sedimentological Analyses from the Late Pleistocene-Holocene Sequence of Goda Buticha, Southeastern Ethiopia

PubMed Central

Asrat, Asfawossen; Bahain, Jean-Jacques; Chapon, Cécile; Douville, Eric; Fragnol, Carole; Hernandez, Marion; Hovers, Erella; Leplongeon, Alice; Martin, Loïc; Pleurdeau, David; Pearson, Osbjorn; Puaud, Simon; Assefa, Zelalem

2017-01-01

Goda Buticha is a cave site near Dire Dawa in southeastern Ethiopia that contains an archaeological sequence sampling the late Pleistocene and Holocene of the region. The sedimentary sequence displays complex cultural, chronological and sedimentological histories that seem incongruent with one another. A first set of radiocarbon ages suggested a long sedimentological gap from the end of Marine Isotopic Stage (MIS) 3 to the mid-Holocene. Macroscopic observations suggest that the main sedimentological change does not coincide with the chronostratigraphic hiatus. The cultural sequence shows technological continuity with a late persistence of artifacts that are usually attributed to the Middle Stone Age into the younger parts of the stratigraphic sequence, yet become increasingly associated with lithic artifacts typically related to the Later Stone Age. While not a unique case, this combination of features is unusual in the Horn of Africa. In order to evaluate the possible implications of these observations, sedimentological analyses combined with optically stimulated luminescence (OSL) were conducted. The OSL data now extend the radiocarbon chronology up to 63 ± 7 ka; they also confirm the existence of the chronological gap between 24.8 ± 2.6 ka and 7.5 ± 0.3 ka. The sedimentological analyses suggest that the origin and mode of deposition were largely similar throughout the whole sequence, although the anthropic and faunal activities increased in the younger levels. Regional climatic records are used to support the sedimentological observations and interpretations. We discuss the implications of the sedimentological and dating analyses for understanding cultural processes in the region. PMID:28125597
Across the Gap: Geochronological and Sedimentological Analyses from the Late Pleistocene-Holocene Sequence of Goda Buticha, Southeastern Ethiopia.

PubMed

Tribolo, Chantal; Asrat, Asfawossen; Bahain, Jean-Jacques; Chapon, Cécile; Douville, Eric; Fragnol, Carole; Hernandez, Marion; Hovers, Erella; Leplongeon, Alice; Martin, Loïc; Pleurdeau, David; Pearson, Osbjorn; Puaud, Simon; Assefa, Zelalem

2017-01-01

Goda Buticha is a cave site near Dire Dawa in southeastern Ethiopia that contains an archaeological sequence sampling the late Pleistocene and Holocene of the region. The sedimentary sequence displays complex cultural, chronological and sedimentological histories that seem incongruent with one another. A first set of radiocarbon ages suggested a long sedimentological gap from the end of Marine Isotopic Stage (MIS) 3 to the mid-Holocene. Macroscopic observations suggest that the main sedimentological change does not coincide with the chronostratigraphic hiatus. The cultural sequence shows technological continuity with a late persistence of artifacts that are usually attributed to the Middle Stone Age into the younger parts of the stratigraphic sequence, yet become increasingly associated with lithic artifacts typically related to the Later Stone Age. While not a unique case, this combination of features is unusual in the Horn of Africa. In order to evaluate the possible implications of these observations, sedimentological analyses combined with optically stimulated luminescence (OSL) were conducted. The OSL data now extend the radiocarbon chronology up to 63 ± 7 ka; they also confirm the existence of the chronological gap between 24.8 ± 2.6 ka and 7.5 ± 0.3 ka. The sedimentological analyses suggest that the origin and mode of deposition were largely similar throughout the whole sequence, although the anthropic and faunal activities increased in the younger levels. Regional climatic records are used to support the sedimentological observations and interpretations. We discuss the implications of the sedimentological and dating analyses for understanding cultural processes in the region.
Species identification of mutans streptococci by groESL gene sequence.

PubMed

Hung, Wei-Chung; Tsai, Jui-Chang; Hsueh, Po-Ren; Chia, Jean-San; Teng, Lee-Jene

2005-09-01

The near full-length sequences of the groESL genes were determined and analysed among eight reference strains (serotypes a to h) representing five species of mutans group streptococci. The groES sequences from these reference strains revealed that there are two lengths (285 and 288 bp) in the five species. The intergenic spacer between groES and groEL appears to be a unique marker for species, with a variable size (ranging from 111 to 310 bp) and sequence. Phylogenetic analysis of groES and groEL separated the eight serotypes into two major clusters. Strains of serotypes b, c, e and f were highly related and had groES gene sequences of the same length, 288 bp, while strains of serotypes a, d, g and h were also closely related and their groES gene sequence lengths were 285 bp. The groESL sequences in clinical isolates of three serotypes of S. mutans were analysed for intraspecies polymorphism. The results showed that the groESL sequences could provide information for differentiation among species, but were unable to distinguish serotypes of the same species. Based on the determined sequences, a PCR assay was developed that could differentiate members of the mutans streptococci by amplicon size and provide an alternative way for distinguishing mutans streptococci from other viridans streptococci.
Analyzing the relationship between sequence divergence and nodal support using Bayesian phylogenetic analyses.

PubMed

Makowsky, Robert; Cox, Christian L; Roelke, Corey; Chippindale, Paul T

2010-11-01

Determining the appropriate gene for phylogeny reconstruction can be a difficult process. Rapidly evolving genes tend to resolve recent relationships, but suffer from alignment issues and increased homoplasy among distantly related species. Conversely, slowly evolving genes generally perform best for deeper relationships, but lack sufficient variation to resolve recent relationships. We determine the relationship between sequence divergence and Bayesian phylogenetic reconstruction ability using both natural and simulated datasets. The natural data are based on 28 well-supported relationships within the subphylum Vertebrata. Sequences of 12 genes were acquired and Bayesian analyses were used to determine phylogenetic support for correct relationships. Simulated datasets were designed to determine whether an optimal range of sequence divergence exists across extreme phylogenetic conditions. Across all genes we found that an optimal range of divergence for resolving the correct relationships does exist, although this level of divergence expectedly depends on the distance metric. Simulated datasets show that an optimal range of sequence divergence exists across diverse topologies and models of evolution. We determine that a simple to measure property of genetic sequences (genetic distance) is related to phylogenic reconstruction ability in Bayesian analyses. This information should be useful for selecting the most informative gene to resolve any relationships, especially those that are difficult to resolve, as well as minimizing both cost and confounding information during project design. Copyright © 2010. Published by Elsevier Inc.
Assessing the diversity of AM fungi in arid gypsophilous plant communities.

PubMed

Alguacil, M M; Roldán, A; Torres, M P

2009-10-01

In the present study, we used PCR-Single-Stranded Conformation Polymorphism (SSCP) techniques to analyse arbuscular mycorrhizal fungi (AMF) communities in four sites within a 10 km(2) gypsum area in Southern Spain. Four common plant species from these ecosystems were selected. The AM fungal small-subunit (SSU) rRNA genes were subjected to PCR, cloning, SSCP analysis, sequencing and phylogenetic analyses. A total of 1443 SSU rRNA sequences were analysed, for 21 AM fungal types: 19 belonged to the genus Glomus, 1 to the genus Diversispora and 1 to the Scutellospora. Four sequence groups were identified, which showed high similarity to sequences of known glomalean species or isolates: Glo G18 to Glomus constrictum, Glo G1 to Glomus intraradices, Glo G16 to Glomus clarum, Scut to Scutellospora dipurpurescens and Div to one new genus in the family Diversisporaceae identified recently as Otospora bareai. There were three sequence groups that received strong support in the phylogenetic analysis, and did not seem to be related to any sequences of AM fungi in culture or previously found in the database; thus, they could be novel taxa within the genus Glomus: Glo G4, Glo G2 and Glo G14. We have detected the presence of both generalist and potential specialist AMF in gypsum ecosystems. The AMF communities were different in the plant studied suggesting some degree of preference in the interactions between these symbionts.
The (in)complete organelle genome: exploring the use and nonuse of available technologies for characterizing mitochondrial and plastid chromosomes.

PubMed

Sanitá Lima, Matheus; Woods, Laura C; Cartwright, Matthew W; Smith, David Roy

2016-11-01

Not long ago, scientists paid dearly in time, money and skill for every nucleotide that they sequenced. Today, DNA sequencing technologies epitomize the slogan 'faster, easier, cheaper and more', and in many ways, sequencing an entire genome has become routine, even for the smallest laboratory groups. This is especially true for mitochondrial and plastid genomes. Given their relatively small sizes and high copy numbers per cell, organelle DNAs are currently among the most highly sequenced kind of chromosome. But accurately characterizing an organelle genome and the information it encodes can require much more than DNA sequencing and bioinformatics analyses. Organelle genomes can be surprisingly complex and can exhibit convoluted and unconventional modes of gene expression. Unravelling this complexity can demand a wide assortment of experiments, from pulsed-field gel electrophoresis to Southern and Northern blots to RNA analyses. Here, we show that it is exactly these types of 'complementary' analyses that are often lacking from contemporary organelle genome papers, particularly short 'genome announcement' articles. Consequently, crucial and interesting features of organelle chromosomes are going undescribed, which could ultimately lead to a poor understanding and even a misrepresentation of these genomes and the genes they express. High-throughput sequencing and bioinformatics have made it easy to sequence and assemble entire chromosomes, but they should not be used as a substitute for or at the expense of other types of genomic characterization methods. © 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction

PubMed Central

Laehnemann, David; Borkhardt, Arndt

2016-01-01

Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159
Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach

NASA Astrophysics Data System (ADS)

Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.

2012-10-01

In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.
First complete genome sequence of vanilla mosaic strain of Dasheen mosaic virus isolated from the Cook Islands.

PubMed

Puli'uvea, Christopher; Khan, Subuhi; Chang, Wee-Leong; Valmonte, Gardette; Pearson, Michael N; Higgins, Colleen M

2017-02-01

We present the first complete genome of vanilla mosaic virus (VanMV). The VanMV genomic structure is consistent with that of a potyvirus, containing a single open reading frame (ORF) encoding a polyprotein of 3139 amino acids. Motif analyses indicate the polyprotein can be cleaved into the expected ten individual proteins; other recognised potyvirus motifs are also present. As expected, the VanMV genome shows high sequence similarity to the published Dasheen mosaic virus (DsMV) genome sequences; comparisons with DsMV continue to support VanMV as a vanilla infecting strain of DsMV. Phylogenetic analyses indicate that VanMV and DsMV share a common ancestor, with VanMV having the closest relationship with DsMV strains from the South Pacific.
Complete Genome Sequences of Four Avian Paramyxoviruses of Serotype 10 Isolated from Rockhopper Penguins on the Falkland Islands

PubMed Central

Goraichuk, Iryna V.; Dimitrov, Kiril M.; Sharma, Poonam; Miller, Patti J.; Swayne, David E.; Suarez, David L.

2017-01-01

ABSTRACT The first complete genome sequences of four avian paramyxovirus serotype 10 (APMV-10) isolates are described here. The viruses were isolated from rockhopper penguins on the Falkland Islands, sampled in 2007. All four genomes are 15,456 nucleotides in length, and phylogenetic analyses show them to be closely related. PMID:28572332
Genetic and phylogenetic analysis of a novel parvovirus isolated from chickens in Guangxi, China.

PubMed

Feng, Bin; Xie, Zhixun; Deng, Xianwen; Xie, Liji; Xie, Zhiqin; Huang, Li; Fan, Qin; Luo, Sisi; Huang, Jiaoling; Zhang, Yanfang; Zeng, Tingting; Wang, Sheng; Wang, Leyi

2016-11-01

A previously unidentified chicken parvovirus (ChPV) strain, associated with runting-stunting syndrome (RSS), is now endemic among chickens in China. To explore the genetic diversity of ChPV strains, we determined the first complete genome sequence of a novel ChPV isolate (GX-CH-PV-7) identified in chickens in Guang Xi, China, and showed moderate genome sequence similarity to reference strains. Analysis showed that the viral genome sequence is 86.4 %-93.9 % identical to those of other ChPVs. Genetic and phylogenetic analyses showed that this newly emergent GX-CH-PV-7 is closely related to Gallus gallus enteric parvovirus isolate ChPV 798 from the USA, indicating that they may share a common ancestor. The complete DNA sequence is 4612 bp long with an A+T content of 56.66 %. We determined the first complete genome sequence of a previously unidentified ChPV strain to elucidate its origin and evolutionary status.
An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.

PubMed

Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica

2016-02-18

The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through the graphical user interface ( http://compbio.math.hr/ ). Our results show that scanning with a carefully parameterized motif-HMM is an effective approach for annotation of protein families with low sequence similarity and conserved motifs. The results of this study expand current knowledge and provide new insights into the evolution of the large GDSL-lipase family in land plants.
Genotypic and phenotypic diversity of Alicyclobacillus acidocaldarius isolates.

PubMed

Félix-Valenzuela, L; Guardiola-Avila, I; Burgara-Estrella, A; Ibarra-Zavala, M; Mata-Haro, V

2015-10-01

The fruit juice industry recognizes Alicyclobacillus as a major quality control target micro-organism. In this study, we analysed 19 bacterial isolates to identify Alicyclobacillus species by polymerase chain reaction (PCR) and sequencing analyses. Phenotypic and genomic diversity among isolates were investigated by API 50CHB system and ERIC-PCR (enterobacterial repetitive intergenic consensus-PCR) respectively. All bacterial isolates were identified as Alicyclobacillus acidocaldarius, and almost all showed identical DNA sequences according to their 16S rRNA (rDNA) gene partial sequences. Only few carbohydrates were fermented by A. acidocaldarius isolates, and there was little variability in the biochemical profile. Genotypic fingerprinting of the A. acidocaldarius isolates showed high diversity, and clusters by ERIC-PCR were distinct to those obtained from the 16S rRNA gene phylogenetic tree. There was no correlation between phenotypic and genotypic variability in the A. acidocaldarius isolates analysed in this study. Detection of Alicyclobacillus strains is imperative in fruit concentrates and juices due to the production of guaiacol. Identification of the genera originates rejection of the product by processing industry. However, not all the Alicyclobacillus species are deteriorative and hence the importance to differentiate among them. In this study, partial 16S ribosomal RNA sequence alignment allowed the differentiation of species. In addition, ERIC-PCR was introduced for the genotypic characterization of Alicyclobacillus, as an alternative for differentiation among isolates from the same species. © 2015 The Society for Applied Microbiology.
Characterization of a Novel Polerovirus Infecting Maize in China

PubMed Central

Chen, Sha; Jiang, Guangzhuang; Wu, Jianxiang; Liu, Yong; Qian, Yajuan; Zhou, Xueping

2016-01-01

A novel virus, tentatively named Maize Yellow Mosaic Virus (MaYMV), was identified from the field-grown maize plants showing yellow mosaic symptoms on the leaves collected from the Yunnan Province of China by the deep sequencing of small RNAs. The complete 5642 nucleotide (nt)-long genome of the MaYMV shared the highest nucleotide sequence identity (73%) to Maize Yellow Dwarf Virus-RMV. Sequence comparisons and phylogenetic analyses suggested that MaYMV represents a new member of the genus Polerovirus in the family Luteoviridae. Furthermore, the P0 protein encoded by MaYMV was demonstrated to inhibit both local and systemic RNA silencing by co-infiltration assays using transgenic Nicotiana benthamiana line 16c carrying the GFP reporter gene, which further supported the identification of a new polerovirus. The biologically-active cDNA clone of MaYMV was generated by inserting the full-length cDNA of MaYMV into the binary vector pCB301. RT-PCR and Northern blot analyses showed that this clone was systemically infectious upon agro-inoculation into N. benthamiana. Subsequently, 13 different isolates of MaYMV from field-grown maize plants in different geographical locations of Yunnan and Guizhou provinces of China were sequenced. Analyses of their molecular variation indicate that the 3′ half of P3–P5 read-through protein coding region was the most variable, whereas the coat protein- (CP-) and movement protein- (MP-)coding regions were the most conserved. PMID:27136578
Characterization of a Novel Polerovirus Infecting Maize in China.

PubMed

Chen, Sha; Jiang, Guangzhuang; Wu, Jianxiang; Liu, Yong; Qian, Yajuan; Zhou, Xueping

2016-04-28

A novel virus, tentatively named Maize Yellow Mosaic Virus (MaYMV), was identified from the field-grown maize plants showing yellow mosaic symptoms on the leaves collected from the Yunnan Province of China by the deep sequencing of small RNAs. The complete 5642 nucleotide (nt)-long genome of the MaYMV shared the highest nucleotide sequence identity (73%) to Maize Yellow Dwarf Virus-RMV. Sequence comparisons and phylogenetic analyses suggested that MaYMV represents a new member of the genus Polerovirus in the family Luteoviridae. Furthermore, the P0 protein encoded by MaYMV was demonstrated to inhibit both local and systemic RNA silencing by co-infiltration assays using transgenic Nicotiana benthamiana line 16c carrying the GFP reporter gene, which further supported the identification of a new polerovirus. The biologically-active cDNA clone of MaYMV was generated by inserting the full-length cDNA of MaYMV into the binary vector pCB301. RT-PCR and Northern blot analyses showed that this clone was systemically infectious upon agro-inoculation into N. benthamiana. Subsequently, 13 different isolates of MaYMV from field-grown maize plants in different geographical locations of Yunnan and Guizhou provinces of China were sequenced. Analyses of their molecular variation indicate that the 3' half of P3-P5 read-through protein coding region was the most variable, whereas the coat protein- (CP-) and movement protein- (MP-)coding regions were the most conserved.
A weighted U-statistic for genetic association analyses of sequencing data.

PubMed

Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing

2014-12-01

With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.
Species Diversity of Puerto Rican Heterotermes (Dictyoptera: Rhinotermitidae) Revealed by Phylogenetic Analyses of Two Mitochondrial Genes

PubMed Central

Jones, Susan C.; Jenkins, Tracie M.

2016-01-01

The goal of this study was to infer Heterotermes (Froggatt) (Dictyoptera: Rhinotermitidae) species diversity on the island of Puerto Rico from phylogenetic analyses of DNA sequence data from two mitochondrial genes, 16S rRNA and cytochrome oxidase II (COII). This termite genus is a structural pest known to be well adapted to arid environments in subtropical and tropical regions worldwide including Puerto Rico and many other Caribbean islands. Extensive sampling was accomplished across Puerto Rico, and phylogenetic analyses of individual gene sequences from these samples indicated robust datasets of congruent gene tree topologies showing three monophyletic groups: H. cardini (Snyder), H. convexinotatus (Snyder), and H. tenuis (Hagen). We found that H. cardini and H. convexinotatus were widespread in the arid coastal regions of Puerto Rico, whereas H. tenuis was uncommon and may represent a relatively new introduction. We found only H. convexinotatus on Culebra Island. We provide strong evidence that Puerto Rico may be linked to the Heterotermes in southern Florida, USA, since its GenBank 16S sequence was identical to that of seven Puerto Rican H. cardini sequences. Our study represents the first records of H. cardini from Puerto Rico and Grand Bahama.

Analysis of heterogeneity of Copia-like retrotransposons in the genome of cassava (Manihot esculenta Crantz).

PubMed

Gbadegesin, Micheal A; Beeching, John R

2011-12-20

Retrotransposons are ubiquitous in eukaryotic genomes and now proving to be useful genetic tools for genetic diversity and phylogenetic analyses, especially in plants. In order to assess the diversity of Ty1/Copia-like retrotransposons of cassava, we used PCR primers anchored on the conserved domains of reverse transcriptases (RTs) to amplify cassava Ty1/Copia-like RT. The PCR product was cloned and sequenced. Sequences analysis of the clones revealed the presence of 69 families of Ty1/Copia-like retrotransposon in the genome of cassava. Comparative analyses of the predicted amino acid sequences of these clones with those of other plants showed that retroelements of this class are very heterogeneous in cassava. Cassava is widely grown for its edible roots in the tropical and subtropical regions of the world. Cassava roots, though poor in protein, are rich in starch (makes up about 80% of the dry matter), vitamin C, carotenes, calcium and potassium. It has a great commercial importance as a source of starch and starch based products. Realizing the importance of cassava, it stands out as a crop to benefit from biotechnology development. Heterogeneity of Mecops (Manihot esculenta copia-like Retrotransposons) showed that they may be useful for genetic diversity and phylogenetic analyses of cassava germplasm.
Illumina sequencing-based analyses of bacterial communities during short-chain fatty-acid production from food waste and sewage sludge fermentation at different pH values.

PubMed

Cheng, Weixiao; Chen, Hong; Yan, ShuHai; Su, Jianqiang

2014-09-01

Short-chain fatty acids (SCFAs) can be produced by primary and waste activated sludge anaerobic fermentation. The yield and product spectrum distribution of SCFAs can be significantly affected by different initial pH values. However, most studies have focused on the physical and chemical aspects of SCFA production by waste activated sludge fermentation at different pH values. Information on the bacterial community structures during acidogenic fermentation is limited. In this study, comparisons of the bacterial communities during the co-substrate fermentation of food wastes and sewage sludge at different pH values were performed using the barcoded Illumina paired-end sequencing method. The results showed that different pH environments harbored a characteristic bacterial community, including sequences related to Lactobacillus, Prevotella, Mitsuokella, Treponema, Clostridium, and Ureibacillus. The most abundant bacterial operational taxonomic units in the different pH environments were those related to carbohydrate-degrading bacteria, which are associated with constituents of co-substrate fermentation. Further analyses showed that during organic matter fermentation, a core microbiota composed of Firmicutes, Proteobacteria, and Bacteroidetes existed. Comparison analyses revealed that the bacterial community during fermentation was significantly affected by the pH, and that the diverse product distribution was related to the shift in bacterial communities.
Bacterial community composition characterization of a lead-contaminated Microcoleus sp. consortium.

PubMed

Giloteaux, Ludovic; Solé, Antoni; Esteve, Isabel; Duran, Robert

2011-08-01

A Microcoleus sp. consortium, obtained from the Ebro delta microbial mat, was maintained under different conditions including uncontaminated, lead-contaminated, and acidic conditions. Terminal restriction fragment length polymorphism and 16S rRNA gene library analyses were performed in order to determine the effect of lead and culture conditions on the Microcoleus sp. consortium. The bacterial composition inside the consortium revealed low diversity and the presence of specific terminal-restriction fragments under lead conditions. 16S rRNA gene library analyses showed that members of the consortium were affiliated to the Alpha, Beta, and Gammaproteobacteria and Cyanobacteria. Sequences closely related to Achromobacter spp., Alcaligenes faecalis, and Thiobacillus species were exclusively found under lead conditions while sequences related to Geitlerinema sp., a cyanobacterium belonging to the Oscillatoriales, were not found in presence of lead. This result showed a strong lead selection of the bacterial members present in the Microcoleus sp. consortium. Several of the 16S rRNA sequences were affiliated to nitrogen-fixing microorganisms including members of the Rhizobiaceae and the Sphingomonadaceae. Additionally, confocal laser scanning microscopy and scanning and transmission electron microscopy showed that under lead-contaminated condition Microcoleus sp. cells were grouped and the number of electrodense intracytoplasmic inclusions was increased.
MaxAlign: maximizing usable data in an alignment.

PubMed

Gouveia-Oliveira, Rodrigo; Sackett, Peter W; Pedersen, Anders G

2007-08-28

The presence of gaps in an alignment of nucleotide or protein sequences is often an inconvenience for bioinformatical studies. In phylogenetic and other analyses, for instance, gapped columns are often discarded entirely from the alignment. MaxAlign is a program that optimizes the alignment prior to such analyses. Specifically, it maximizes the number of nucleotide (or amino acid) symbols that are present in gap-free columns - the alignment area - by selecting the optimal subset of sequences to exclude from the alignment. MaxAlign can be used prior to phylogenetic and bioinformatical analyses as well as in other situations where this form of alignment improvement is useful. In this work we test MaxAlign's performance in these tasks and compare the accuracy of phylogenetic estimates including and excluding gapped columns from the analysis, with and without processing with MaxAlign. In this paper we also introduce a new simple measure of tree similarity, Normalized Symmetric Similarity (NSS) that we consider useful for comparing tree topologies. We demonstrate how MaxAlign is helpful in detecting misaligned or defective sequences without requiring manual inspection. We also show that it is not advisable to exclude gapped columns from phylogenetic analyses unless MaxAlign is used first. Finally, we find that the sequences removed by MaxAlign from an alignment tend to be those that would otherwise be associated with low phylogenetic accuracy, and that the presence of gaps in any given sequence does not seem to disturb the phylogenetic estimates of other sequences. The MaxAlign web-server is freely available online at http://www.cbs.dtu.dk/services/MaxAlign where supplementary information can also be found. The program is also freely available as a Perl stand-alone package.
The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase.

PubMed Central

Freemont, P S; Dunbar, B; Fothergill-Gilmore, L A

1988-01-01

The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase, comprising 363 residues, was determined. The sequence was deduced by automated sequencing of CNBr-cleavage, o-iodosobenzoic acid-cleavage, trypsin-digest and staphylococcal-proteinase-digest fragments. Comparison of the sequence with other class I aldolase sequences shows that the mammalian muscle isoenzyme is one of the most highly conserved enzymes known, with only about 2% of the residues changing per 100 million years. Non-mammalian aldolases appear to be evolving at the same rate as other glycolytic enzymes, with about 4% of the residues changing per 100 million years. Secondary-structure predictions are analysed in an accompanying paper [Sawyer, Fothergill-Gilmore & Freemont (1988) Biochem. J. 249, 789-793]. PMID:3355497
Secondary structural analyses of ITS1 in Paramecium.

PubMed

Hoshina, Ryo

2010-01-01

The nuclear ribosomal RNA gene operon is interrupted by internal transcribed spacer (ITS) 1 and ITS2. Although the secondary structure of ITS2 has been widely investigated, less is known about ITS1 and its structure. In this study, the secondary structure of ITS1 sequences for Paramecium and other ciliates was predicted. Each Paramecium ITS1 forms an open loop with three helices, A through C. Helix B was highly conserved among Paramecium, and similar helices were found in other ciliates. A phylogenetic analysis using the ITS1 sequences showed high-resolution, implying that ITS1 is a good tool for species-level analyses.
mySyntenyPortal: an application package to construct websites for synteny block analysis.

PubMed

Lee, Jongin; Lee, Daehwan; Sim, Mikang; Kwon, Daehong; Kim, Juyeon; Ko, Younhee; Kim, Jaebum

2018-06-05

Advances in sequencing technologies have facilitated large-scale comparative genomics based on whole genome sequencing. Constructing and investigating conserved genomic regions among multiple species (called synteny blocks) are essential in the comparative genomics. However, they require significant amounts of computational resources and time in addition to bioinformatics skills. Many web interfaces have been developed to make such tasks easier. However, these web interfaces cannot be customized for users who want to use their own set of genome sequences or definition of synteny blocks. To resolve this limitation, we present mySyntenyPortal, a stand-alone application package to construct websites for synteny block analyses by using users' own genome data. mySyntenyPortal provides both command line and web-based interfaces to build and manage websites for large-scale comparative genomic analyses. The websites can be also easily published and accessed by other users. To demonstrate the usability of mySyntenyPortal, we present an example study for building websites to compare genomes of three mammalian species (human, mouse, and cow) and show how they can be easily utilized to identify potential genes affected by genome rearrangements. mySyntenyPortal will contribute for extended comparative genomic analyses based on large-scale whole genome sequences by providing unique functionality to support the easy creation of interactive websites for synteny block analyses from user's own genome data.
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations

PubMed Central

Marinier, Eric; Zaheer, Rahat; Berry, Chrystal; Weedmark, Kelly A.; Domaratzki, Michael; Mabon, Philip; Knox, Natalie C.; Reimer, Aleisha R.; Graham, Morag R.; Chui, Linda; Patterson-Fortin, Laura; Zhang, Jian; Pagotto, Franco; Farber, Jeff; Mahony, Jim; Seyer, Karine; Bekal, Sadjia; Tremblay, Cécile; Isaac-Renton, Judy; Prystajecky, Natalie; Chen, Jessica; Slade, Peter

2017-01-01

Abstract The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. PMID:29048594
Using variable rate models to identify genes under selection in sequence pairs: their validity and limitations for EST sequences.

PubMed

Church, Sheri A; Livingstone, Kevin; Lai, Zhao; Kozik, Alexander; Knapp, Steven J; Michelmore, Richard W; Rieseberg, Loren H

2007-02-01

Using likelihood-based variable selection models, we determined if positive selection was acting on 523 EST sequence pairs from two lineages of sunflower and lettuce. Variable rate models are generally not used for comparisons of sequence pairs due to the limited information and the inaccuracy of estimates of specific substitution rates. However, previous studies have shown that the likelihood ratio test (LRT) is reliable for detecting positive selection, even with low numbers of sequences. These analyses identified 56 genes that show a signature of selection, of which 75% were not identified by simpler models that average selection across codons. Subsequent mapping studies in sunflower show four of five of the positively selected genes identified by these methods mapped to domestication QTLs. We discuss the validity and limitations of using variable rate models for comparisons of sequence pairs, as well as the limitations of using ESTs for identification of positively selected genes.
LINE-1 retrotransposons: from 'parasite' sequences to functional elements.

PubMed

Paço, Ana; Adega, Filomena; Chaves, Raquel

2015-02-01

Long interspersed nuclear elements-1 (LINE-1) are the most abundant and active retrotransposons in the mammalian genomes. Traditionally, the occurrence of LINE-1 sequences in the genome of mammals has been explained by the selfish DNA hypothesis. Nevertheless, recently, it has also been argued that these sequences could play important roles in these genomes, as in the regulation of gene expression, genome modelling and X-chromosome inactivation. The non-random chromosomal distribution is a striking feature of these retroelements that somehow reflects its functionality. In the present study, we have isolated and analysed a fraction of the open reading frame 2 (ORF2) LINE-1 sequence from three rodent species, Cricetus cricetus, Peromyscus eremicus and Praomys tullbergi. Physical mapping of the isolated sequences revealed an interspersed longitudinal AT pattern of distribution along all the chromosomes of the complement in the three genomes. A detailed analysis shows that these sequences are preferentially located in the euchromatic regions, although some signals could be detected in the heterochromatin. In addition, a coincidence between the location of imprinted gene regions (as Xist and Tsix gene regions) and the LINE-1 retroelements was also observed. According to these results, we propose an involvement of LINE-1 sequences in different genomic events as gene imprinting, X-chromosome inactivation and evolution of repetitive sequences located at the heterochromatic regions (e.g. satellite DNA sequences) of the rodents' genomes analysed.
Cis-acting elements in the promoter region of the human aldolase C gene.

PubMed

Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F

1993-08-16

We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.
The spectra of WC9 stars: evolution and dust formation

NASA Astrophysics Data System (ADS)

Williams, P. M.; Crowther, P. A.; van der Hucht, K. A.

2015-05-01

We present analyses of new optical spectra of three WC9 stars, WR 88, WR 92 and WR 103 to test the suggestion that they exemplify an evolutionary sequence amongst the WC9 stars. The spectrum of WR 88 shows conspicuous lines of N III and N IV, leading to classification as a transitional WN8o/WC9 star. The three stars show a sequence of increasing O II and O III line strengths, confirming and extending earlier studies. The spectra were analysed using CMFGEN models, finding greater abundances of oxygen and carbon in WR 103 than in WR 92 and, especially, in WR 88. Of the three stars, only WR 103 makes circumstellar dust. We suggest that oxygen itself does not enhance this process but that it is its higher carbon abundance that allows WR 103 to make dust.
Characterization of tannase protein sequences of bacteria and fungi: an in silico study.

PubMed

Banerjee, Amrita; Jana, Arijit; Pati, Bikash R; Mondal, Keshab C; Das Mohapatra, Pradeep K

2012-04-01

The tannase protein sequences of 149 bacteria and 36 fungi were retrieved from NCBI database. Among them only 77 bacterial and 31 fungal tannase sequences were taken which have different amino acid compositions. These sequences were analysed for different physical and chemical properties, superfamily search, multiple sequence alignment, phylogenetic tree construction and motif finding to find out the functional motif and the evolutionary relationship among them. The superfamily search for these tannase exposed the occurrence of proline iminopeptidase-like, biotin biosynthesis protein BioH, O-acetyltransferase, carboxylesterase/thioesterase 1, carbon-carbon bond hydrolase, haloperoxidase, prolyl oligopeptidase, C-terminal domain and mycobacterial antigens families and alpha/beta hydrolase superfamily. Some bacterial and fungal sequence showed similarity with different families individually. The multiple sequence alignment of these tannase protein sequences showed conserved regions at different stretches with maximum homology from amino acid residues 389-469 and 482-523 which could be used for designing degenerate primers or probes specific for tannase producing bacterial and fungal species. Phylogenetic tree showed two different clusters; one has only bacteria and another have both fungi and bacteria showing some relationship between these different genera. Although in second cluster near about all fungal species were found together in a corner which indicates the sequence level similarity among fungal genera. The distributions of fourteen motifs analysis revealed Motif 1 with a signature amino acid sequence of 29 amino acids, i.e. GCSTGGREALKQAQRWPHDYDGIIANNPA, was uniformly observed in 83.3 % of studied tannase sequences representing its participation with the structure and enzymatic function.
Quasispecies Analyses of the HIV-1 Near-full-length Genome With Illumina MiSeq

PubMed Central

Ode, Hirotaka; Matsuda, Masakazu; Matsuoka, Kazuhiro; Hachiya, Atsuko; Hattori, Junko; Kito, Yumiko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

2015-01-01

Human immunodeficiency virus type-1 (HIV-1) exhibits high between-host genetic diversity and within-host heterogeneity, recognized as quasispecies. Because HIV-1 quasispecies fluctuate in terms of multiple factors, such as antiretroviral exposure and host immunity, analyzing the HIV-1 genome is critical for selecting effective antiretroviral therapy and understanding within-host viral coevolution mechanisms. Here, to obtain HIV-1 genome sequence information that includes minority variants, we sought to develop a method for evaluating quasispecies throughout the HIV-1 near-full-length genome using the Illumina MiSeq benchtop deep sequencer. To ensure the reliability of minority mutation detection, we applied an analysis method of sequence read mapping onto a consensus sequence derived from de novo assembly followed by iterative mapping and subsequent unique error correction. Deep sequencing analyses of aHIV-1 clone showed that the analysis method reduced erroneous base prevalence below 1% in each sequence position and discarded only < 1% of all collected nucleotides, maximizing the usage of the collected genome sequences. Further, we designed primer sets to amplify the HIV-1 near-full-length genome from clinical plasma samples. Deep sequencing of 92 samples in combination with the primer sets and our analysis method provided sufficient coverage to identify >1%-frequency sequences throughout the genome. When we evaluated sequences of pol genes from 18 treatment-naïve patients' samples, the deep sequencing results were in agreement with Sanger sequencing and identified numerous additional minority mutations. The results suggest that our deep sequencing method would be suitable for identifying within-host viral population dynamics throughout the genome. PMID:26617593
Naturally occurring deletions/insertions in HBV core promoter tend to decrease in hepatitis B e antigen-positive chronic hepatitis B patients during antiviral therapy.

PubMed

Peng, Yaqin; Liu, Baoming; Hou, Jinlin; Sun, Jian; Hao, Ran; Xiang, Kuanhui; Yan, Ling; Zhang, Jiangbo; Zhuang, Hui; Li, Tong

2015-01-01

Mutations in HBV core promoter (CP) are suggested to affect viral replication and disease progression. We investigated CP deletion/insertion mutations (Del/Ins) in hepatitis B e antigen (HBeAg)-positive chronic hepatitis B (CHB) patients before and during antiviral treatment. Direct and clone sequencings were used for detection of CP Del/Ins in 12 patients. The dynamic changes of CP Del/Ins were tracked in these cases until week 48 of treatment. The effects of Del/Ins on CP activities and hepatitis B X protein (HBx) were analysed using luciferase assay and sequence comparison, respectively. Furthermore, 292 untreated HBeAg-positive CHB cases were also analysed. Twelve cases with multi-peak PCR direct sequencing electropherograms at baseline were confirmed to have CP Del/Ins by clone sequencing, with detection rates varying from 14.8% to 93.3% of clones analysed. Follow-up studies showed the detection rates of CP Del/Ins in patients decreased from 100% (12/12) at baseline to 16.7% (2/12) at week 48 of treatment (P<0.001), in parallel with a decline in HBV DNA, hepatitis B surface antigen (HBsAg), alanine aminotransferase (ALT) and aspartate transaminase (AST) levels along with an increase in HBeAg loss. Luciferase assay results showed distinct promoter activities among Del/Ins-harbouring CP sequences. Importantly, 71.8% (148/206) of Del/Ins sequences potentially resulted in HBx carboxy-terminal truncations. CP Del/Ins mutations were also found in 27.4% (80/292) of untreated cases. Naturally occurring complex of CP Del/Ins mutants existed in untreated HBeAg-positive CHB patients. These mutations would affect HBV transcription activities and integrity of HBx, which might correlate with disease progression. Their prevalence decreases on antiviral therapy in parallel with the decline in HBV DNA, HBsAg and ALT and AST levels.
Ancient Recombination Events between Human Herpes Simplex Viruses.

PubMed

Burrel, Sonia; Boutolleau, David; Ryu, Diane; Agut, Henri; Merkel, Kevin; Leendertz, Fabian H; Calvignac-Spencer, Sébastien

2017-07-01

Herpes simplex viruses 1 and 2 (HSV-1 and HSV-2) are seen as close relatives but also unambiguously considered as evolutionary independent units. Here, we sequenced the genomes of 18 HSV-2 isolates characterized by divergent UL30 gene sequences to further elucidate the evolutionary history of this virus. Surprisingly, genome-wide recombination analyses showed that all HSV-2 genomes sequenced to date contain HSV-1 fragments. Using phylogenomic analyses, we could also show that two main HSV-2 lineages exist. One lineage is mostly restricted to subSaharan Africa whereas the other has reached a global distribution. Interestingly, only the worldwide lineage is characterized by ancient recombination events with HSV-1. Our findings highlight the complexity of HSV-2 evolution, a virus of putative zoonotic origin which later recombined with its human-adapted relative. They also suggest that coinfections with HSV-1 and 2 may have genomic and potentially functional consequences and should therefore be monitored more closely. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
A novel flavivirus detected in two Aedes spp. collected near the demilitarized zone of the Republic of Korea.

PubMed

Korkusol, Achareeya; Takhampunya, Ratree; Hang, Jun; Jarman, Richard G; Tippayachai, Bousaraporn; Kim, Heung-Chul; Chong, Sung-Tae; Davidson, Silas A; Klein, Terry A

2017-05-01

Flaviviruses comprise a large and diverse group of positive-stranded RNA viruses, including tick-, mosquito- and unknown-vector-borne flaviviruses. A novel flavivirus was detected in pools of Aedes vexans nipponii (n=1) and Aedes esoensis (n=3) collected in 2012 and 2013 near the demilitarized zone (DMZ), Republic of Korea (ROK). Phylogenetic analyses of the NS5, E gene and complete polyprotein coding sequence (CDS) showed that the novel virus fell within the Aedes-borne flaviviruses (ABFVs), with nucleotide identity ranging from 57.8-75.1 %, 46.1-74.2 % and 51.1-76.2 %, respectively. While the novel ABFV was distant from other flaviviruses within the group, it formed a clade with Ilomantsi virus (ILOV). Sequence alignments of the partial NS5 gene, full-length E gene and polyprotein CDS between the novel virus and ILOV showed approximately 76.2 % nucleotide identity and 90 % amino acid identity, respectively. The ABFV identified in Aedes mosquitoes from the ROK is a novel ABFV based on the sequence analyses and is designated as Panmunjeom flavivirus (PANFV).
A sequential analysis of classroom discourse in Italian primary schools: the many faces of the IRF pattern.

PubMed

Molinari, Luisa; Mameli, Consuelo; Gnisci, Augusto

2013-09-01

A sequential analysis of classroom discourse is needed to investigate the conditions under which the triadic initiation-response-feedback (IRF) pattern may host different teaching orientations. The purpose of the study is twofold: first, to describe the characteristics of classroom discourse and, second, to identify and explore the different interactive sequences that can be captured with a sequential statistical analysis. Twelve whole-class activities were video recorded in three Italian primary schools. We observed classroom interaction as it occurs naturally on an everyday basis. In total, we collected 587 min of video recordings. Subsequently, 828 triadic IRF patterns were extracted from this material and analysed with the programme Generalized Sequential Query (GSEQ). The results indicate that classroom discourse may unfold in different ways. In particular, we identified and described four types of sequences. Dialogic sequences were triggered by authentic questions, and continued through further relaunches. Monologic sequences were directed to fulfil the teachers' pre-determined didactic purposes. Co-constructive sequences fostered deduction, reasoning, and thinking. Scaffolding sequences helped and sustained children with difficulties. The application of sequential analyses allowed us to show that interactive sequences may account for a variety of meanings, thus making a significant contribution to the literature and research practice in classroom discourse. © 2012 The British Psychological Society.
Functional Assays and Metagenomic Analyses Reveals Differences between the Microbial Communities Inhabiting the Soil Horizons of a Norway Spruce Plantation

PubMed Central

Uroz, Stéphane; Ioannidis, Panos; Lengelle, Juliette; Cébron, Aurélie; Morin, Emmanuelle; Buée, Marc; Martin, Francis

2013-01-01

In temperate ecosystems, acidic forest soils are among the most nutrient-poor terrestrial environments. In this context, the long-term differentiation of the forest soils into horizons may impact the assembly and the functions of the soil microbial communities. To gain a more comprehensive understanding of the ecology and functional potentials of these microbial communities, a suite of analyses including comparative metagenomics was applied on independent soil samples from a spruce plantation (Breuil-Chenue, France). The objectives were to assess whether the decreasing nutrient bioavailability and pH variations that naturally occurs between the organic and mineral horizons affects the soil microbial functional biodiversity. The 14 Gbp of pyrosequencing and Illumina sequences generated in this study revealed complex microbial communities dominated by bacteria. Detailed analyses showed that the organic soil horizon was significantly enriched in sequences related to Bacteria, Chordata, Arthropoda and Ascomycota. On the contrary the mineral horizon was significantly enriched in sequences related to Archaea. Our analyses also highlighted that the microbial communities inhabiting the two soil horizons differed significantly in their functional potentials according to functional assays and MG-RAST analyses, suggesting a functional specialisation of these microbial communities. Consistent with this specialisation, our shotgun metagenomic approach revealed a significant increase in the relative abundance of sequences related glycoside hydrolases in the organic horizon compared to the mineral horizon that was significantly enriched in glycoside transferases. This functional stratification according to the soil horizon was also confirmed by a significant correlation between the functional assays performed in this study and the functional metagenomic analyses. Together, our results suggest that the soil stratification and particularly the soil resource availability impact the functional diversity and to a lesser extent the taxonomic diversity of the bacterial communities. PMID:23418476
Hierarchical Traces for Reduced NSM Memory Requirements

NASA Astrophysics Data System (ADS)

Dahl, Torbjørn S.

This paper presents work on using hierarchical long term memory to reduce the memory requirements of nearest sequence memory (NSM) learning, a previously published, instance-based reinforcement learning algorithm. A hierarchical memory representation reduces the memory requirements by allowing traces to share common sub-sequences. We present moderated mechanisms for estimating discounted future rewards and for dealing with hidden state using hierarchical memory. We also present an experimental analysis of how the sub-sequence length affects the memory compression achieved and show that the reduced memory requirements do not effect the speed of learning. Finally, we analyse and discuss the persistence of the sub-sequences independent of specific trace instances.

Genomic analysis of expressed sequence tags in American black bear Ursus americanus

PubMed Central

2010-01-01

Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065
Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

PubMed

Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

2010-03-26

Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.
STAT1:DNA sequence-dependent binding modulation by phosphorylation, protein:protein interactions and small-molecule inhibition

PubMed Central

Bonham, Andrew J.; Wenta, Nikola; Osslund, Leah M.; Prussin, Aaron J.; Vinkemeier, Uwe; Reich, Norbert O.

2013-01-01

The DNA-binding specificity and affinity of the dimeric human transcription factor (TF) STAT1, were assessed by total internal reflectance fluorescence protein-binding microarrays (TIRF-PBM) to evaluate the effects of protein phosphorylation, higher-order polymerization and small-molecule inhibition. Active, phosphorylated STAT1 showed binding preferences consistent with prior characterization, whereas unphosphorylated STAT1 showed a weak-binding preference for one-half of the GAS consensus site, consistent with recent models of STAT1 structure and function in response to phosphorylation. This altered-binding preference was further tested by use of the inhibitor LLL3, which we show to disrupt STAT1 binding in a sequence-dependent fashion. To determine if this sequence-dependence is specific to STAT1 and not a general feature of human TF biology, the TF Myc/Max was analysed and tested with the inhibitor Mycro3. Myc/Max inhibition by Mycro3 is sequence independent, suggesting that the sequence-dependent inhibition of STAT1 may be specific to this system and a useful target for future inhibitor design. PMID:23180800
Skeletal development in the African elephant and ossification timing in placental mammals

PubMed Central

Hautier, Lionel; Stansfield, Fiona J.; Allen, W. R. Twink; Asher, Robert J.

2012-01-01

We provide here unique data on elephant skeletal ontogeny. We focus on the sequence of cranial and post-cranial ossification events during growth in the African elephant (Loxodonta africana). Previous analyses on ossification sequences in mammals have focused on monotremes, marsupials, boreoeutherian and xenarthran placentals. Here, we add data on ossification sequences in an afrotherian. We use two different methods to quantify sequence heterochrony: the sequence method and event-paring/Parsimov. Compared with other placentals, elephants show late ossifications of the basicranium, manual and pedal phalanges, and early ossifications of the ischium and metacarpals. Moreover, ossification in elephants starts very early and progresses rapidly. Specifically, the elephant exhibits the same percentage of bones showing an ossification centre at the end of the first third of its gestation period as the mouse and hamster have close to birth. Elephants show a number of features of their ossification patterns that differ from those of other placental mammals. The pattern of the initiation of the ossification evident in the African elephant underscores a possible correlation between the timing of ossification onset and gestation time throughout mammals. PMID:22298853
The Neural Correlates of Implicit Sequence Learning in Schizophrenia

PubMed Central

Marvel, Cherie L.; Turner, Beth M.; O’Leary, Daniel S.; Johnson, Hans J.; Pierson, Ronald K.; Boles Ponto, Laura L.; Andreasen, Nancy C.

2009-01-01

Twenty-seven schizophrenia spectrum patients and 25 healthy controls performed a probabilistic version of the serial reaction time task (SRT) that included sequence trials embedded within random trials. Patients showed diminished, yet measurable, sequence learning. Postexperimental analyses revealed that a group of patients performed above chance when generating short spans of the sequence. This high-generation group showed SRT learning that was similar in magnitude to that of controls. Their learning was evident from the very 1st block; however, unlike controls, learning did not develop further with continued testing. A subset of 12 patients and 11 controls performed the SRT in conjunction with positron emission tomography. High-generation performance, which corresponded to SRT learning in patients, correlated to activity in the premotor cortex and parahippocampus. These areas have been associated with stimulus-driven visuospatial processing. Taken together, these results suggest that a subset of patients who showed moderate success on the SRT used an explicit stimulus-driven strategy to process the sequential stimuli. This adaptive strategy facilitated sequence learning but may have interfered with conventional implicit learning of the overall stimulus pattern. PMID:17983290
Bacteriomes of the corn leafhopper, Dalbulus maidis (DeLong & Wolcott, 1923) (Insecta, Hemiptera, Cicadellidae: Deltocephalinae) harbor Sulcia symbiont: molecular characterization, ultrastructure, and transovarial transmission.

PubMed

Brentassi, María Eugenia; Franco, Ernesto; Balatti, Pedro; Medina, Rocío; Bernabei, Franco; Marino de Remes Lenicov, Ana M

2017-05-01

In this study, we surveyed the bacteriome-associated microbiota of the corn leafhopper Dalbulus maidis by means of histological, ultrastructural, and molecular analyses. Amplification and sequencing of 16S rDNA genes revealed that the endosymbiont "Candidatus Sulcia muelleri" (Phylum Bacteroidetes) resides in bacteriomes of D. maidis. Phylogenetic analysis showed that the sequence was closely allied to others found in representatives of the subfamily Deltocephalinae. We failed to amplify other sequences as "Candidatus Nasuia deltocephalinicola," a co-primary symbiont frequently associated to deltocephaline leafhoppers. In addition, a metagenetic analysis carried out in order to investigate the presence of other bacteriome-associated bacteria of D. maidis showed that the sequence of Sulcia accounted for 98.56 % of all the sequences. Histological and ultrastructural observations showed that microorganisms harbored in bacteriomes (central syncytium and cytoplasm of uninucleate bacteriocytes) look like others Sulcia described in hemipteran species and they were transovarially transmitted from mother to offspring which is typical of obligate endosymbionts. The only presence of Sulcia in the bacteriomes of D. maidis was discussed.
[Methods, challenges and opportunities for big data analyses of microbiome].

PubMed

Sheng, Hua-Fang; Zhou, Hong-Wei

2015-07-01

Microbiome is a novel research field related with a variety of chronic inflamatory diseases. Technically, there are two major approaches to analysis of microbiome: metataxonome by sequencing the 16S rRNA variable tags, and metagenome by shot-gun sequencing of the total microbial (mainly bacterial) genome mixture. The 16S rRNA sequencing analyses pipeline includes sequence quality control, diversity analyses, taxonomy and statistics; metagenome analyses further includes gene annotation and functional analyses. With the development of the sequencing techniques, the cost of sequencing will decrease, and big data analyses will become the central task. Data standardization, accumulation, modeling and disease prediction are crucial for future exploit of these data. Meanwhile, the information property in these data, and the functional verification with culture-dependent and culture-independent experiments remain the focus in future research. Studies of human microbiome will bring a better understanding of the relations between the human body and the microbiome, especially in the context of disease diagnosis and therapy, which promise rich research opportunities.
Comparative molecular cytogenetics of major repetitive sequence families of three Dendrobium species (Orchidaceae) from Bangladesh

PubMed Central

Begum, Rabeya; Alam, Sheikh Shamimul; Menzel, Gerhard; Schmidt, Thomas

2009-01-01

Background and Aims Dendrobium species show tremendous morphological diversity and have broad geographical distribution. As repetitive sequence analysis is a useful tool to investigate the evolution of chromosomes and genomes, the aim of the present study was the characterization of repetitive sequences from Dendrobium moschatum for comparative molecular and cytogenetic studies in the related species Dendrobium aphyllum, Dendrobium aggregatum and representatives from other orchid genera. Methods In order to isolate highly repetitive sequences, a c0t-1 DNA plasmid library was established. Repeats were sequenced and used as probes for Southern hybridization. Sequence divergence was analysed using bioinformatic tools. Repetitive sequences were localized along orchid chromosomes by fluorescence in situ hybridization (FISH). Key Results Characterization of the c0t-1 library resulted in the detection of repetitive sequences including the (GA)n dinucleotide DmoO11, numerous Arabidopsis-like telomeric repeats and the highly amplified dispersed repeat DmoF14. The DmoF14 repeat is conserved in six Dendrobium species but diversified in representative species of three other orchid genera. FISH analyses showed the genome-wide distribution of DmoF14 in D. moschatum, D. aphyllum and D. aggregatum. Hybridization with the telomeric repeats demonstrated Arabidopsis-like telomeres at the chromosome ends of Dendrobium species. However, FISH using the telomeric probe revealed two pairs of chromosomes with strong intercalary signals in D. aphyllum. FISH showed the terminal position of 5S and 18S–5·8S–25S rRNA genes and a characteristic number of rDNA sites in the three Dendrobium species. Conclusions The repeated sequences isolated from D. moschatum c0t-1 DNA constitute major DNA families of the D. moschatum, D. aphyllum and D. aggregatum genomes with DmoF14 representing an ancient component of orchid genomes. Large intercalary telomere-like arrays suggest chromosomal rearrangements in D. aphyllum while the number and localization of rRNA genes as well as the species-specific distribution pattern of an abundant microsatellite reflect the genomic diversity of the three Dendrobium species. PMID:19635741
cDNA sequences and organization of IgM heavy chain genes in two holostean fish.

PubMed

Wilson, M R; van Ravenstein, E; Miller, N W; Clem, L W; Middleton, D L; Warr, G W

1995-01-01

Immunoglobulin M heavy chain (mu) sequences of two holostean fish, the bowfin, Amia calva, and the longnose gar, Lepisosteus osseus, were amplified from spleen mRNA by RACE-PCR, cloned, and sequenced. Each mu chain showed the conserved four constant domain structure typical of a secreted mu chain. Southern blot analyses with specific heavy chain variable (VH) and constant (CH) region probes suggest that both fish possess an IgH locus that resembles that of the teleosts, amphibians, and mammals in its organization. The overall sequence similarity of gar and bowfin mu chains was 60% and 48% at the nucleotide and amino acid levels, respectively, while similarity to the mu chains of teleosts and elasmobranchs was lower. The bowfin mu chain possesses a distinctive proline-rich sequence at the C mu 1/C mu 2 boundary; a shorter proline-rich sequence is present at this position in the gar mu chain. Both gar and bowfin show, in their C mu 4 sequences, motifs that could serve as cryptic splice donor sites for the production of mRNA encoding the membrane-bound form of the mu chains, and the bowfin also shows a potential cryptic splice donor site in the C mu 3 exon.
Analysis of whole genome sequencing for the Escherichia coli O157:H7 typing phages.

PubMed

Cowley, Lauren A; Beckett, Stephen J; Chase-Topping, Margo; Perry, Neil; Dallman, Tim J; Gally, David L; Jenkins, Claire

2015-04-08

Shiga toxin producing Escherichia coli O157 can cause severe bloody diarrhea and haemolytic uraemic syndrome. Phage typing of E. coli O157 facilitates public health surveillance and outbreak investigations, certain phage types are more likely to occupy specific niches and are associated with specific age groups and disease severity. The aim of this study was to analyse the genome sequences of 16 (fourteen T4 and two T7) E. coli O157 typing phages and to determine the genes responsible for the subtle differences in phage type profiles. The typing phages were sequenced using paired-end Illumina sequencing at The Genome Analysis Centre and the Animal Health and Veterinary Laboratories Agency and bioinformatics programs including Velvet, Brig and Easyfig were used to analyse them. A two-way Euclidian cluster analysis highlighted the associations between groups of phage types and typing phages. The analysis showed that the T7 typing phages (9 and 10) differed by only three genes and that the T4 typing phages formed three distinct groups of similar genomic sequences: Group 1 (1, 8, 11, 12 and 15, 16), Group 2 (3, 6, 7 and 13) and Group 3 (2, 4, 5 and 14). The E. coli O157 phage typing scheme exhibited a significantly modular network linked to the genetic similarity of each group showing that these groups are specialised to infect a subset of phage types. Sequencing the typing phage has enabled us to identify the variable genes within each group and to determine how this corresponds to changes in phage type.
RAPD and Internal Transcribed Spacer Sequence Analyses Reveal Zea nicaraguensis as a Section Luxuriantes Species Close to Zea luxurians

PubMed Central

Wang, Pei; Lu, Yanli; Zheng, Mingmin; Rong, Tingzhao; Tang, Qilin

2011-01-01

Genetic relationship of a newly discovered teosinte from Nicaragua, Zea nicaraguensis with waterlogging tolerance, was determined based on randomly amplified polymorphic DNA (RAPD) markers and the internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA using 14 accessions from Zea species. RAPD analysis showed that a total of 5,303 fragments were produced by 136 random decamer primers, of which 84.86% bands were polymorphic. RAPD-based UPGMA analysis demonstrated that the genus Zea can be divided into section Luxuriantes including Zea diploperennis, Zea luxurians, Zea perennis and Zea nicaraguensis, and section Zea including Zea mays ssp. mexicana, Zea mays ssp. parviglumis, Zea mays ssp. huehuetenangensis and Zea mays ssp. mays. ITS sequence analysis showed the lengths of the entire ITS region of the 14 taxa in Zea varied from 597 to 605 bp. The average GC content was 67.8%. In addition to the insertion/deletions, 78 variable sites were recorded in the total ITS region with 47 in ITS1, 5 in 5.8S, and 26 in ITS2. Sequences of these taxa were analyzed with neighbor-joining (NJ) and maximum parsimony (MP) methods to construct the phylogenetic trees, selecting Tripsacum dactyloides L. as the outgroup. The phylogenetic relationships of Zea species inferred from the ITS sequences are highly concordant with the RAPD evidence that resolved two major subgenus clades. Both RAPD and ITS sequence analyses indicate that Zea nicaraguensis is more closely related to Zea luxurians than the other teosintes and cultivated maize, which should be regarded as a section Luxuriantes species. PMID:21525982
Biosystematics and Conservation: A Case Study with Two Enigmatic and Uncommon Species of Crassula from New Zealand

PubMed Central

De Lange, P. J.; Heenan, P. B.; Keeling, D. J.; Murray, B. G.; Smissen, R.; Sykes, W. R.

2008-01-01

Background and Aims Crassula hunua and C. ruamahanga have been taxonomically controversial. Here their distinctiveness is assessed so that their taxonomic and conservation status can be clarified. Methods Populations of these two species were analysed using morphological, chromosomal and DNA sequence data. Key Results It proved impossible to differentiate between these two species using 12 key morphological characters. Populations were found to be chromosomally variable with 11 different chromosome numbers ranging from 2n = 42 to 2n = 100. Meiotic behaviour and levels of pollen stainability were both variable. Phylogenetic analyses showed that differences exist in both nuclear and plastid DNA sequences between individual plants, sometimes from the same population. Conclusions The results suggest that these plants are a species complex that has evolved through interspecific hybridization and polyploidy. Their high levels of chromosomal and DNA sequence variation present a problem for their conservation. PMID:18055560
Analyses of Mitogenome Sequences Revealed that Asian Citrus Psyllids (Diaphorina citri) from California Were Related to Those from Florida.

PubMed

Wu, Fengnian; Kumagai, Luci; Cen, Yijing; Chen, Jianchi; Wallis, Christopher M; Polek, MaryLou; Jiang, Hongyan; Zheng, Zheng; Liang, Guangwen; Deng, Xiaoling

2017-08-31

Asian citrus psyllid (ACP, Diaphorina citri Kuwayama) transmits "Candidatus Liberibacter asiaticus" (CLas), an unculturable alpha-proteobacterium associated with citrus Huanglongbing (HLB). CLas has recently been found in California. Understanding ACP population diversity is necessary for HLB regulatory practices aimed at reducing CLas spread. In this study, two circular ACP mitogenome sequences from California (mt-CApsy, ~15,027 bp) and Florida (mt-FLpsy, ~15,012 bp), USA, were acquired. Each mitogenome contained 13 protein coding genes, 2 ribosomal RNA and 22 transfer RNA genes, and a control region varying in sizes. The Californian mt-CApsy was identical to the Floridian mt-FLpsy, but different from the mitogenome (mt-GDpsy) of Guangdong, China, in 50 single nucleotide polymorphisms (SNPs). Further analyses were performed on sequences in cox1 and trnAsn regions with 100 ACPs, SNPs in nad1-nad4-nad5 locus through PCR with 252 ACP samples. All results showed the presence of a Chinese ACP cluster (CAC) and an American ACP cluster (AAC). We proposed that ACP in California was likely not introduced from China based on our current ACP collection but somewhere in America. However, more studies with ACP samples from around the world are needed. ACP mitogenome sequence analyses will facilitate ACP population research.
Pretreatment drug resistance in a large countrywide Ethiopian HIV-1C cohort: a comparison of Sanger and high-throughput sequencing.

PubMed

Telele, Nigus Fikrie; Kalu, Amare Worku; Gebre-Selassie, Solomon; Fekade, Daniel; Abdurahman, Samir; Marrone, Gaetano; Neogi, Ujjwal; Tegbaru, Belete; Sönnerborg, Anders

2018-05-15

Baseline plasma samples of 490 randomly selected antiretroviral therapy (ART) naïve patients from seven hospitals participating in the first nationwide Ethiopian HIV-1 cohort were analysed for surveillance drug resistance mutations (sDRM) by population based Sanger sequencing (PBSS). Also next generation sequencing (NGS) was used in a subset of 109 baseline samples of patients. Treatment outcome after 6- and 12-months was assessed by on-treatment (OT) and intention-to-treat (ITT) analyses. Transmitted drug resistance (TDR) was detected in 3.9% (18/461) of successfully sequenced samples by PBSS. However, NGS detected sDRM more often (24%; 26/109) than PBSS (6%; 7/109) (p = 0.0001) and major integrase strand transfer inhibitors (INSTI) DRMs were also found in minor viral variants from five patients. Patients with sDRM had more frequent treatment failure in both OT and ITT analyses. The high rate of TDR by NGS and the identification of preexisting INSTI DRMs in minor wild-type HIV-1 subtype C viral variants infected Ethiopian patients underscores the importance of TDR surveillance in low- and middle-income countries and shows added value of high-throughput NGS in such studies.
Global Distribution of Polaromonas Phylotypes - Evidence for a Highly Successful Dispersal Capacity

PubMed Central

Darcy, John L.; Lynch, Ryan C.; King, Andrew J.; Robeson, Michael S.; Schmidt, Steven K.

2011-01-01

Bacteria from the genus Polaromonas are dominant phylotypes in clone libraries and culture collections from polar and high-elevation environments. Although Polaromonas has been found on six continents, we do not know if the same phylotypes exist in all locations or if they exhibit genetic isolation by distance patterns. To examine their biogeographic distribution, we analyzed all available, long-read 16S rRNA gene sequences of Polaromonas phylotypes from glacial and periglacial environments across the globe. Using genetic isolation by geographic distance analyses, including Mantel tests and Mantel correlograms, we found that Polaromonas phylotypes are globally distributed showing weak isolation by distance patterns at global scales. More focused analyses using discrete, equally sampled distances classes, revealed that only two distance classes (out of 12 total) showed significant spatial structuring. Overall, our analyses show that most Polaromonas phylotypes are truly globally distributed, but that some, as yet unknown, environmental variable may be selecting for unique phylotypes at a minority of our global sites. Analyses of aerobiological and genomic data suggest that Polaromonas phylotypes are globally distributed as dormant cells through high-elevation air currents; Polaromonas phylotypes are common in air and snow samples from high altitudes, and a glacial-ice metagenome and the two sequenced Polaromonas genomes contain the gene hipA, suggesting that Polaromonas can form dormant cells. PMID:21897856
Isolation and molecular identification of endophytic diazotrophs from seeds and stems of three cereal crops.

PubMed

Liu, Huawei; Zhang, Lei; Meng, Aihua; Zhang, Junbiao; Xie, Miaomiao; Qin, Yaohong; Faulk, Dylan Chase; Zhang, Baohong; Yang, Shushen; Qiu, Li

2017-01-01

Ten strains of endophytic diazotroph were isolated and identified from the plants collected from three different agricultural crop species, wheat, rice and maize, using the nitrogen-free selective isolation conditions. The nitrogen-fixing ability of endophytic diazotroph was verified by the nifH-PCR assay that showed positive nitrogen fixation ability. These identified strains were classified by 879F-RAPD and 16S rRNA sequence analysis. RAPD analyses revealed that the 10 strains were clustered into seven 879F-RAPD groups, suggesting a clonal origin. 16S rRNA sequencing analyses allowed the assignment of the 10 strains to known groups of nitrogen-fixing bacteria, including organisms from the genera Paenibacillus, Enterobacter, Klebsiella and Pantoea. These representative genus are not endophytic diazotrophs in the conventional sense. They may have obtained nitrogen fixation ability through lateral gene transfer, however, the evolutionary forces of lateral gene transfer are not well known. Molecular identification results from 16S rRNA analyses were also confirmed by morphological and biochemical data. The test strains SH6A and MZB showed positive effect on the growth of plants.
A reanalysis of the indirect evidence for recombination in human mitochondrial DNA.

PubMed

Piganeau, G; Eyre-Walker, A

2004-04-01

In an attempt to resolve the controversy about whether recombination occurs in human mtDNA, we have analysed three recently published data sets of complete mtDNA sequences along with 10 RFLP data sets. We have analysed the relationship between linkage disequilibrium (LD) and distance between sites under a variety of conditions using two measures of LD, r2 and /D'/. We find that there is a negative correlation between r2 and distance in the majority of data sets, but no overall trend for /D'/. Five out of six mtDNA sequence data sets show an excess of homoplasy, but this could be due to either recombination or hypervariable sites. Two additional recombination detection methods used, Geneconv and Maximum Chi-Square, showed nonsignificant results. The overall significance of these findings is hard to quantify because of nonindependence, but our results suggest a lack of evidence for recombination in human mtDNA.
Conflicting social motives in negotiating groups.

PubMed

Weingart, Laurie R; Brett, Jeanne M; Olekalns, Mara; Smith, Philip L

2007-12-01

Negotiators' social motives (cooperative vs. individualistic) influence their strategic behaviors. In this study, the authors used multilevel modeling and analyses of strategy sequences to test hypotheses regarding how negotiators' social motives and the composition of the group influence group members' negotiation strategies. Four-person groups negotiating a 5-issue mixed-motive decision-making task were videotaped, and the tapes were transcribed and coded. Group composition included 2 homogeneous conditions (all cooperators and all individualists) and 3 heterogeneous conditions (3 cooperators and 1 individualist, 2 cooperators and 2 individualists, 1 cooperator and 3 individualists). Results showed that cooperative negotiators adjusted their use of integrative and distributive strategies in response to the social-motive composition of the group, but individualistic negotiators did not. Results from analyses of strategy sequences showed that cooperators responded more systematically to others' behaviors than did individualists. They also redirected the negotiation depending on group composition. (c) 2007 APA, all rights reserved.
Phylogeny of Syndermata (syn. Rotifera): Mitochondrial gene order verifies epizoic Seisonidea as sister to endoparasitic Acanthocephala within monophyletic Hemirotifera.

PubMed

Sielaff, Malte; Schmidt, Hanno; Struck, Torsten H; Rosenkranz, David; Mark Welch, David B; Hankeln, Thomas; Herlyn, Holger

2016-03-01

A monophyletic origin of endoparasitic thorny-headed worms (Acanthocephala) and wheel-animals (Rotifera) is widely accepted. However, the phylogeny inside the clade, be it called Syndermata or Rotifera, has lacked validation by mitochondrial (mt) data. Herein, we present the first mt genome of the key taxon Seison and report conflicting results of phylogenetic analyses: while mt sequence-based topologies showed monophyletic Lemniscea (Bdelloidea+Acanthocephala), gene order analyses supported monophyly of Pararotatoria (Seisonidea+Acanthocephala) and Hemirotifera (Bdelloidea+Pararotatoria). Sequence-based analyses obviously suffered from substitution saturation, compositional bias, and branch length heterogeneity; however, we observed no compromising effects in gene order analyses. Moreover, gene order-based topologies were robust to changes in coding (genes vs. gene pairs, two-state vs. multistate, aligned vs. non-aligned), tree reconstruction methods, and the treatment of the two monogonont mt genomes. Thus, mt gene order verifies seisonids as sister to acanthocephalans within monophyletic Hemirotifera, while deviating results of sequence-based analyses reflect artificial signal. This conclusion implies that the complex life cycle of extant acanthocephalans evolved from a free-living state, as retained by most monogononts and bdelloids, via an epizoic state with a simple life cycle, as shown by seisonids. Hence, Acanthocephala represent a rare example where ancestral transitional stages have counterparts amongst the closest relatives. Copyright © 2015 Elsevier Inc. All rights reserved.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.

PubMed

Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H

2017-04-15

Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.

Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

PubMed Central

Sinclair, Robert M.; Ravantti, Janne J.

2017-01-01

ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979
Mitochondrial DNA diversity of the Amerindian populations living in the Andean Piedmont of Bolivia: Chimane, Moseten, Aymara and Quechua.

PubMed

Corella, Alfons; Bert, Francesc; Pérez-Pérez, Alejandro; Gené, Manel; Turbón, Daniel

2007-01-01

Chimane, Moseten Aymara and Quechua are Amerindian populations living in the Bolivian Piedmont, a characteristic ecoregion between the eastern slope of the Andean mountains and the Amazonian Llanos de Moxos. In both neighbouring areas, dense and complex societies have developed over the centuries. The Piedmont area is especially interesting from a human peopling perspective since there is no clear evidence regarding the genetic influence and peculiarities of these populations. This land has been used extensively as a territory of economic and cultural exchange between the Andes and Amazonia, however Chimane and Moseten populations have been sufficiently isolated from their neighbour groups to be recognized as distinct populations. Genetic information suggests that evolutionary processes, such as genetic drift, natural selection and genetic admixture have formed the history of the Piedmont populations. The objective of this study is to characterize the genetic diversity of the Piedmont populations, analysing the sequence variability of the HVR-I control region in the mitochondrial DNA (mtDNA). Haplogroup mtDNA data available from the whole of Central and South America were utilized to determine the relationship of the Piedmont populations with other Amerindian populations. Hair pulls were obtained in situ, and DNA from non-related individuals was extracted using a standard Chelex 100 method. A 401 bp DNA fragment of HVR-I region was amplified using standard procedures. Two independent 401 and 328 bp DNA fragments were sequenced separately for each sample. The sequence analyses included mismatch distribution and mean pairwise differences, median network analyses, AMOVA and principal component analyses. The genetic diversity of DNA sequences was measured and compared with other South Amerindian populations. The genetic diversity of 401 nucleotide mtDNA sequences, in the hypervariable Control Region, from positions 16 000-16 400, was characterized in a sample of 46 Amerindians living in the Piedmont area in the Beni Department of Bolivia. The results obtained indicate that the genetic diversity in the area is higher than that observed in other American groups living in much larger areas and despite the reduced size of the studied area the human groups analysed show high levels of inter-group variability. In addition, results show that Amerindian populations living in the Piedmont are genetically more related to those in the Andean than in the Amazonian populations.
Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology

PubMed Central

Leese, Florian; Mayer, Christoph; Agrawal, Shobhit; Dambach, Johannes; Dietz, Lars; Doemel, Jana S.; Goodall-Copstake, William P.; Held, Christoph; Jackson, Jennifer A.; Lampert, Kathrin P.; Linse, Katrin; Macher, Jan N.; Nolzen, Jennifer; Raupach, Michael J.; Rivera, Nicole T.; Schubart, Christoph D.; Striewski, Sebastian; Tollrian, Ralph; Sands, Chester J.

2012-01-01

High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. PMID:23185309
Insights into rubber biosynthesis from transcriptome analysis of Hevea brasiliensis latex.

PubMed

Chow, Keng-See; Wan, Kiew-Lian; Isa, Mohd Noor Mat; Bahari, Azlina; Tan, Siang-Hee; Harikrishna, K; Yeang, Hoong-Yeet

2007-01-01

Hevea brasiliensis is the most widely cultivated species for commercial production of natural rubber (cis-polyisoprene). In this study, 10,040 expressed sequence tags (ESTs) were generated from the latex of the rubber tree, which represents the cytoplasmic content of a single cell type, in order to analyse the latex transcription profile with emphasis on rubber biosynthesis-related genes. A total of 3,441 unique transcripts (UTs) were obtained after quality editing and assembly of EST sequences. Functional classification of UTs according to the Gene Ontology convention showed that 73.8% were related to genes of unknown function. Among highly expressed ESTs, a significant proportion encoded proteins related to rubber biosynthesis and stress or defence responses. Sequences encoding rubber particle membrane proteins (RPMPs) belonging to three protein families accounted for 12% of the ESTs. Characterization of these ESTs revealed nine RPMP variants (7.9-27 kDa) including the 14 kDa REF (rubber elongation factor) and 22 kDa SRPP (small rubber particle protein). The expression of multiple RPMP isoforms in latex was shown using antibodies against REF and SRPP. Both EST and quantitative reverse transcription-PCR (QRT-PCR) analyses demonstrated REF and SRPP to be the most abundant transcripts in latex. Besides rubber biosynthesis, comparative sequence analysis showed that the RPMPs are highly similar to sequences in the plant kingdom having stress-related functions. Implications of the RPMP function in cis-polyisoprene biosynthesis in the context of transcript abundance and differential gene expression are discussed.
The discovery of Halictivirus resolves the Sinaivirus phylogeny.

PubMed

Bigot, Diane; Dalmon, Anne; Roy, Bronwen; Hou, Chunsheng; Germain, Michèle; Romary, Manon; Deng, Shuai; Diao, Qingyun; Weinert, Lucy A; Cook, James M; Herniou, Elisabeth A; Gayral, Philippe

2017-11-01

By providing pollination services, bees are among the most important insects, both in ecological and economical terms. Combined next-generation and classical sequencing approaches were applied to discover and study new insect viruses potentially harmful to bees. A bioinformatics virus discovery pipeline was used on individual Illumina transcriptomes of 13 wild bees from three species from the genus Halictus and 30 ants from six species of the genera Messor and Aphaenogaster. This allowed the discovery and description of three sequences of a new virus termed Halictus scabiosae Adlikon virus (HsAV). Phylogenetic analyses of ORF1, RNA-dependent RNA-polymerase (RdRp) and capsid genes showed that HsAV is closely related to (+)ssRNA viruses of the unassigned Sinaivirus genus but distant enough to belong to a different new genus we called Halictivirus. In addition, our study of ant transcriptomes revealed the first four sinaivirus sequences from ants (Messor barbarus, M. capitatus and M. concolor). Maximum likelihood phylogenetic analyses were performed on a 594 nt fragment of the ORF1/RdRp region from 84 sinaivirus sequences, including 31 new Lake Sinai viruses (LSVs) from honey bees collected in five countries across the globe and the four ant viral sequences. The phylogeny revealed four main clades potentially representing different viral species infecting honey bees. Moreover, the ant viruses belonged to the LSV4 clade, suggesting a possible cross-species transmission between bees and ants. Lastly, wide honey bee screening showed that all four LSV clades have worldwide distributions with no obvious geographical segregation.
Polymorphism and selection in the major histocompatibility complex DRA and DQA genes in the family Equidae.

PubMed

Janova, Eva; Matiasovic, Jan; Vahala, Jiri; Vodicka, Roman; Van Dyk, Enette; Horin, Petr

2009-07-01

The major histocompatibility complex genes coding for antigen binding and presenting molecules are the most polymorphic genes in the vertebrate genome. We studied the DRA and DQA gene polymorphism of the family Equidae. In addition to 11 previously reported DRA and 24 DQA alleles, six new DRA sequences and 13 new DQA alleles were identified in the genus Equus. Phylogenetic analysis of both DRA and DQA sequences provided evidence for trans-species polymorphism in the family Equidae. The phylogenetic trees differed from species relationships defined by standard taxonomy of Equidae and from trees based on mitochondrial or neutral gene sequence data. Analysis of selection showed differences between the less variable DRA and more variable DQA genes. DRA alleles were more often shared by more species. The DQA sequences analysed showed strong amongst-species positive selection; the selected amino acid positions mostly corresponded to selected positions in rodent and human DQA genes.
Multiplex PCR-Based Next-Generation Sequencing and Global Diversity of Seoul Virus in Humans and Rats.

PubMed

Kim, Won-Keun; No, Jin Sun; Lee, Seung-Ho; Song, Dong Hyun; Lee, Daesang; Kim, Jeong-Ah; Gu, Se Hun; Park, Sunhye; Jeong, Seong Tae; Kim, Heung-Chul; Klein, Terry A; Wiley, Michael R; Palacios, Gustavo; Song, Jin-Won

2018-02-01

Seoul virus (SEOV) poses a worldwide public health threat. This virus, which is harbored by Rattus norvegicus and R. rattus rats, is the causative agent of hemorrhagic fever with renal syndrome (HFRS) in humans, which has been reported in Asia, Europe, the Americas, and Africa. Defining SEOV genome sequences plays a critical role in development of preventive and therapeutic strategies against the unique worldwide hantavirus. We applied multiplex PCR-based next-generation sequencing to obtain SEOV genome sequences from clinical and reservoir host specimens. Epidemiologic surveillance of R. norvegicus rats in South Korea during 2000-2016 demonstrated that the serologic prevalence of enzootic SEOV infections was not significant on the basis of sex, weight (age), and season. Viral loads of SEOV in rats showed wide dissemination in tissues and dynamic circulation among populations. Phylogenetic analyses showed the global diversity of SEOV and possible genomic configuration of genetic exchanges.
Characterization of species-specific repeated DNA sequences from B. nigra.

PubMed

Gupta, V; Lakshmisita, G; Shaila, M S; Jagannathan, V; Lakshmikumaran, M S

1992-07-01

The construction and characterization of two genome-specific recombinant DNA clones from B. nigra are described. Southern analysis showed that the two clones belong to a dispersed repeat family. They differ from each other in their length, distribution and sequence, though the average GC content is nearly the same (45%). These B genome-specific repeats have been used to analyse the phylogenetic relationships between cultivated and wild species of the family Brassicaceae.
A combined computational-experimental analyses of selected metabolic enzymes in Pseudomonas species.

PubMed

Perumal, Deepak; Lim, Chu Sing; Chow, Vincent T K; Sakharkar, Kishore R; Sakharkar, Meena K

2008-09-10

Comparative genomic analysis has revolutionized our ability to predict the metabolic subsystems that occur in newly sequenced genomes, and to explore the functional roles of the set of genes within each subsystem. These computational predictions can considerably reduce the volume of experimental studies required to assess basic metabolic properties of multiple bacterial species. However, experimental validations are still required to resolve the apparent inconsistencies in the predictions by multiple resources. Here, we present combined computational-experimental analyses on eight completely sequenced Pseudomonas species. Comparative pathway analyses reveal that several pathways within the Pseudomonas species show high plasticity and versatility. Potential bypasses in 11 metabolic pathways were identified. We further confirmed the presence of the enzyme O-acetyl homoserine (thiol) lyase (EC: 2.5.1.49) in P. syringae pv. tomato that revealed inconsistent annotations in KEGG and in the recently published SYSTOMONAS database. These analyses connect and integrate systematic data generation, computational data interpretation, and experimental validation and represent a synergistic and powerful means for conducting biological research.
XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets.

PubMed

Yu, Yao; Hu, Hao; Bohlender, Ryan J; Hu, Fulan; Chen, Jiun-Sheng; Holt, Carson; Fowler, Jerry; Guthery, Stephen L; Scheet, Paul; Hildebrandt, Michelle A T; Yandell, Mark; Huff, Chad D

2018-04-06

High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratification biases that overwhelm subtle signals of association in studies of complex traits. Here, we introduce the Cross-Platform Association Toolkit, XPAT, which provides a suite of tools designed to support and conduct large-scale association studies with heterogeneous sequencing datasets. XPAT includes tools to support cross-platform aware variant calling, quality control filtering, gene-based association testing and rare variant effect size estimation. To evaluate the performance of XPAT, we conducted case-control association studies for three diseases, including 783 breast cancer cases, 272 ovarian cancer cases, 205 Crohn disease cases and 3507 shared controls (including 1722 females) using sequencing data from multiple sources. XPAT greatly reduced Type I error inflation in the case-control analyses, while replicating many previously identified disease-gene associations. We also show that association tests conducted with XPAT using cross-platform data have comparable performance to tests using matched platform data. XPAT enables new association studies that combine existing sequencing datasets to identify genetic loci associated with common diseases and other complex traits.
Characterization of the first complete genome sequence of an Impatiens necrotic spot orthotospovirus isolate from the United States and worldwide phylogenetic analyses of INSV isolates.

PubMed

Zhao, Kaixi; Margaria, Paolo; Rosa, Cristina

2018-05-10

Impatiens necrotic spot orthotospovirus (INSV) can impact economically important ornamental plants and vegetables worldwide. Characterization studies on INSV are limited. For most INSV isolates, there are no complete genome sequences available. This lack of genomic information has a negative impact on the understanding of the INSV genetic diversity and evolution. Here we report the first complete nucleotide sequence of a US INSV isolate. INSV-UP01 was isolated from an impatiens in Pennsylvania, US. RT-PCR was used to clone its full-length genome and Vector NTI to assemble overlapping sequences. Phylogenetic trees were constructed by using MEGA7 software to show the phylogenetic relationships with other available INSV sequences worldwide. This US isolate has genome and biological features classical of INSV species and clusters in the Western Hemisphere clade, but its origin appears to be recent. Furthermore, INSV-UP01 might have been involved in a recombination event with an Italian isolate belonging to the Asian clade. Our analyses support that INSV isolates infect a broad plant-host range they group by geographic origin and not by host, and are subjected to frequent recombination events. These results justify the need to generate and analyze complete genome sequences of orthotospoviruses in general and INSV in particular.
Comparative sequence analyses of sixteen reptilian paramyxoviruses

USGS Publications Warehouse

Ahne, W.; Batts, W.N.; Kurath, G.; Winton, J.R.

1999-01-01

Viral genomic RNA of Fer-de-Lance virus (FDLV), a paramyxovirus highly pathogenic for reptiles, was reverse transcribed and cloned. Plasmids with significant sequence similarities to the hemagglutinin-neuraminidase (HN) and polymerase (L) genes of mammalian paramyxoviruses were identified by BLAST search. Partial sequences of the FDLV genes were used to design primers for amplification by nested polymerase chain reaction (PCR) and sequencing of 518-bp L gene and 352-bp HN gene fragments from a collection of 15 previously uncharacterized reptilian paramyxoviruses. Phylogenetic analyses of the partial L and HN sequences produced similar trees in which there were two distinct subgroups of isolates that were supported with maximum bootstrap values, and several intermediate isolates. Within each subgroup the nucleotide divergence values were less than 2.5%, while the divergence between the two subgroups was 20-22%. This indicated that the two subgroups represent distinct virus species containing multiple virus strains. The five intermediate isolates had nucleotide divergence values of 11-20% and may represent additional distinct species. In addition to establishing diversity among reptilian paramyxoviruses, the phylogenetic groupings showed some correlation with geographic location, and clearly demonstrated a low level of host species-specificity within these viruses. Copyright (C) 1999 Elsevier Science B.V.
Extending the spectrum of DNA sequences retrieved from ancient bones and teeth

PubMed Central

Glocke, Isabelle; Meyer, Matthias

2017-01-01

The number of DNA fragments surviving in ancient bones and teeth is known to decrease with fragment length. Recent genetic analyses of Middle Pleistocene remains have shown that the recovery of extremely short fragments can prove critical for successful retrieval of sequence information from particularly degraded ancient biological material. Current sample preparation techniques, however, are not optimized to recover DNA sequences from fragments shorter than ∼35 base pairs (bp). Here, we show that much shorter DNA fragments are present in ancient skeletal remains but lost during DNA extraction. We present a refined silica-based DNA extraction method that not only enables efficient recovery of molecules as short as 25 bp but also doubles the yield of sequences from longer fragments due to improved recovery of molecules with single-strand breaks. Furthermore, we present strategies for monitoring inefficiencies in library preparation that may result from co-extraction of inhibitory substances during DNA extraction. The combination of DNA extraction and library preparation techniques described here substantially increases the yield of DNA sequences from ancient remains and provides access to a yet unexploited source of highly degraded DNA fragments. Our work may thus open the door for genetic analyses on even older material. PMID:28408382
Identification of Y-Chromosome Sequences in Turner Syndrome.

PubMed

Silva-Grecco, Roseane Lopes da; Trovó-Marqui, Alessandra Bernadete; Sousa, Tiago Alves de; Croce, Lilian Da; Balarin, Marly Aparecida Spadotto

2016-05-01

To investigate the presence of Y-chromosome sequences and determine their frequency in patients with Turner syndrome. The study included 23 patients with Turner syndrome from Brazil, who gave written informed consent for participating in the study. Cytogenetic analyses were performed in peripheral blood lymphocytes, with 100 metaphases per patient. Genomic DNA was also extracted from peripheral blood lymphocytes, and gene sequences DYZ1, DYZ3, ZFY and SRY were amplified by Polymerase Chain Reaction. The cytogenetic analysis showed a 45,X karyotype in 9 patients (39.2 %) and a mosaic pattern in 14 (60.8 %). In 8.7 % (2 out of 23) of the patients, Y-chromosome sequences were found. This prevalence is very similar to those reported previously. The initial karyotype analysis of these patients did not reveal Y-chromosome material, but they were found positive for Y-specific sequences in the lymphocyte DNA analysis. The PCR technique showed that 2 (8.7 %) of the patients with Turner syndrome had Y-chromosome sequences, both presenting marker chromosomes on cytogenetic analysis.
Spiroplasma species share common DNA sequences among their viruses, plasmids and genomes.

PubMed

Ranhand, J M; Nur, I; Rose, D L; Tully, J G

1987-01-01

Alkaline-Southern-blot analyses showed that a spiroplasma plasmid, pRA1, obtained from Spiroplasma citri (Maroc-R8A2), contained DNA sequences that were homologous to spiroplasma type 3 viruses (SV3) obtained from S. citri (Maroc-R8A2), S. citri (608) and S. mirum (SMCA). In addition, pRA1 and SV3(608) DNA shared common, but not necessarily related, sequences with extrachromosomal DNA derived from 11 Spiroplasma species or strains. Furthermore, SV3(608) had DNA homology with the chromosome from 6 distinct spiroplasmas but not with chromosomal DNA from eight other Spiroplasma species or strains. The biological function of these common sequences is unknown.
Spatial and temporal plasticity of chromatin during programmed DNA-reorganization in Stylonychia macronuclear development

PubMed Central

Postberg, Jan; Heyse, Katharina; Cremer, Marion; Cremer, Thomas; Lipps, Hans J

2008-01-01

Background: In this study we exploit the unique genome organization of ciliates to characterize the biological function of histone modification patterns and chromatin plasticity for the processing of specific DNA sequences during a nuclear differentiation process. Ciliates are single-cell eukaryotes containing two morphologically and functionally specialized types of nuclei, the somatic macronucleus and the germline micronucleus. In the course of sexual reproduction a new macronucleus develops from a micronuclear derivative. During this process specific DNA sequences are eliminated from the genome, while sequences that will be transcribed in the mature macronucleus are retained. Results: We show by immunofluorescence microscopy, Western analyses and chromatin immunoprecipitation (ChIP) experiments that each nuclear type establishes its specific histone modification signature. Our analyses reveal that the early macronuclear anlage adopts a permissive chromatin state immediately after the fusion of two heterochromatic germline micronuclei. As macronuclear development progresses, repressive histone modifications that specify sequences to be eliminated are introduced de novo. ChIP analyses demonstrate that permissive histone modifications are associated with sequences that will be retained in the new macronucleus. Furthermore, our data support the hypothesis that a PIWI-family protein is involved in a transnuclear cross-talk and in the RNAi-dependent control of developmental chromatin reorganization. Conclusion: Based on these data we present a comprehensive analysis of the spatial and temporal pattern of histone modifications during this nuclear differentiation process. Results obtained in this study may also be relevant for our understanding of chromatin plasticity during metazoan embryogenesis. PMID:19014664
Sequence and structural analyses of nuclear export signals in the NESdb database

PubMed Central

Xu, Darui; Farmer, Alicia; Collett, Garen; Grishin, Nick V.; Chook, Yuh Min

2012-01-01

We compiled >200 nuclear export signal (NES)–containing CRM1 cargoes in a database named NESdb. We analyzed the sequences and three-dimensional structures of natural, experimentally identified NESs and of false-positive NESs that were generated from the database in order to identify properties that might distinguish the two groups of sequences. Analyses of amino acid frequencies, sequence logos, and agreement with existing NES consensus sequences revealed strong preferences for the Φ1-X3-Φ2-X2-Φ3-X-Φ4 pattern and for negatively charged amino acids in the nonhydrophobic positions of experimentally identified NESs but not of false positives. Strong preferences against certain hydrophobic amino acids in the hydrophobic positions were also revealed. These findings led to a new and more precise NES consensus. More important, three-dimensional structures are now available for 68 NESs within 56 different cargo proteins. Analyses of these structures showed that experimentally identified NESs are more likely than the false positives to adopt α-helical conformations that transition to loops at their C-termini and more likely to be surface accessible within their protein domains or be present in disordered or unobserved parts of the structures. Such distinguishing features for real NESs might be useful in future NES prediction efforts. Finally, we also tested CRM1-binding of 40 NESs that were found in the 56 structures. We found that 16 of the NES peptides did not bind CRM1, hence illustrating how NESs are easily misidentified. PMID:22833565
A novel high-resolution multilocus sequence typing of Giardia intestinalis Assemblage A isolates reveals zoonotic transmission, clonal outbreaks and recombination.

PubMed

Ankarklev, Johan; Lebbad, Marianne; Einarsson, Elin; Franzén, Oscar; Ahola, Harri; Troell, Karin; Svärd, Staffan G

2018-06-01

Molecular epidemiology and genotyping studies of the parasitic protozoan Giardia intestinalis have proven difficult due to multiple factors, such as low discriminatory power in the commonly used genotyping loci, which has hampered molecular analyses of outbreak sources, zoonotic transmission and virulence types. Here we have focused on assemblage A Giardia and developed a high-resolution assemblage-specific multilocus sequence typing (MLST) method. Analyses of sequenced G. intestinalis assemblage A genomes from different sub-assemblages identified a set of six genetic loci with high genetic variability. DNA samples from both humans (n = 44) and animals (n = 18) that harbored Giardia assemblage A infections, were PCR amplified (557-700 bp products) and sequenced at the six novel genetic loci. Bioinformatic analyses showed five to ten-fold higher levels of polymorphic sites than what was previously found among assemblage A samples using the classic genotyping loci. Phylogenetically, a division of two major clusters in assemblage A became apparent, separating samples of human and animal origin. A subset of human samples (n = 9) from a documented Giardia outbreak in a Swedish day-care center, showed full complementarity at nine genetic loci (the six new and the standard BG, TPI and GDH loci), strongly suggesting one source of infection. Furthermore, three samples of human origin displayed MLST profiles that were phylogenetically more closely related to MLST profiles from animal derived samples, suggesting zoonotic transmission. These new genotyping loci enabled us to detect events of recombination between different assemblage A isolates but also between assemblage A and E isolates. In summary, we present a novel and expanded MLST strategy with significantly improved sensitivity for molecular analyses of virulence types, zoonotic potential and source tracking for assemblage A Giardia. Copyright © 2018. Published by Elsevier B.V.
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
Analysing the performance of personal computers based on Intel microprocessors for sequence aligning bioinformatics applications.

PubMed

Nair, Pradeep S; John, Eugene B

2007-01-01

Aligning specific sequences against a very large number of other sequences is a central aspect of bioinformatics. With the widespread availability of personal computers in biology laboratories, sequence alignment is now often performed locally. This makes it necessary to analyse the performance of personal computers for sequence aligning bioinformatics benchmarks. In this paper, we analyse the performance of a personal computer for the popular BLAST and FASTA sequence alignment suites. Results indicate that these benchmarks have a large number of recurring operations and use memory operations extensively. It seems that the performance can be improved with a bigger L1-cache.

Divergent nuclear 18S rDNA paralogs in a turkey coccidium, Eimeria meleagrimitis, complicate molecular systematics and identification.

PubMed

El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R

2013-07-01

Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
New species of Bordetella, Bordetella ansorpii sp. nov., isolated from the purulent exudate of an epidermal cyst.

PubMed

Ko, Kwan Soo; Peck, Kyong Ran; Oh, Won Sup; Lee, Nam Yong; Lee, Jang Ho; Song, Jae-Hoon

2005-05-01

A gram-negative bacillus, SMC-8986(T), which was isolated from the purulent exudate of an epidermal cyst but could not be identified by a conventional microbiologic method, was characterized by a variety of phenotypic and genotypic analyses. Sequences of the 16S rRNA gene revealed that this bacterium belongs to the genus Bordetella but diverged distinctly from previously described Bordetella species. Analyses of cellular fatty acid composition and performance of biochemical tests confirmed that this bacterium is distinct from other Bordetella species. Furthermore, the results of comparative sequence analyses of two protein-coding genes (risA and ompA) also showed that this strain represents a new species within the genus Bordetella. Based on the evaluated phenotypic and genotypic characteristics, it is proposed that SMC-8986(T) should be classified as a new species, namely Bordetella ansorpii sp. nov.
A broad-range survey of ticks from livestock in Northern Xinjiang: changes in tick distribution and the isolation of Borrelia burgdorferi sensu stricto.

PubMed

Wang, Yuan-Zhi; Mu, Lu-Meng; Zhang, Ke; Yang, Mei-Hua; Zhang, Lin; Du, Jing-Yun; Liu, Zhi-Qiang; Li, Yong-Xiang; Lu, Wei-Hua; Chen, Chuang-Fu; Wang, Yan; Chen, Rong-Gui; Xu, Jun; Yuan, Li; Zhang, Wan-Jiang; Zuo, Wei-Ze; Shao, Ren-Fu

2015-09-04

Borreliosis is highly prevalent in Xinjiang Uygur Autonomous Region, China. However, little is known about the presence of Borrelia pathogens in tick species in this region, in addition Borrelia pathogens have not been isolated from domestic animals. We collected adult ticks from domestic animals at 19 sampling sites in 14 counties in northern Xinjiang from 2012 to 2014. Ticks were identified to species by morphology and were molecularly analysed by sequences of mitochondrial 16S rDNA gene; 4-8 ticks of each species at every sampling site were sequenced. 112 live adult ticks were selected for each species in every county, and were used to culture Borrelia pathogens; the genotypes were then determined by sequences of the 5S-23S rRNA intergenic spacer and the outer surface protein A (ospA) gene. A total of 5257 adult ticks, belonging to four genera and seven species, were collected. Compared with three decades ago, the abundance of the five common tick species during the peak ixodid tick season has changed. Certain tick species, such as Rhipicephalus turanicus (Rh. turanicus), was found at Jimusaer, Yining, Fukang, and Chabuchaer Counties for the first time. Additionally, the sequence analyses showed that the Hyalomma asiaticum (Hy. asiaticum), Haemaphysalis punctata (Ha. punctata), and Dermacentor marginatus (D. marginatus) that were collected from different sampling sites (≥3 sites) shared identical 16S rDNA sequences respectively. For the tick species that were collected from the same county, such as Hy. asiaticum from Shihezi County and Rh. turanicus from Yining County, their 16S rDNA sequences showed genetic diversity. In addition, sixteen Borrelia isolates were found in Hy. asiaticum, Ha. punctata, D. marginatus and Rh. turanicus, which infested cattle, sheep, horse and camel in Yining, Chabuchaer, Shihezi and Shawan Counties. All of the isolates were genetically identified as B. Burgdorferi sensu stricto. Warmer and wetter climate may have contributed to the altered distribution and abundance of the five most common ticks in northern Xinjiang. The genetic analyses showed that certain tick species, such as Hy. asiaticum or Rh. turanicus, exhibit genetic commonness or diversity. Additionally, this study is the first to isolate B. burgdorferi sensu stricto in Hy. asiaticum asiaticum, H. punctata, D. nuttalli and D. marginatus ticks from domestic animals. These ticks may transmit borreliosis among livestock.
The diploid genome sequence of an Asian individual

PubMed Central

Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian

2009-01-01

Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735
ReQON: a Bioconductor package for recalibrating quality scores from next-generation sequencing data

PubMed Central

2012-01-01

Background Next-generation sequencing technologies have become important tools for genome-wide studies. However, the quality scores that are assigned to each base have been shown to be inaccurate. If the quality scores are used in downstream analyses, these inaccuracies can have a significant impact on the results. Results Here we present ReQON, a tool that recalibrates the base quality scores from an input BAM file of aligned sequencing data using logistic regression. ReQON also generates diagnostic plots showing the effectiveness of the recalibration. We show that ReQON produces quality scores that are both more accurate, in the sense that they more closely correspond to the probability of a sequencing error, and do a better job of discriminating between sequencing errors and non-errors than the original quality scores. We also compare ReQON to other available recalibration tools and show that ReQON is less biased and performs favorably in terms of quality score accuracy. Conclusion ReQON is an open source software package, written in R and available through Bioconductor, for recalibrating base quality scores for next-generation sequencing data. ReQON produces a new BAM file with more accurate quality scores, which can improve the results of downstream analysis, and produces several diagnostic plots showing the effectiveness of the recalibration. PMID:22946927
Detection of Human Papillomavirus Type 2 Related Sequence in Oral Papilloma

PubMed Central

Yamaguchi, Taihei; Shindoh, Masanobu; Amemiya, Akira; Inoue, Nobuo; Kawamura, Masaaki; Sakaoka, Hiroshi; Inoue, Masakazu; Fujinaga, Kei

1998-01-01

Oral papilloma is a benign tumourous lesion. Part of this lesion is associated with human papillomavirus (HPV) infection. We analysed the genetical and histopathological evidence for HPV type 2 infection in three oral papillomas. Southern blot hybridization showed HPV 2a sequence in one lesion. Cells of the positive specimen appeared to contain high copy numbers of the viral DNA in an episomal state. In situ staining demonstrated virus capsid antigen in koilocytotic cells and surrounding cells in the hyperplastic epithelial layer. Two other specimens contained no HPV sequences by labeled probe of full length linear HPVs 2a, 6b, 11, 16, 18, 31 and 33 DNA under low stringency hybridization conditions. These results showed the possibility that HPV 2 plays a role in oral papilloma. PMID:9699941
Isolation and characterization of major histocompatibility complex class II B genes in cranes.

PubMed

Kohyama, Tetsuo I; Akiyama, Takuya; Nishida, Chizuko; Takami, Kazutoshi; Onuma, Manabu; Momose, Kunikazu; Masuda, Ryuichi

2015-11-01

In this study, we isolated and characterized the major histocompatibility complex (MHC) class II B genes in cranes. Genomic sequences spanning exons 1 to 4 were amplified and determined in 13 crane species and three other species closely related to cranes. In all, 55 unique sequences were identified, and at least two polymorphic MHC class II B loci were found in most species. An analysis of sequence polymorphisms showed the signature of positive selection and recombination. A phylogenetic reconstruction based on exon 2 sequences indicated that trans-species polymorphism has persisted for at least 10 million years, whereas phylogenetic analyses of the sequences flanking exon 2 revealed a pattern of concerted evolution. These results suggest that both balancing selection and recombination play important roles in the crane MHC evolution.
Pepo aphid-borne yellows virus: a new species in the genus Polerovirus.

PubMed

Ibaba, Jacques D; Laing, Mark D; Gubba, Augustine

2017-02-01

Pepo aphid-borne yellows virus (PABYV) has been proposed as a putative representative of a new species in the genus Polerovirus in the family Luteoviridae. The genomes of two South African (SA) isolates of cucurbit-infecting PABYV were described in this record. Total RNA, extracted from a pattypan (Cucurbita pepo L.) and a baby marrow (C. pepo L.) leaf samples, was subjected to next-generation sequencing (NGS) on the HiSeq Illumina platform. Sanger sequencing was subsequently used to authenticate the integrity of PABYV's genome generated from de novo assembly of the NGS data. PABYV genome of SA isolates consists of 5813 nucleotides and displays an organisation typical of poleroviruses. Genome sequence comparisons of the SA PABYV isolates to other poleroviruses support the classification of PABYV as a new species in the genus Polerovirus. Recombination analyses showed that PABYV and Cucurbit aphid-borne yellows virus (CABYV) shared the same ancestor for the genome part situated between breaking points. Phylogenetic analyses of the RNA-dependent RNA polymerase and the coat protein genes showed that SA PABYV isolates shared distant relationship with CABYV and Suakwa aphid-borne yellows virus. Based on our results, we propose that PABYV is a distinct species in the genus Polerovirus.
Molecular Barcoding of Aquatic Oligochaetes: Implications for Biomonitoring

PubMed Central

Vivien, Régis; Wyler, Sofia; Lafont, Michel; Pawlowski, Jan

2015-01-01

Aquatic oligochaetes are well recognized bioindicators of quality of sediments and water in watercourses and lakes. However, the difficult taxonomic determination based on morphological features compromises their more common use in eco-diagnostic analyses. To overcome this limitation, we investigated molecular barcodes as identification tool for broad range of taxa of aquatic oligochaetes. We report 185 COI and 52 ITS2 rDNA sequences for specimens collected in Switzerland and belonging to the families Naididae, Lumbriculidae, Enchytraeidae and Lumbricidae. Phylogenetic analyses allowed distinguishing 41 lineages separated by more than 10 % divergence in COI sequences. The lineage distinction was confirmed by Automatic Barcode Gap Discovery (ABGD) method and by ITS2 data. Our results showed that morphological identification underestimates the oligochaete diversity. Only 26 of the lineages could be assigned to morphospecies, of which seven were sequenced for the first time. Several cryptic species were detected within common morphospecies. Many juvenile specimens that could not be assigned morphologically have found their home after genetic analysis. Our study showed that COI barcodes performed very well as species identifiers in aquatic oligochaetes. Their easy amplification and good taxonomic resolution might help promoting aquatic oligochaetes as bioindicators for next generation environmental DNA biomonitoring of aquatic ecosystems. PMID:25856230
Systematics of Cladophora spp. (Chlorophyta) from North Carolina, USA, based upon morphology and DNA sequence data with a description of Cladophora subtilissima sp. nov.

PubMed

Taylor, Robin L; Bailey, Jeffrey Craig; Freshwater, David Wilson

2017-06-01

Identification of Cladophora species is challenging due to conservation of gross morphology, few discrete autapomorphies, and environmental influences on morphology. Twelve species of marine Cladophora were reported from North Carolina waters. Cladophora specimens were collected from inshore and offshore marine waters for DNA sequence and morphological analyses. The nuclear-encoded rRNA internal transcribed spacer regions (ITS) were sequenced for 105 specimens and used in molecular assisted identification. The ITS1 and ITS2 region was highly variable, and sequences were sorted into ITS Sets of Alignable Sequences (SASs). Sequencing of short hyper-variable ITS1 sections from Cladophora type specimens was used to positively identify species represented by SASs when the types were made available. Secondary structures for the ITS1 locus were also predicted for each specimen and compared to predicted structures from Cladophora sequences available in GenBank. Nine ITS SASs were identified and representative specimens chosen for phylogenetic analyses of 18S and 28S rRNA gene sequences to reveal relationships with other Cladophora species. Phylogenetic analyses indicated that marine Cladophorales were polyphyletic and separated into two clades, the Cladophora clade and the "Siphonocladales" clade. Morphological analyses were performed to assess the consistency of character states within species, and complement the DNA sequence analyses. These analyses revealed intra- and interspecific character state variation, and that combined molecular and morphological analyses were required for the identification of species. One new report, Cladophora dotyana, and one new species Cladophora subtilissima sp. nov., were revealed, and increased the biodiversity of North Carolina marine Cladophora to 14 species. © 2017 Phycological Society of America.
Molecular identification of the Cryptosporidium deer genotype in the Hokkaido sika deer (Cervus nippon yesoensis) in Hokkaido, Japan.

PubMed

Kato, Satomi; Yanagawa, Yojiro; Matsuyama, Ryota; Suzuki, Masatsugu; Sugimoto, Chihiro

2016-04-01

The protozoan Cryptosporidium occurs in a wide range of animal species including many Cervidae species. Fecal samples collected from the Hokkaido sika deer (Cervus nippon yesoensis), a native deer of Hokkaido, in the central, western, and eastern areas of Hokkaido were examined by polymerase chain reaction (PCR) to detect infections with Cryptosporidium and for sequence analyses to reveal the molecular characteristics of the amplified DNA. DNA was extracted from 319 fecal samples and examined with PCR using primers for small-subunit ribosomal RNA (SSU-rRNA), actin, and 70-kDa heat shock protein (HSP70) gene loci. PCR-amplified fragments were sequenced and phylogenetic trees were created. In 319 fecal samples, 25 samples (7.8 %) were positive with SSU-rRNA PCR that were identified as the Cryptosporidium deer genotype. Among Cryptosporidium-positive samples, fawns showed higher prevalence (16.1 %) than yearlings (6.4 %) and adults (4.7 %). The result of Fisher's exact test showed a statistical significance in the prevalence of the Cryptosporidium deer genotype between fawn and other age groups. Sequence analyses with actin and HSP70 gene fragments confirmed the SSU-rRNA result, and there were no sequence diversities observed. The Cryptosporidium deer genotype appears to be the prevalent Cryptosporidium species in the wild sika deer in Hokkaido, Japan.
Genetic and antigenic diversity of Theileria parva in cattle in Eastern and Southern zones of Tanzania. A study to support control of East Coast fever.

PubMed

Elisa, Mwega; Hasan, Salih Dia; Moses, Njahira; Elpidius, Rukambile; Skilton, Robert; Gwakisa, Paul

2015-04-01

This study investigated the genetic and antigenic diversity of Theileria parva in cattle from the Eastern and Southern zones of Tanzania. Thirty-nine (62%) positive samples were genotyped using 14 mini- and microsatellite markers with coverage of all four T. parva chromosomes. Wright's F index (F(ST) = 0 × 094) indicated a high level of panmixis. Linkage equilibrium was observed in the two zones studied, suggesting existence of a panmyctic population. In addition, sequence analysis of CD8+ T-cell target antigen genes Tp1 revealed a single protein sequence in all samples analysed, which is also present in the T. parva Muguga strain, which is a component of the FAO1 vaccine. All Tp2 epitope sequences were identical to those in the T. parva Muguga strain, except for one variant of a Tp2 epitope, which is found in T. parva Kiambu 5 strain, also a component the FAO1 vaccine. Neighbour joining tree of the nucleotide sequences of Tp2 showed clustering according to geographical origin. Our results show low genetic and antigenic diversity of T. parva within the populations analysed. This has very important implications for the development of sustainable control measures for T. parva in Eastern and Southern zones of Tanzania, where East Coast fever is endemic.
The neural correlates of implicit sequence learning in schizophrenia.

PubMed

Marvel, Cherie L; Turner, Beth M; O'Leary, Daniel S; Johnson, Hans J; Pierson, Ronald K; Ponto, Laura L Boles; Andreasen, Nancy C

2007-11-01

Twenty-seven schizophrenia spectrum patients and 25 healthy controls performed a probabilistic version of the serial reaction time task (SRT) that included sequence trials embedded within random trials. Patients showed diminished, yet measurable, sequence learning. Postexperimental analyses revealed that a group of patients performed above chance when generating short spans of the sequence. This high-generation group showed SRT learning that was similar in magnitude to that of controls. Their learning was evident from the very 1st block; however, unlike controls, learning did not develop further with continued testing. A subset of 12 patients and 11 controls performed the SRT in conjunction with positron emission tomography. High-generation performance, which corresponded to SRT learning in patients, correlated to activity in the premotor cortex and parahippocampus. These areas have been associated with stimulus-driven visuospatial processing. Taken together, these results suggest that a subset of patients who showed moderate success on the SRT used an explicit stimulus-driven strategy to process the sequential stimuli. This adaptive strategy facilitated sequence learning but may have interfered with conventional implicit learning of the overall stimulus pattern. PsycINFO Database Record (c) 2007 APA, all rights reserved.
The genetic diversity of hepatitis A genotype I in Bulgaria

PubMed Central

Cella, Eleonora; Golkocheva-Markova, Elitsa N.; Trandeva-Bankova, Diljana; Gregori, Giulia; Bruni, Roberto; Taffon, Stefania; Equestre, Michele; Costantino, Angela; Spoto, Silvia; Curtis, Melissa; Ciccaglione, Anna Rita; Ciccozzi, Massimo; Angeletti, Silvia

2018-01-01

Abstract The purpose of this study was to analyze sequences of hepatitis A virus (HAV) Ia and Ib genotypes from Bulgarian patients to investigate the molecular epidemiology of HAV genotype I during the years 2012 to 2014. Around 105 serum samples were collected by the Department of Virology of the National Center of Infectious and Parasitic Diseases in Bulgaria. The sequenced region encompassed the VP1/2A region of HAV genome. The sequences obtained from the samples were 103. For the phylogenetic analyses, 5 datasets were built to investigate the viral gene in/out flow among distinct HAV subpopulations in different geographic areas and to build a Bayesian dated tree, Bayesian phylogenetic and migration pattern analyses were performed. HAV Ib Bulgarian sequences mostly grouped into a single clade. This indicates that the Bulgarian epidemic is partially compartmentalized. It originated from a limited number of viruses and then spread through fecal-oral local transmission. HAV Ia Bulgarian sequences were intermixed with European sequences, suggesting that an Ia epidemic is not restricted to Bulgaria but can affect other European countries. The time-scaled phylogeny reconstruction showed the root of the tree dating in 2008 for genotype Ib and in 1999 for genotype Ia with a second epidemic entrance in 2003. The Bayesian skyline plot for genotype Ib showed a slow but continuous growth, sustained by fecal-oral route transmission. For genotype Ia, there was an exponential growth followed by a plateau, which suggests better infection control. Bidirectional viral flow for Ib genotype, involving different Bulgarian areas, was observed, whereas a unidirectional flow from Sofia to Ihtiman for genotype Ia was highlighted, suggesting the fecal-oral transmission route for Ia. PMID:29504993
The genetic diversity of hepatitis A genotype I in Bulgaria.

PubMed

Cella, Eleonora; Golkocheva-Markova, Elitsa N; Trandeva-Bankova, Diljana; Gregori, Giulia; Bruni, Roberto; Taffon, Stefania; Equestre, Michele; Costantino, Angela; Spoto, Silvia; Curtis, Melissa; Ciccaglione, Anna Rita; Ciccozzi, Massimo; Angeletti, Silvia

2018-01-01

The purpose of this study was to analyze sequences of hepatitis A virus (HAV) Ia and Ib genotypes from Bulgarian patients to investigate the molecular epidemiology of HAV genotype I during the years 2012 to 2014. Around 105 serum samples were collected by the Department of Virology of the National Center of Infectious and Parasitic Diseases in Bulgaria. The sequenced region encompassed the VP1/2A region of HAV genome. The sequences obtained from the samples were 103. For the phylogenetic analyses, 5 datasets were built to investigate the viral gene in/out flow among distinct HAV subpopulations in different geographic areas and to build a Bayesian dated tree, Bayesian phylogenetic and migration pattern analyses were performed. HAV Ib Bulgarian sequences mostly grouped into a single clade. This indicates that the Bulgarian epidemic is partially compartmentalized. It originated from a limited number of viruses and then spread through fecal-oral local transmission. HAV Ia Bulgarian sequences were intermixed with European sequences, suggesting that an Ia epidemic is not restricted to Bulgaria but can affect other European countries. The time-scaled phylogeny reconstruction showed the root of the tree dating in 2008 for genotype Ib and in 1999 for genotype Ia with a second epidemic entrance in 2003. The Bayesian skyline plot for genotype Ib showed a slow but continuous growth, sustained by fecal-oral route transmission. For genotype Ia, there was an exponential growth followed by a plateau, which suggests better infection control. Bidirectional viral flow for Ib genotype, involving different Bulgarian areas, was observed, whereas a unidirectional flow from Sofia to Ihtiman for genotype Ia was highlighted, suggesting the fecal-oral transmission route for Ia. Copyright © 2017 The Authors. Published by Wolters Kluwer Health, Inc. All rights reserved.
Selection and Trans-Species Polymorphism of Major Histocompatibility Complex Class II Genes in the Order Crocodylia

PubMed Central

Jaratlerdsiri, Weerachai; Isberg, Sally R.; Higgins, Damien P.; Miles, Lee G.; Gongora, Jaime

2014-01-01

Major Histocompatibility Complex (MHC) class II genes encode for molecules that aid in the presentation of antigens to helper T cells. MHC characterisation within and between major vertebrate taxa has shed light on the evolutionary mechanisms shaping the diversity within this genomic region, though little characterisation has been performed within the Order Crocodylia. Here we investigate the extent and effect of selective pressures and trans-species polymorphism on MHC class II α and β evolution among 20 extant species of Crocodylia. Selection detection analyses showed that diversifying selection influenced MHC class II β diversity, whilst diversity within MHC class II α is the result of strong purifying selection. Comparison of translated sequences between species revealed the presence of twelve trans-species polymorphisms, some of which appear to be specific to the genera Crocodylus and Caiman. Phylogenetic reconstruction clustered MHC class II α sequences into two major clades representing the families Crocodilidae and Alligatoridae. However, no further subdivision within these clades was evident and, based on the observation that most MHC class II α sequences shared the same trans-species polymorphisms, it is possible that they correspond to the same gene lineage across species. In contrast, phylogenetic analyses of MHC class II β sequences showed a mixture of subclades containing sequences from Crocodilidae and/or Alligatoridae, illustrating orthologous relationships among those genes. Interestingly, two of the subclades containing sequences from both Crocodilidae and Alligatoridae shared specific trans-species polymorphisms, suggesting that they may belong to ancient lineages pre-dating the divergence of these two families from the common ancestor 85–90 million years ago. The results presented herein provide an immunogenetic resource that may be used to further assess MHC diversity and functionality in Crocodylia. PMID:24503938
Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

PubMed

Mackey, Aaron J; Pearson, William R

2004-10-01

Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Vander Lugt correlation of DNA sequence data

NASA Astrophysics Data System (ADS)

Christens-Barry, William A.; Hawk, James F.; Martin, James C.

1990-12-01

DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.
Phylogeny of culturable cyanobacteria from Brazilian mangroves.

PubMed

Silva, Caroline Souza Pamplona; Genuário, Diego Bonaldo; Vaz, Marcelo Gomes Marçal Vieira; Fiore, Marli Fátima

2014-03-01

The cyanobacterial community from Brazilian mangrove ecosystems was examined using a culture-dependent method. Fifty cyanobacterial strains were isolated from soil, water and periphytic samples collected from Cardoso Island and Bertioga mangroves using specific cyanobacterial culture media. Unicellular, homocytous and heterocytous morphotypes were recovered, representing five orders, seven families and eight genera (Synechococcus, Cyanobium, Cyanobacterium, Chlorogloea, Leptolyngbya, Phormidium, Nostoc and Microchaete). All of these novel mangrove strains had their 16S rRNA gene sequenced and BLAST analysis revealed sequence identities ranging from 92.5 to 99.7% when they were compared with other strains available in GenBank. The results showed a high variability of the 16S rRNA gene sequences among the genotypes that was not associated with the morphologies observed. Phylogenetic analyses showed several branches formed exclusively by some of these novel 16S rRNA gene sequences. BLAST and phylogeny analyses allowed for the identification of Nodosilinea and Oxynema strains, genera already known to exhibit poor morphological diacritic traits. In addition, several Nostoc and Leptolyngbya morphotypes of the mangrove strains may represent new generic entities, as they were distantly affiliated with true genera clades. The presence of non-ribosomal peptide synthetase, polyketide synthase, microcystin and saxitoxin genes were detected in 20.5%, 100%, 37.5% and 33.3%, respectively, of the 44 tested isolates. A total of 134 organic extracts obtained from 44 strains were tested against microorganisms, and 26% of the extracts showed some antimicrobial activity. This is the first polyphasic study of cultured cyanobacteria from Brazilian mangrove ecosystems using morphological, genetic and biological approaches. Copyright © 2014 Elsevier GmbH. All rights reserved.
Differentiation of Toxocara canis and Toxocara cati based on PCR-RFLP analyses of rDNA-ITS and mitochondrial cox1 and nad1 regions.

PubMed

Mikaeili, Fattaneh; Mathis, Alexander; Deplazes, Peter; Mirhendi, Hossein; Barazesh, Afshin; Ebrahimi, Sepideh; Kia, Eshrat Beigom

2017-09-26

The definitive genetic identification of Toxocara species is currently based on PCR/sequencing. The objectives of the present study were to design and conduct an in silico polymerase chain reaction-restriction fragment length polymorphism method for identification of Toxocara species. In silico analyses using the DNASIS and NEBcutter softwares were performed with rDNA internal transcribed spacers, and mitochondrial cox1 and nad1 sequences obtained in our previous studies along with relevant sequences deposited in GenBank. Consequently, RFLP profiles were designed and all isolates of T. canis and T. cati collected from dogs and cats in different geographical areas of Iran were investigated with the RFLP method using some of the identified suitable enzymes. The findings of in silico analyses predicted that on the cox1 gene only the MboII enzyme is appropriate for PCR-RFLP to reliably distinguish the two species. No suitable enzyme for PCR-RFLP on the nad1 gene was identified that yields the same pattern for all isolates of a species. DNASIS software showed that there are 241 suitable restriction enzymes for the differentiation of T. canis from T. cati based on ITS sequences. RsaI, MvaI and SalI enzymes were selected to evaluate the reliability of the in silico PCR-RFLP. The sizes of restriction fragments obtained by PCR-RFLP of all samples consistently matched the expected RFLP patterns. The ITS sequences are usually conserved and the PCR-RFLP approach targeting the ITS sequence is recommended for the molecular differentiation of Toxocara species and can provide a reliable tool for identification purposes particularly at the larval and egg stages.

Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer

PubMed Central

D’Addabbo, Pietro; Caizzi, Ruggiero

2016-01-01

Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon’s co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon’s evolutionary dynamics and increases our understanding on the Tc1-mariner elements’ biology. PMID:27213270
Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer.

PubMed

Palazzo, Antonio; Lovero, Domenica; D'Addabbo, Pietro; Caizzi, Ruggiero; Marsano, René Massimiliano

2016-01-01

Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon's co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon's evolutionary dynamics and increases our understanding on the Tc1-mariner elements' biology.
Subsurface microbial diversity in deep-granitic-fracture water in Colorado

USGS Publications Warehouse

Sahl, J.W.; Schmidt, R.; Swanner, E.D.; Mandernack, K.W.; Templeton, A.S.; Kieft, Thomas L.; Smith, R.L.; Sanford, W.E.; Callaghan, R.L.; Mitton, J.B.; Spear, J.R.

2008-01-01

A microbial community analysis using 16S rRNA gene sequencing was performed on borehole water and a granite rock core from Henderson Mine, a >1,000-meter-deep molybdenum mine near Empire, CO. Chemical analysis of borehole water at two separate depths (1,044 m and 1,004 m below the mine entrance) suggests that a sharp chemical gradient exists, likely from the mixing of two distinct subsurface fluids, one metal rich and one relatively dilute; this has created unique niches for microorganisms. The microbial community analyzed from filtered, oxic borehole water indicated an abundance of sequences from iron-oxidizing bacteria (Gallionella spp.) and was compared to the community from the same borehole after 2 weeks of being plugged with an expandable packer. Statistical analyses with UniFrac revealed a significant shift in community structure following the addition of the packer. Phospholipid fatty acid (PLFA) analysis suggested that Nitrosomonadales dominated the oxic borehole, while PLFAs indicative of anaerobic bacteria were most abundant in the samples from the plugged borehole. Microbial sequences were represented primarily by Firmicutes, Proteobacteria, and a lineage of sequences which did not group with any identified bacterial division; phylogenetic analyses confirmed the presence of a novel candidate division. This "Henderson candidate division" dominated the clone libraries from the dilute anoxic fluids. Sequences obtained from the granitic rock core (1,740 m below the surface) were represented by the divisions Proteobacteria (primarily the family Ralstoniaceae) and Firmicutes. Sequences grouping within Ralstoniaceae were also found in the clone libraries from metal-rich fluids yet were absent in more dilute fluids. Lineage-specific comparisons, combined with phylogenetic statistical analyses, show that geochemical variance has an important effect on microbial community structure in deep, subsurface systems. Copyright ?? 2008, American Society for Microbiology. All Rights Reserved.
Subsurface Microbial Diversity in Deep-Granitic-Fracture Water in Colorado▿

PubMed Central

Sahl, Jason W.; Schmidt, Raleigh; Swanner, Elizabeth D.; Mandernack, Kevin W.; Templeton, Alexis S.; Kieft, Thomas L.; Smith, Richard L.; Sanford, William E.; Callaghan, Robert L.; Mitton, Jeffry B.; Spear, John R.

2008-01-01

A microbial community analysis using 16S rRNA gene sequencing was performed on borehole water and a granite rock core from Henderson Mine, a >1,000-meter-deep molybdenum mine near Empire, CO. Chemical analysis of borehole water at two separate depths (1,044 m and 1,004 m below the mine entrance) suggests that a sharp chemical gradient exists, likely from the mixing of two distinct subsurface fluids, one metal rich and one relatively dilute; this has created unique niches for microorganisms. The microbial community analyzed from filtered, oxic borehole water indicated an abundance of sequences from iron-oxidizing bacteria (Gallionella spp.) and was compared to the community from the same borehole after 2 weeks of being plugged with an expandable packer. Statistical analyses with UniFrac revealed a significant shift in community structure following the addition of the packer. Phospholipid fatty acid (PLFA) analysis suggested that Nitrosomonadales dominated the oxic borehole, while PLFAs indicative of anaerobic bacteria were most abundant in the samples from the plugged borehole. Microbial sequences were represented primarily by Firmicutes, Proteobacteria, and a lineage of sequences which did not group with any identified bacterial division; phylogenetic analyses confirmed the presence of a novel candidate division. This “Henderson candidate division” dominated the clone libraries from the dilute anoxic fluids. Sequences obtained from the granitic rock core (1,740 m below the surface) were represented by the divisions Proteobacteria (primarily the family Ralstoniaceae) and Firmicutes. Sequences grouping within Ralstoniaceae were also found in the clone libraries from metal-rich fluids yet were absent in more dilute fluids. Lineage-specific comparisons, combined with phylogenetic statistical analyses, show that geochemical variance has an important effect on microbial community structure in deep, subsurface systems. PMID:17981950
Mobile Genome Express (MGE): A comprehensive automatic genetic analyses pipeline with a mobile device.

PubMed

Yoon, Jun-Hee; Kim, Thomas W; Mendez, Pedro; Jablons, David M; Kim, Il-Jin

2017-01-01

The development of next-generation sequencing (NGS) technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE) that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.
Phylogeny of Neoparamoeba strains isolated from marine fish and invertebrates as inferred from SSU rDNA sequences.

PubMed

Dyková, Iva; Nowak, Barbara; Pecková, Hana; Fiala, Ivan; Crosbie, Philip; Dvoráková, Helena

2007-02-08

We characterised 9 strains selected from primary isolates referable to Paramoeba/Neoparamoeba spp. Based on ultrastructural study, 5 strains isolated from fish (amoebic gill disease [AGD]-affected Atlantic salmon and dead southern bluefin tuna), 1 strain from netting of a floating sea cage and 3 strains isolated from invertebrates (sea urchins and crab) were assigned to the genus Neoparamoeba Page, 1987. Phylogenetic analyses based on SSU rDNA sequences revealed affiliations of newly introduced and previously analysed Neoparamoeba strains. Three strains from the invertebrates and 2 out of 3 strains from gills of southern bluefin tunas were members of the N. branchiphila clade, while the remaining, fish-isolated strains, as well as the fish cage strain, clustered within the clade of N. pemaquidensis. These findings and previous reports point to the possibility that N. pemaquidensis and N. branchiphila can affect both fish and invertebrates. A new potential fish host, southern bluefin tuna, was included in the list of farmed fish endangered by N. branchiphila. The sequence of P. eilhardi (Culture Collection of Algae and Protozoa [CCAP] strain 1560/2) appeared in all analyses among sequences of strain representatives of Neoparamoeba species, in a position well supported by bootstrap value, Bremer index and Bayesian posterior probability. Our research shows that isolation of additional strains from invertebrates and further analyses of relations between molecular data and morphological characters of the genera Paramoeba and Neoparamoeba are required. This complexity needs to be considered when attempting to define molecular markers for identification of Paramoeba/Neoparamoeba species in tissues of fish and invertebrates.
Reassessment of the taxonomic position of Burkholderia andropogonis and description of Robbsia andropogonis gen. nov., comb. nov.

PubMed

Lopes-Santos, Lucilene; Castro, Daniel Bedo Assumpção; Ferreira-Tonin, Mariana; Corrêa, Daniele Bussioli Alves; Weir, Bevan Simon; Park, Duckchul; Ottoboni, Laura Maria Mariscal; Neto, Júlio Rodrigues; Destéfano, Suzete Aparecida Lanza

2017-06-01

The phylogenetic classification of the species Burkholderia andropogonis within the Burkholderia genus was reassessed using 16S rRNA gene phylogenetic analysis and multilocus sequence analysis (MLSA). Both phylogenetic trees revealed two main groups, named A and B, strongly supported by high bootstrap values (100%). Group A encompassed all of the Burkholderia species complex, whi.le Group B only comprised B. andropogonis species, with low percentage similarities with other species of the genus, from 92 to 95% for 16S rRNA gene sequences and 83% for conserved gene sequences. Average nucleotide identity (ANI), tetranucleotide signature frequency, and percentage of conserved proteins POCP analyses were also carried out, and in the three analyses B. andropogonis showed lower values when compared to the other Burkholderia species complex, near 71% for ANI, from 0.484 to 0.724 for tetranucleotide signature frequency, and around 50% for POCP, reinforcing the distance observed in the phylogenetic analyses. Our findings provide an important insight into the taxonomy of B. andropogonis. It is clear from the results that this bacterial species exhibits genotypic differences and represents a new genus described herein as Robbsia andropogonis gen. nov., comb. nov.
Accounting for genotype uncertainty in the estimation of allele frequencies in autopolyploids.

PubMed

Blischak, Paul D; Kubatko, Laura S; Wolfe, Andrea D

2016-05-01

Despite the increasing opportunity to collect large-scale data sets for population genomic analyses, the use of high-throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty-ADU), which complicates the calculation of important quantities such as allele frequencies. Here, we describe a statistical model to estimate biallelic SNP frequencies in a population of autopolyploids using high-throughput sequencing data in the form of read counts. We bridge the gap from data collection (using restriction enzyme based techniques [e.g. GBS, RADseq]) to allele frequency estimation in a unified inferential framework using a hierarchical Bayesian model to sum over genotype uncertainty. Simulated data sets were generated under various conditions for tetraploid, hexaploid and octoploid populations to evaluate the model's performance and to help guide the collection of empirical data. We also provide an implementation of our model in the R package polyfreqs and demonstrate its use with two example analyses that investigate (i) levels of expected and observed heterozygosity and (ii) model adequacy. Our simulations show that the number of individuals sampled from a population has a greater impact on estimation error than sequencing coverage. The example analyses also show that our model and software can be used to make inferences beyond the estimation of allele frequencies for autopolyploids by providing assessments of model adequacy and estimates of heterozygosity. © 2015 John Wiley & Sons Ltd.
Clustered array of ochratoxin A biosynthetic genes in Aspergillus steynii and their expression patterns in permissive conditions.

PubMed

Gil-Serna, Jessica; Vázquez, Covadonga; González-Jaén, María Teresa; Patiño, Belén

2015-12-02

Aspergillus steynii is probably the most relevant species of section Circumdati producing ochratoxin A (OTA). This mycotoxin contaminates a wide number of commodities and it is highly toxic for humans and animals. Little is known on the biosynthetic genes and their regulation in Aspergillus species. In this work, we identified and analysed three contiguous genes in A. steynii using 5'-RACE and genome walking approaches which predicted a cytochrome P450 monooxygenase (p450ste), a non-ribosomal peptide synthetase (nrpsste) and a polyketide synthase (pksste). These three genes were contiguous within a 20742 bp long genomic DNA fragment. Their corresponding cDNA were sequenced and their expression was analysed in three A. steynii strains using real time RT-PCR specific assays in permissive conditions in in vitro cultures. OTA was also analysed in these cultures. Comparative analyses of predicted genomic, cDNA and amino acid sequences were performed with sequences of similar gene functions. All the results obtained in these analyses were consistent and point out the involvement of these three genes in OTA biosynthesis by A. steynii and showed a co-ordinated expression pattern. This is the first time that a clustered organization OTA biosynthetic genes has been reported in Aspergillus genus. The results also suggested that this situation might be common in Aspergillus OTA-producing species and distinct to the one described for Penicillium species. Copyright © 2015 Elsevier B.V. All rights reserved.
Molecular evidence of undescribed Ceratonova sp. (Cnidaria: Myxosporea) in the freshwater polychaete, Manayunkia speciosa, from western Lake Erie

USGS Publications Warehouse

Malakauskas, David M.; Snipes, Robert Benjamin; Thompson, Ann M.; Schloesser, Donald W.

2016-01-01

We used PCR to screen pooled individuals of Manayunkia speciosa from western Lake Erie, Michigan, USA for myxosporean parasites. Amplicons from positive PCRs were sequenced and showed a Ceratonova species in an estimated 1.1% (95% CI = 0.46%, 1.8%) of M. speciosa individuals. We sequenced 18S, ITS1, 5.8S, ITS2 and most of the 28S rDNA regions of this Ceratonova sp., and part of the protein-coding EF2 gene. Phylogenetic analyses of ribosomal and EF2 sequences showed the Lake Erie Ceratonova sp. is most similar to, but genetically distinct from, Ceratonova shasta. Marked interspecific polymorphism in all genes examined, including the ITS barcoding genes, along with geographic location suggests this is an undescribed Ceratonova species. COI sequences showed M. speciosa individuals in Michigan and California are the same species. These findings represent a third parasite in the genus Ceratonovapotentially hosted by M. speciosa.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer.

PubMed

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph; Aury, Jean-Marc

2017-02-01

Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. © The Author 2017. Published by Oxford University Press.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer

PubMed Central

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph

2017-01-01

Abstract Background: Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Results: Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Conclusion: Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. PMID:28369459
Genome Sequences of Pseudomonas spp. Isolated from Cereal Crops

PubMed Central

Stiller, Jiri; Covarelli, Lorenzo; Lindeberg, Magdalen; Shivas, Roger G.; Manners, John M.

2013-01-01

Compared to those of dicot-infecting bacteria, the available genome sequences of bacteria that infect wheat and barley are limited. Herein, we report the draft genome sequences of four pseudomonads originally isolated from these cereals. These genome sequences provide a useful resource for comparative analyses within the genus and for cross-kingdom analyses of plant pathogenesis. PMID:23661484
Identification of Cis-Acting Promoter Elements in Cold- and Dehydration-Induced Transcriptional Pathways in Arabidopsis, Rice, and Soybean

PubMed Central

Maruyama, Kyonoshin; Todaka, Daisuke; Mizoi, Junya; Yoshida, Takuya; Kidokoro, Satoshi; Matsukura, Satoko; Takasaki, Hironori; Sakurai, Tetsuya; Yamamoto, Yoshiharu Y.; Yoshiwara, Kyouko; Kojima, Mikiko; Sakakibara, Hitoshi; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

2012-01-01

The genomes of three plants, Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and soybean (Glycine max), have been sequenced, and their many genes and promoters have been predicted. In Arabidopsis, cis-acting promoter elements involved in cold- and dehydration-responsive gene expression have been extensively analysed; however, the characteristics of such cis-acting promoter sequences in cold- and dehydration-inducible genes of rice and soybean remain to be clarified. In this study, we performed microarray analyses using the three species, and compared characteristics of identified cold- and dehydration-inducible genes. Transcription profiles of the cold- and dehydration-responsive genes were similar among these three species, showing representative upregulated (dehydrin/LEA) and downregulated (photosynthesis-related) genes. All (46 = 4096) hexamer sequences in the promoters of the three species were investigated, revealing the frequency of conserved sequences in cold- and dehydration-inducible promoters. A core sequence of the abscisic acid-responsive element (ABRE) was the most conserved in dehydration-inducible promoters of all three species, suggesting that transcriptional regulation for dehydration-inducible genes is similar among these three species, with the ABRE-dependent transcriptional pathway. In contrast, for cold-inducible promoters, the conserved hexamer sequences were diversified among these three species, suggesting the existence of diverse transcriptional regulatory pathways for cold-inducible genes among the species. PMID:22184637
The occurrence of Toxocara malaysiensis in cats in China, confirmed by sequence-based analyses of ribosomal DNA.

PubMed

Li, Ming-Wei; Zhu, Xing-Quan; Gasser, Robin B; Lin, Rui-Qing; Sani, Rehana A; Lun, Zhao-Rong; Jacobs, Dennis E

2006-10-01

Non-isotopic polymerase chain reaction (PCR)-based single-strand conformation polymorphism and sequence analyses of the second internal transcribed spacer (ITS-2) of nuclear ribosomal DNA (rDNA) were utilized to genetically characterise ascaridoids from dogs and cats from China by comparison with those from other countries. The study showed that Toxocara canis, Toxocara cati, and Toxascaris leonina from China were genetically the same as those from other geographical origins. Specimens from cats from Guangzhou, China, which were morphologically consistent with Toxocara malaysiensis, were the same genetically as those from Malaysia, with the exception of a polymorphism in the ITS-2 but no unequivocal sequence difference. This is the first report of T. malaysiensis in cats outside of Malaysia (from where it was originally described), supporting the proposal that this species has a broader geographical distribution. The molecular approach employed provides a powerful tool for elucidating the biology, epidemiology, and zoonotic significance of T. malaysiensis.
Candidate chemosensory ionotropic receptors in a Lepidoptera.

PubMed

Olivier, V; Monsempes, C; François, M-C; Poivet, E; Jacquin-Joly, E

2011-04-01

A new family of candidate chemosensory ionotropic receptors (IRs) related to ionotropic glutamate receptors (iGluRs) was recently discovered in Drosophila melanogaster. Through Blast analyses of an expressed sequenced tag library prepared from male antennae of the noctuid moth Spodoptera littoralis, we identified 12 unigenes encoding proteins related to D. melanogaster and Bombyx mori IRs. Their full length sequences were obtained and the analyses of their expression patterns suggest that they were exclusively expressed or clearly enriched in chemosensory organs. The deduced protein sequences were more similar to B. mori and D. melanogaster IRs than to iGluRs and showed considerable variations in the predicted ligand-binding domains; none have the three glutamate-interacting residues found in iGluRs, suggesting different binding specificities. Our data suggest that we identified members of the insect IR chemosensory receptor family in S. littoralis and we report here the first demonstration of IR expression in Lepidoptera. © 2010 The Authors. Insect Molecular Biology © 2010 The Royal Entomological Society.
Genome sequence analysis of dengue virus 1 isolated in Key West, Florida.

PubMed

Shin, Dongyoung; Richards, Stephanie L; Alto, Barry W; Bettinardi, David J; Smartt, Chelsea T

2013-01-01

Dengue virus (DENV) is transmitted to humans through the bite of mosquitoes. In November 2010, a dengue outbreak was reported in Monroe County in southern Florida (FL), including greater than 20 confirmed human cases. The virus collected from the human cases was verified as DENV serotype 1 (DENV-1) and one isolate was provided for sequence analysis. RNA was extracted from the DENV-1 isolate and was used in reverse transcription polymerase chain reaction (RT-PCR) to amplify PCR fragments to sequence. Nucleic acid primers were designed to generate overlapping PCR fragments that covered the entire genome. The DENV-1 isolate found in Key West (KW), FL was sequenced for whole genome characterization. Sequence assembly, Genbank searches, and recombination analyses were performed to verify the identity of the genome sequences and to determine percent similarity to known DENV-1 sequences. We show that the KW DENV-1 strain is 99% identical to Nicaraguan and Mexican DENV-1 strains. Phylogenetic and recombination analyses suggest that the DENV-1 isolated in KW originated from Nicaragua (NI) and the KW strain may circulate in KW. Also, recombination analysis results detected recombination events in the KW strain compared to DENV-1 strains from Puerto Rico. We evaluate the relative growth of KW strain of DENV-1 compared to other dengue viruses to determine whether the underlying genetics of the strain is associated with a replicative advantage, an important consideration since local transmission of DENV may result because domestic tourism can spread DENVs.
Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures.

PubMed

Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi

2014-09-18

Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.
Molecular Detection of Rickettsia felis in Different Flea Species from Caldas, Colombia

PubMed Central

Ramírez-Hernández, Alejandro; Montoya, Viviana; Martínez, Alejandra; Pérez, Jorge E.; Mercado, Marcela; de la Ossa, Alberto; Vélez, Carolina; Estrada, Gloria; Correa, Maria I.; Duque, Laura; Ariza, Juan S.; Henao, Cesar; Valbuena, Gustavo; Hidalgo, Marylin

2013-01-01

Rickettsioses caused by Rickettsia felis are an emergent global threat. Historically, the northern region of the province of Caldas in Colombia has reported murine typhus cases, and recently, serological studies confirmed high seroprevalence for both R. felis and R. typhi. In the present study, fleas from seven municipalities were collected from dogs, cats, and mice. DNA was extracted and amplified by polymerase chain reaction (PCR) to identify gltA, ompB, and 17kD genes. Positive samples were sequenced to identify the species of Rickettsia. Of 1,341 fleas, Ctenocephalides felis was the most prevalent (76.7%). Positive PCR results in the three genes were evidenced in C. felis (minimum infection rates; 5.3%), C. canis (9.2%), and Pulex irritans (10.0%). Basic Local Alignment Search Tool (BLAST) analyses of sequences showed high identity values (> 98%) with R. felis, and all were highly related by phylogenetic analyses. This work shows the first detection of R. felis in fleas collected from animals in Colombia. PMID:23878183
Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal

PubMed Central

Skoglund, Pontus; Northoff, Bernd H.; Shunkov, Michael V.; Derevianko, Anatoli P.; Pääbo, Svante; Krause, Johannes; Jakobsson, Mattias

2014-01-01

One of the main impediments for obtaining DNA sequences from ancient human skeletons is the presence of contaminating modern human DNA molecules in many fossil samples and laboratory reagents. However, DNA fragments isolated from ancient specimens show a characteristic DNA damage pattern caused by miscoding lesions that differs from present day DNA sequences. Here, we develop a framework for evaluating the likelihood of a sequence originating from a model with postmortem degradation—summarized in a postmortem degradation score—which allows the identification of DNA fragments that are unlikely to originate from present day sources. We apply this approach to a contaminated Neandertal specimen from Okladnikov Cave in Siberia to isolate its endogenous DNA from modern human contaminants and show that the reconstructed mitochondrial genome sequence is more closely related to the variation of Western Neandertals than what was discernible from previous analyses. Our method opens up the potential for genomic analysis of contaminated fossil material. PMID:24469802

Divergence, hybridization, and recombination in the mitochondrial genome of the human pathogenic yeast Cryptococcus gattii.

PubMed

Xu, Jianping; Yan, Zhun; Guo, Hong

2009-06-01

The inheritance of mitochondrial genes and genomes are uniparental in most sexual eukaryotes. This pattern of inheritance makes mitochondrial genomes in natural populations effectively clonal. Here, we examined the mitochondrial population genetics of the emerging human pathogenic fungus Cryptococcus gattii. The DNA sequences for five mitochondrial DNA fragments were obtained from each of 50 isolates belonging to two evolutionary divergent lineages, VGI and VGII. Our analyses revealed a greater sequence diversity within VGI than that within VGII, consistent with observations of the nuclear genes. The combined analyses of all five gene fragments indicated significant divergence between VGI and VGII. However, the five individual genealogies showed different relationships among the isolates, consistent with recent hybridization and mitochondrial gene transfer between the two lineages. Population genetic analyses of the multilocus data identified evidence for predominantly clonal mitochondrial population structures within both lineages. Interestingly, there were clear signatures of recombination among mitochondrial genes within the VGII lineage. Our analyses suggest historical mitochondrial genome divergence within C. gattii, but there is evidence for recent hybridization and recombination in the mitochondrial genome of this important human yeast pathogen.
Analyses of mitochondrial amino acid sequence datasets support the proposal that specimens of Hypodontus macropi from three species of macropodid hosts represent distinct species

PubMed Central

2013-01-01

Background Hypodontus macropi is a common intestinal nematode of a range of kangaroos and wallabies (macropodid marsupials). Based on previous multilocus enzyme electrophoresis (MEE) and nuclear ribosomal DNA sequence data sets, H. macropi has been proposed to be complex of species. To test this proposal using independent molecular data, we sequenced the whole mitochondrial (mt) genomes of individuals of H. macropi from three different species of hosts (Macropus robustus robustus, Thylogale billardierii and Macropus [Wallabia] bicolor) as well as that of Macropicola ocydromi (a related nematode), and undertook a comparative analysis of the amino acid sequence datasets derived from these genomes. Results The mt genomes sequenced by next-generation (454) technology from H. macropi from the three host species varied from 13,634 bp to 13,699 bp in size. Pairwise comparisons of the amino acid sequences predicted from these three mt genomes revealed differences of 5.8% to 18%. Phylogenetic analysis of the amino acid sequence data sets using Bayesian Inference (BI) showed that H. macropi from the three different host species formed distinct, well-supported clades. In addition, sliding window analysis of the mt genomes defined variable regions for future population genetic studies of H. macropi in different macropodid hosts and geographical regions around Australia. Conclusions The present analyses of inferred mt protein sequence datasets clearly supported the hypothesis that H. macropi from M. robustus robustus, M. bicolor and T. billardierii represent distinct species. PMID:24261823
Sequence and phylogenetic analyses of novel totivirus-like double-stranded RNAs from field-collected powdery mildew fungi.

PubMed

Kondo, Hideki; Hisano, Sakae; Chiba, Sotaro; Maruyama, Kazuyuki; Andika, Ida Bagus; Toyoda, Kazuhiro; Fujimori, Fumihiro; Suzuki, Nobuhiro

2016-02-02

The identification of mycoviruses contributes greatly to understanding of the diversity and evolutionary aspects of viruses. Powdery mildew fungi are important and widely studied obligate phytopathogenic agents, but there has been no report on mycoviruses infecting these fungi. In this study, we used a deep sequencing approach to analyze the double-stranded RNA (dsRNA) segments isolated from field-collected samples of powdery mildew fungus-infected red clover plants in Japan. Database searches identified the presence of at least ten totivirus (genus Totivirus)-like sequences, termed red clover powdery mildew-associated totiviruses (RPaTVs). The majority of these sequences shared moderate amino acid sequence identity with each other (<44%) and with other known totiviruses (<59%). Nine of these identified sequences (RPaTV1a, 1b and 2-8) resembled the genome of the prototype totivirus, Saccharomyces cerevisiae virus-L-A (ScV-L-A) in that they contained two overlapping open reading frames (ORFs) encoding a putative coat protein (CP) and an RNA dependent RNA polymerase (RdRp), while one sequence (RPaTV9) showed similarity to another totivirus, Ustilago maydis virus H1 (UmV-H1) that encodes a single polyprotein (CP-RdRp fusion). Similar to yeast totiviruses, each ScV-L-A-like RPaTV contains a -1 ribosomal frameshift site downstream of a predicted pseudoknot structure in the overlapping region of these ORFs, suggesting that the RdRp is translated as a CP-RdRp fusion. Moreover, several ScV-L-A-like sequences were also found by searches of the transcriptome shotgun assembly (TSA) libraries from rust fungi, plants and insects. Phylogenetic analyses show that nine ScV-L-A-like RPaTVs along with ScV-L-A-like sequences derived from TSA libraries are clustered with most established members of the genus Totivirus, while one RPaTV forms a new distinct clade with UmV-H1, possibly establishing an additional genus in the family. Taken together, our results indicate the presence of diverse, novel totiviruses in the powdery mildew fungus populations infecting red clover plants in the field. Copyright © 2015 Elsevier B.V. All rights reserved.
Some methodical peculiarities of analysis of small-mass samples by SRXFA

NASA Astrophysics Data System (ADS)

Kudryashova, A. F.; Tarasov, L. S.; Ulyanov, A. A.; Baryshev, V. B.

1989-10-01

The stability of work of the element analysis station on the storage rings VEPP-3 and VEPP-4 in INP (Novosibirsk, USSR) was demonstrated on the example of three sets of rare element analyses carried out by SRXFA in May 1985, January and May-June 1988. These data show that there are some systematic deviations in the results of measurements of Zr and La contents. SRXFA and INAA data have been compared for the latter element. A false linear correlation on the Rb-Sr plot in one set of analyses has been attributed to an overlapping artificial Sr peak on a Rb peak. The authors proposed sequences of registration of spectra and computer treatment for samples and standards. Such sequences result in better final concentration data.
Bioinformatic Analysis of Strawberry GSTF12 Gene

NASA Astrophysics Data System (ADS)

Wang, Xiran; Jiang, Leiyu; Tang, Haoru

2018-01-01

GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.
The giant zooxanthellae-bearing ciliate Maristentor dinoferus (Heterotrichea) is closely related to folliculinidae.

PubMed

Miao, Wei; Simpson, Alastair G B; Fu, Chengjie; Lobban, Christopher S

2005-01-01

The small subunit rDNA sequence of Maristentor dinoferus (Lobban, Schefter, Simpson, Pochon, Pawlowski, and Foissner, 2002) was determined and compared with sequences from other Heterotrichea and Karyorelictea. Maristentor resembles Stentor in basic morphology and had been provisionally assigned to Stentoridae. However, our phylogenetic analyses show that Maristentor is more closely related to Folliculinidae. Our results support the creation of a separate family for Maristentor, Maristentoridae n. fam., and also confirm the phylogenetic grouping of Folliculindae, Stentoridae, Blepharismidae, and Maristentoridae, which we informally call 'stentorids'. Maristentor, rather than Stentor itself, appears to be most significant in understanding the origins of folliculinids from their aloricate ancestors. Our analyses suggest continued uncertainty in the exact placement of the root of heterotrichs with this phylogenetic marker.
Detection of a new bat gammaherpesvirus in the Philippines.

PubMed

Watanabe, Shumpei; Ueda, Naoya; Iha, Koichiro; Masangkay, Joseph S; Fujii, Hikaru; Alviola, Phillip; Mizutani, Tetsuya; Maeda, Ken; Yamane, Daisuke; Walid, Azab; Kato, Kentaro; Kyuwa, Shigeru; Tohya, Yukinobu; Yoshikawa, Yasuhiro; Akashi, Hiroomi

2009-08-01

A new bat herpesvirus was detected in the spleen of an insectivorous bat (Hipposideros diadema, family Hipposideridae) collected on Panay Island, the Philippines. PCR analyses were performed using COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOPs) targeting the herpesvirus DNA polymerase (DPOL) gene. Although we obtained PCR products with CODEHOPs, direct sequencing using the primers was not possible because of high degree of degeneracy. Direct sequencing technology developed in our rapid determination system of viral RNA sequences (RDV) was applied in this study, and a partial DPOL nucleotide sequence was determined. In addition, a partial gB gene nucleotide sequence was also determined using the same strategy. We connected the partial gB and DPOL sequences with long-distance PCR, and a 3741-bp nucleotide fragment, including the 3' part of the gB gene and the 5' part of the DPOL gene, was finally determined. Phylogenetic analysis showed that the sequence was novel and most similar to those of the subfamily Gammaherpesvirinae.
An experimental phylogeny to benchmark ancestral sequence reconstruction

PubMed Central

Randall, Ryan N.; Radford, Caelan E.; Roof, Kelsey A.; Natarajan, Divya K.; Gaucher, Eric A.

2016-01-01

Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. The 19 leaves then serve as ‘modern' sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. Subsampling of extant sequences had minor effect on the inference of ancestral sequences. PMID:27628687
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing.

PubMed

Hargreaves, Adam D; Mulley, John F

2015-01-01

Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

PubMed Central

Hargreaves, Adam D.

2015-01-01

Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194
Characterization of a new apple luteovirus identified by high-throughput sequencing.

PubMed

Liu, Huawei; Wu, Liping; Nikolaeva, Ekaterina; Peter, Kari; Liu, Zongrang; Mollov, Dimitre; Cao, Mengji; Li, Ruhui

2018-05-15

'Rapid Apple Decline' (RAD) is a newly emerging problem of young, dwarf apple trees in the Northeastern USA. The affected trees show trunk necrosis, cracking and canker before collapse in summer. In this study, we discovered and characterized a new luteovirus from apple trees in RAD-affected orchards using high-throughput sequencing (HTS) technology and subsequent Sanger sequencing. Illumina NextSeq sequencing was applied to total RNAs prepared from three diseased apple trees. Sequence reads were de novo assembled, and contigs were annotated by BLASTx. RT-PCR and 5'/3' RACE sequencing were used to obtain the complete genome of a new virus. RT-PCR was used to detect the virus. Three common apple viruses and a new luteovirus were identified from the diseased trees by HTS and RT-PCR. Sequence analyses of the complete genome of the new virus show that it is a new species of the genus Luteovirus in the family Luteoviridae. The virus is graft transmissible and detected by RT-PCR in apple trees in a couple of orchards. A new luteovirus and/or three known viruses were found to be associated with RAD. Molecular characterization of the new luteovirus provides important information for further investigation of its distribution and etiological role.
What can we learn about lyssavirus genomes using 454 sequencing?

PubMed

Höper, Dirk; Finke, Stefan; Freuling, Conrad M; Hoffmann, Bernd; Beer, Martin

2012-01-01

The main task of the individual project number four"Whole genome sequencing, virus-host adaptation, and molecular epidemiological analyses of lyssaviruses "within the network" Lyssaviruses--a potential re-emerging public health threat" is to provide high quality complete genome sequences from lyssaviruses. These sequences are analysed in-depth with regard to the diversity of the viral populations as to both quasi-species and so-called defective interfering RNAs. Moreover, the sequence data will facilitate further epidemiological analyses, will provide insight into the evolution of lyssaviruses and will be the basis for the design of novel nucleic acid based diagnostics. The first results presented here indicate that not only high quality full-length lyssavirus genome sequences can be generated, but indeed efficient analysis of the viral population gets feasible.
BRepertoire: a user-friendly web server for analysing antibody repertoire data.

PubMed

Margreitter, Christian; Lu, Hui-Chun; Townsend, Catherine; Stewart, Alexander; Dunn-Walters, Deborah K; Fraternali, Franca

2018-04-14

Antibody repertoire analysis by high throughput sequencing is now widely used, but a persisting challenge is enabling immunologists to explore their data to discover discriminating repertoire features for their own particular investigations. Computational methods are necessary for large-scale evaluation of antibody properties. We have developed BRepertoire, a suite of user-friendly web-based software tools for large-scale statistical analyses of repertoire data. The software is able to use data preprocessed by IMGT, and performs statistical and comparative analyses with versatile plotting options. BRepertoire has been designed to operate in various modes, for example analysing sequence-specific V(D)J gene usage, discerning physico-chemical properties of the CDR regions and clustering of clonotypes. Those analyses are performed on the fly by a number of R packages and are deployed by a shiny web platform. The user can download the analysed data in different table formats and save the generated plots as image files ready for publication. We believe BRepertoire to be a versatile analytical tool that complements experimental studies of immune repertoires. To illustrate the server's functionality, we show use cases including differential gene usage in a vaccination dataset and analysis of CDR3H properties in old and young individuals. The server is accessible under http://mabra.biomed.kcl.ac.uk/BRepertoire.
Next-generation sequencing library construction on a surface.

PubMed

Feng, Kuan; Costa, Justin; Edwards, Jeremy S

2018-05-30

Next-generation sequencing (NGS) has revolutionized almost all fields of biology, agriculture and medicine, and is widely utilized to analyse genetic variation. Over the past decade, the NGS pipeline has been steadily improved, and the entire process is currently relatively straightforward. However, NGS instrumentation still requires upfront library preparation, which can be a laborious process, requiring significant hands-on time. Herein, we present a simple but robust approach to streamline library preparation by utilizing surface bound transposases to construct DNA libraries directly on a flowcell surface. The surface bound transposases directly fragment genomic DNA while simultaneously attaching the library molecules to the flowcell. We sequenced and analysed a Drosophila genome library generated by this surface tagmentation approach, and we showed that our surface bound library quality was comparable to the quality of the library from a commercial kit. In addition to the time and cost savings, our approach does not require PCR amplification of the library, which eliminates potential problems associated with PCR duplicates. We described the first study to construct libraries directly on a flowcell. We believe our technique could be incorporated into the existing Illumina sequencing pipeline to simplify the workflow, reduce costs, and improve data quality.
Genomic and probiotic characterization of SJP-SNU strain of Pichia kudriavzevii.

PubMed

Hong, Seung-Min; Kwon, Hyuk-Joon; Park, Se-Joon; Seong, Won-Jin; Kim, Ilhwan; Kim, Jae-Hong

2018-05-17

The yeast strain SJP-SNU was investigated as a probiotic and was characterized with respect to growth temperature, bile salt resistance, hydrogen sulfide reducing activity, intestinal survival ability and chicken embryo pathogenicity. In addition, we determined the complete genomic and mitochondrial sequences of SJP-SNU and conducted comparative genomics analyses. SJP-SNU grew rapidly at 37 °C and formed colonies on MacConkey agar containing bile salt. SJP-SNU reduced hydrogen sulfide produced by Salmonella serotype Enteritidis and, after being fed to 4-week-old chickens, could be isolated from cecal feces. SJP-SNU did not cause mortality in 10-day-old chicken embryos. From 13 initial contigs, 11 were finally assembled and represented 10 chromosomal sequences and 1 mitochondrial DNA sequence. Comparative genomic analyses revealed that SJP-SNU was a strain of Pichia kudriavzevii. Although SJP-SNU possesses pathogenicity-related genes, they showed very low amino acid sequence identities to those of Candida albicans. Furthermore, SJP-SNU possessed useful genes, such as phytases and cellulase. Thus, SJP-SNU is a useful yeast possessing the basic traits of a probiotic, and further studies to demonstrate its efficacy as a probiotic in the future may be warranted.
Sequence change in the HS2-LCR and Ggamma-globin gene promoter region of sickle cell anemia patients.

PubMed

Adorno, E V; Moura-Neto, J P; Lyra, I; Zanette, A; Santos, L F O; Seixas, M O; Reis, M G; Goncalves, M S

2008-02-01

The fetal hemoglobin (HbF) levels and betaS-globin gene haplotypes of 125 sickle cell anemia patients from Brazil were investigated. We sequenced the Ggamma- and Agamma-globin gene promoters and the DNase I-2 hypersensitive sites in the locus control regions (HS2-LCR) of patients with HbF level disparities as compared to their betaS haplotypes. Sixty-four (51.2%) patients had CAR/Ben genotype; 36 (28.8%) Ben/Ben; 18 (14.4%) CAR/CAR; 2 (1.6%) CAR/Atypical; 2 (1.6%) Ben/Cam; 1 (0.8%) CAR/Cam; 1 (0.8%) CAR/Arab-Indian, and 1 (0.8%) Sen/Atypical. The HS2-LCR sequence analyses demonstrated a c.-10.677G>A change in patients with the Ben haplotype and high HbF levels. The Gg gene promoter sequence analyses showed a c.-157T>C substitution shared by all patients, and a c.-222_-225del related to the Cam haplotype. These results identify new polymorphisms in the HS2-LCR and Gg-globin gene promoter. Further studies are required to determine the correlation between HbF synthesis and the clinical profile of sickle cell anemia patients.
CEQer: a graphical tool for copy number and allelic imbalance detection from whole-exome sequencing data.

PubMed

Piazza, Rocco; Magistroni, Vera; Pirola, Alessandra; Redaelli, Sara; Spinelli, Roberta; Redaelli, Serena; Galbiati, Marta; Valletta, Simona; Giudici, Giovanni; Cazzaniga, Giovanni; Gambacorti-Passerini, Carlo

2013-01-01

Copy number alterations (CNA) are common events occurring in leukaemias and solid tumors. Comparative Genome Hybridization (CGH) is actually the gold standard technique to analyze CNAs; however, CGH analysis requires dedicated instruments and is able to perform only low resolution Loss of Heterozygosity (LOH) analyses. Here we present CEQer (Comparative Exome Quantification analyzer), a new graphical, event-driven tool for CNA/allelic-imbalance (AI) coupled analysis of exome sequencing data. By using case-control matched exome data, CEQer performs a comparative digital exonic quantification to generate CNA data and couples this information with exome-wide LOH and allelic imbalance detection. This data is used to build mixed statistical/heuristic models allowing the identification of CNA/AI events. To test our tool, we initially used in silico generated data, then we performed whole-exome sequencing from 20 leukemic specimens and corresponding matched controls and we analyzed the results using CEQer. Taken globally, these analyses showed that the combined use of comparative digital exon quantification and LOH/AI allows generating very accurate CNA data. Therefore, we propose CEQer as an efficient, robust and user-friendly graphical tool for the identification of CNA/AI in the context of whole-exome sequencing data.
Complete genome sequence of 285P, a novel T7-like polyvalent E. coli bacteriophage.

PubMed

Xu, Bin; Ma, Xiangyu; Xiong, Hongyan; Li, Yafei

2014-06-01

Bacteriophages are considered potential biological agents for the control of infectious diseases and environmental disinfection. Here, we describe a novel T7-like polyvalent Escherichia coli bacteriophage, designated "285P," which can lyse several strains of E. coli. The genome, which consists of 39,270 base pairs with a G+C content of 48.73 %, was sequenced and annotated. Forty-three potential open reading frames were identified using bioinformatics tools. Based on whole-genome sequence comparison, phage 285P was identified as a novel strain of subgroup T7. It showed strongest sequence similarity to Kluyvera phage Kvp1. The phylogenetic analyses of both non-structural proteins (endonuclease gp3, amidase gp3.5, DNA primase/helicase gp4, DNA polymerase gp5, and exonuclease gp6) and structural protein (tail fiber protein gp17) led to the identification of 285P as T7-like phage. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis and matrix-assisted laser desorption/ionization time-of-flight mass spectrometric analyses verified the annotation of the structural proteins (major capsid protein gp10a, tail protein gp12, and tail fiber protein gp17).
Dynamically heterogenous partitions and phylogenetic inference: an evaluation of analytical strategies with cytochrome b and ND6 gene sequences in cranes.

PubMed

Krajewski, C; Fain, M G; Buckley, L; King, D G

1999-11-01

ki ctes over whether molecular sequence data should be partitioned for phylogenetic analysis often confound two types of heterogeneity among partitions. We distinguish historical heterogeneity (i.e., different partitions have different evolutionary relationships) from dynamic heterogeneity (i.e., different partitions show different patterns of sequence evolution) and explore the impact of the latter on phylogenetic accuracy and precision with a two-gene, mitochondrial data set for cranes. The well-established phylogeny of cranes allows us to contrast tree-based estimates of relevant parameter values with estimates based on pairwise comparisons and to ascertain the effects of incorporating different amounts of process information into phylogenetic estimates. We show that codon positions in the cytochrome b and NADH dehydrogenase subunit 6 genes are dynamically heterogenous under both Poisson and invariable-sites + gamma-rates versions of the F84 model and that heterogeneity includes variation in base composition and transition bias as well as substitution rate. Estimates of transition-bias and relative-rate parameters from pairwise sequence comparisons were comparable to those obtained as tree-based maximum likelihood estimates. Neither rate-category nor mixed-model partitioning strategies resulted in a loss of phylogenetic precision relative to unpartitioned analyses. We suggest that weighted-average distances provide a computationally feasible alternative to direct maximum likelihood estimates of phylogeny for mixed-model analyses of large, dynamically heterogenous data sets. Copyright 1999 Academic Press.
Mitochondrial genome sequencing helps show the evolutionary mechanism of mitochondrial genome formation in Brassica

PubMed Central

2011-01-01

Background Angiosperm mitochondrial genomes are more complex than those of other organisms. Analyses of the mitochondrial genome sequences of at least 11 angiosperm species have showed several common properties; these cannot easily explain, however, how the diverse mitotypes evolved within each genus or species. We analyzed the evolutionary relationships of Brassica mitotypes by sequencing. Results We sequenced the mitotypes of cam (Brassica rapa), ole (B. oleracea), jun (B. juncea), and car (B. carinata) and analyzed them together with two previously sequenced mitotypes of B. napus (pol and nap). The sizes of whole single circular genomes of cam, jun, ole, and car are 219,747 bp, 219,766 bp, 360,271 bp, and 232,241 bp, respectively. The mitochondrial genome of ole is largest as a resulting of the duplication of a 141.8 kb segment. The jun mitotype is the result of an inherited cam mitotype, and pol is also derived from the cam mitotype with evolutionary modifications. Genes with known functions are conserved in all mitotypes, but clear variation in open reading frames (ORFs) with unknown functions among the six mitotypes was observed. Sequence relationship analysis showed that there has been genome compaction and inheritance in the course of Brassica mitotype evolution. Conclusions We have sequenced four Brassica mitotypes, compared six Brassica mitotypes and suggested a mechanism for mitochondrial genome formation in Brassica, including evolutionary events such as inheritance, duplication, rearrangement, genome compaction, and mutation. PMID:21988783

Refined NrfA phylogeny improves PCR-based nrfA gene detection

USDA-ARS?s Scientific Manuscript database

Dissimilatory nitrate reduction to ammonium (DNRA) promotes N-retention in the terrestrial nitrogen- (N-) cycle. Respiratory nitrite reduction to ammonium is catalyzed by the nitrite reductase NrfA. Prior phylogenetic analyses showed that NrfA divided into18 distinct clades amongst available sequenc...
A review of bioinformatic methods for forensic DNA analyses.

PubMed

Liu, Yao-Yuan; Harbison, SallyAnn

2018-03-01

Short tandem repeats, single nucleotide polymorphisms, and whole mitochondrial analyses are three classes of markers which will play an important role in the future of forensic DNA typing. The arrival of massively parallel sequencing platforms in forensic science reveals new information such as insights into the complexity and variability of the markers that were previously unseen, along with amounts of data too immense for analyses by manual means. Along with the sequencing chemistries employed, bioinformatic methods are required to process and interpret this new and extensive data. As more is learnt about the use of these new technologies for forensic applications, development and standardization of efficient, favourable tools for each stage of data processing is being carried out, and faster, more accurate methods that improve on the original approaches have been developed. As forensic laboratories search for the optimal pipeline of tools, sequencer manufacturers have incorporated pipelines into sequencer software to make analyses convenient. This review explores the current state of bioinformatic methods and tools used for the analyses of forensic markers sequenced on the massively parallel sequencing (MPS) platforms currently most widely used. Copyright © 2017 Elsevier B.V. All rights reserved.
Obtaining a more resolute teleost growth hormone phylogeny by the introduction of gaps in sequence alignment.

PubMed

Rubin, D A; Dores, R M

1995-06-01

In order to obtain a more resolute phylogeny of teleosts based on growth hormone (GH) sequences, phylogenetic analyses were performed in which deletions (gaps), which appear to be order specific, were upheld to maintain GH's structural information. Sequences were analyzed at 194 amino acid positions. In addition, the two closest genealogically related groups to the teleosts, Amia calva and Acipenser guldenstadti, were used as outgroups. Modified sequence alignments were also analyzed to determine clade stability. Analyses indicated, in the most parsimonious cladogram, that molecular and morphological relationships for the orders of fishes are congruent. With GH molecular sequence data it was possible to resolve all clades at the familial level. Analyses of the primary sequence data indicate that: (a) the halecomorphean and chondrostean GH sequences are the appropriate outgroups for generating the most parsimonious cladogram for teleosts; (b) proper alignment of teleost GH sequence by the inclusion of gaps is necessary for resolution of the Percomorpha; and (c) removal of sequence information by deleting improperly aligned sequence decreases the phylogenetic signal obtained.
Arbuscular mycorrhizal fungi (Glomeromycota) harbour ancient fungal tubulin genes that resemble those of the chytrids (Chytridiomycota).

PubMed

Corradi, Nicolas; Hijri, Mohamed; Fumagalli, Luca; Sanders, Ian R

2004-11-01

The genes encoding alpha- and beta-tubulins have been widely sampled in most major fungal phyla and they are useful tools for fungal phylogeny. Here, we report the first isolation of alpha-tubulin sequences from arbuscular mycorrhizal fungi (AMF). In parallel, AMF beta-tubulins were sampled and analysed to identify the presence of paralogs of this gene. The AMF alpha-tubulin amino acid phylogeny was congruent with the results previously reported for AMF beta-tubulins and showed that AMF tubulins group together at a basal position in the fungal clade and showed high sequence similarities with members of the Chytridiomycota. This is in contrast with phylogenies for other regions of the AMF genome. The amount and nature of substitutions are consistent with an ancient divergence of both orthologs and paralogs of AMF tubulins. At the amino acid level, however, AMF tubulins have hardly evolved from those of the chytrids. This is remarkable given that these two groups are ancient and the monophyletic Glomeromycota probably diverged from basal fungal ancestors at least 500 million years ago. The specific primers we designed for the AMF tubulins, together with the high molecular variation we found among the AMF species we analysed, make AMF tubulin sequences potentially useful for AMF identification purposes.
Toxicity phenotype does not correlate with phylogeny of Cylindrospermopsis raciborskii strains.

PubMed

Stucken, Karina; Murillo, Alejandro A; Soto-Liebe, Katia; Fuentes-Valdés, Juan J; Méndez, Marco A; Vásquez, Mónica

2009-02-01

Cylindrospermopsis raciborskii is a species of freshwater, bloom-forming cyanobacterium. C. raciborskii produces toxins, including cylindrospermopsin (hepatotoxin) and saxitoxin (neurotoxin), although non toxin-producing strains are also observed. In spite of differences in toxicity, C. raciborskii strains comprise a monophyletic group, based upon 16S rRNA gene sequence identities (greater than 99%). We performed phylogenetic analyses; 16S rRNA gene and 16S-23S rRNA gene internally transcribed spacer (ITS-1) sequence comparisons, and genomic DNA restriction fragment length polymorphism (RFLP), resolved by pulsed-field gel electrophoresis (PFGE), of strains of C. raciborskii, obtained mainly from the Australian phylogeographic cluster. Our results showed no correlation between toxic phenotype and phylogenetic association in the Australian strains. Analyses of the 16S rRNA gene and the respective ITS-1 sequences (long L, and short S) showed an independent evolution of each ribosomal operon. The genes putatively involved in the cylindrospermopsin biosynthetic pathway were present in one locus and only in the hepatotoxic strains, demonstrating a common genomic organization for these genes and the absence of mutated or inactivated biosynthetic genes in the non toxic strains. In summary, our results support the hypothesis that the genes involved in toxicity may have been transferred as an island by processes of gene lateral transfer, rather than convergent evolution.
Prevalence of Tobacco mosaic virus in Iran and Evolutionary Analyses of the Coat Protein Gene

PubMed Central

Alishiri, Athar; Rakhshandehroo, Farshad; Zamanizadeh, Hamid-Reza; Palukaitis, Peter

2013-01-01

The incidence and distribution of Tobacco mosaic virus (TMV) and related tobamoviruses was determined using an enzyme-linked immunosorbent assay on 1,926 symptomatic horticultural crops and 107 asymptomatic weed samples collected from 78 highly infected fields in the major horticultural crop-producing areas in 17 provinces throughout Iran. The results were confirmed by host range studies and reverse transcription-polymerase chain reaction. The overall incidence of infection by these viruses in symptomatic plants was 11.3%. The coat protein (CP) gene sequences of a number of isolates were determined and disclosed to be a high identity (up to 100%) among the Iranian isolates. Phylogenetic analysis of all known TMV CP genes showed three clades on the basis of nucleotide sequences with all Iranian isolates distinctly clustered in clade II. Analysis using the complete CP amino acid sequence showed one clade with two subgroups, IA and IB, with Iranian isolates in both subgroups. The nucleotide diversity within each sub-group was very low, but higher between the two clades. No correlation was found between genetic distance and geographical origin or host species of isolation. Statistical analyses suggested a negative selection and demonstrated the occurrence of gene flow from the isolates in other clades to the Iranian population. PMID:25288953
Characterization and Evolution of Cell Division and Cell Wall Synthesis Genes in the Bacterial Phyla Verrucomicrobia, Lentisphaerae, Chlamydiae, and Planctomycetes and Phylogenetic Comparison with rRNA Genes▿ †

PubMed Central

Pilhofer, Martin; Rappl, Kristina; Eckl, Christina; Bauer, Andreas Peter; Ludwig, Wolfgang; Schleifer, Karl-Heinz; Petroni, Giulio

2008-01-01

In the past, studies on the relationships of the bacterial phyla Planctomycetes, Chlamydiae, Lentisphaerae, and Verrucomicrobia using different phylogenetic markers have been controversial. Investigations based on 16S rRNA sequence analyses suggested a relationship of the four phyla, showing the branching order Planctomycetes, Chlamydiae, Verrucomicrobia/Lentisphaerae. Phylogenetic analyses of 23S rRNA genes in this study also support a monophyletic grouping and their branching order—this grouping is significant for understanding cell division, since the major bacterial cell division protein FtsZ is absent from members of two of the phyla Chlamydiae and Planctomycetes. In Verrucomicrobia, knowledge about cell division is mainly restricted to the recent report of ftsZ in the closely related genera Prosthecobacter and Verrucomicrobium. In this study, genes of the conserved division and cell wall (dcw) cluster (ddl, ftsQ, ftsA, and ftsZ) were characterized in all verrucomicrobial subdivisions (1 to 4) with cultivable representatives (1 to 4). Sequence analyses and transcriptional analyses in Verrucomicrobia and genome data analyses in Lentisphaerae suggested that cell division is based on FtsZ in all verrucomicrobial subdivisions and possibly also in the sister phylum Lentisphaerae. Comprehensive sequence analyses of available genome data for representatives of Verrucomicrobia, Lentisphaerae, Chlamydiae, and Planctomycetes strongly indicate that their last common ancestor possessed a conserved, ancestral type of dcw gene cluster and an FtsZ-based cell division mechanism. This implies that Planctomycetes and Chlamydiae may have shifted independently to a non-FtsZ-based cell division mechanism after their separate branchings from their last common ancestor with Verrucomicrobia. PMID:18310338
Sequence and expression analyses of porcine ISG15 and ISG43 genes.

PubMed

Huang, Jiangnan; Zhao, Shuhong; Zhu, Mengjin; Wu, Zhenfang; Yu, Mei

2009-08-01

The coding sequences of porcine interferon-stimulated gene 15 (ISG15) and the interferon-stimulated gene (ISG43) were cloned from swine spleen mRNA. The amino acid sequences deduced from porcine ISG15 and ISG43 genes coding sequence shared 24-75% and 29-83% similarity with ISG15s and ISG43s from other vertebrates, respectively. Structural analyses revealed that porcine ISG15 comprises two ubiquitin homologues motifs (UBQ) domain and a conserved C-terminal LRLRGG conjugating motif. Porcine ISG43 contains an ubiquitin-processing proteases-like domain. Phylogenetic analyses showed that porcine ISG15 and ISG43 were mostly related to rat ISG15 and cattle ISG43, respectively. Using quantitative real-time PCR assay, significant increased expression levels of porcine ISG15 and ISG43 genes were detected in porcine kidney endothelial cells (PK15) cells treated with poly I:C. We also observed the enhanced mRNA expression of three members of dsRNA pattern-recognition receptors (PRR), TLR3, DDX58 and IFIH1, which have been reported to act as critical receptors in inducing the mRNA expression of ISG15 and ISG43 genes. However, we did not detect any induced mRNA expression of IFNalpha and IFNbeta, suggesting that transcriptional activations of ISG15 and ISG43 were mediated through IFN-independent signaling pathway in the poly I:C treated PK15 cells. Association analyses in a Landrace pig population revealed that ISG15 c.347T>C (BstUI) polymorphism and the ISG43 c.953T>G (BccI) polymorphism were significantly associated with hematological parameters and immune-related traits.
Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses

PubMed Central

Callahan, Ben J.; Sankaran, Kris; Fukuyama, Julia A.; McMurdie, Paul J.; Holmes, Susan P.

2016-01-01

High-throughput sequencing of PCR-amplified taxonomic markers (like the 16S rRNA gene) has enabled a new level of analysis of complex bacterial communities known as microbiomes. Many tools exist to quantify and compare abundance levels or OTU composition of communities in different conditions. The sequencing reads have to be denoised and assigned to the closest taxa from a reference database. Common approaches use a notion of 97% similarity and normalize the data by subsampling to equalize library sizes. In this paper, we show that statistical models allow more accurate abundance estimates. By providing a complete workflow in R, we enable the user to do sophisticated downstream statistical analyses, whether parametric or nonparametric. We provide examples of using the R packages dada2, phyloseq, DESeq2, ggplot2 and vegan to filter, visualize and test microbiome data. We also provide examples of supervised analyses using random forests and nonparametric testing using community networks and the ggnetwork package. PMID:27508062
Sequence diversity among badnavirus isolates infecting black pepper and related species in India.

PubMed

Bhat, A I; Sasi, Shina; Revathy, K A; Deeshma, K P; Saji, K V

2014-01-01

The badnavirus, piper yellow mottle virus (PYMoV) is known to infect black pepper (Piper nigrum), betelvine (P. betle) and Indian long pepper (P. longum) in India and other parts of the world. Occurrence of PYMoV or other badnaviruses in other species of Piper and its variability is not reported so far. We have analysed sequence variability in the conserved putative reverse transcriptase (RT)/ribonuclease H (RNase H) coding region of the virus using specific badnavirus primers from 13 virus isolates of black pepper collected from different cultivars and regions and one isolate each from 23 other species of Piper. Of these, four species failed to produce expected amplicon while amplicon from four other species showed more similarities to plant sequences than to badnaviruses. Of the remaining, isolates from black pepper, P. argyrophyllum, P. attenuatum, P. barberi, P. betle, P. colubrinum, P. galeatum, P. longum, P. ornatum, P. sarmentosum and P. trichostachyon showed an identity of >85 % at the nucleotide and >90 % at the amino acid level with PYMoV indicating that they are isolates of PYMoV. On the other hand high sequence variability of 21-43 % at nucleotide and 17-46 % at amino acid level compared to PYMoV was found among isolates infecting P. bababudani, P. chaba, P. peepuloides, P. mullesua and P. thomsonii suggesting the presence of new badnaviruses. Phylogenetic analyses showed close clustering of all PYMoV isolates that were well separated from other known badnaviruses. This is the first report of occurrence of PYMoV in eight Piper spp and likely occurrence of four new species in five Piper spp.
Sequence Analysis of Leuconostoc mesenteroides Bacteriophage Φ1-A4 Isolated from an Industrial Vegetable Fermentation▿

PubMed Central

Lu, Z.; Altermann, E.; Breidt, F.; Kozyavkin, S.

2010-01-01

Vegetable fermentations rely on the proper succession of a variety of lactic acid bacteria (LAB). Leuconostoc mesenteroides initiates fermentation. As fermentation proceeds, L. mesenteroides dies off and other LAB complete the fermentation. Phages infecting L. mesenteroides may significantly influence the die-off of L. mesenteroides. However, no L. mesenteroides phages have been previously genetically characterized. Knowledge of more phage genome sequences may provide new insights into phage genomics, phage evolution, and phage-host interactions. We have determined the complete genome sequence of L. mesenteroides phage Φ1-A4, isolated from an industrial sauerkraut fermentation. The phage possesses a linear, double-stranded DNA genome consisting of 29,508 bp with a G+C content of 36%. Fifty open reading frames (ORFs) were predicted. Putative functions were assigned to 26 ORFs (52%), including 5 ORFs of structural proteins. The phage genome was modularly organized, containing DNA replication, DNA-packaging, head and tail morphogenesis, cell lysis, and DNA regulation/modification modules. In silico analyses showed that Φ1-A4 is a unique lytic phage with a large-scale genome inversion (∼30% of the genome). The genome inversion encompassed the lysis module, part of the structural protein module, and a cos site. The endolysin gene was flanked by two holin genes. The tail morphogenesis module was interspersed with cell lysis genes and other genes with unknown functions. The predicted amino acid sequences of the phage proteins showed little similarity to other phages, but functional analyses showed that Φ1-A4 clusters with several Lactococcus phages. To our knowledge, Φ1-A4 is the first genetically characterized L. mesenteroides phage. PMID:20118355
Life-history, substrate choice and Cytochrome Oxidase I variations in sandy beach peracaridans along the Rio de la Plata estuary

NASA Astrophysics Data System (ADS)

Fanini, L.; Zampicinini, G.; Tsigenopoulos, C. S.; Barboza, F. R.; Lozoya, J. P.; Gómez, J.; Celentano, E.; Lercari, D.; Marchetti, G. M.; Defeo, O.

2017-03-01

Life-history, substrate choice and Cytochrome Oxidase I (COI) sequences were analysed in populations of two peracaridans, the supralittoral talitrid Atlantorchestoidea brasiliensis and the intertidal cirolanid Excirolana armata. Three populations of each species, from beaches with similar grain size and located at different points along the natural gradient generated by the Rio de la Plata estuary were analysed. Abundance of E. armata increased with distance from the estuary, while the opposite trend was observed for A. brasiliensis. The proportion of females decreased towards high salinities for both species, significantly for E. armata. A test on substrate salinity preference revealed the absence of patterns due to active choice in E. armata. By contrast, A. brasiliensis showed no preference for the population closer to the estuary, while individuals from the other two sites significantly preferred high salinity substrates. Mitochondrial COI sequences were obtained from A. brasiliensis specimens tested for behaviour. Sequence analysis showed the population from the intermediate site to differ significantly from the other two. No significant genetic differentiation was instead found between populations from the two most distant sites, nor between individuals that expressed different salinity preference. Results showed that diverse sets of traits at the population level enable sandy beach species to cope with local environmental changes: life-history and behavioural traits appear to change in response to different ecological conditions, and, in the case of A brasiliensis, independently of the population structure inferred from COI sequence variation. Information from multiple traits allowed detection of population profiles, highlighting the relevance of multidisciplinary information and the concurrent analysis of field data and laboratory experiments, to detect responses of resident biota to environmental changes.
Identification of food and beverage spoilage yeasts from DNA sequence analyses

USDA-ARS?s Scientific Manuscript database

Detection, identification, and classification of yeasts has undergone a major transformation in the last decade and a half following application of gene sequence analyses and genome comparisons. Development of a database (barcode) of easily determined DNA sequences from domains 1 and 2 (D1/D2) of th...
Enabling large-scale next-generation sequence assembly with Blacklight

PubMed Central

Couger, M. Brian; Pipes, Lenore; Squina, Fabio; Prade, Rolf; Siepel, Adam; Palermo, Robert; Katze, Michael G.; Mason, Christopher E.; Blood, Philip D.

2014-01-01

Summary A variety of extremely challenging biological sequence analyses were conducted on the XSEDE large shared memory resource Blacklight, using current bioinformatics tools and encompassing a wide range of scientific applications. These include genomic sequence assembly, very large metagenomic sequence assembly, transcriptome assembly, and sequencing error correction. The data sets used in these analyses included uncategorized fungal species, reference microbial data, very large soil and human gut microbiome sequence data, and primate transcriptomes, composed of both short-read and long-read sequence data. A new parallel command execution program was developed on the Blacklight resource to handle some of these analyses. These results, initially reported previously at XSEDE13 and expanded here, represent significant advances for their respective scientific communities. The breadth and depth of the results achieved demonstrate the ease of use, versatility, and unique capabilities of the Blacklight XSEDE resource for scientific analysis of genomic and transcriptomic sequence data, and the power of these resources, together with XSEDE support, in meeting the most challenging scientific problems. PMID:25294974
Sequencing of Dust Filter Production Process Using Design Structure Matrix (DSM)

NASA Astrophysics Data System (ADS)

Sari, R. M.; Matondang, A. R.; Syahputri, K.; Anizar; Siregar, I.; Rizkya, I.; Ursula, C.

2018-01-01

Metal casting company produces machinery spare part for manufactures. One of the product produced is dust filter. Most of palm oil mill used this product. Since it is used in most of palm oil mill, company often have problems to address this product. One of problem is the disordered of production process. It carried out by the job sequencing. The important job that should be solved first, least implement, while less important job and could be completed later, implemented first. Design Structure Matrix (DSM) used to analyse and determine priorities in the production process. DSM analysis is sort of production process through dependency sequencing. The result of dependency sequences shows the sequence process according to the inter-process linkage considering before and after activities. Finally, it demonstrates their activities to the coupled activities for metal smelting, refining, grinding, cutting container castings, metal expenditure of molds, metal casting, coating processes, and manufacture of molds of sand.
Phylogenomics and taxonomy of Lecomtelleae (Poaceae), an isolated panicoid lineage from Madagascar

PubMed Central

Besnard, Guillaume; Christin, Pascal-Antoine; Malé, Pierre-Jean G.; Coissac, Eric; Ralimanana, Hélène; Vorontsova, Maria S.

2013-01-01

Background and Aims An accurate characterization of biodiversity requires analyses of DNA sequences in addition to classical morphological descriptions. New methods based on high-throughput sequencing may allow investigation of specimens with a large set of genetic markers to infer their evolutionary history. In the grass family, the phylogenetic position of the monotypic genus Lecomtella, a rare bamboo-like endemic from Madagascar, has never been appropriately evaluated. Until now its taxonomic treatment has remained controversial, indicating the need for re-evaluation based on a combination of molecular and morphological data. Methods The phylogenetic position of Lecomtella in Poaceae was evaluated based on sequences from the nuclear and plastid genomes generated by next-generation sequencing (NGS). In addition, a detailed morphological description of L. madagascariensis was produced, and its distribution and habit were investigated in order to assess its conservation status. Key Results The complete plastid sequence, a ribosomal DNA unit and fragments of low-copy nuclear genes (phyB and ppc) were obtained. All phylogenetic analyses place Lecomtella as an isolated member of the core panicoids, which last shared a common ancestor with other species >20 million years ago. Although Lecomtella exhibits morphological characters typical of Panicoideae, an unusual combination of traits supports its treatment as a separate group. Conclusions The study showed that NGS can be used to generate abundant phylogenetic information rapidly, opening new avenues for grass phylogenetics. These data clearly showed that Lecomtella forms an isolated lineage, which, in combination with its morphological peculiarities, justifies its treatment as a separate tribe: Lecomtelleae. New descriptions of the tribe, genus and species are presented with a typification, a distribution map and an IUCN conservation assessment. PMID:23985988
Phylogenomics and taxonomy of Lecomtelleae (Poaceae), an isolated panicoid lineage from Madagascar.

PubMed

Besnard, Guillaume; Christin, Pascal-Antoine; Malé, Pierre-Jean G; Coissac, Eric; Ralimanana, Hélène; Vorontsova, Maria S

2013-10-01

An accurate characterization of biodiversity requires analyses of DNA sequences in addition to classical morphological descriptions. New methods based on high-throughput sequencing may allow investigation of specimens with a large set of genetic markers to infer their evolutionary history. In the grass family, the phylogenetic position of the monotypic genus Lecomtella, a rare bamboo-like endemic from Madagascar, has never been appropriately evaluated. Until now its taxonomic treatment has remained controversial, indicating the need for re-evaluation based on a combination of molecular and morphological data. The phylogenetic position of Lecomtella in Poaceae was evaluated based on sequences from the nuclear and plastid genomes generated by next-generation sequencing (NGS). In addition, a detailed morphological description of L. madagascariensis was produced, and its distribution and habit were investigated in order to assess its conservation status. The complete plastid sequence, a ribosomal DNA unit and fragments of low-copy nuclear genes (phyB and ppc) were obtained. All phylogenetic analyses place Lecomtella as an isolated member of the core panicoids, which last shared a common ancestor with other species >20 million years ago. Although Lecomtella exhibits morphological characters typical of Panicoideae, an unusual combination of traits supports its treatment as a separate group. The study showed that NGS can be used to generate abundant phylogenetic information rapidly, opening new avenues for grass phylogenetics. These data clearly showed that Lecomtella forms an isolated lineage, which, in combination with its morphological peculiarities, justifies its treatment as a separate tribe: Lecomtelleae. New descriptions of the tribe, genus and species are presented with a typification, a distribution map and an IUCN conservation assessment.
Genomic Analyses Yield Markers for Identifying Agronomically Important Genes in Potato

USDA-ARS?s Scientific Manuscript database

This study explores the genetic architecture underling the potato evolution through a comprehensive assessment of wild and cultivated potato species based on the re-sequencing of 201 accessions of Solanum section Petota with >12 × genome coverage. We identified 450 domesticated genes, which showed e...
Sexual reproduction as the cause of heat resistance in the food spoilage fungus Byssochlamys spectabilis (anamorph Paecilomyces variotii).

PubMed

Houbraken, Jos; Varga, János; Rico-Munoz, Emilia; Johnson, Shawn; Samson, Robert A

2008-03-01

Paecilomyces variotii is a common cosmopolitan species that is able to spoil various food- and feedstuffs and is frequently encountered in heat-treated products. However, isolates from heat-treated products rarely form ascospores. In this study we examined by using molecular techniques and mating tests whether this species can undergo a sexual cycle and form ascospores. The population structure of this species was examined by analyzing the nuclear ribosomal internal transcribed spacer 1 (ITS1) and ITS2 and the 5.8S rRNA gene, as well as partial beta-tubulin, actin, and calmodulin gene sequences. Phylogenetic analyses revealed that P. variotii is a highly variable species. Partition homogeneity tests revealed that P. variotii has a recombining population structure. In addition to sequence analyses, mating experiments indicated that P. variotii is able to form ascomata and ascospores in culture in a heterothallic manner. The distribution of MAT1-1 and MAT1-2 genes showed a 1:1 ratio in the progeny of the mating experiments. From the sequence analyses and mating data we conclude that P. variotii is the anamorph of Talaromyces spectabilis and that it has a biallelic heterothallic mating system. Since Paecilomyces sensu stricto anamorphs group within Byssochlamys, a new combination Byssochlamys spectabilis is proposed.
Differentiation of Xylella fastidiosa Strains via Multilocus Sequence Analysis of Environmentally Mediated Genes (MLSA-E)

PubMed Central

Parker, Jennifer K.; Havird, Justin C.

2012-01-01

Isolates of the plant pathogen Xylella fastidiosa are genetically very similar, but studies on their biological traits have indicated differences in virulence and infection symptomatology. Taxonomic analyses have identified several subspecies, and phylogenetic analyses of housekeeping genes have shown broad host-based genetic differences; however, results are still inconclusive for genetic differentiation of isolates within subspecies. This study employs multilocus sequence analysis of environmentally mediated genes (MLSA-E; genes influenced by environmental factors) to investigate X. fastidiosa relationships and differentiate isolates with low genetic variability. Potential environmentally mediated genes, including host colonization and survival genes related to infection establishment, were identified a priori. The ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions (dN/dS) was calculated to select genes that may be under increased positive selection compared to previously studied housekeeping genes. Nine genes were sequenced from 54 X. fastidiosa isolates infecting different host plants across the United States. Results of maximum likelihood (ML) and Bayesian phylogenetic (BP) analyses are in agreement with known X. fastidiosa subspecies clades but show novel within-subspecies differentiation, including geographic differentiation, and provide additional information regarding host-based isolate variation and specificity. dN/dS ratios of environmentally mediated genes, though <1 due to high sequence similarity, are significantly greater than housekeeping gene dN/dS ratios and correlate with increased sequence variability. MLSA-E can more precisely resolve relationships between closely related bacterial strains with low genetic variability, such as X. fastidiosa isolates. Discovering the genetic relationships between X. fastidiosa isolates will provide new insights into the epidemiology of populations of X. fastidiosa, allowing improved disease management in economically important crops. PMID:22194287

Differentiation of Xylella fastidiosa strains via multilocus sequence analysis of environmentally mediated genes (MLSA-E).

PubMed

Parker, Jennifer K; Havird, Justin C; De La Fuente, Leonardo

2012-03-01

Isolates of the plant pathogen Xylella fastidiosa are genetically very similar, but studies on their biological traits have indicated differences in virulence and infection symptomatology. Taxonomic analyses have identified several subspecies, and phylogenetic analyses of housekeeping genes have shown broad host-based genetic differences; however, results are still inconclusive for genetic differentiation of isolates within subspecies. This study employs multilocus sequence analysis of environmentally mediated genes (MLSA-E; genes influenced by environmental factors) to investigate X. fastidiosa relationships and differentiate isolates with low genetic variability. Potential environmentally mediated genes, including host colonization and survival genes related to infection establishment, were identified a priori. The ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions (dN/dS) was calculated to select genes that may be under increased positive selection compared to previously studied housekeeping genes. Nine genes were sequenced from 54 X. fastidiosa isolates infecting different host plants across the United States. Results of maximum likelihood (ML) and Bayesian phylogenetic (BP) analyses are in agreement with known X. fastidiosa subspecies clades but show novel within-subspecies differentiation, including geographic differentiation, and provide additional information regarding host-based isolate variation and specificity. dN/dS ratios of environmentally mediated genes, though <1 due to high sequence similarity, are significantly greater than housekeeping gene dN/dS ratios and correlate with increased sequence variability. MLSA-E can more precisely resolve relationships between closely related bacterial strains with low genetic variability, such as X. fastidiosa isolates. Discovering the genetic relationships between X. fastidiosa isolates will provide new insights into the epidemiology of populations of X. fastidiosa, allowing improved disease management in economically important crops.
Structural studies of the Sputnik virophage.

PubMed

Sun, Siyang; La Scola, Bernard; Bowman, Valorie D; Ryan, Christopher M; Whitelegge, Julian P; Raoult, Didier; Rossmann, Michael G

2010-01-01

The virophage Sputnik is a satellite virus of the giant mimivirus and is the only satellite virus reported to date whose propagation adversely affects its host virus' production. Genome sequence analysis showed that Sputnik has genes related to viruses infecting all three domains of life. Here, we report structural studies of Sputnik, which show that it is about 740 A in diameter, has a T=27 icosahedral capsid, and has a lipid membrane inside the protein shell. Structural analyses suggest that the major capsid protein of Sputnik is likely to have a double jelly-roll fold, although sequence alignments do not show any detectable similarity with other viral double jelly-roll capsid proteins. Hence, the origin of Sputnik's capsid might have been derived from other viruses prior to its association with mimivirus.
Structural Studies of the Sputnik Virophage▿

PubMed Central

Sun, Siyang; La Scola, Bernard; Bowman, Valorie D.; Ryan, Christopher M.; Whitelegge, Julian P.; Raoult, Didier; Rossmann, Michael G.

2010-01-01

The virophage Sputnik is a satellite virus of the giant mimivirus and is the only satellite virus reported to date whose propagation adversely affects its host virus' production. Genome sequence analysis showed that Sputnik has genes related to viruses infecting all three domains of life. Here, we report structural studies of Sputnik, which show that it is about 740 Å in diameter, has a T=27 icosahedral capsid, and has a lipid membrane inside the protein shell. Structural analyses suggest that the major capsid protein of Sputnik is likely to have a double jelly-roll fold, although sequence alignments do not show any detectable similarity with other viral double jelly-roll capsid proteins. Hence, the origin of Sputnik's capsid might have been derived from other viruses prior to its association with mimivirus. PMID:19889775
Analysis of Facultative Lithotroph Distribution and Diversity on Volcanic Deposits by Use of the Large Subunit of Ribulose 1,5-Bisphosphate Carboxylase/Oxygenase†

PubMed Central

Nanba, K.; King, G. M.; Dunfield, K.

2004-01-01

A 492- to 495-bp fragment of the gene coding for the large subunit of the form I ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) (rbcL) was amplified by PCR from facultatively lithotrophic aerobic CO-oxidizing bacteria, colorless and purple sulfide-oxidizing microbial mats, and genomic DNA extracts from tephra and ash deposits from Kilauea volcano, for which atmospheric CO and hydrogen have been previously documented as important substrates. PCR products from the mats and volcanic sites were used to construct rbcL clone libraries. Phylogenetic analyses showed that the rbcL sequences from all isolates clustered with form IC rbcL sequences derived from facultative lithotrophs. In contrast, the microbial mat clone sequences clustered with sequences from obligate lithotrophs representative of form IA rbcL. Clone sequences from volcanic sites fell within the form IC clade, suggesting that these sites were dominated by facultative lithotrophs, an observation consistent with biogeochemical patterns at the sites. Based on phylogenetic and statistical analyses, clone libraries differed significantly among volcanic sites, indicating that they support distinct lithotrophic assemblages. Although some of the clone sequences were similar to known rbcL sequences, most were novel. Based on nucleotide diversity and average pairwise difference, a forested site and an 1894 lava flow were found to support the most diverse and least diverse lithotrophic populations, respectively. These indices of diversity were not correlated with rates of atmospheric CO and hydrogen uptake but were correlated with estimates of respiration and microbial biomass. PMID:15066819
Analysis of facultative lithotroph distribution and diversity on volcanic deposits by use of the large subunit of ribulose 1,5-bisphosphate carboxylase/oxygenase.

PubMed

Nanba, K; King, G M; Dunfield, K

2004-04-01

A 492- to 495-bp fragment of the gene coding for the large subunit of the form I ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) (rbcL) was amplified by PCR from facultatively lithotrophic aerobic CO-oxidizing bacteria, colorless and purple sulfide-oxidizing microbial mats, and genomic DNA extracts from tephra and ash deposits from Kilauea volcano, for which atmospheric CO and hydrogen have been previously documented as important substrates. PCR products from the mats and volcanic sites were used to construct rbcL clone libraries. Phylogenetic analyses showed that the rbcL sequences from all isolates clustered with form IC rbcL sequences derived from facultative lithotrophs. In contrast, the microbial mat clone sequences clustered with sequences from obligate lithotrophs representative of form IA rbcL. Clone sequences from volcanic sites fell within the form IC clade, suggesting that these sites were dominated by facultative lithotrophs, an observation consistent with biogeochemical patterns at the sites. Based on phylogenetic and statistical analyses, clone libraries differed significantly among volcanic sites, indicating that they support distinct lithotrophic assemblages. Although some of the clone sequences were similar to known rbcL sequences, most were novel. Based on nucleotide diversity and average pairwise difference, a forested site and an 1894 lava flow were found to support the most diverse and least diverse lithotrophic populations, respectively. These indices of diversity were not correlated with rates of atmospheric CO and hydrogen uptake but were correlated with estimates of respiration and microbial biomass.
Molecular characterization of Giardia psittaci by multilocus sequence analysis.

PubMed

Abe, Niichiro; Makino, Ikuko; Kojima, Atsushi

2012-12-01

Multilocus sequence analyses targeting small subunit ribosomal DNA (SSU rDNA), elongation factor 1 alpha (ef1α), glutamate dehydrogenase (gdh), and beta giardin (β-giardin) were performed on Giardia psittaci isolates from three Budgerigars (Melopsittacus undulates) and four Barred parakeets (Bolborhynchus lineola) kept in individual households or imported from overseas. Nucleotide differences and phylogenetic analyses at four loci indicate the distinction of G. psittaci from the other known Giardia species: Giardia muris, Giardia microti, Giardia ardeae, and Giardia duodenalis assemblages. Furthermore, G. psittaci was related more closely to G. duodenalis than to the other known Giardia species, except for G. microti. Conflicting signals regarded as "double peaks" were found at the same nucleotide positions of the ef1α in all isolates. However, the sequences of the other three loci, including gdh and β-giardin, which are known to be highly variable, from all isolates were also mutually identical at every locus. They showed no double peaks. These results suggest that double peaks found in the ef1α sequences are caused not by mixed infection with genetically different G. psittaci isolates but by allelic sequence heterogeneity (ASH), which is observed in diplomonad lineages including G. duodenalis. No sequence difference was found in any G. psittaci isolates at the gdh and β-giardin, suggesting that G. psittaci is indeed not more diverse genetically than other Giardia species. This report is the first to provide evidence related to the genetic characteristics of G. psittaci obtained using multilocus sequence analysis. Copyright © 2012 Elsevier B.V. All rights reserved.
Highly effective sequencing whole chloroplast genomes of angiosperms by nine novel universal primer pairs.

PubMed

Yang, Jun-Bo; Li, De-Zhu; Li, Hong-Tao

2014-09-01

Chloroplast genomes supply indispensable information that helps improve the phylogenetic resolution and even as organelle-scale barcodes. Next-generation sequencing technologies have helped promote sequencing of complete chloroplast genomes, but compared with the number of angiosperms, relatively few chloroplast genomes have been sequenced. There are two major reasons for the paucity of completely sequenced chloroplast genomes: (i) massive amounts of fresh leaves are needed for chloroplast sequencing and (ii) there are considerable gaps in the sequenced chloroplast genomes of many plants because of the difficulty of isolating high-quality chloroplast DNA, preventing complete chloroplast genomes from being assembled. To overcome these obstacles, all known angiosperm chloroplast genomes available to date were analysed, and then we designed nine universal primer pairs corresponding to the highly conserved regions. Using these primers, angiosperm whole chloroplast genomes can be amplified using long-range PCR and sequenced using next-generation sequencing methods. The primers showed high universality, which was tested using 24 species representing major clades of angiosperms. To validate the functionality of the primers, eight species representing major groups of angiosperms, that is, early-diverging angiosperms, magnoliids, monocots, Saxifragales, fabids, malvids and asterids, were sequenced and assembled their complete chloroplast genomes. In our trials, only 100 mg of fresh leaves was used. The results show that the universal primer set provided an easy, effective and feasible approach for sequencing whole chloroplast genomes in angiosperms. The designed universal primer pairs provide a possibility to accelerate genome-scale data acquisition and will therefore magnify the phylogenetic resolution and species identification in angiosperms. © 2014 John Wiley & Sons Ltd.
Independent assessment and improvement of wheat genome sequence assemblies using Fosill jumping libraries.

PubMed

Lu, Fu-Hao; McKenzie, Neil; Kettleborough, George; Heavens, Darren; Clark, Matthew D; Bevan, Michael W

2018-05-01

The accurate sequencing and assembly of very large, often polyploid, genomes remains a challenging task, limiting long-range sequence information and phased sequence variation for applications such as plant breeding. The 15-Gb hexaploid bread wheat (Triticum aestivum) genome has been particularly challenging to sequence, and several different approaches have recently generated long-range assemblies. Mapping and understanding the types of assembly errors are important for optimising future sequencing and assembly approaches and for comparative genomics. Here we use a Fosill 38-kb jumping library to assess medium and longer-range order of different publicly available wheat genome assemblies. Modifications to the Fosill protocol generated longer Illumina sequences and enabled comprehensive genome coverage. Analyses of two independent Bacterial Artificial Chromosome (BAC)-based chromosome-scale assemblies, two independent Illumina whole genome shotgun assemblies, and a hybrid Single Molecule Real Time (SMRT-PacBio) and short read (Illumina) assembly were carried out. We revealed a surprising scale and variety of discrepancies using Fosill mate-pair mapping and validated several of each class. In addition, Fosill mate-pairs were used to scaffold a whole genome Illumina assembly, leading to a 3-fold increase in N50 values. Our analyses, using an independent means to validate different wheat genome assemblies, show that whole genome shotgun assemblies based solely on Illumina sequences are significantly more accurate by all measures compared to BAC-based chromosome-scale assemblies and hybrid SMRT-Illumina approaches. Although current whole genome assemblies are reasonably accurate and useful, additional improvements will be needed to generate complete assemblies of wheat genomes using open-source, computationally efficient, and cost-effective methods.
Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

PubMed

Hong, Jungeui; Gresham, David

2017-11-01

Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.
Phylogenetic Analysis of Ruminant Theileria spp. from China Based on 28S Ribosomal RNA Gene

PubMed Central

Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze

2013-01-01

Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode. PMID:24327775
Phylogenetic analysis of ruminant Theileria spp. from China based on 28S ribosomal RNA gene.

PubMed

Gou, Huitian; Guan, Guiquan; Ma, Miling; Liu, Aihong; Liu, Zhijie; Xu, Zongke; Ren, Qiaoyun; Li, Youquan; Yang, Jifei; Chen, Ze; Yin, Hong; Luo, Jianxun

2013-10-01

Species identification using DNA sequences is the basis for DNA taxonomy. In this study, we sequenced the ribosomal large-subunit RNA gene sequences (3,037-3,061 bp) in length of 13 Chinese Theileria stocks that were infective to cattle and sheep. The complete 28S rRNA gene is relatively difficult to amplify and its conserved region is not important for phylogenetic study. Therefore, we selected the D2-D3 region from the complete 28S rRNA sequences for phylogenetic analysis. Our analyses of 28S rRNA gene sequences showed that the 28S rRNA was useful as a phylogenetic marker for analyzing the relationships among Theileria spp. in ruminants. In addition, the D2-D3 region was a short segment that could be used instead of the whole 28S rRNA sequence during the phylogenetic analysis of Theileria, and it may be an ideal DNA barcode.
Flexible, fast and accurate sequence alignment profiling on GPGPU with PaSWAS.

PubMed

Warris, Sven; Yalcin, Feyruz; Jackson, Katherine J L; Nap, Jan Peter

2015-01-01

To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.
Small studies may overestimate the effect sizes in critical care meta-analyses: a meta-epidemiological study

PubMed Central

2013-01-01

Introduction Small-study effects refer to the fact that trials with limited sample sizes are more likely to report larger beneficial effects than large trials. However, this has never been investigated in critical care medicine. Thus, the present study aimed to examine the presence and extent of small-study effects in critical care medicine. Methods Critical care meta-analyses involving randomized controlled trials and reported mortality as an outcome measure were considered eligible for the study. Component trials were classified as large (≥100 patients per arm) and small (<100 patients per arm) according to their sample sizes. Ratio of odds ratio (ROR) was calculated for each meta-analysis and then RORs were combined using a meta-analytic approach. ROR<1 indicated larger beneficial effect in small trials. Small and large trials were compared in methodological qualities including sequence generating, blinding, allocation concealment, intention to treat and sample size calculation. Results A total of 27 critical care meta-analyses involving 317 trials were included. Of them, five meta-analyses showed statistically significant RORs <1, and other meta-analyses did not reach a statistical significance. Overall, the pooled ROR was 0.60 (95% CI: 0.53 to 0.68); the heterogeneity was moderate with an I2 of 50.3% (chi-squared = 52.30; P = 0.002). Large trials showed significantly better reporting quality than small trials in terms of sequence generating, allocation concealment, blinding, intention to treat, sample size calculation and incomplete follow-up data. Conclusions Small trials are more likely to report larger beneficial effects than large trials in critical care medicine, which could be partly explained by the lower methodological quality in small trials. Caution should be practiced in the interpretation of meta-analyses involving small trials. PMID:23302257
Mitochondrial genomes of Meloidogyne chitwoodi and M. incognita (Nematoda: Tylenchina): comparative analysis, gene order and phylogenetic relationships with other nematodes.

PubMed

Humphreys-Pereira, Danny A; Elling, Axel A

2014-01-01

Root-knot nematodes (Meloidogyne spp.) are among the most important plant pathogens. In this study, the mitochondrial (mt) genomes of the root-knot nematodes, M. chitwoodi and M. incognita were sequenced. PCR analyses suggest that both mt genomes are circular, with an estimated size of 19.7 and 18.6-19.1kb, respectively. The mt genomes each contain a large non-coding region with tandem repeats and the control region. The mt gene arrangement of M. chitwoodi and M. incognita is unlike that of other nematodes. Sequence alignments of the two Meloidogyne mt genomes showed three translocations; two in transfer RNAs and one in cox2. Compared with other nematode mt genomes, the gene arrangement of M. chitwoodi and M. incognita was most similar to Pratylenchus vulnus. Phylogenetic analyses (Maximum Likelihood and Bayesian inference) were conducted using 78 complete mt genomes of diverse nematode species. Analyses based on nucleotides and amino acids of the 12 protein-coding mt genes showed strong support for the monophyly of class Chromadorea, but only amino acid-based analyses supported the monophyly of class Enoplea. The suborder Spirurina was not monophyletic in any of the phylogenetic analyses, contradicting the Clade III model, which groups Ascaridomorpha, Spiruromorpha and Oxyuridomorpha based on the small subunit ribosomal RNA gene. Importantly, comparisons of mt gene arrangement and tree-based methods placed Meloidogyne as sister taxa of Pratylenchus, a migratory plant endoparasitic nematode, and not with the sedentary endoparasitic Heterodera. Thus, comparative analyses of mt genomes suggest that sedentary endoparasitism in Meloidogyne and Heterodera is based on convergent evolution. Copyright © 2014 Elsevier B.V. All rights reserved.
A genome-wide screening of BEL-Pao like retrotransposons in Anopheles gambiae by the LTR_STRUC program.

PubMed

Marsano, Renè Massimiliano; Caizzi, Ruggiero

2005-09-12

The advanced status of assembly of the nematoceran Anopheles gambiae genomic sequence allowed us to perform a wide genome analysis to looking at the presence of Long Terminal Repeats (LTRs) in the range of 10 kb by means of the LTR_STRUC tool. More than three hundred sequences were retrieved and 210 were treated as putative complete retrotransposons that were individually analysed with respect to known retrotransposons of A. gambiae and D. melanogaster. The results show that the vast majority of the retrotransposons analysed belong to the Ty3/gypsy class and only 8% to the Ty1/copia class. In addition, phylogenetic analysis allowed us to characterize in more detail the relationship of a large BEL-Pao lineage in which a single family was shown to harbour an additional env gene.
Caught in the middle with multiple displacement amplification: the myth of pooling for avoiding multiple displacement amplification bias in a metagenome.

PubMed

Marine, Rachel; McCarren, Coleen; Vorrasane, Vansay; Nasko, Dan; Crowgey, Erin; Polson, Shawn W; Wommack, K Eric

2014-01-30

Shotgun metagenomics has become an important tool for investigating the ecology of microorganisms. Underlying these investigations is the assumption that metagenome sequence data accurately estimates the census of microbial populations. Multiple displacement amplification (MDA) of microbial community DNA is often used in cases where it is difficult to obtain enough DNA for sequencing; however, MDA can result in amplification biases that may impact subsequent estimates of population census from metagenome data. Some have posited that pooling replicate MDA reactions negates these biases and restores the accuracy of population analyses. This assumption has not been empirically tested. Using mock viral communities, we examined the influence of pooling on population-scale analyses. In pooled and single reaction MDA treatments, sequence coverage of viral populations was highly variable and coverage patterns across viral genomes were nearly identical, indicating that initial priming biases were reproducible and that pooling did not alleviate biases. In contrast, control unamplified sequence libraries showed relatively even coverage across phage genomes. MDA should be avoided for metagenomic investigations that require quantitative estimates of microbial taxa and gene functional groups. While MDA is an indispensable technique in applications such as single-cell genomics, amplification biases cannot be overcome by combining replicate MDA reactions. Alternative library preparation techniques should be utilized for quantitative microbial ecology studies utilizing metagenomic sequencing approaches.
Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, Soo-Ik; Hammes, G.G.

1989-11-01

Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chickenmore » and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.« less
kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity.

PubMed

Murray, Kevin D; Webers, Christfried; Ong, Cheng Soon; Borevitz, Justin; Warthmann, Norman

2017-09-01

Modern genomics techniques generate overwhelming quantities of data. Extracting population genetic variation demands computationally efficient methods to determine genetic relatedness between individuals (or "samples") in an unbiased manner, preferably de novo. Rapid estimation of genetic relatedness directly from sequencing data has the potential to overcome reference genome bias, and to verify that individuals belong to the correct genetic lineage before conclusions are drawn using mislabelled, or misidentified samples. We present the k-mer Weighted Inner Product (kWIP), an assembly-, and alignment-free estimator of genetic similarity. kWIP combines a probabilistic data structure with a novel metric, the weighted inner product (WIP), to efficiently calculate pairwise similarity between sequencing runs from their k-mer counts. It produces a distance matrix, which can then be further analysed and visualised. Our method does not require prior knowledge of the underlying genomes and applications include establishing sample identity and detecting mix-up, non-obvious genomic variation, and population structure. We show that kWIP can reconstruct the true relatedness between samples from simulated populations. By re-analysing several published datasets we show that our results are consistent with marker-based analyses. kWIP is written in C++, licensed under the GNU GPL, and is available from https://github.com/kdmurray91/kwip.
From algae to angiosperms–inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes

PubMed Central

2014-01-01

Background Next-generation sequencing has provided a wealth of plastid genome sequence data from an increasingly diverse set of green plants (Viridiplantae). Although these data have helped resolve the phylogeny of numerous clades (e.g., green algae, angiosperms, and gymnosperms), their utility for inferring relationships across all green plants is uncertain. Viridiplantae originated 700-1500 million years ago and may comprise as many as 500,000 species. This clade represents a major source of photosynthetic carbon and contains an immense diversity of life forms, including some of the smallest and largest eukaryotes. Here we explore the limits and challenges of inferring a comprehensive green plant phylogeny from available complete or nearly complete plastid genome sequence data. Results We assembled protein-coding sequence data for 78 genes from 360 diverse green plant taxa with complete or nearly complete plastid genome sequences available from GenBank. Phylogenetic analyses of the plastid data recovered well-supported backbone relationships and strong support for relationships that were not observed in previous analyses of major subclades within Viridiplantae. However, there also is evidence of systematic error in some analyses. In several instances we obtained strongly supported but conflicting topologies from analyses of nucleotides versus amino acid characters, and the considerable variation in GC content among lineages and within single genomes affected the phylogenetic placement of several taxa. Conclusions Analyses of the plastid sequence data recovered a strongly supported framework of relationships for green plants. This framework includes: i) the placement of Zygnematophyceace as sister to land plants (Embryophyta), ii) a clade of extant gymnosperms (Acrogymnospermae) with cycads + Ginkgo sister to remaining extant gymnosperms and with gnetophytes (Gnetophyta) sister to non-Pinaceae conifers (Gnecup trees), and iii) within the monilophyte clade (Monilophyta), Equisetales + Psilotales are sister to Marattiales + leptosporangiate ferns. Our analyses also highlight the challenges of using plastid genome sequences in deep-level phylogenomic analyses, and we provide suggestions for future analyses that will likely incorporate plastid genome sequence data for thousands of species. We particularly emphasize the importance of exploring the effects of different partitioning and character coding strategies. PMID:24533922
The complete genome sequence of a south Indian isolate of Rice tungro spherical virus reveals evidence of genetic recombination between distinct isolates.

PubMed

Sailaja, B; Anjum, Najreen; Patil, Yogesh K; Agarwal, Surekha; Malathi, P; Krishnaveni, D; Balachandran, S M; Viraktamath, B C; Mangrauthia, Satendra K

2013-12-01

In this study, complete genome of a south Indian isolate of Rice tungro spherical virus (RTSV) from Andhra Pradesh (AP) was sequenced, and the predicted amino acid sequence was analysed. The RTSV RNA genome consists of 12,171 nt without the poly(A) tail, encoding a putative typical polyprotein of 3,470 amino acids. Furthermore, cleavage sites and sequence motifs of the polyprotein were predicted. Multiple alignment with other RTSV isolates showed a nucleotide sequence identity of 95% to east Indian isolates and 90% to Philippines isolates. A phylogenetic tree based on complete genome sequence showed that Indian isolates clustered together, while Vt6 and PhilA isolates of Philippines formed two separate clusters. Twelve recombination events were detected in RNA genome of RTSV using the Recombination Detection Program version 3. Recombination analysis suggested significant role of 5' end and central region of genome in virus evolution. Further, AP and Odisha isolates appeared as important RTSV isolates involved in diversification of this virus in India through recombination phenomenon. The new addition of complete genome of first south Indian isolate provided an opportunity to establish the molecular evolution of RTSV through recombination analysis and phylogenetic relationship.

Combined proteomic and molecular approaches for cloning and characterization of copper-zinc superoxide dismutase (Cu, Zn-SOD2) from garlic (Allium sativum).

PubMed

Hadji Sfaxi, Imen; Ezzine, Aymen; Coquet, Laurent; Cosette, Pascal; Jouenne, Thierry; Marzouki, M Nejib

2012-09-01

Superoxide dismutases (SODs; EC 1.15.1.1) are key enzymes in the cells protection against oxidant agents. Thus, SODs play a major role in the protection of aerobic organisms against oxygen-mediated damages. Three SOD isoforms were previously identified by zymogram staining from Allium sativum bulbs. The purified Cu, Zn-SOD2 shows an antagonist effect to an anticancer drug and alleviate cytotoxicity inside tumor cells lines B16F0 (mouse melanoma cells) and PAE (porcine aortic endothelial cells). To extend the characterization of Allium SODs and their corresponding genes, a proteomic approach was applied involving two-dimensional gel electrophoresis and LC-MS/MS analyses. From peptide sequence data obtained by mass spectrometry and sequences homologies, primers were defined and a cDNA fragment of 456 bp was amplified by RT-PCR. The cDNA nucleotide sequence analysis revealed an open reading frame coding for 152 residues. The deduced amino acid sequence showed high identity (82-87%) with sequences of Cu, Zn-SODs from other plant species. Molecular analysis was achieved by a protein 3D structural model.
Genetic variability of Echinococcus granulosus from the Tibetan plateau inferred by mitochondrial DNA sequences.

PubMed

Yan, Ning; Nie, Hua-Ming; Jiang, Zhong-Rong; Yang, Ai-Guo; Deng, Shi-Jin; Guo, Li; Yu, Hua; Yan, Yu-Bao; Tsering, Dawa; Kong, Wei-Shu; Wang, Ning; Wang, Jia-Hai; Xie, Yue; Fu, Yan; Yang, De-Ying; Wang, Shu-Xian; Gu, Xiao-Bin; Peng, Xue-Rong; Yang, Guang-You

2013-09-01

To analyse genetic variability and population structure, 84 isolates of Echinococcus granulosus (Cestoda: Taeniidae) collected from various host species at different sites of the Tibetan plateau in China were sequenced for the whole mitochondrial nad1 (894 bp) and atp6 (513 bp) genes. The vast majority were classified as G1 genotype (n=82), and two samples from human patients in Sichuan province were identified as G3 genotype. Based on the concatenated sequences of nad1+atp6, 28 different haplotypes (NA1-NA28) were identified. A parsimonious network of the concatenated sequence haplotypes showed star-like features in the overall population, with NA1 as the major haplotype in the population networks. By AMOVA it was shown that variation of E. granulosus within the overall population was the main pattern of the total genetic variability. Neutrality indexes of the concatenated sequence (nad1+atp6) were computed by Tajima's D and Fu's Fs tests and showed high negative values for E. granulosus, indicating significant deviations from neutrality. FST and Nm values suggested that the populations were not genetically differentiated. Copyright © 2013 Elsevier B.V. All rights reserved.
Identification of the first nonsense CDSN mutation with expression of a truncated protein causing peeling skin syndrome type B.

PubMed

Mallet, A; Kypriotou, M; George, K; Leclerc, E; Rivero, D; Mazereeuw-Hautier, J; Serre, G; Huber, M; Jonca, N; Hohl, D

2013-12-01

Peeling skin disease (PSD), a generalized inflammatory form of peeling skin syndrome, is caused by autosomal recessive nonsense mutations in the corneodesmosin gene (CDSN). To investigate a novel mutation in CDSN. A 50-year-old white woman showed widespread peeling with erythema and elevated serum IgE. DNA sequencing, immunohistochemistry, Western blot and real-time polymerase chain reaction analyses of skin biopsies were performed in order to study the genetics and to characterize the molecular profile of the disease. Histology showed hyperkeratosis and acanthosis of the epidermis, and inflammatory infiltrates in the dermis. DNA sequencing revealed a homozygous mutation leading to a premature termination codon in CDSN: p.Gly142*. Protein analyses showed reduced expression of a 16-kDa corneodesmosin mutant in the upper epidermal layers, whereas the full-length protein was absent. These results are interesting regarding the genotype-phenotype correlations in diseases caused by CDSN mutations. The PSD-causing CDSN mutations identified heretofore result in total corneodesmosin loss, suggesting that PSD is due to full corneodesmosin deficiency. Here, we show for the first time that a mutant corneodesmosin can be stably expressed in some patients with PSD, and that this truncated protein is very probably nonfunctional. © 2013 British Association of Dermatologists.
The Salmonella In Silico Typing Resource (SISTR): An Open Web-Accessible Tool for Rapidly Typing and Subtyping Draft Salmonella Genome Assemblies.

PubMed

Yoshida, Catherine E; Kruczkiewicz, Peter; Laing, Chad R; Lingohr, Erika J; Gannon, Victor P J; Nash, John H E; Taboada, Eduardo N

2016-01-01

For nearly 100 years serotyping has been the gold standard for the identification of Salmonella serovars. Despite the increasing adoption of DNA-based subtyping approaches, serotype information remains a cornerstone in food safety and public health activities aimed at reducing the burden of salmonellosis. At the same time, recent advances in whole-genome sequencing (WGS) promise to revolutionize our ability to perform advanced pathogen characterization in support of improved source attribution and outbreak analysis. We present the Salmonella In Silico Typing Resource (SISTR), a bioinformatics platform for rapidly performing simultaneous in silico analyses for several leading subtyping methods on draft Salmonella genome assemblies. In addition to performing serovar prediction by genoserotyping, this resource integrates sequence-based typing analyses for: Multi-Locus Sequence Typing (MLST), ribosomal MLST (rMLST), and core genome MLST (cgMLST). We show how phylogenetic context from cgMLST analysis can supplement the genoserotyping analysis and increase the accuracy of in silico serovar prediction to over 94.6% on a dataset comprised of 4,188 finished genomes and WGS draft assemblies. In addition to allowing analysis of user-uploaded whole-genome assemblies, the SISTR platform incorporates a database comprising over 4,000 publicly available genomes, allowing users to place their isolates in a broader phylogenetic and epidemiological context. The resource incorporates several metadata driven visualizations to examine the phylogenetic, geospatial and temporal distribution of genome-sequenced isolates. As sequencing of Salmonella isolates at public health laboratories around the world becomes increasingly common, rapid in silico analysis of minimally processed draft genome assemblies provides a powerful approach for molecular epidemiology in support of public health investigations. Moreover, this type of integrated analysis using multiple sequence-based methods of sub-typing allows for continuity with historical serotyping data as we transition towards the increasing adoption of genomic analyses in epidemiology. The SISTR platform is freely available on the web at https://lfz.corefacility.ca/sistr-app/.
Molecular characterization of putative Hepatozoon sp. from the sedge warbler (Acrocephalus schoenobaenus).

PubMed

Biedrzycka, Aleksandra; Kloch, Agnieszka; Migalska, Magdalena; Bielański, Wojciech

2013-05-01

We characterized partial sequences of 18S rDNA from sedge warblers infected with a parasite described previously as Hepatozoon kabeeni. Prevalence was 47% in sampled birds.We detected 3 parasite haplotypes in 62 sequenced samples from infected animals. In phylogenetic analyses, 2 of the putative Hepatozoon haplotypes closely resembled Lankesterella minima and L. valsainensis. The third haplotype grouped in a wider clade composed of Caryospora and Eimeria. None of the haplotypes showed resemblance to sequences of Hepatozoon from reptiles and mammals. Molecular detection results were consistent with those from microscopy of stained blood smears, confirming that the primers indeed amplified the parasite sequences. Here we provide evidence that the avian Hepatozoon-like parasites are most likely Lankesterella, supporting the suggestion that the systematic position of avian Hepatozoon-like species needs to be revised.
Complete genome sequence of Menghai rhabdovirus, a novel mosquito-borne rhabdovirus from China.

PubMed

Sun, Qiang; Zhao, Qiumin; An, Xiaoping; Guo, Xiaofang; Zuo, Shuqing; Zhang, Xianglilan; Pei, Guangqian; Liu, Wenli; Cheng, Shi; Wang, Yunfei; Shu, Peng; Mi, Zhiqiang; Huang, Yong; Zhang, Zhiyi; Tong, Yigang; Zhou, Hongning; Zhang, Jiusong

2017-04-01

Menghai rhabdovirus (MRV) was isolated from Aedes albopictus in Menghai county of Yunnan Province, China, in August 2010. Whole-genome sequencing of MRV was performed using an Ion PGM™ Sequencer. We found that MRV is a single-stranded, negative-sense RNA virus. The complete genome of MRV has 10,744 nt, with short inverted repeat termini, encoding five typical rhabdovirus proteins (N, P, M, G, and L) and an additional small hypothetical protein. Nucleotide BLAST analysis using the BLASTn method showed that the genome sequence most similar to that of MRV is that of Arboretum virus (NC_025393.1), with a Max score of 322, query coverage of 14%, and 66% identity. Genomic and phylogenetic analyses both demonstrated that MRV should be considered a member of a novel species of the family Rhabdoviridae.
Controllability of Deterministic Networks with the Identical Degree Sequence

PubMed Central

Ma, Xiujuan; Zhao, Haixing; Wang, Binghong

2015-01-01

Controlling complex network is an essential problem in network science and engineering. Recent advances indicate that the controllability of complex network is dependent on the network's topology. Liu and Barabási, et.al speculated that the degree distribution was one of the most important factors affecting controllability for arbitrary complex directed network with random link weights. In this paper, we analysed the effect of degree distribution to the controllability for the deterministic networks with unweighted and undirected. We introduce a class of deterministic networks with identical degree sequence, called (x,y)-flower. We analysed controllability of the two deterministic networks ((1, 3)-flower and (2, 2)-flower) by exact controllability theory in detail and give accurate results of the minimum number of driver nodes for the two networks. In simulation, we compare the controllability of (x,y)-flower networks. Our results show that the family of (x,y)-flower networks have the same degree sequence, but their controllability is totally different. So the degree distribution itself is not sufficient to characterize the controllability of deterministic networks with unweighted and undirected. PMID:26020920
Phylogenetic relationships in Peniocereus (Cactaceae) inferred from plastid DNA sequence data.

PubMed

Arias, Salvador; Terrazas, Teresa; Arreola-Nava, Hilda J; Vázquez-Sánchez, Monserrat; Cameron, Kenneth M

2005-10-01

The phylogenetic relationships of Peniocereus (Cactaceae) species were studied using parsimony analyses of DNA sequence data. The plastid rpl16 and trnL-F regions were sequenced for 98 taxa including 17 species of Peniocereus, representatives from all genera of tribe Pachycereeae, four genera of tribe Hylocereeae, as well as from three additional outgroup genera of tribes Calymmantheae, Notocacteae, and Trichocereeae. Phylogenetic analyses support neither the monophyly of Peniocereus as currently circumscribed, nor the monophyly of tribe Pachycereeae since species of Peniocereus subgenus Pseudoacanthocereus are embedded within tribe Hylocereeae. Furthermore, these results show that the eight species of Peniocereus subgenus Peniocereus (Peniocereus sensu stricto) form a well-supported clade within subtribe Pachycereinae; P. serpentinus is also a member of this subtribe, but is sister to Bergerocactus. Moreover, Nyctocereus should be resurrected as a monotypic genus. Species of Peniocereus subgenus Pseudoacanthocereus are positioned among species of Acanthocereus within tribe Hylocereeae, indicating that they may be better classified within that genus. A number of morphological and anatomical characters, especially related to the presence or absence of dimorphic branches, are discussed to support these relationships.
Complete genome sequence of Bacillus velezensis QST713: A biocontrol agent that protects Agaricus bisporus crops against the green mould disease.

PubMed

Pandin, Caroline; Le Coq, Dominique; Deschamps, Julien; Védie, Régis; Rousseau, Thierry; Aymerich, Stéphane; Briandet, Romain

2018-04-24

Bacillus subtilis QST713 is extensively used as a biological control agent in agricultural fields including in the button mushroom culture, Agaricus bisporus. This last use exploits its inhibitory activity against microbial pathogens such as Trichoderma aggressivum f. europaeum, the main button mushroom green mould competitor. Here, we report the complete genome sequence of this bacterium with a genome size of 4 233 757 bp, 4263 predicted genes and an average GC content of 45.9%. Based on phylogenomic analyses, strain QST713 is finally designated as Bacillus velezensis. Genomic analyses revealed two clusters encoding potential new antimicrobials with NRPS and TransATPKS synthetase. B. velezensis QST713 genome also harbours several genes previously described as being involved in surface colonization and biofilm formation. This strain shows a strong ability to form in vitro spatially organized biofilm and to antagonize T. aggressivum. The availability of this genome sequence could bring new elements to understand the interactions with micro or/and macroorganisms in crops. Copyright © 2018 Elsevier B.V. All rights reserved.
Wheat-specific gene, ribosomal protein l21, used as the endogenous reference gene for qualitative and real-time quantitative polymerase chain reaction detection of transgenes.

PubMed

Liu, Yi-Ke; Li, He-Ping; Huang, Tao; Cheng, Wei; Gao, Chun-Sheng; Zuo, Dong-Yun; Zhao, Zheng-Xi; Liao, Yu-Cai

2014-10-29

Wheat-specific ribosomal protein L21 (RPL21) is an endogenous reference gene suitable for genetically modified (GM) wheat identification. This taxon-specific RPL21 sequence displayed high homogeneity in different wheat varieties. Southern blots revealed 1 or 3 copies, and sequence analyses showed one amplicon in common wheat. Combined analyses with sequences from common wheat (AABBDD) and three diploid ancestral species, Triticum urartu (AA), Aegilops speltoides (BB), and Aegilops tauschii (DD), demonstrated the presence of this amplicon in the AA genome. Using conventional qualitative polymerase chain reaction (PCR), the limit of detection was 2 copies of wheat haploid genome per reaction. In the quantitative real-time PCR assay, limits of detection and quantification were about 2 and 8 haploid genome copies, respectively, the latter of which is 2.5-4-fold lower than other reported wheat endogenous reference genes. Construct-specific PCR assays were developed using RPL21 as an endogenous reference gene, and as little as 0.5% of GM wheat contents containing Arabidopsis NPR1 were properly quantified.
Analysis of Claviceps africana and C. sorghi from India using AFLPs, EF-1alpha gene intron 4, and beta-tubulin gene intron 3.

PubMed

Tooley, Paul W; Bandyopadhyay, Ranajit; Carras, Marie M; Pazoutová, Sylvie

2006-04-01

Isolates of Claviceps causing ergot on sorghum in India were analysed by AFLP analysis, and by analysis of DNA sequences of the EF-1alpha gene intron 4 and beta-tubulin gene intron 3 region. Of 89 isolates assayed from six states in India, four were determined to be C. sorghi, and the rest C. africana. A relatively low level of genetic diversity was observed within the Indian C. africana population. No evidence of genetic exchange between C. africana and C. sorghi was observed in either AFLP or DNA sequence analysis. Phylogenetic analysis was conducted using DNA sequences from 14 different Claviceps species. A multigene phylogeny based on the EF-1alpha gene intron 4, the beta-tubulin gene intron 3 region, and rDNA showed that C. sorghi grouped most closely with C. gigantea and C. africana. Although the Claviceps species we analysed were closely related, they colonize hosts that are taxonomically very distinct suggesting that there is no direct coevolution of Claviceps with its hosts.
Listeria booriae sp. nov. and Listeria newyorkensis sp. nov., from food processing environments in the USA.

PubMed

Weller, Daniel; Andrus, Alexis; Wiedmann, Martin; den Bakker, Henk C

2015-01-01

Sampling of seafood and dairy processing facilities in the north-eastern USA produced 18 isolates of Listeria spp. that could not be identified at the species-level using traditional phenotypic and genotypic identification methods. Results of phenotypic and genotypic analyses suggested that the isolates represent two novel species with an average nucleotide blast identity of less than 92% with previously described species of the genus Listeria. Phylogenetic analyses based on whole genome sequences, 16S rRNA gene and sigB gene sequences confirmed that the isolates represented by type strain FSL M6-0635(T) and FSL A5-0209 cluster phylogenetically with Listeria cornellensis. Phylogenetic analyses also showed that the isolates represented by type strain FSL A5-0281(T) cluster phylogenetically with Listeria riparia. The name Listeria booriae sp. nov. is proposed for the species represented by type strain FSL A5-0281(T) ( =DSM 28860(T) =LMG 28311(T)), and the name Listeria newyorkensis sp. nov. is proposed for the species represented by type strain FSL M6-0635(T) ( =DSM 28861(T) =LMG 28310(T)). Phenotypic and genotypic analyses suggest that neither species is pathogenic. © 2015 IUMS.
Interordinal gene capture, the phylogenetic position of Steller's sea cow based on molecular and morphological data, and the macroevolutionary history of Sirenia.

PubMed

Springer, Mark S; Signore, Anthony V; Paijmans, Johanna L A; Vélez-Juarbe, Jorge; Domning, Daryl P; Bauer, Cameron E; He, Kai; Crerar, Lorelei; Campos, Paula F; Murphy, William J; Meredith, Robert W; Gatesy, John; Willerslev, Eske; MacPhee, Ross D E; Hofreiter, Michael; Campbell, Kevin L

2015-10-01

The recently extinct (ca. 1768) Steller's sea cow (Hydrodamalis gigas) was a large, edentulous North Pacific sirenian. The phylogenetic affinities of this taxon to other members of this clade, living and extinct, are uncertain based on previous morphological and molecular studies. We employed hybridization capture methods and second generation sequencing technology to obtain >30kb of exon sequences from 26 nuclear genes for both H. gigas and Dugong dugon. We also obtained complete coding sequences for the tooth-related enamelin (ENAM) gene. Hybridization probes designed using dugong and manatee sequences were both highly effective in retrieving sequences from H. gigas (mean=98.8% coverage), as were more divergent probes for regions of ENAM (99.0% coverage) that were designed exclusively from a proboscidean (African elephant) and a hyracoid (Cape hyrax). New sequences were combined with available sequences for representatives of all other afrotherian orders. We also expanded a previously published morphological matrix for living and fossil Sirenia by adding both new taxa and nine new postcranial characters. Maximum likelihood and parsimony analyses of the molecular data provide robust support for an association of H. gigas and D. dugon to the exclusion of living trichechids (manatees). Parsimony analyses of the morphological data also support the inclusion of H. gigas in Dugongidae with D. dugon and fossil dugongids. Timetree analyses based on calibration density approaches with hard- and soft-bounded constraints suggest that H. gigas and D. dugon diverged in the Oligocene and that crown sirenians last shared a common ancestor in the Eocene. The coding sequence for the ENAM gene in H. gigas does not contain frameshift mutations or stop codons, but there is a transversion mutation (AG to CG) in the acceptor splice site of intron 2. This disruption in the edentulous Steller's sea cow is consistent with previous studies that have documented inactivating mutations in tooth-specific loci of a variety of edentulous and enamelless vertebrates including birds, turtles, aardvarks, pangolins, xenarthrans, and baleen whales. Further, branch-site dN/dS analyses provide evidence for positive selection in ENAM on the stem dugongid branch where extensive tooth reduction occurred, followed by neutral evolution on the Hydrodamalis branch. Finally, we present a synthetic evolutionary tree for living and fossil sirenians showing several key innovations in the history of this clade including character state changes that parallel those that occurred in the evolutionary history of cetaceans. Copyright © 2015 Elsevier Inc. All rights reserved.
Leveraging genome-wide datasets to quantify the functional role of the anti-Shine-Dalgarno sequence in regulating translation efficiency.

PubMed

Hockenberry, Adam J; Pah, Adam R; Jewett, Michael C; Amaral, Luís A N

2017-01-01

Studies dating back to the 1970s established that sequence complementarity between the anti-Shine-Dalgarno (aSD) sequence on prokaryotic ribosomes and the 5' untranslated region of mRNAs helps to facilitate translation initiation. The optimal location of aSD sequence binding relative to the start codon, the full extents of the aSD sequence and the functional form of the relationship between aSD sequence complementarity and translation efficiency have not been fully resolved. Here, we investigate these relationships by leveraging the sequence diversity of endogenous genes and recently available genome-wide estimates of translation efficiency. We show that-after accounting for predicted mRNA structure-aSD sequence complementarity increases the translation of endogenous mRNAs by roughly 50%. Further, we observe that this relationship is nonlinear, with translation efficiency maximized for mRNAs with intermediate levels of aSD sequence complementarity. The mechanistic insights that we observe are highly robust: we find nearly identical results in multiple datasets spanning three distantly related bacteria. Further, we verify our main conclusions by re-analysing a controlled experimental dataset. © 2017 The Authors.
DMINDA: an integrated web server for DNA motif identification and analyses

PubMed Central

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-01-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
Candida mesorugosa sp. nov., a novel yeast species similar to Candida rugosa, isolated from a tertiary hospital in Brazil.

PubMed

Chaves, Guilherme M; Terçarioli, Gisela R; Padovan, Ana Carolina B; Rosas, Robert C; Ferreira, Renata C; Melo, Analy S A; Colombo, Arnaldo L

2013-04-01

Candida rugosa is a yeast species that is emerging as a causative agent of invasive infection, particularly in Latin America. Recently, C. pseudorugosa was proposed as a new species closely related to C. rugosa. We evaluated in this investigation the genetic heterogeneity within the C. rugosa species complex. All clinical isolates used in this study were identified phenotypically as C. rugosa but were genotypically different from the C. rugosa type, ATCC 10571. RAPD marker analysis revealed less than 83% similarity between our clinical isolates and the C. rugosa type strain. The D1/D2 region sequences of our clinical isolates showed 98% identity with C. rugosa but only 94-95% identity with C. pseudorugosa. The ITS rDNA sequences of the Brazilian isolates showed 91% identity with the C. rugosa ATCC 10571 ITS sequence. Network and Bayesian analyses of ITS and housekeeping gene sequences separated our clinical isolates into different branches from C. rugosa type strain. These differences are sufficient to reassign our isolates to a distinct species, named C. mesorugosa.
Complete genome sequences of two highly divergent Japanese isolates of Plantago asiatica mosaic virus.

PubMed

Komatsu, Ken; Yamashita, Kazuo; Sugawara, Kota; Verbeek, Martin; Fujita, Naoko; Hanada, Kaoru; Uehara-Ichiki, Tamaki; Fuji, Shin-Ichi

2017-02-01

Plantago asiatica mosaic virus (PlAMV) is a member of the genus Potexvirus and has an exceptionally wide host range. It causes severe damage to lilies. Here we report on the complete nucleotide sequences of two new Japanese PlAMV isolates, one from the eudicot weed Viola grypoceras (PlAMV-Vi), and the other from the eudicot shrub Nandina domestica Thunb. (PlAMV-NJ). Their genomes contain five open reading frames (ORFs), which is characteristic of potexviruses. Surprisingly, the isolates showed only 76.0-78.0 % sequence identity with each other and with other PlAMV isolates, including isolates from Japanese lily and American nandina. Amino acid alignments of the replicase coding region encoded by ORF1 showed that the regions between the methyltransferase and helicase domains were less conserved than other regions, with several insertions and/or deletions. Phylogenetic analyses of the full-length nucleotide sequences revealed a moderate correlation between phylogenetic clustering and the original host plants of the PlAMV isolates. This study revealed the presence of two highly divergent PlAMV isolates in Japan.
Associations among measures of sequential processing in motor and linguistics tasks in adults with and without a family history of childhood apraxia of speech: a replication study.

PubMed

Button, Le; Peter, Beate; Stoel-Gammon, Carol; Raskind, Wendy H

2013-03-01

The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically related adults from a family with familial CAS showed motor sequencing deficits in an alternating motor speech task. Compared with the other adults, these three participants showed deficits in tasks requiring high loads of sequential processing, including nonword imitation, nonword reading and spelling. Qualitative error analyses in real word and nonword imitations revealed group differences in phoneme sequencing errors. Motor sequencing ability was correlated with phoneme sequencing errors during real word and nonword imitation, reading and spelling. Correlations were characterized by extremely high scores in one family and extremely low scores in another. Results are consistent with a central deficit in sequential processing in CAS of familial origin.
High-resolution sedimentological and subsidence analysis of the Late Neogene, Pannonian Basin, Hungary

USGS Publications Warehouse

Juhasz, E.; Muller, P.; Toth-Makk, A.; Hamor, T.; Farkas-Bulla, J.; Suto-Szentai, M.; Phillips, R.L.; Ricketts, B.

1996-01-01

Detailed sedimentological and paleontological analyses were carried out on more than 13,000 m of core from ten boreholes in the Late Neogene sediments of the Pannonian Basin, Hungary. These data provide the basis for determining the character of high-order depositional cycles and their stacking patterns. In the Late Neogene sediments of the Pannonian Basin there are two third-order sequences: the Late Miocene and the Pliocene ones. The Miocene sequence shows a regressive, upward-coarsening trend. There are four distinguishable sedimentary units in this sequence: the basal transgressive, the lower aggradational, the progradational and the upper aggradational units. The Pliocene sequence is also of aggradational character. The progradation does not coincide in time in the wells within the basin. The character of the relative water-level curves is similar throughout the basin but shows only very faint similarity to the sea-level curve. Therefore, it is unlikely that eustasy played any significant role in the pattern of basin filling. Rather, the dominant controls were the rapidly changing basin subsidence and high sedimentation rates, together with possible climatic factors.
Associations among measures of sequential processing in motor and linguistics tasks in adults with and without a family history of childhood apraxia of speech: A replication study

PubMed Central

BUTTON, LE; PETER, BEATE; STOEL-GAMMON, CAROL; RASKIND, WENDY H.

2013-01-01

The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically related adults from a family with familial CAS showed motor sequencing deficits in an alternating motor speech task. Compared with the other adults, these three participants showed deficits in tasks requiring high loads of sequential processing, including nonword imitation, nonword reading and spelling. Qualitative error analyses in real word and nonword imitations revealed group differences in phoneme sequencing errors. Motor sequencing ability was correlated with phoneme sequencing errors during real word and nonword imitation, reading and spelling. Correlations were characterized by extremely high scores in one family and extremely low scores in another. Results are consistent with a central deficit in sequential processing in CAS of familial origin. PMID:23339292

Targeted sequencing for high-resolution evolutionary analyses following genome duplication in salmonid fish: Proof of concept for key components of the insulin-like growth factor axis.

PubMed

Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J

2016-12-01

High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Exome copy number variation detection: Use of a pool of unrelated healthy tissue as reference sample.

PubMed

Wenric, Stephane; Sticca, Tiberio; Caberg, Jean-Hubert; Josse, Claire; Fasquelle, Corinne; Herens, Christian; Jamar, Mauricette; Max, Stéphanie; Gothot, André; Caers, Jo; Bours, Vincent

2017-01-01

An increasing number of bioinformatic tools designed to detect CNVs (copy number variants) in tumor samples based on paired exome data where a matched healthy tissue constitutes the reference have been published in the recent years. The idea of using a pool of unrelated healthy DNA as reference has previously been formulated but not thoroughly validated. As of today, the gold standard for CNV calling is still aCGH but there is an increasing interest in detecting CNVs by exome sequencing. We propose to design a metric allowing the comparison of two CNV profiles, independently of the technique used and assessed the validity of using a pool of unrelated healthy DNA instead of a matched healthy tissue as reference in exome-based CNV detection. We compared the CNV profiles obtained with three different approaches (aCGH, exome sequencing with a matched healthy tissue as reference, exome sequencing with a pool of eight unrelated healthy tissue as reference) on three multiple myeloma samples. We show that the usual analyses performed to compare CNV profiles (deletion/amplification ratios and CNV size distribution) lack in precision when confronted with low LRR values, as they only consider the binary status of each CNV. We show that the metric-based distance constitutes a more accurate comparison of two CNV profiles. Based on these analyses, we conclude that a reliable picture of CNV alterations in multiple myeloma samples can be obtained from whole-exome sequencing in the absence of a matched healthy sample. © 2016 WILEY PERIODICALS, INC.
Applying Agrep to r-NSA to solve multiple sequences approximate matching.

PubMed

Ni, Bing; Wong, Man-Hon; Lam, Chi-Fai David; Leung, Kwong-Sak

2014-01-01

This paper addresses the approximate matching problem in a database consisting of multiple DNA sequences, where the proposed approach applies Agrep to a new truncated suffix array, r-NSA. The construction time of the structure is linear to the database size, and the computations of indexing a substring in the structure are constant. The number of characters processed in applying Agrep is analysed theoretically, and the theoretical upper-bound can approximate closely the empirical number of characters, which is obtained through enumerating the characters in the actual structure built. Experiments are carried out using (synthetic) random DNA sequences, as well as (real) genome sequences including Hepatitis-B Virus and X-chromosome. Experimental results show that, compared to the straight-forward approach that applies Agrep to multiple sequences individually, the proposed approach solves the matching problem in much shorter time. The speed-up of our approach depends on the sequence patterns, and for highly similar homologous genome sequences, which are the common cases in real-life genomes, it can be up to several orders of magnitude.
Characterizing novel endogenous retroviruses from genetic variation inferred from short sequence reads

PubMed Central

Mourier, Tobias; Mollerup, Sarah; Vinner, Lasse; Hansen, Thomas Arn; Kjartansdóttir, Kristín Rós; Guldberg Frøslev, Tobias; Snogdal Boutrup, Torsten; Nielsen, Lars Peter; Willerslev, Eske; Hansen, Anders J.

2015-01-01

From Illumina sequencing of DNA from brain and liver tissue from the lion, Panthera leo, and tumor samples from the pike-perch, Sander lucioperca, we obtained two assembled sequence contigs with similarity to known retroviruses. Phylogenetic analyses suggest that the pike-perch retrovirus belongs to the epsilonretroviruses, and the lion retrovirus to the gammaretroviruses. To determine if these novel retroviral sequences originate from an endogenous retrovirus or from a recently integrated exogenous retrovirus, we assessed the genetic diversity of the parental sequences from which the short Illumina reads are derived. First, we showed by simulations that we can robustly infer the level of genetic diversity from short sequence reads. Second, we find that the measures of nucleotide diversity inferred from our retroviral sequences significantly exceed the level observed from Human Immunodeficiency Virus infections, prompting us to conclude that the novel retroviruses are both of endogenous origin. Through further simulations, we rule out the possibility that the observed elevated levels of nucleotide diversity are the result of co-infection with two closely related exogenous retroviruses. PMID:26493184
Target gene analyses of 39 amelogenesis imperfecta kindreds

PubMed Central

Chan, Hui-Chen; Estrella, Ninna M. R. P.; Milkovich, Rachel N.; Kim, Jung-Wook; Simmer, James P.; Hu, Jan C-C.

2012-01-01

Previously, mutational analyses identified six disease-causing mutations in 24 amelogenesis imperfecta (AI) kindreds. We have since expanded the number of AI kindreds to 39, and performed mutation analyses covering the coding exons and adjoining intron sequences for the six proven AI candidate genes [amelogenin (AMELX), enamelin (ENAM), family with sequence similarity 83, member H (FAM83H), WD repeat containing domain 72 (WDR72), enamelysin (MMP20), and kallikrein-related peptidase 4 (KLK4)] and for ameloblastin (AMBN) (a suspected candidate gene). All four of the X-linked AI families (100%) had disease-causing mutations in AMELX, suggesting that AMELX is the only gene involved in the aetiology of X-linked AI. Eighteen families showed an autosomal-dominant pattern of inheritance. Disease-causing mutations were identified in 12 (67%): eight in FAM83H, and four in ENAM. No FAM83H coding-region or splice-junction mutations were identified in three probands with autosomal-dominant hypocalcification AI (ADHCAI), suggesting that a second gene may contribute to the aetiology of ADHCAI. Six families showed an autosomal-recessive pattern of inheritance, and disease-causing mutations were identified in three (50%): two in MMP20, and one in WDR72. No disease-causing mutations were found in 11 families with only one affected member. We conclude that mutation analyses of the current candidate genes for AI have about a 50% chance of identifying the disease-causing mutation in a given kindred. PMID:22243262
Phylogeny of Castanea (Fagaceae) based on chloroplast trnT-L-F sequence data

Treesearch

Ping Lang; Fenny Dane; Thomas L. Kubisiak

2005-01-01

Species in the genus Castanea are widely distributed in the deciduous forests of the Northern Hemisphere from Asia to Europe and North America. They show floristic similarity but differences in chestnut blight resistance especially among eastern Asian and eastern North American species. Phylogenetic analyses were conducted in this study using...
Phylogenetic Analysis of Klebsiella pneumoniae from Hospitalized Children, Pakistan.

PubMed

Ejaz, Hasan; Wang, Nancy; Wilksch, Jonathan J; Page, Andrew J; Cao, Hanwei; Gujaran, Shruti; Keane, Jacqueline A; Lithgow, Trevor; Ul-Haq, Ikram; Dougan, Gordon; Strugnell, Richard A; Heinz, Eva

2017-11-01

Klebsiella pneumoniae shows increasing emergence of multidrug-resistant lineages, including strains resistant to all available antimicrobial drugs. We conducted whole-genome sequencing of 178 highly drug-resistant isolates from a tertiary hospital in Lahore, Pakistan. Phylogenetic analyses to place these isolates into global context demonstrate the expansion of multiple independent lineages, including K. quasipneumoniae.
New Hepatitis E Virus Genotype in Camels, the Middle East

PubMed Central

Lau, Susanna K.P.; Teng, Jade L.L.; Tsang, Alan K. L.; Joseph, Marina; Wong, Emily Y.M.; Tang, Ying; Sivakumar, Saritha; Xie, Jun; Bai, Ru; Wernery, Renate; Wernery, Ulrich; Yuen, Kwok-Yung

2014-01-01

In a molecular epidemiology study of hepatitis E virus (HEV) in dromedaries in Dubai, United Arab Emirates, HEV was detected in fecal samples from 3 camels. Complete genome sequencing of 2 strains showed >20% overall nucleotide difference to known HEVs. Comparative genomic and phylogenetic analyses revealed a previously unrecognized HEV genotype. PMID:24856611
Determinants of Student Attitudes toward Team Exams

ERIC Educational Resources Information Center

Reinig, Bruce A.; Horowitz, Ira; Whittenburg, Gene

2014-01-01

We examine how student attitudes toward their group, learning method, and perceived development of professional skills are initially shaped and subsequently evolve through multiple uses of team exams. Using a Tobit regression model to analyse a sequence of 10 team quizzes given in a graduate-level tax accounting course, we show that there is an…
Amblyomma imitator Ticks as Vectors of Rickettsia rickettsii, Mexico

PubMed Central

Oliveira, Karla A.; Pinter, Adriano; Medina-Sanchez, Aaron; Boppana, Venkata D.; Wikel, Stephen K.; Saito, Tais B.; Shelite, Thomas; Blanton, Lucas; Popov, Vsevolod; Teel, Pete D.; Walker, David H.; Galvao, Marcio A.M.; Mafra, Claudio

2010-01-01

Real-time PCR of Amblyomma imitator tick egg masses obtained in Nuevo Leon State, Mexico, identified a Rickettsia species. Sequence analyses of 17-kD common antigen and outer membrane protein A and B gene fragments showed to it to be R. rickettsii, which suggested a potential new vector for this bacterium. PMID:20678325
Novel ferulate esterase from Gram-positive lactic acid bacteria and analyses of the recombinant enzyme produced in E. coli

USDA-ARS?s Scientific Manuscript database

Using a plate containing ethyl ferulate as sole carbon source, various bacteria cultures were screened for ferulate esterase (FAE). Among a dozen of species showing positive FAE, one Lactobacillus fermentum strain NRRL 1932 demonstrated the strongest activity. Using a published sequence of ferulate ...
Phylogenetic analyses of mtDNA sequences corroborate taxonomic designations based on cuticular hydrocarbons in subterranean termites

Treesearch

Kirsten A. Copren; Lori J. Nelson; Edward L. Vargo; Michael I. Haverty

2005-01-01

Cuticular hydrocarbons (CHCs) are valuable characters for the analysis of cryptic insect species with few discernible morphological characters. Yet, their use in insect systematics, speciWcally in subterranean termites in the genus Reticulitermes (Isoptera: Rhinotermitidae), remains controversial. In this paper, we show that taxonomic designations...
New insights into replication origin characteristics in metazoans

PubMed Central

Puy, Aurore; Rialle, Stéphanie; Kaplan, Noam; Segal, Eran

2012-01-01

We recently reported the identification and characterization of DNA replication origins (Oris) in metazoan cell lines. Here, we describe additional bioinformatic analyses showing that the previously identified GC-rich sequence elements form origin G-rich repeated elements (OGREs) that are present in 67% to 90% of the DNA replication origins from Drosophila to human cells, respectively. Our analyses also show that initiation of DNA synthesis takes place precisely at 160 bp (Drosophila) and 280 bp (mouse) from the OGRE. We also found that in most CpG islands, an OGRE is positioned in opposite orientation on each of the two DNA strands and detected two sites of initiation of DNA synthesis upstream or downstream of each OGRE. Conversely, Oris not associated with CpG islands have a single initiation site. OGRE density along chromosomes correlated with previously published replication timing data. Ori sequences centered on the OGRE are also predicted to have high intrinsic nucleosome occupancy. Finally, OGREs predict G-quadruplex structures at Oris that might be structural elements controlling the choice or activation of replication origins. PMID:22373526
Analyses of expression and localization of two mammalian-type transglutaminases in Physarum polycephalum, an acellular slime mold.

PubMed

Wada, Fumitaka; Ogawa, Atsuko; Hanai, Yuko; Nakamura, Akio; Maki, Masatoshi; Hitomi, Kiyotaka

2004-11-01

Transglutaminase (TGase) is an enzyme that modifies proteins by crosslinking or polyamination. Physarum polycephalum, an acellular slime mold, is the evolutionally lowest organism that has a mammalian-type transglutaminase. We have cloned a cDNA for Physarum polycephalum TGase (PpTGB), homologous to a previously identified TGase (PpTGA), whose sequence is similar to that of mammalian TGases. PpTGB encodes a primary sequence identical to that of PpTGA except for 11 amino acid residues at the N-terminus. Reverse transcription-PCR and Western blotting analyses showed that both PpTGA and PpTGB are expressed in microplasmodia and macroplasmodia during their life cycle, except for in sporangia. For biochemical characterization, we carried out the ectopical expressions of PpTGA and PpTGB in Dictyostelium discoideum. Subcellular fractionation of these Dictyostelium cells showed that the expressed PpTGA, but not PpTGB, localizes to the membrane fraction. Furthermore, in Physarum, subcellular fractionation and immunostaining indicated specific localization at the plasma membrane in macroplasmodia, while the localization was entirely cytoplasmic in microplasmodia.
Molecular epidemiology demonstrated three emerging clusters of human immunodeficiency virus type 1 subtype B infection in Hong Kong.

PubMed

Leung, Tommy W C; Mak, Darwin; Wong, K H; Wang, Y; Song, Y H; Tsang, D N C; Wong, C; Shao, Y M; Lim, W L

2008-07-01

We conducted a molecular epidemiological study on newly diagnosed human immunodeficiency virus type 1 (HIV-1)-infected patients in Hong Kong to identify the epidemiological linkage of HIV-1 infection in the locality. Reverse transcription polymerase chain reaction (RT-PCR) for HIV-1 was performed on newly diagnosed HIV-1-positive sera collected from January 2002 to December 2006. PCR products correspond to the env C2V3V4 region and gag p17/p24 junction of the HIV-1 genome were nucleotide sequenced. Phylogenetic analyses performed on the acquired nucleotide sequences revealed that CRF01_AE and subtype B were the two dominant HIV-1 subtypes. Analyses also demonstrated the presence of three emerging HIV-1 clusters among the subtype B sequences in Hong Kong. Individual cluster possesses a unique cluster-specific amino acid signature for identification. Data show that one of the clusters (Cluster I) is rapidly expanding. In addition to the unique cluster-specific amino acid signature, the majority of sequences in Cluster I harbor a 6-amino acid insertion at the gag p17/p24 junction in a region that is thought to be closely associated with HIV-1 infectivity.
CEQer: A Graphical Tool for Copy Number and Allelic Imbalance Detection from Whole-Exome Sequencing Data

PubMed Central

Piazza, Rocco; Magistroni, Vera; Pirola, Alessandra; Redaelli, Sara; Spinelli, Roberta; Redaelli, Serena; Galbiati, Marta; Valletta, Simona; Giudici, Giovanni; Cazzaniga, Giovanni; Gambacorti-Passerini, Carlo

2013-01-01

Copy number alterations (CNA) are common events occurring in leukaemias and solid tumors. Comparative Genome Hybridization (CGH) is actually the gold standard technique to analyze CNAs; however, CGH analysis requires dedicated instruments and is able to perform only low resolution Loss of Heterozygosity (LOH) analyses. Here we present CEQer (Comparative Exome Quantification analyzer), a new graphical, event-driven tool for CNA/allelic-imbalance (AI) coupled analysis of exome sequencing data. By using case-control matched exome data, CEQer performs a comparative digital exonic quantification to generate CNA data and couples this information with exome-wide LOH and allelic imbalance detection. This data is used to build mixed statistical/heuristic models allowing the identification of CNA/AI events. To test our tool, we initially used in silico generated data, then we performed whole-exome sequencing from 20 leukemic specimens and corresponding matched controls and we analyzed the results using CEQer. Taken globally, these analyses showed that the combined use of comparative digital exon quantification and LOH/AI allows generating very accurate CNA data. Therefore, we propose CEQer as an efficient, robust and user-friendly graphical tool for the identification of CNA/AI in the context of whole-exome sequencing data. PMID:24124457
Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects.

PubMed

Zhang, De-Xing; Hewitt, Godfrey M

2003-03-01

Population-genetic studies have been remarkably productive and successful in the last decade following the invention of PCR technology and the introduction of mitochondrial and microsatellite DNA markers. While mitochondrial DNA has proven powerful for genealogical and evolutionary studies of animal populations, and microsatellite sequences are the most revealing DNA markers available so far for inferring population structure and dynamics, they both have important and unavoidable limitations. To obtain a fuller picture of the history and evolutionary potential of populations, genealogical data from nuclear loci are essential, and the inclusion of other nuclear markers, i.e. single copy nuclear polymorphic (scnp) sequences, is clearly needed. Four major uncertainties for nuclear DNA analyses of populations have been facing us, i.e. the availability of scnp markers for carrying out such analysis, technical laboratory hurdles for resolving haplotypes, difficulty in data analysis because of recombination, low divergence levels and intraspecific multifurcation evolution, and the utility of scnp markers for addressing population-genetic questions. In this review, we discuss the availability of highly polymorphic single copy DNA in the nuclear genome, describe patterns and rate of evolution of nuclear sequences, summarize past empirical and theoretical efforts to recover and analyse data from scnp markers, and examine the difficulties, challenges and opportunities faced in such studies. We show that although challenges still exist, the above-mentioned obstacles are now being removed. Recent advances in technology and increases in statistical power provide the prospect of nuclear DNA analyses becoming routine practice, allowing allele-discriminating characterization of scnp loci and microsatellite loci. This certainly will increase our ability to address more complex questions, and thereby the sophistication of genetic analyses of populations.
Isolation of Brucella inopinata-Like Bacteria from White's and Denny's Tree Frogs.

PubMed

Kimura, Masanobu; Une, Yumi; Suzuki, Michio; Park, Eun-Sil; Imaoka, Koichi; Morikawa, Shigeru

2017-05-01

Brucella inopinata strain BO1 and B. sp. strain BO2 isolated from human patients, respectively, are genetically different from classical Brucella species. We isolated bacteria of the genus Brucella from two species of wild-caught tropical frogs kept in the facilities in Japan: White's tree frog, which inhabits Oceania, and Denny's tree frog, which inhabits Southeast Asia. Phylogenetic analyses based on 16S rRNA and recA gene sequences and multilocus sequence analysis showed that two isolates of Brucella spp. showed significant similarity to BO1, BO2, and the isolates from other wild-caught frogs. These results suggest that a variety of frog species are susceptible to a novel clade of Brucella bacteria, including B. inopinata.
Discordant genetic diversity and geographic patterns between Crassicutis cichlasomae (Digenea: Apocreadiidae) and its cichlid host, "Cichlasoma" urophthalmus (Osteichthyes: Cichlidae), in Middle-America.

PubMed

Razo-Mendivil, Ulises; Vázquez-Domínguez, Ella; de León, Gerardo Pérez-Ponce

2013-12-01

Genetic analyses of hosts and their parasites are key to understand the evolutionary patterns and processes that have shaped host-parasite associations. We evaluated the genetic structure of the digenean Crassicutis cichlasomae and its most common host, the Mayan cichlid "Cichlasoma" urophthalmus, encompassing most of their geographical range in Middle-America (river basins in southeastern Mexico, Belize, and Guatemala together with the Yucatan Peninsula). Genetic diversity and structure analyses were done based on 167 cytochrome c oxidase subunit 1 sequences (330 bp) for C. cichlasomae from 21 populations and 161 cytochrome b sequences (599 bp) for "C." urophthalmus from 26 populations. Analyses performed included phylogenetic tree estimation under Bayesian inference and maximum likelihood analysis, genetic diversity, distance and structure estimates, haplotype networks, and demographic evaluations. Crassicutis cichlasomae showed high genetic diversity values and genetic structuring, corresponding with 4 groups clearly differentiated and highly divergent. Conversely, "C." urophthalmus showed low levels of genetic diversity and genetic differentiation, defined as 2 groups with low divergence and with no correspondence with geographical distribution. Our results show that species of cichlids parasitized by C. cichlasomae other than "C." urophthalmus, along with multiple colonization events and subsequent isolation in different basins, are likely factors that shaped the genetic structure of the parasite. Meanwhile, historical long-distance dispersal and drought periods during the Holocene, with significant population size reductions and fragmentations, are factors that could have shaped the genetic structure of the Mayan cichlid.
Complete mitochondrial genome sequence of the hedgehog seahorse Hippocampus spinosissimus Weber, 1933 (Gasterosteiformes:Syngnathidae).

PubMed

Liu, Shuaishuai; Zhang, Yanhong; Wang, Changming; Lin, Qiang

2016-07-01

The complete mitochondrial genome sequence of the hedgehog seahorse Hippocampus spinosissimus was first determined in this article. The total length of H. spinosissimus mitogenome is 16 527 bp and consists of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 1 control region. The gene order and composition of H. spinosissimus were similar to those of most other vertebrates. The overall base composition of H. spinosissimus is 32.1% A, 30.3% T, 14.9% G and 22.7% C, with a slight A + T-rich feature (62.4%). Phylogenetic analyses based on complete mitochondrial genome sequence showed that H. spinosissimus has a close genetic relationship to H. ingens and H. kuda.

The complete nucleotide sequence of the barley yellow dwarf GPV isolate from China shows that it is a new member of the genus Polerovirus.

PubMed

Zhang, Wenwei; Cheng, Zhuomin; Xu, Lei; Wu, Maosen; Waterhouse, Peter; Zhou, Guanghe; Li, Shifang

2009-01-01

The complete nucleotide sequence of the ssRNA genome of a Chinese GPV isolate of barley yellow dwarf virus (BYDV) was determined. It comprised 5673 nucleotides, and the deduced genome organization resembled that of members of the genus Polerovirus. It was most closely related to cereal yellow dwarf virus-RPV (77% nt identity over the entire genome; coat protein amino acid identity 79%). The GPV isolate also differs in vector specificity from other BYDV strains. Biological properties, phylogenetic analyses and detailed sequence comparisons suggest that GPV should be considered a member of a new species within the genus, and the name Wheat yellow dwarf virus-GPV is proposed.
Multilocus sequence analysis for assessment of phylogenetic diversity and biogeography in Thalassospira bacteria from diverse marine environments.

PubMed

Lai, Qiliang; Liu, Yang; Yuan, Jun; Du, Juan; Wang, Liping; Sun, Fengqin; Shao, Zongze

2014-01-01

Thalassospira bacteria are widespread and have been isolated from various marine environments. Less is known about their genetic diversity and biogeography, as well as their role in marine environments, many of them cannot be discriminated merely using the 16S rRNA gene. To address these issues, in this report, the phylogenetic analysis of 58 strains from seawater and deep sea sediments were carried out using the multilocus sequence analysis (MLSA) based on acsA, aroE, gyrB, mutL, rpoD and trpB genes, and the DNA-DNA hybridization (DDH) and average nucleotide identity (ANI) based on genome sequences. The MLSA analysis demonstrated that the 58 strains were clearly separated into 15 lineages, corresponding to seven validly described species and eight potential novel species. The DDH and ANI values further confirmed the validity of the MLSA analysis and eight potential novel species. The MLSA interspecies gap of the genus Thalassospira was determined to be 96.16-97.12% sequence identity on the basis of the combined analyses of the DDH and MLSA, while the ANIm interspecies gap was 95.76-97.20% based on the in silico DDH analysis. Meanwhile, phylogenetic analyses showed that the Thalassospira bacteria exhibited distribution pattern to a certain degree according to geographic regions. Moreover, they clustered together according to the habitats depth. For short, the phylogenetic analyses and biogeography of the Thalassospira bacteria were systematically investigated for the first time. These results will be helpful to explore further their ecological role and adaptive evolution in marine environments.
Multilocus Sequence Analysis for Assessment of Phylogenetic Diversity and Biogeography in Thalassospira Bacteria from Diverse Marine Environments

PubMed Central

Yuan, Jun; Du, Juan; Wang, Liping; Sun, Fengqin; Shao, Zongze

2014-01-01

Thalassospira bacteria are widespread and have been isolated from various marine environments. Less is known about their genetic diversity and biogeography, as well as their role in marine environments, many of them cannot be discriminated merely using the 16S rRNA gene. To address these issues, in this report, the phylogenetic analysis of 58 strains from seawater and deep sea sediments were carried out using the multilocus sequence analysis (MLSA) based on acsA, aroE, gyrB, mutL, rpoD and trpB genes, and the DNA-DNA hybridization (DDH) and average nucleotide identity (ANI) based on genome sequences. The MLSA analysis demonstrated that the 58 strains were clearly separated into 15 lineages, corresponding to seven validly described species and eight potential novel species. The DDH and ANI values further confirmed the validity of the MLSA analysis and eight potential novel species. The MLSA interspecies gap of the genus Thalassospira was determined to be 96.16–97.12% sequence identity on the basis of the combined analyses of the DDH and MLSA, while the ANIm interspecies gap was 95.76–97.20% based on the in silico DDH analysis. Meanwhile, phylogenetic analyses showed that the Thalassospira bacteria exhibited distribution pattern to a certain degree according to geographic regions. Moreover, they clustered together according to the habitats depth. For short, the phylogenetic analyses and biogeography of the Thalassospira bacteria were systematically investigated for the first time. These results will be helpful to explore further their ecological role and adaptive evolution in marine environments. PMID:25198177
Unique Phylogenetic Lineage Found in the Fusarium-like Clade after Re-examining BCCM/IHEM Fungal Culture Collection Material

PubMed Central

De Cremer, Koen; Piérard, Denis; Hendrickx, Marijke

2016-01-01

Recently, the Fusarium genus has been narrowed based upon phylogenetic analyses and a Fusarium-like clade was adopted. The few species of the Fusarium-like clade were moved to new, re-installed or existing genera or provisionally retained as "Fusarium." Only a limited number of reference strains and DNA marker sequences are available for this clade and not much is known about its actual species diversity. Here, we report six strains, preserved by the Belgian fungal culture collection BCCM/IHEM as a Fusarium species, that belong to the Fusarium-like clade. They showed a slow growth and produced pionnotes, typical morphological characteristics of many Fusarium-like species. Multilocus sequencing with comparative sequence analyses in GenBank and phylogenetic analyses, using reference sequences of type material, confirmed that they were indeed member of the Fusarium-like clade. One strain was identified as "Fusarium" ciliatum whereas another strain was identified as Fusicolla merismoides. The four remaining strains were shown to represent a unique phylogenetic lineage in the Fusarium-like clade and were also found morphologically distinct from other members of the Fusarium-like clade. Based upon phylogenetic considerations, a new genus, Pseudofusicolla gen. nov., and a new species, Pseudofusicolla belgica sp. nov., were installed for this lineage. A formal description is provided in this study. Additional sampling will be required to gather isolates other than the historical strains presented in the present study as well as to further reveal the actual species diversity in the Fusarium-like clade. PMID:27790062
Reprint of "Sequence and phylogenetic analyses of novel totivirus-like double-stranded RNAs from field-collected powdery mildew fungi".

PubMed

Kondo, Hideki; Hisano, Sakae; Chiba, Sotaro; Maruyama, Kazuyuki; Andika, Ida Bagus; Toyoda, Kazuhiro; Fujimori, Fumihiro; Suzuki, Nobuhiro

2016-07-02

The identification of mycoviruses contributes greatly to understanding of the diversity and evolutionary aspects of viruses. Powdery mildew fungi are important and widely studied obligate phytopathogenic agents, but there has been no report on mycoviruses infecting these fungi. In this study, we used a deep sequencing approach to analyze the double-stranded RNA (dsRNA) segments isolated from field-collected samples of powdery mildew fungus-infected red clover plants in Japan. Database searches identified the presence of at least ten totivirus (genus Totivirus)-like sequences, termed red clover powdery mildew-associated totiviruses (RPaTVs). The majority of these sequences shared moderate amino acid sequence identity with each other (<44%) and with other known totiviruses (<59%). Nine of these identified sequences (RPaTV1a, 1b and 2-8) resembled the genome of the prototype totivirus, Saccharomyces cerevisiae virus-L-A (ScV-L-A) in that they contained two overlapping open reading frames (ORFs) encoding a putative coat protein (CP) and an RNA dependent RNA polymerase (RdRp), while one sequence (RPaTV9) showed similarity to another totivirus, Ustilago maydis virus H1 (UmV-H1) that encodes a single polyprotein (CP-RdRp fusion). Similar to yeast totiviruses, each ScV-L-A-like RPaTV contains a -1 ribosomal frameshift site downstream of a predicted pseudoknot structure in the overlapping region of these ORFs, suggesting that the RdRp is translated as a CP-RdRp fusion. Moreover, several ScV-L-A-like sequences were also found by searches of the transcriptome shotgun assembly (TSA) libraries from rust fungi, plants and insects. Phylogenetic analyses show that nine ScV-L-A-like RPaTVs along with ScV-L-A-like sequences derived from TSA libraries are clustered with most established members of the genus Totivirus, while one RPaTV forms a new distinct clade with UmV-H1, possibly establishing an additional genus in the family. Taken together, our results indicate the presence of diverse, novel totiviruses in the powdery mildew fungus populations infecting red clover plants in the field. Copyright © 2015 Elsevier B.V. All rights reserved.
Composition and diversity of nifH genes of nitrogen-fixing cyanobacteria associated with boreal forest feather mosses.

PubMed

Ininbergs, Karolina; Bay, Guillaume; Rasmussen, Ulla; Wardle, David A; Nilsson, Marie-Charlotte

2011-10-01

Recent studies have revealed that nitrogen fixation by cyanobacteria living in association with feather mosses is a major input of nitrogen to boreal forests. We characterized the community composition and diversity of cyanobacterial nifH phylotypes associated with each of two feather moss species (Pleurozium schreberi and Hylocomium splendens) on each of 30 lake islands varying in ecosystem properties in northern Sweden. Nitrogen fixation was measured using acetylene reduction, and nifH sequences were amplified using general and cyanobacterial selective primers, separated and analyzed using density gradient gel electrophoresis (DGGE) or cloning, and further sequenced for phylogenetic analyses. Analyses of DGGE fingerprinting patterns revealed two host-specific clusters (one for each moss species), and sequence analysis showed five clusters of nifH phylotypes originating from heterocystous cyanobacteria. For H. splendens only, N(2) fixation was related to both nifH composition and diversity among islands. We demonstrated that the cyanobacterial communities associated with feather mosses show a high degree of host specificity. However, phylotype composition and diversity, and nitrogen fixation, did not differ among groups of islands that varied greatly in their availability of resources. These results suggest that moss species identity, but not extrinsic environmental conditions, serves as the primary determinant of nitrogen-fixing cyanobacterial communities that inhabit mosses. © 2011 The Authors. New Phytologist © 2011 New Phytologist Trust.
A comparative study of AMF diversity in annual and perennial plant species from semiarid gypsum soils.

NASA Astrophysics Data System (ADS)

Alguacil, M. M.; Torrecillas, E.; Roldán, A.; Díaz, G.; Torres, P.

2012-04-01

The arbuscular mycorrhizal fungi (AMF) communities composition regulate plant interactions and determine the structure of plant communities. In this study we analysed the diversity of AMF in the roots of two perennial gypsophyte plant species, Herniaria fruticosa and Senecio auricula, and an annual herbaceous species, Bromus rubens, growing in a gypsum soil from a semiarid area. The objective was to determine whether perennial and annual host plants support different AMF communities in their roots and whether there are AMF species that might be indicators of specific functional plant roles in these ecosystems. The roots were analysed by nested PCR, cloning, sequencing of the ribosomal DNA small subunit region and phylogenetic analysis. Twenty AMF sequence types, belonging to the Glomus group A, Glomus group B, Diversisporaceae, Acaulosporaceae, Archaeosporaceae and Paraglomeraceae, were identified. Both gypsophyte perennial species had differing compositions of the AMF community and higher diversity when compared with the annual species, showing preferential selection by specific AMF sequences types. B. rubens did not show host specificity, sharing the full composition of its AMF community with both perennial plant species. Seasonal variations in the competitiveness of AM fungi could explain the observed differences in AMF community composition, but this is still a working hypothesis that requires the analysis of further data obtained from a higher number of both annual and perennial plant species in order to be fully tested.
Genomics of an emerging clone of Salmonella serovar Typhimurium ST313 from Nigeria and the Democratic Republic of Congo.

PubMed

Leekitcharoenphon, Pimlapas; Friis, Carsten; Zankari, Ea; Svendsen, Christina Aaby; Price, Lance B; Rahmani, Maral; Herrero-Fresno, Ana; Fashae, Kayode; Vandenberg, Olivier; Aarestrup, Frank M; Hendriksen, Rene S

2013-10-15

Salmonella enterica serovar Typhimurium ST313 is an invasive and phylogenetically distinct lineage present in sub-Saharan Africa. We report the presence of S. Typhimurium ST313 from patients in the Democratic Republic of Congo and Nigeria. Eighteen S. Typhimurium ST313 isolates were characterized by antimicrobial susceptibility testing, pulsed-field gel electrophoresis (PFGE), and multilocus sequence typing (MLST). Additionally, six of the isolates were characterized by whole genome sequence typing (WGST). The presence of a putative virulence determinant was examined in 177 Salmonella isolates belonging to 57 different serovars. All S. Typhimurium ST313 isolates harbored resistant genes encoded by blaTEM1b, catA1, strA/B, sul1, and dfrA1. Additionally, aac(6')1aa gene was detected. Phylogenetic analyses revealed close genetic relationships among Congolese and Nigerian isolates from both blood and stool. Comparative genomic analyses identified a putative virulence fragment (ST313-TD) unique to S. Typhimurium ST313 and S. Dublin. We showed in a limited number of isolates that S. Typhimurium ST313 is a prevalent sequence-type causing gastrointestinal diseases and septicemia in patients from Nigeria and DRC. We found three distinct phylogenetic clusters based on the origin of isolation suggesting some spatial evolution. Comparative genomics showed an interesting putative virulence fragment (ST313-TD) unique to S. Typhimurium ST313 and invasive S. Dublin.
Sequences of Normative Evaluation in Two Telecollaboration Projects: A Comparative Study of Multimodal Feedback through Desktop Videoconference

ERIC Educational Resources Information Center

Cappellini, Marco; Azaoui, Brahim

2017-01-01

In our study we analyse how the same interactional dynamic is produced in two different pedagogical settings exploiting a desktop videoconference system. We propose to focus our attention on a specific type of conversational side sequence, known in the Francophone literature as sequences of normative evaluation. More particularly, we analyse data…
msgbsR: An R package for analysing methylation-sensitive restriction enzyme sequencing data.

PubMed

Mayne, Benjamin T; Leemaqz, Shalem Y; Buckberry, Sam; Rodriguez Lopez, Carlos M; Roberts, Claire T; Bianco-Miotto, Tina; Breen, James

2018-02-01

Genotyping-by-sequencing (GBS) or restriction-site associated DNA marker sequencing (RAD-seq) is a practical and cost-effective method for analysing large genomes from high diversity species. This method of sequencing, coupled with methylation-sensitive enzymes (often referred to as methylation-sensitive restriction enzyme sequencing or MRE-seq), is an effective tool to study DNA methylation in parts of the genome that are inaccessible in other sequencing techniques or are not annotated in microarray technologies. Current software tools do not fulfil all methylation-sensitive restriction sequencing assays for determining differences in DNA methylation between samples. To fill this computational need, we present msgbsR, an R package that contains tools for the analysis of methylation-sensitive restriction enzyme sequencing experiments. msgbsR can be used to identify and quantify read counts at methylated sites directly from alignment files (BAM files) and enables verification of restriction enzyme cut sites with the correct recognition sequence of the individual enzyme. In addition, msgbsR assesses DNA methylation based on read coverage, similar to RNA sequencing experiments, rather than methylation proportion and is a useful tool in analysing differential methylation on large populations. The package is fully documented and available freely online as a Bioconductor package ( https://bioconductor.org/packages/release/bioc/html/msgbsR.html ).
Imaging different components of a tectonic tremor sequence in southwestern Japan using an automatic statistical detection and location method

NASA Astrophysics Data System (ADS)

Poiata, Natalia; Vilotte, Jean-Pierre; Bernard, Pascal; Satriano, Claudio; Obara, Kazushige

2018-06-01

In this study, we demonstrate the capability of an automatic network-based detection and location method to extract and analyse different components of tectonic tremor activity by analysing a 9-day energetic tectonic tremor sequence occurring at the downdip extension of the subducting slab in southwestern Japan. The applied method exploits the coherency of multiscale, frequency-selective characteristics of non-stationary signals recorded across the seismic network. Use of different characteristic functions, in the signal processing step of the method, allows to extract and locate the sources of short-duration impulsive signal transients associated with low-frequency earthquakes and of longer-duration energy transients during the tectonic tremor sequence. Frequency-dependent characteristic functions, based on higher-order statistics' properties of the seismic signals, are used for the detection and location of low-frequency earthquakes. This allows extracting a more complete (˜6.5 times more events) and time-resolved catalogue of low-frequency earthquakes than the routine catalogue provided by the Japan Meteorological Agency. As such, this catalogue allows resolving the space-time evolution of the low-frequency earthquakes activity in great detail, unravelling spatial and temporal clustering, modulation in response to tide, and different scales of space-time migration patterns. In the second part of the study, the detection and source location of longer-duration signal energy transients within the tectonic tremor sequence is performed using characteristic functions built from smoothed frequency-dependent energy envelopes. This leads to a catalogue of longer-duration energy sources during the tectonic tremor sequence, characterized by their durations and 3-D spatial likelihood maps of the energy-release source regions. The summary 3-D likelihood map for the 9-day tectonic tremor sequence, built from this catalogue, exhibits an along-strike spatial segmentation of the long-duration energy-release regions, matching the large-scale clustering features evidenced from the low-frequency earthquake's activity analysis. Further examination of the two catalogues showed that the extracted short-duration low-frequency earthquakes activity coincides in space, within about 10-15 km distance, with the longer-duration energy sources during the tectonic tremor sequence. This observation provides a potential constraint on the size of the longer-duration energy-radiating source region in relation with the clustering of low-frequency earthquakes activity during the analysed tectonic tremor sequence. We show that advanced statistical network-based methods offer new capabilities for automatic high-resolution detection, location and monitoring of different scale-components of tectonic tremor activity, enriching existing slow earthquakes catalogues. Systematic application of such methods to large continuous data sets will allow imaging the slow transient seismic energy-release activity at higher resolution, and therefore, provide new insights into the underlying multiscale mechanisms of slow earthquakes generation.
Imaging different components of a tectonic tremor sequence in southwestern Japan using an automatic statistical detection and location method

NASA Astrophysics Data System (ADS)

Poiata, Natalia; Vilotte, Jean-Pierre; Bernard, Pascal; Satriano, Claudio; Obara, Kazushige

2018-02-01

In this study, we demonstrate the capability of an automatic network-based detection and location method to extract and analyse different components of tectonic tremor activity by analysing a 9-day energetic tectonic tremor sequence occurring at the down-dip extension of the subducting slab in southwestern Japan. The applied method exploits the coherency of multi-scale, frequency-selective characteristics of non-stationary signals recorded across the seismic network. Use of different characteristic functions, in the signal processing step of the method, allows to extract and locate the sources of short-duration impulsive signal transients associated with low-frequency earthquakes and of longer-duration energy transients during the tectonic tremor sequence. Frequency-dependent characteristic functions, based on higher-order statistics' properties of the seismic signals, are used for the detection and location of low-frequency earthquakes. This allows extracting a more complete (˜6.5 times more events) and time-resolved catalogue of low-frequency earthquakes than the routine catalogue provided by the Japan Meteorological Agency. As such, this catalogue allows resolving the space-time evolution of the low-frequency earthquakes activity in great detail, unravelling spatial and temporal clustering, modulation in response to tide, and different scales of space-time migration patterns. In the second part of the study, the detection and source location of longer-duration signal energy transients within the tectonic tremor sequence is performed using characteristic functions built from smoothed frequency-dependent energy envelopes. This leads to a catalogue of longer-duration energy sources during the tectonic tremor sequence, characterized by their durations and 3-D spatial likelihood maps of the energy-release source regions. The summary 3-D likelihood map for the 9-day tectonic tremor sequence, built from this catalogue, exhibits an along-strike spatial segmentation of the long-duration energy-release regions, matching the large-scale clustering features evidenced from the low-frequency earthquake's activity analysis. Further examination of the two catalogues showed that the extracted short-duration low-frequency earthquakes activity coincides in space, within about 10-15 km distance, with the longer-duration energy sources during the tectonic tremor sequence. This observation provides a potential constraint on the size of the longer-duration energy-radiating source region in relation with the clustering of low-frequency earthquakes activity during the analysed tectonic tremor sequence. We show that advanced statistical network-based methods offer new capabilities for automatic high-resolution detection, location and monitoring of different scale-components of tectonic tremor activity, enriching existing slow earthquakes catalogues. Systematic application of such methods to large continuous data sets will allow imaging the slow transient seismic energy-release activity at higher resolution, and therefore, provide new insights into the underlying multi-scale mechanisms of slow earthquakes generation.
[Complete genome sequencing and analyses of rabies viruses isolated from wild animals (Chinese Ferret-Badger) in Zhejiang province].

PubMed

Lei, Yong-Liang; Wang, Xiao-Guang; Liu, Fu-Ming; Chen, Xiu-Ying; Ye, Bi-Feng; Mei, Jian-Hua; Lan, Jin-Quan; Tang, Qing

2009-08-01

Based on sequencing the full-length genomes of two Chinese Ferret-Badger, we analyzed the properties of rabies viruses genetic variation in molecular level to get information on prevalence and variation of rabies viruses in Zhejiang, and to enrich the genome database of rabies viruses street strains isolated from Chinese wildlife. Overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses of the N genes from Chinese Ferret-Badger, sika deer, vole, dog. Vaccine strains were then determined. The two full-length genomes were completely sequenced to find out that they had the same genetic structure with 11 923 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions (IGRs), 423 nts-Pseudogene-like sequence (Psi), 70 nts-Trailer. The two full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by blast and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the two full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so that the nucleotide mutations happened in these two genomes were most probably as synonymous mutations. Compared to the referenced rabies viruses, the lengths of the five protein coding regions did not show any changes or recombination, but only with a few-point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the two ferret badgers genomes were similar to the referenced vaccine or street strains. The two strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessing the distinct geographyphic characteristics of China. All the evidence suggested a cue that these two ferret badgers rabies viruses were likely to be street virus that already circulating in wildlife.
Parallel computation of genome-scale RNA secondary structure to detect structural constraints on human genome.

PubMed

Kawaguchi, Risa; Kiryu, Hisanori

2016-05-06

RNA secondary structure around splice sites is known to assist normal splicing by promoting spliceosome recognition. However, analyzing the structural properties of entire intronic regions or pre-mRNA sequences has been difficult hitherto, owing to serious experimental and computational limitations, such as low read coverage and numerical problems. Our novel software, "ParasoR", is designed to run on a computer cluster and enables the exact computation of various structural features of long RNA sequences under the constraint of maximal base-pairing distance. ParasoR divides dynamic programming (DP) matrices into smaller pieces, such that each piece can be computed by a separate computer node without losing the connectivity information between the pieces. ParasoR directly computes the ratios of DP variables to avoid the reduction of numerical precision caused by the cancellation of a large number of Boltzmann factors. The structural preferences of mRNAs computed by ParasoR shows a high concordance with those determined by high-throughput sequencing analyses. Using ParasoR, we investigated the global structural preferences of transcribed regions in the human genome. A genome-wide folding simulation indicated that transcribed regions are significantly more structural than intergenic regions after removing repeat sequences and k-mer frequency bias. In particular, we observed a highly significant preference for base pairing over entire intronic regions as compared to their antisense sequences, as well as to intergenic regions. A comparison between pre-mRNAs and mRNAs showed that coding regions become more accessible after splicing, indicating constraints for translational efficiency. Such changes are correlated with gene expression levels, as well as GC content, and are enriched among genes associated with cytoskeleton and kinase functions. We have shown that ParasoR is very useful for analyzing the structural properties of long RNA sequences such as mRNAs, pre-mRNAs, and long non-coding RNAs whose lengths can be more than a million bases in the human genome. In our analyses, transcribed regions including introns are indicated to be subject to various types of structural constraints that cannot be explained from simple sequence composition biases. ParasoR is freely available at https://github.com/carushi/ParasoR .
Improved PCR-Based Detection of Soil Transmitted Helminth Infections Using a Next-Generation Sequencing Approach to Assay Design.

PubMed

Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A

2016-03-01

The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Comparison of Microbiomes between Red Poultry Mite Populations (Dermanyssus gallinae): Predominance of Bartonella-like Bacteria.

PubMed

Hubert, Jan; Erban, Tomas; Kopecky, Jan; Sopko, Bruno; Nesvorna, Marta; Lichovnikova, Martina; Schicht, Sabine; Strube, Christina; Sparagano, Olivier

2017-11-01

Blood feeding red poultry mites (RPM) serve as vectors of pathogenic bacteria and viruses among vertebrate hosts including wild birds, poultry hens, mammals, and humans. The microbiome of RPM has not yet been studied by high-throughput sequencing. RPM eggs, larvae, and engorged adult/nymph samples obtained in four poultry houses in Czechia were used for microbiome analyses by Illumina amplicon sequencing of the 16S ribosomal RNA (rRNA) gene V4 region. A laboratory RPM population was used as positive control for transcriptome analysis by pyrosequencing with identification of sequences originating from bacteria. The samples of engorged adult/nymph stages had 100-fold more copies of 16S rRNA gene copies than the samples of eggs and larvae. The microbiome composition showed differences among the four poultry houses and among observed developmental stadia. In the adults' microbiome 10 OTUs comprised 90 to 99% of all sequences. Bartonella-like bacteria covered between 30 and 70% of sequences in RPM microbiome and 25% bacterial sequences in transcriptome. The phylogenetic analyses of 16S rRNA gene sequences revealed two distinct groups of Bartonella-like bacteria forming sister groups: (i) symbionts of ants; (ii) Bartonella genus. Cardinium, Wolbachia, and Rickettsiella sp. were found in the microbiomes of all tested stadia, while Spiroplasma eriocheiris and Wolbachia were identified in the laboratory RPM transcriptome. The microbiomes from eggs, larvae, and engorged adults/nymphs differed. Bartonella-like symbionts were found in all stadia and sampling sites. Bartonella-like bacteria was the most diversified group within the RPM microbiome. The presence of identified putative pathogenic bacteria is relevant with respect to human and animal health issues while the identification of symbiontic bacteria can lead to new control methods targeting them to destabilize the arthropod host.
Genome-wide signatures of convergent evolution in echolocating mammals

PubMed Central

Parker, Joe; Tsagkogeorga, Georgia; Cotton, James A.; Liu, Yuan; Provero, Paolo; Stupka, Elia; Rossiter, Stephen J.

2013-01-01

Evolution is typically thought to proceed through divergence of genes, proteins, and ultimately phenotypes1-3. However, similar traits might also evolve convergently in unrelated taxa due to similar selection pressures4,5. Adaptive phenotypic convergence is widespread in nature, and recent results from a handful of genes have suggested that this phenomenon is powerful enough to also drive recurrent evolution at the sequence level6-9. Where homoplasious substitutions do occur these have long been considered the result of neutral processes. However, recent studies have demonstrated that adaptive convergent sequence evolution can be detected in vertebrates using statistical methods that model parallel evolution9,10 although the extent to which sequence convergence between genera occurs across genomes is unknown. Here we analyse genomic sequence data in mammals that have independently evolved echolocation and show for the first time that convergence is not a rare process restricted to a handful of loci but is instead widespread, continuously distributed and commonly driven by natural selection acting on a small number of sites per locus. Systematic analyses of convergent sequence evolution in 805,053 amino acids within 2,326 orthologous coding gene sequences compared across 22 mammals (including four new bat genomes) revealed signatures consistent with convergence in nearly 200 loci. Strong and significant support for convergence among bats and the dolphin was seen in numerous genes linked to hearing or deafness, consistent with an involvement in echolocation. Surprisingly we also found convergence in many genes linked to vision: the convergent signal of many sensory genes was robustly correlated with the strength of natural selection. This first attempt to detect genome-wide convergent sequence evolution across divergent taxa reveals the phenomenon to be much more pervasive than previously recognised. PMID:24005325
The complete mitochondrial genomes of three parasitic nematodes of birds: a unique gene order and insights into nematode phylogeny

PubMed Central

2013-01-01

Background Analyses of mitochondrial (mt) genome sequences in recent years challenge the current working hypothesis of Nematoda phylogeny proposed from morphology, ecology and nuclear small subunit rRNA gene sequences, and raise the need to sequence additional mt genomes for a broad range of nematode lineages. Results We sequenced the complete mt genomes of three Ascaridia species (family Ascaridiidae) that infest chickens, pigeons and parrots, respectively. These three Ascaridia species have an identical arrangement of mt genes to each other but differ substantially from other nematodes. Phylogenetic analyses of the mt genome sequences of the Ascaridia species, together with 62 other nematode species, support the monophylies of seven high-level taxa of the phylum Nematoda: 1) the subclass Dorylaimia; 2) the orders Rhabditida, Trichinellida and Mermithida; 3) the suborder Rhabditina; and 4) the infraorders Spiruromorpha and Oxyuridomorpha. Analyses of mt genome sequences, however, reject the monophylies of the suborders Spirurina and Tylenchina, and the infraorders Rhabditomorpha, Panagrolaimomorpha and Tylenchomorpha. Monophyly of the infraorder Ascaridomorpha varies depending on the methods of phylogenetic analysis. The Ascaridomorpha was more closely related to the infraorders Rhabditomorpha and Diplogasteromorpha (suborder Rhabditina) than they were to the other two infraorders of the Spirurina: Oxyuridorpha and Spiruromorpha. The closer relationship among Ascaridomorpha, Rhabditomorpha and Diplogasteromorpha was also supported by a shared common pattern of mitochondrial gene arrangement. Conclusions Analyses of mitochondrial genome sequences and gene arrangement has provided novel insights into the phylogenetic relationships among several major lineages of nematodes. Many lineages of nematodes, however, are underrepresented or not represented in these analyses. Expanding taxon sampling is necessary for future phylogenetic studies of nematodes with mt genome sequences. PMID:23800363
A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica

PubMed Central

2012-01-01

Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides. PMID:22958331
Computational optimisation of targeted DNA sequencing for cancer detection

NASA Astrophysics Data System (ADS)

Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul; Gerlinger, Marco; Swanton, Charles

2013-12-01

Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting ``hotspot'' regions would introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection.

Integrative analysis of environmental sequences using MEGAN4.

PubMed

Huson, Daniel H; Mitra, Suparna; Ruscheweyh, Hans-Joachim; Weber, Nico; Schuster, Stephan C

2011-09-01

A major challenge in the analysis of environmental sequences is data integration. The question is how to analyze different types of data in a unified approach, addressing both the taxonomic and functional aspects. To facilitate such analyses, we have substantially extended MEGAN, a widely used taxonomic analysis program. The new program, MEGAN4, provides an integrated approach to the taxonomic and functional analysis of metagenomic, metatranscriptomic, metaproteomic, and rRNA data. While taxonomic analysis is performed based on the NCBI taxonomy, functional analysis is performed using the SEED classification of subsystems and functional roles or the KEGG classification of pathways and enzymes. A number of examples illustrate how such analyses can be performed, and show that one can also import and compare classification results obtained using others' tools. MEGAN4 is freely available for academic purposes, and installers for all three major operating systems can be downloaded from www-ab.informatik.uni-tuebingen.de/software/megan.
Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

PubMed Central

Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

2010-01-01

Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665
Independent origins and incipient speciation among host-associated populations of Thielaviopsis ethacetica in Cameroon.

PubMed

Mbenoun, Michael; Wingfield, Michael J; Letsoalo, Teboho; Bihon, Wubetu; Wingfield, Brenda D; Roux, Jolanda

2015-11-01

Thielaviopsis ethacetica was recently reinstated as a distinct taxon using DNA phylogenies. It is widespread affecting several crop plants of global economic importance. In this study, microsatellite markers were developed and used in conjunction with sequence data to investigate the genetic diversity and structure of Th. ethacetica in Cameroon. A collection of 71 isolates from cacao, oil palm, and pineapple, supplemented with nine isolates from other countries were analysed. Four genetic groups were identified. Two of these were associated with oil palm in Cameroon and showed high genetic diversity, suggesting that they might represent an indigenous population of the pathogen. In contrast, the remaining two groups, associated with cacao and pineapple, had low genetic diversity and, most likely, represent introduced populations. There was no evidence of gene flow between these groups. Phylogenetic analyses based on sequences of the tef1-α as well as the combined flanking regions of six microsatellite loci were consistent with population genetic analyses and suggested that Th. ethacetica is comprised of two divergent genetic lineages. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Variability and molecular typing of the woody-tree infecting prunus necrotic ringspot ilarvirus.

PubMed

Vasková, D; Petrzik, K; Karesová, R

2000-01-01

The 3'-part of the movement protein gene, the intergenic region and the complete coat protein gene of sixteen isolates of Prunus necrotic ringspot virus (PNRSV) from five different host species from the Czech Republic were sequenced in order to search for the bases of extensive variability of viroses caused by this pathogen. According to phylogenetic analyses all the 46 isolates sequenced to date split into three main groups, which correlated to a certain extend with their geographic origin. Modelled serological properties showed that all the new isolates belong to one serotype.
Diversity and distribution of unicellular opisthokonts along the European coast analysed using high-throughput sequencing.

PubMed

Del Campo, Javier; Mallo, Diego; Massana, Ramon; de Vargas, Colomban; Richards, Thomas A; Ruiz-Trillo, Iñaki

2015-09-01

The opisthokonts are one of the major super groups of eukaryotes. It comprises two major clades: (i) the Metazoa and their unicellular relatives and (ii) the Fungi and their unicellular relatives. There is, however, little knowledge of the role of opisthokont microbes in many natural environments, especially among non-metazoan and non-fungal opisthokonts. Here, we begin to address this gap by analysing high-throughput 18S rDNA and 18S rRNA sequencing data from different European coastal sites, sampled at different size fractions and depths. In particular, we analyse the diversity and abundance of choanoflagellates, filastereans, ichthyosporeans, nucleariids, corallochytreans and their related lineages. Our results show the great diversity of choanoflagellates in coastal waters as well as a relevant representation of the ichthyosporeans and the uncultured marine opisthokonts (MAOP). Furthermore, we describe a new lineage of marine fonticulids (MAFO) that appears to be abundant in sediments. Taken together, our work points to a greater potential ecological role for unicellular opisthokonts than previously appreciated in marine environments, both in water column and sediments, and also provides evidence of novel opisthokont phylogenetic lineages. This study highlights the importance of high-throughput sequencing approaches to unravel the diversity and distribution of both known and novel eukaryotic lineages. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.
Choice of Reference Sequence and Assembler for Alignment of Listeria monocytogenes Short-Read Sequence Data Greatly Influences Rates of Error in SNP Analyses

PubMed Central

Pightling, Arthur W.; Petronella, Nicholas; Pagotto, Franco

2014-01-01

The wide availability of whole-genome sequencing (WGS) and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs) in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs) are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps) are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i) depth of sequencing coverage, ii) choice of reference-guided short-read sequence assembler, iii) choice of reference genome, and iv) whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT), using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming). We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers should test a variety of conditions to achieve optimal results. PMID:25144537
Sublinear growth of information in DNA sequences.

PubMed

Menconi, Giulia

2005-07-01

We introduce a novel method to analyse complete genomes and recognise some distinctive features by means of an adaptive compression algorithm, which is not DNA-oriented, based on the Lempel-Ziv scheme. We study the Information Content as a function of the number of symbols encoded by the algorithm and we analyse the dictionary created by the algorithm. Preliminary results are shown concerning regions showing a sublinear type of information growth, which is strictly connected to the presence of highly repetitive subregions that might be supposed to have a regulatory function within the genome.
The Large Subunit rDNA Sequence of Plasmodiophora brassicae Does not Contain Intra-species Polymorphism

PubMed Central

Schwelm, Arne; Berney, Cédric; Dixelius, Christina; Bass, David; Neuhauser, Sigrid

2016-01-01

Clubroot disease caused by Plasmodiophora brassicae is one of the most important diseases of cultivated brassicas. P. brassicae occurs in pathotypes which differ in the aggressiveness towards their Brassica host plants. To date no DNA based method to distinguish these pathotypes has been described. In 2011 polymorphism within the 28S rDNA of P. brassicae was reported which potentially could allow to distinguish pathotypes without the need of time-consuming bioassays. However, isolates of P. brassicae from around the world analysed in this study do not show polymorphism in their LSU rDNA sequences. The previously described polymorphism most likely derived from soil inhabiting Cercozoa more specifically Neoheteromita-like glissomonads. Here we correct the LSU rDNA sequence of P. brassicae. By using FISH we demonstrate that our newly generated sequence belongs to the causal agent of clubroot disease. PMID:27750174
snpAD: An ancient DNA genotype caller.

PubMed

Prüfer, Kay

2018-06-21

The study of ancient genomes can elucidate the evolutionary past. However, analyses are complicated by base-modifications in ancient DNA molecules that result in errors in DNA sequences. These errors are particularly common near the ends of sequences and pose a challenge for genotype calling. I describe an iterative method that estimates genotype frequencies and errors along sequences to allow for accurate genotype calling from ancient sequences. The implementation of this method, called snpAD, performs well on high-coverage ancient data, as shown by simulations and by subsampling the data of a high-coverage Neandertal genome. Although estimates for low-coverage genomes are less accurate, I am able to derive approximate estimates of heterozygosity from several low-coverage Neandertals. These estimates show that low heterozygosity, compared to modern humans, was common among Neandertals. The C ++ code of snpAD is freely available at http://bioinf.eva.mpg.de/snpAD/. Supplementary data are available at Bioinformatics online.
Rapid protein alignment in the cloud: HAMOND combines fast DIAMOND alignments with Hadoop parallelism.

PubMed

Yu, Jia; Blom, Jochen; Sczyrba, Alexander; Goesmann, Alexander

2017-09-10

The introduction of next generation sequencing has caused a steady increase in the amounts of data that have to be processed in modern life science. Sequence alignment plays a key role in the analysis of sequencing data e.g. within whole genome sequencing or metagenome projects. BLAST is a commonly used alignment tool that was the standard approach for more than two decades, but in the last years faster alternatives have been proposed including RapSearch, GHOSTX, and DIAMOND. Here we introduce HAMOND, an application that uses Apache Hadoop to parallelize DIAMOND computation in order to scale-out the calculation of alignments. HAMOND is fault tolerant and scalable by utilizing large cloud computing infrastructures like Amazon Web Services. HAMOND has been tested in comparative genomics analyses and showed promising results both in efficiency and accuracy. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
The sequence and de novo assembly of the giant panda genome

PubMed Central

Li, Ruiqiang; Fan, Wei; Tian, Geng; Zhu, Hongmei; He, Lin; Cai, Jing; Huang, Quanfei; Cai, Qingle; Li, Bo; Bai, Yinqi; Zhang, Zhihe; Zhang, Yaping; Wang, Wen; Li, Jun; Wei, Fuwen; Li, Heng; Jian, Min; Li, Jianwen; Zhang, Zhaolei; Nielsen, Rasmus; Li, Dawei; Gu, Wanjun; Yang, Zhentao; Xuan, Zhaoling; Ryder, Oliver A.; Leung, Frederick Chi-Ching; Zhou, Yan; Cao, Jianjun; Sun, Xiao; Fu, Yonggui; Fang, Xiaodong; Guo, Xiaosen; Wang, Bo; Hou, Rong; Shen, Fujun; Mu, Bo; Ni, Peixiang; Lin, Runmao; Qian, Wubin; Wang, Guodong; Yu, Chang; Nie, Wenhui; Wang, Jinhuan; Wu, Zhigang; Liang, Huiqing; Min, Jiumeng; Wu, Qi; Cheng, Shifeng; Ruan, Jue; Wang, Mingwei; Shi, Zhongbin; Wen, Ming; Liu, Binghang; Ren, Xiaoli; Zheng, Huisong; Dong, Dong; Cook, Kathleen; Shan, Gao; Zhang, Hao; Kosiol, Carolin; Xie, Xueying; Lu, Zuhong; Zheng, Hancheng; Li, Yingrui; Steiner, Cynthia C.; Lam, Tommy Tsan-Yuk; Lin, Siyuan; Zhang, Qinghui; Li, Guoqing; Tian, Jing; Gong, Timing; Liu, Hongde; Zhang, Dejin; Fang, Lin; Ye, Chen; Zhang, Juanbin; Hu, Wenbo; Xu, Anlong; Ren, Yuanyuan; Zhang, Guojie; Bruford, Michael W.; Li, Qibin; Ma, Lijia; Guo, Yiran; An, Na; Hu, Yujie; Zheng, Yang; Shi, Yongyong; Li, Zhiqiang; Liu, Qing; Chen, Yanling; Zhao, Jing; Qu, Ning; Zhao, Shancen; Tian, Feng; Wang, Xiaoling; Wang, Haiyin; Xu, Lizhi; Liu, Xiao; Vinar, Tomas; Wang, Yajun; Lam, Tak-Wah; Yiu, Siu-Ming; Liu, Shiping; Zhang, Hemin; Li, Desheng; Huang, Yan; Wang, Xia; Yang, Guohua; Jiang, Zhi; Wang, Junyi; Qin, Nan; Li, Li; Li, Jingxiang; Bolund, Lars; Kristiansen, Karsten; Wong, Gane Ka-Shu; Olson, Maynard; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian; Wang, Jun

2013-01-01

Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes. PMID:20010809
Phylogenetic diversity of bacterial communities in bovine rumen as affected by diets and microenvironments.

PubMed

Kim, Minseok; Morrison, Mark; Yu, Zhongtang

2011-09-01

Phylogenetic analysis was conducted to examine ruminal bacteria in two ruminal fractions (adherent fraction vs. liquid fraction) collected from cattle fed with two different diets: forage alone vs. forage plus concentrate. One hundred forty-four 16S rRNA gene (rrs) sequences were obtained from clone libraries constructed from the four samples. These rrs sequences were assigned to 116 different operational taxonomic units (OTUs) defined at 0.03 phylogenetic distance. Most of these OTUs could not be assigned to any known genus. The phylum Firmicutes was represented by approximately 70% of all the sequences. By comparing to the OTUs already documented in the rumen, 52 new OTUs were identified. UniFrac, SONS, and denaturing gradient gel electrophoresis analyses revealed difference in diversity between the two fractions and between the two diets. This study showed that rrs sequences recovered from small clone libraries can still help identify novel species-level OTUs.
New Measles Genotype, Uganda

PubMed Central

Muwonge, Apollo; Nanyunja, Miriam; Bwogi, Josephine; Lowe, Luis; Liffick, Stephanie L.; Bellini, William J.; Sylvester, Sempala

2005-01-01

We report the first genetic characterization of wildtype measles viruses from Uganda. Thirty-six virus isolates from outbreaks in 6 districts were analyzed from 2000 to 2002. Analyses of sequences of the nucleoprotein (N) and hemagglutinin (H) genes showed that the Ugandan isolates were all closely related, and phylogenetic analysis indicated that these viruses were members of a unique group within clade D. Sequences of the Ugandan viruses were not closely related to any of the World Health Organization reference sequences representing the 22 currently recognized genotypes. The minimum nucleotide divergence between the Ugandan viruses and the most closely related reference strain, genotype D2, was 3.1% for the N gene and 2.6% for the H gene. Therefore, Ugandan viruses should be considered a new, proposed genotype (d10). This new sequence information will expand the utility of molecular epidemiologic techniques for describing measles transmission patterns in eastern Africa. PMID:16318690
Large-Scale Biomonitoring of Remote and Threatened Ecosystems via High-Throughput Sequencing

PubMed Central

Gibson, Joel F.; Shokralla, Shadi; Curry, Colin; Baird, Donald J.; Monk, Wendy A.; King, Ian; Hajibabaei, Mehrdad

2015-01-01

Biodiversity metrics are critical for assessment and monitoring of ecosystems threatened by anthropogenic stressors. Existing sorting and identification methods are too expensive and labour-intensive to be scaled up to meet management needs. Alternately, a high-throughput DNA sequencing approach could be used to determine biodiversity metrics from bulk environmental samples collected as part of a large-scale biomonitoring program. Here we show that both morphological and DNA sequence-based analyses are suitable for recovery of individual taxonomic richness, estimation of proportional abundance, and calculation of biodiversity metrics using a set of 24 benthic samples collected in the Peace-Athabasca Delta region of Canada. The high-throughput sequencing approach was able to recover all metrics with a higher degree of taxonomic resolution than morphological analysis. The reduced cost and increased capacity of DNA sequence-based approaches will finally allow environmental monitoring programs to operate at the geographical and temporal scale required by industrial and regulatory end-users. PMID:26488407
Chromosome rearrangements via template switching between diverged repeated sequences

PubMed Central

Anand, Ranjith P.; Tsaponina, Olga; Greenwell, Patricia W.; Lee, Cheng-Sheng; Du, Wei; Petes, Thomas D.

2014-01-01

Recent high-resolution genome analyses of cancer and other diseases have revealed the occurrence of microhomology-mediated chromosome rearrangements and copy number changes. Although some of these rearrangements appear to involve nonhomologous end-joining, many must have involved mechanisms requiring new DNA synthesis. Models such as microhomology-mediated break-induced replication (MM-BIR) have been invoked to explain these rearrangements. We examined BIR and template switching between highly diverged sequences in Saccharomyces cerevisiae, induced during repair of a site-specific double-strand break (DSB). Our data show that such template switches are robust mechanisms that give rise to complex rearrangements. Template switches between highly divergent sequences appear to be mechanistically distinct from the initial strand invasions that establish BIR. In particular, such jumps are less constrained by sequence divergence and exhibit a different pattern of microhomology junctions. BIR traversing repeated DNA sequences frequently results in complex translocations analogous to those seen in mammalian cells. These results suggest that template switching among repeated genes is a potent driver of genome instability and evolution. PMID:25367035
Phylogenetic relationships in Epidendroideae (Orchidaceae), one of the great flowering plant radiations: progressive specialization and diversification.

PubMed

Freudenstein, John V; Chase, Mark W

2015-03-01

The largest subfamily of orchids, Epidendroideae, represents one of the most significant diversifications among flowering plants in terms of pollination strategy, vegetative adaptation and number of species. Although many groups in the subfamily have been resolved, significant relationships in the tree remain unclear, limiting conclusions about diversification and creating uncertainty in the classification. This study brings together DNA sequences from nuclear, plastid and mitochrondrial genomes in order to clarify relationships, to test associations of key characters with diversification and to improve the classification. Sequences from seven loci were concatenated in a supermatrix analysis for 312 genera representing most of epidendroid diversity. Maximum-likelihood and parsimony analyses were performed on this matrix and on subsets of the data to generate trees and to investigate the effect of missing values. Statistical character-associated diversification analyses were performed. Likelihood and parsimony analyses yielded highly resolved trees that are in strong agreement and show significant support for many key clades. Many previously proposed relationships among tribes and subtribes are supported, and some new relationships are revealed. Analyses of subsets of the data suggest that the relatively high number of missing data for the full analysis is not problematic. Diversification analyses show that epiphytism is most strongly associated with diversification among epidendroids, followed by expansion into the New World and anther characters that are involved with pollinator specificity, namely early anther inflexion, cellular pollinium stalks and the superposed pollinium arrangement. All tested characters show significant association with speciation in Epidendroideae, suggesting that no single character accounts for the success of this group. Rather, it appears that a succession of key features appeared that have contributed to diversification, sometimes in parallel. © The Author 2015. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Mitochondrial DNA variation of indigenous goats in Narok and Isiolo counties of Kenya.

PubMed

Kibegwa, F M; Githui, K E; Jung'a, J O; Badamana, M S; Nyamu, M N

2016-06-01

Phylogenetic relationships among and genetic variability within 60 goats from two different indigenous breeds in Narok and Isiolo counties in Kenya and 22 published goat samples were analysed using mitochondrial control region sequences. The results showed that there were 54 polymorphic sites in a 481-bp sequence and 29 haplotypes were determined. The mean haplotype diversity and nucleotide diversity were 0.981 ± 0.006 and 0.019 ± 0.001, respectively. The phylogenetic analysis in combination with goat haplogroup reference sequences from GenBank showed that all goat sequences were clustered into two haplogroups (A and G), of which haplogroup A was the commonest in the two populations. A very high percentage (99.90%) of the genetic variation was distributed within the regions, and a smaller percentage (0.10%) distributed among regions as revealed by the analysis of molecular variance (amova). This amova results showed that the divergence between regions was not statistically significant. We concluded that the high levels of intrapopulation diversity in Isiolo and Narok goats and the weak phylogeographic structuring suggested that there existed strong gene flow among goat populations probably caused by extensive transportation of goats in history. © 2015 Blackwell Verlag GmbH.
Complete nuclear ribosomal DNA sequence amplification and molecular analyses of Bangia (Bangiales, Rhodophyta) from China

NASA Astrophysics Data System (ADS)

Xu, Jiajie; Jiang, Bo; Chai, Sanming; He, Yuan; Zhu, Jianyi; Shen, Zonggen; Shen, Songdong

2016-09-01

Filamentous Bangia, which are distributed extensively throughout the world, have simple and similar morphological characteristics. Scientists can classify these organisms using molecular markers in combination with morphology. We successfully sequenced the complete nuclear ribosomal DNA, approximately 13 kb in length, from a marine Bangia population. We further analyzed the small subunit ribosomal DNA gene (nrSSU) and the internal transcribed spacer (ITS) sequence regions along with nine other marine, and two freshwater Bangia samples from China. Pairwise distances of the nrSSU and 5.8S ribosomal DNA gene sequences show the marine samples grouping together with low divergences (00.003; 0-0.006, respectively) from each other, but high divergences (0.123-0.126; 0.198, respectively) from freshwater samples. An exception is the marine sample collected from Weihai, which shows high divergence from both other marine samples (0.063-0.065; 0.129, respectively) and the freshwater samples (0.097; 0.120, respectively). A maximum likelihood phylogenetic tree based on a combined SSU-ITS dataset with maximum likelihood method shows the samples divided into three clades, with the two marine sample clades containing Bangia spp. from North America, Europe, Asia, and Australia; and one freshwater clade, containing Bangia atropurpurea from North America and China.
Saturnia jonasii Butler, 1877 on Jejudo Island, a new saturnid moth of South Korea with DNA data and morphology (Lepidoptera: Saturniidae).

PubMed

Kim, Min Jee; Choi, Sei-Woong; Kim, Iksoo

2015-04-10

Saturnia (Rinaca) jonasii Butler, 1877 is distributed in Japan, including Tsushima Island and Taiwan, whereas S. boisduvalii Eversmann, 1846 is distributed in northern areas, such as China, Russia, and South Korea. In the present study we found that the specimens from Mt. Hallasan on Jejudo, a southern remote offshore island, were S. jonasii, rather than S. boisduvalii based on morphology, DNA barcode, and nuclear elongation factor 1 alpha (EF-1α) sequences. The major morphological differences between the two species included the shape of wing pattern elements of fore- and hindwings and male and female genitalia. A DNA barcode analysis of the sequences of the Jejudo specimens and S. boisduvalii, along with those of Saturnia species obtained from a public database showed a minimum sequence divergence of 4.26% (28 bp). A phylogenetic analysis also showed clustering of the Jejudo specimens with S. jonasii, separating S. boisduvalii (Bayesian posterior probability = 0.99). The EF-1α-based sequence and phylogenetic analyses of the two species from Jejudo Island and the Korean mainland showed the uniqueness of the Jejudo specimens from S. boisduvalii collected on the Korean mainland, indicating distribution of S. jonasii on Jejudo Island in South Korea, instead of S. boisduvalii.
Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

PubMed Central

Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

2014-01-01

Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

PubMed

Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

2014-06-01

Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.
DMINDA: an integrated web server for DNA motif identification and analyses.

PubMed

Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

2014-07-01

DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Diversity of Babesia bovis merozoite surface antigen genes in the Philippines.

PubMed

Tattiyapong, Muncharee; Sivakumar, Thillaiampalam; Ybanez, Adrian Patalinghug; Ybanez, Rochelle Haidee Daclan; Perez, Zandro Obligado; Guswanto, Azirwan; Igarashi, Ikuo; Yokoyama, Naoaki

2014-02-01

Babesia bovis is the causative agent of fatal babesiosis in cattle. In the present study, we investigated the genetic diversity of B. bovis among Philippine cattle, based on the genes that encode merozoite surface antigens (MSAs). Forty-one B. bovis-positive blood DNA samples from cattle were used to amplify the msa-1, msa-2b, and msa-2c genes. In phylogenetic analyses, the msa-1, msa-2b, and msa-2c gene sequences generated from Philippine B. bovis-positive DNA samples were found in six, three, and four different clades, respectively. All of the msa-1 and most of the msa-2b sequences were found in clades that were formed only by Philippine msa sequences in the respective phylograms. While all the msa-1 sequences from the Philippines showed similarity to those formed by Australian msa-1 sequences, the msa-2b sequences showed similarity to either Australian or Mexican msa-2b sequences. In contrast, msa-2c sequences from the Philippines were distributed across all the clades of the phylogram, although one clade was formed exclusively by Philippine msa-2c sequences. Similarities among the deduced amino acid sequences of MSA-1, MSA-2b, and MSA-2c from the Philippines were 62.2-100, 73.1-100, and 67.3-100%, respectively. The present findings demonstrate that B. bovis populations are genetically diverse in the Philippines. This information will provide a good foundation for the future design and implementation of improved immunological preventive methodologies against bovine babesiosis in the Philippines. The study has also generated a set of data that will be useful for futher understanding of the global genetic diversity of this important parasite. © 2013.
Interaction between core protein of classical swine fever virus with cellular IQGAP1 proetin appears essential for virulence in swine

USDA-ARS?s Scientific Manuscript database

Here we show that IQGAP1, a cellular protein that plays a pivotal role as a regulator of the cytoskeleton affecting cell adhesion, polarization and migration, interacts with Classical Swine Fever Virus (CSFV) Core protein. Sequence analyses identified a defined set of residues within CSFV Core prote...
Characterization and Complete Nucleotide Sequence of an Unusual Reptilian Retrovirus Recovered from the Order Crocodylia

PubMed Central

Martin, Joanne; Kabat, Peter; Herniou, Elisabeth; Tristem, Michael

2002-01-01

A novel group of retroviruses found within the order Crocodylia are described. Phylogenetic analyses demonstrate that they are probably the most divergent members of the Retroviridae described to date; even the most conserved regions of Pol show an average of only 23% amino acid identity when compared to other retroviruses. PMID:11932432
Phylogenetic Analysis of Klebsiella pneumoniae from Hospitalized Children, Pakistan

PubMed Central

Ejaz, Hasan; Wang, Nancy; Wilksch, Jonathan J.; Page, Andrew J.; Cao, Hanwei; Gujaran, Shruti; Keane, Jacqueline A.; Lithgow, Trevor; ul-Haq, Ikram; Dougan, Gordon

2017-01-01

Klebsiella pneumoniae shows increasing emergence of multidrug-resistant lineages, including strains resistant to all available antimicrobial drugs. We conducted whole-genome sequencing of 178 highly drug-resistant isolates from a tertiary hospital in Lahore, Pakistan. Phylogenetic analyses to place these isolates into global context demonstrate the expansion of multiple independent lineages, including K. quasipneumoniae. PMID:29048298
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kimelman, Aya; Levy, Asaf; Sberro, Hila

In the process of clone-based genome sequencing, initial assemblies frequently contain cloning gaps that can be resolved using cloning-independent methods, but the reason for their occurrence is largely unknown. By analyzing 9,328,693 sequencing clones from 393 microbial genomes we systematically mapped more than 15,000 genes residing in cloning gaps and experimentally showed that their expression products are toxic to the Escherichia coli host. A subset of these toxic sequences was further evaluated through a series of functional assays exploring the mechanisms of their toxicity. Among these genes our assays revealed novel toxins and restriction enzymes, and new classes of smallmore » non-coding toxic RNAs that reproducibly inhibit E. coli growth. Further analyses also revealed abundant, short toxic DNA fragments that were predicted to suppress E. coli growth by interacting with the replication initiator dnaA. Our results show that cloning gaps, once considered the result of technical problems, actually serve as a rich source for the discovery of biotechnologically valuable functions, and suggest new modes of antimicrobial interventions.« less
HIV genetic diversity in Cameroon: possible public health importance.

PubMed

Ndongmo, Clement B; Pieniazek, Danuta; Holberg-Petersen, Mona; Holm-Hansen, Carol; Zekeng, Leopold; Jeansson, Stig L; Kaptue, Lazare; Kalish, Marcia L

2006-08-01

To monitor the evolving molecular epidemiology and genetic diversity of HIV in a country where many distinct strains cocirculate, we performed genetic analyses on sequences from 75 HIV-1-infected Cameroonians: 74 were group M and 1 was group O. Of the group M sequences, 74 were classified into the following env gp41 subtypes or recombinant forms: CRF02 (n = 54), CRF09 (n = 2), CRF13 (n = 2), A (n = 5), CRF11 (n = 4), CRF06 (n = 1), G (n = 2), F2 (n = 2), and E (n = 1, CRF01), and 1 was a JG recombinant. Comparison of phylogenies for 70 matched gp41 and protease sequences showed inconsistent classifications for 18 (26%) strains. Our data show that recombination is rampant in Cameroon with recombinant viruses continuing to recombine, adding to the complexity of circulating HIV strains. This expanding genetic diversity raises public health concerns for the ability of diagnostic assays to detect these unique HIV mosaic variants and for the development of broadly effective HIV vaccines.
Atropos: specific, sensitive, and speedy trimming of sequencing reads.

PubMed

Didion, John P; Martin, Marcel; Collins, Francis S

2017-01-01

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Atropos is open source and free software written in Python (3.3+) and available at https://github.com/jdidion/atropos.
Mumps virus F gene and HN gene sequencing as a molecular tool to study mumps virus transmission.

PubMed

Gouma, Sigrid; Cremer, Jeroen; Parkkali, Saara; Veldhuijzen, Irene; van Binnendijk, Rob S; Koopmans, Marion P G

2016-11-01

Various mumps outbreaks have occurred in the Netherlands since 2004, particularly among persons who had received 2 doses of measles, mumps, and rubella (MMR) vaccination. Genomic typing of pathogens can be used to track outbreaks, but the established genotyping of mumps virus based on the small hydrophobic (SH) gene sequences did not provide sufficient resolution. Therefore, we expanded the sequencing to include fusion (F) gene and haemagglutinin-neuraminidase (HN) gene sequences in addition to the SH gene sequences from 109 mumps virus genotype G strains obtained between 2004 and mid 2015 in the Netherlands. When the molecular information from these 3 genes was combined, we were able to identify separate mumps virus clusters and track mumps virus transmission. The analyses suggested that multiple mumps virus introductions occurred in the Netherlands between 2004 and 2015 resulting in several mumps outbreaks throughout this period, whereas during some local outbreaks the molecular data pointed towards endemic circulation. Combined analysis of epidemiological data and sequence data collected in 2015 showed good support for the phylogenetic clustering. Copyright Â© 2016 Elsevier B.V. All rights reserved.
Identification of fungi in shotgun metagenomics datasets

PubMed Central

Donovan, Paul D.; Gonzalez, Gabriel; Higgins, Desmond G.

2018-01-01

Metagenomics uses nucleic acid sequencing to characterize species diversity in different niches such as environmental biomes or the human microbiome. Most studies have used 16S rRNA amplicon sequencing to identify bacteria. However, the decreasing cost of sequencing has resulted in a gradual shift away from amplicon analyses and towards shotgun metagenomic sequencing. Shotgun metagenomic data can be used to identify a wide range of species, but have rarely been applied to fungal identification. Here, we develop a sequence classification pipeline, FindFungi, and use it to identify fungal sequences in public metagenome datasets. We focus primarily on animal metagenomes, especially those from pig and mouse microbiomes. We identified fungi in 39 of 70 datasets comprising 71 fungal species. At least 11 pathogenic species with zoonotic potential were identified, including Candida tropicalis. We identified Pseudogymnoascus species from 13 Antarctic soil samples initially analyzed for the presence of bacteria capable of degrading diesel oil. We also show that Candida tropicalis and Candida loboi are likely the same species. In addition, we identify several examples where contaminating DNA was erroneously included in fungal genome assemblies. PMID:29444186
Atropos: specific, sensitive, and speedy trimming of sequencing reads

PubMed Central

Collins, Francis S.

2017-01-01

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Atropos is open source and free software written in Python (3.3+) and available at https://github.com/jdidion/atropos. PMID:28875074
A comprehensive analysis of three Asiatic black bear mitochondrial genomes (subspecies ussuricus, formosanus and mupinensis), with emphasis on the complete mtDNA sequence of Ursus thibetanus ussuricus (Ursidae).

PubMed

Hwang, Dae-Sik; Ki, Jang-Seu; Jeong, Dong-Hyuk; Kim, Bo-Hyun; Lee, Bae-Keun; Han, Sang-Hoon; Lee, Jae-Seong

2008-08-01

In the present paper, we describe the mitochondrial genome sequence of the Asiatic black bear (Ursus thibetanus ussuricus) with particular emphasis on the control region (CR), and compared with mitochondrial genomes on molecular relationships among the bears. The mitochondrial genome sequence of U. thibetanus ussuricus was 16,700 bp in size with mostly conserved structures (e.g. 13 protein-coding, two rRNA genes, 22 tRNA genes). The CR consisted of several typical conserved domains such as F, E, D, and C boxes, and a conserved sequence block. Nucleotide sequences and the repeated motifs in the CR were different among the bear species, and their copy numbers were also variable according to populations, even within F1 generations of U. thibetanus ussuricus. Comparative analyses showed that the CR D1 region was highly informative for the discrimination of the bear family. These findings suggest that nucleotide sequences of both repeated motifs and CR D1 in the bear family are good markers for species discriminations.
Isolation and characterization of 5S rDNA sequences in catfishes genome (Heptapteridae and Pseudopimelodidae): perspectives for rDNA studies in fish by C0t method.

PubMed

Gouveia, Juceli Gonzalez; Wolf, Ivan Rodrigo; de Moraes-Manécolo, Vivian Patrícia Oliveira; Bardella, Vanessa Belline; Ferracin, Lara Munique; Giuliano-Caetano, Lucia; da Rosa, Renata; Dias, Ana Lúcia

2016-12-01

Sequences of 5S ribosomal RNA (rRNA) are extensively used in fish cytogenomic studies, once they have a flexible organization at the chromosomal level, showing inter- and intra-specific variation in number and position in karyotypes. Sequences from the genome of Imparfinis schubarti (Heptapteridae) were isolated, aiming to understand the organization of 5S rDNA families in the fish genome. The isolation of 5S rDNA from the genome of I. schubarti was carried out by reassociation kinetics (C 0 t) and PCR amplification. The obtained sequences were cloned for the construction of a micro-library. The obtained clones were sequenced and hybridized in I. schubarti and Microglanis cottoides (Pseudopimelodidae) for chromosome mapping. An analysis of the sequence alignments with other fish groups was accomplished. Both methods were effective when using 5S rDNA for hybridization in I. schubarti genome. However, the C 0 t method enabled the use of a complete 5S rRNA gene, which was also successful in the hybridization of M. cottoides. Nevertheless, this gene was obtained only partially by PCR. The hybridization results and sequence analyses showed that intact 5S regions are more appropriate for the probe operation, due to conserved structure and motifs. This study contributes to a better understanding of the organization of multigene families in catfish's genomes.
Magnetostratigraphy of the impact breccias and post-impact carbonates from borehole Yaxcopoil-1, Chicxulub impact crater, Yucatán, Mexico

NASA Astrophysics Data System (ADS)

Rebolledo-Vieyra, Mario; Urrutia-Fucugauchi, Jaime

2004-06-01

We report the magnetostratigraphy of the sedimentary sequence between the impact breccias and the post-impact carbonate sequence conducted on samples recovered by Yaxcopoil-1 (Yax-1). Samples of impact breccias show reverse polarities that span up to ~56 cm into the postimpact carbonate lithologies. We correlate these breccias to those of PEMEX boreholes Yucatán-6 and Chicxulub-1, from which we tied our magnetostratigraphy to the radiometric age from a melt sample from the Yucatán-6 borehole. Thin section analyses of the carbonate samples showed a significant amount of dark minerals and glass shards that we identified as the magnetic carriers; therefore, we propose that the mechanism of magnetic acquisition within the carbonate rocks for the interval studied is detrital remanent magnetism (DRM). With these samples, we constructed the scale of geomagnetic polarities where we find two polarities within the sequence, a reverse polarity event within the impact breccias and the base of the post-impact carbonate sequence (up to 794.07 m), and a normal polarity event in the last ~20 cm of the interval studied. The polarities recorded in the sequence analyzed are interpreted to span from chron 29r to 29n, and we propose that the reverse polarity event lies within the 29r chron. The magnetostratigraphy of the sequence studied shows that the horizon at 794.11 m deep, interpreted as the K/T boundary, lies within the geomagnetic chron 29r, which contains the K/T boundary.
Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions.

PubMed

Nishizawa, M; Nishizawa, K

2000-10-01

The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the 'between gene' GC content heterogeneity, which is linked to 'isochores', is a principal factor associated with the bias in substitution patterns in human, 'within gene' heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.
Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions

PubMed Central

Nishizawa, Manami; Nishizawa, Kazuhisa

2000-01-01

The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the ‘between gene’ GC content heterogeneity, which is linked to ‘isochores’, is a principal factor associated with the bias in substitution patterns in human, ‘within gene’ heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed. PMID:11000273
Combined molecular and morphological phylogenetic analyses of the New Zealand wolf spider genus Anoteropsis (Araneae: Lycosidae).

PubMed

Vink, Cor J; Paterson, Adrian M

2003-09-01

Datasets from the mitochondrial gene regions NADH dehydrogenase subunit I (ND1) and cytochrome c oxidase subunit I (COI) of the 20 species in the New Zealand wolf spider (Lycosidae) genus Anoteropsis were generated. Sequence data were phylogenetically analysed using parsimony and maximum likelihood analyses. The phylogenies generated from the ND1 and COI sequence data and a previously generated morphological dataset were significantly congruent (p<0.001). Sequence data were combined with morphological data and phylogenetically analysed using parsimony. The ND1 region sequenced included part of tRNA(Leu(CUN)), which appears to have an unstable amino-acyl arm and no TpsiC arm in lycosids. Analyses supported the existence of five species groups within Anoteropsis and the monophyly of species represented by multiple samples. A radiation of Anoteropsis species within the last five million years is inferred from the ND1 and COI likelihood phylograms, habitat and geological data, which also indicates that Anoteropsis arrived in New Zealand some time after it separated from Gondwana.
Next-Generation Sequencing of Aquatic Oligochaetes: Comparison of Experimental Communities

PubMed Central

Vivien, Régis; Lejzerowicz, Franck; Pawlowski, Jan

2016-01-01

Aquatic oligochaetes are a common group of freshwater benthic invertebrates known to be very sensitive to environmental changes and currently used as bioindicators in some countries. However, more extensive application of oligochaetes for assessing the ecological quality of sediments in watercourses and lakes would require overcoming the difficulties related to morphology-based identification of oligochaetes species. This study tested the Next-Generation Sequencing (NGS) of a standard cytochrome c oxydase I (COI) barcode as a tool for the rapid assessment of oligochaete diversity in environmental samples, based on mixed specimen samples. To know the composition of each sample we Sanger sequenced every specimen present in these samples. Our study showed that a large majority of OTUs (Operational Taxonomic Unit) could be detected by NGS analyses. We also observed congruence between the NGS and specimen abundance data for several but not all OTUs. Because the differences in sequence abundance data were consistent across samples, we exploited these variations to empirically design correction factors. We showed that such factors increased the congruence between the values of oligochaetes-based indices inferred from the NGS and the Sanger-sequenced specimen data. The validation of these correction factors by further experimental studies will be needed for the adaptation and use of NGS technology in biomonitoring studies based on oligochaete communities. PMID:26866802
Utility of Whole-Genome Sequencing of Escherichia coli O157 for Outbreak Detection and Epidemiological Surveillance.

PubMed

Holmes, Anne; Allison, Lesley; Ward, Melissa; Dallman, Timothy J; Clark, Richard; Fawkes, Angie; Murphy, Lee; Hanson, Mary

2015-11-01

Detailed laboratory characterization of Escherichia coli O157 is essential to inform epidemiological investigations. This study assessed the utility of whole-genome sequencing (WGS) for outbreak detection and epidemiological surveillance of E. coli O157, and the data were used to identify discernible associations between genotypes and clinical outcomes. One hundred five E. coli O157 strains isolated over a 5-year period from human fecal samples in Lothian, Scotland, were sequenced with the Ion Torrent Personal Genome Machine. A total of 8,721 variable sites in the core genome were identified among the 105 isolates; 47% of the single nucleotide polymorphisms (SNPs) were attributable to six "atypical" E. coli O157 strains and included recombinant regions. Phylogenetic analyses showed that WGS correlated well with the epidemiological data. Epidemiological links existed between cases whose isolates differed by three or fewer SNPs. WGS also correlated well with multilocus variable-number tandem repeat analysis (MLVA) typing data, with only three discordant results observed, all among isolates from cases not known to be epidemiologically related. WGS produced a better-supported, higher-resolution phylogeny than MLVA, confirming that the method is more suitable for epidemiological surveillance of E. coli O157. A combination of in silico analyses (VirulenceFinder, ResFinder, and local BLAST searches) were used to determine stx subtypes, multilocus sequence types (15 loci), and the presence of virulence and acquired antimicrobial resistance genes. There was a high level of correlation between the WGS data and our routine typing methods, although some discordant results were observed, mostly related to the limitation of short sequence read assembly. The data were used to identify sublineages and clades of E. coli O157, and when they were correlated with the clinical outcome data, they showed that one clade, Ic3, was significantly associated with severe disease. Together, the results show that WGS data can provide higher resolution of the relationships between E. coli O157 isolates than that provided by MLVA. The method has the potential to streamline the laboratory workflow and provide detailed information for the clinical management of patients and public health interventions. Copyright © 2015, Holmes et al.

Utility of Whole-Genome Sequencing of Escherichia coli O157 for Outbreak Detection and Epidemiological Surveillance

PubMed Central

Allison, Lesley; Ward, Melissa; Dallman, Timothy J.; Clark, Richard; Fawkes, Angie; Murphy, Lee; Hanson, Mary

2015-01-01

Detailed laboratory characterization of Escherichia coli O157 is essential to inform epidemiological investigations. This study assessed the utility of whole-genome sequencing (WGS) for outbreak detection and epidemiological surveillance of E. coli O157, and the data were used to identify discernible associations between genotypes and clinical outcomes. One hundred five E. coli O157 strains isolated over a 5-year period from human fecal samples in Lothian, Scotland, were sequenced with the Ion Torrent Personal Genome Machine. A total of 8,721 variable sites in the core genome were identified among the 105 isolates; 47% of the single nucleotide polymorphisms (SNPs) were attributable to six “atypical” E. coli O157 strains and included recombinant regions. Phylogenetic analyses showed that WGS correlated well with the epidemiological data. Epidemiological links existed between cases whose isolates differed by three or fewer SNPs. WGS also correlated well with multilocus variable-number tandem repeat analysis (MLVA) typing data, with only three discordant results observed, all among isolates from cases not known to be epidemiologically related. WGS produced a better-supported, higher-resolution phylogeny than MLVA, confirming that the method is more suitable for epidemiological surveillance of E. coli O157. A combination of in silico analyses (VirulenceFinder, ResFinder, and local BLAST searches) were used to determine stx subtypes, multilocus sequence types (15 loci), and the presence of virulence and acquired antimicrobial resistance genes. There was a high level of correlation between the WGS data and our routine typing methods, although some discordant results were observed, mostly related to the limitation of short sequence read assembly. The data were used to identify sublineages and clades of E. coli O157, and when they were correlated with the clinical outcome data, they showed that one clade, Ic3, was significantly associated with severe disease. Together, the results show that WGS data can provide higher resolution of the relationships between E. coli O157 isolates than that provided by MLVA. The method has the potential to streamline the laboratory workflow and provide detailed information for the clinical management of patients and public health interventions. PMID:26354815
Association of the gut microbiota mobilome with hospital location and birth weight in preterm infants.

PubMed

Ravi, Anuradha; Estensmo, Eva Lena F; Abée-Lund, Trine M L'; Foley, Steven L; Allgaier, Bernhard; Martin, Camilia R; Claud, Erika C; Rudi, Knut

2017-11-01

BackgroundThe preterm infant gut microbiota is vulnerable to different biotic and abiotic factors. Although the development of this microbiota has been extensively studied, the mobilome-i.e. the mobile genetic elements (MGEs) in the gut microbiota-has not been considered. Therefore, the aim of this study was to investigate the association of the mobilome with birth weight and hospital location in the preterm infant gut microbiota.MethodsThe data set consists of fecal samples from 62 preterm infants with and without necrotizing enterocolitis (NEC) from three different hospitals. We analyzed the gut microbiome by using 16S rRNA amplicon sequencing, shot-gun metagenome sequencing, and quantitative PCR. Predictive models and other data analyses were performed using MATLAB and QIIME.ResultSThe microbiota composition was significantly different between NEC-positive and NEC-negative infants and significantly different between hospitals. An operational taxanomic unit (OTU) showed strong positive and negative correlation with NEC and birth weight, respectively, whereas none showed significance for mode of delivery. Metagenome analyses revealed high levels of conjugative plasmids with MGEs and virulence genes. Results from quantitative PCR showed that the plasmid signature genes were significantly different between hospitals and in NEC-positive infants.ConclusionOur results point toward an association of the mobilome with hospital location in preterm infants.
Phylogeny, diet, and habitat of an extinct ground sloth from Cuchillo Curá, Neuquén Province, southwest Argentina

USGS Publications Warehouse

Hofreiter, Michael; Betancourt, Julio L.; Sbriller, Alicia Pelliza; Markgraf, Vera; McDonald, H. Gregory

2003-01-01

Advancements in ancient DNA analyses now permit comparative molecular and morphological studies of extinct animal dung commonly preserved in caves of semiarid regions. These new techniques are showcased using a unique dung deposit preserved in a late glacial vizcacha (Lagidium sp.) midden from a limestone cave in southwestern Argentina (38.5° S). Phylogenetic analyses of the mitochondrial DNA show that the dung originated from a small ground sloth species not yet represented by skeletal material in the region, and not closely related to any of the four previously sequenced extinct and extant sloth species. Analyses of pollen and plant cuticles, as well as analyses of the chloroplast DNA, show that the Cuchillo Curá ground sloth browsed on many of the same herb, grass, and shrub genera common at the site today, and that its habitat was treeless Patagonian scrub-steppe. We envision a day when molecular analyses are used routinely to supplement morphological identifications and possibly to provide a time-lapse view of molecular diversification.
Phylogeny, diet, and habitat of an extinct ground sloth from Cuchillo Curá, Neuquén Province, southwest Argentina

NASA Astrophysics Data System (ADS)

Hofreiter, Michael; Betancourt, Julio L.; Sbriller, Alicia Pelliza; Markgraf, Vera; McDonald, H. Gregory

2003-05-01

Advancements in ancient DNA analyses now permit comparative molecular and morphological studies of extinct animal dung commonly preserved in caves of semiarid regions. These new techniques are showcased using a unique dung deposit preserved in a late glacial vizcacha ( Lagidium sp.) midden from a limestone cave in southwestern Argentina (38.5° S). Phylogenetic analyses of the mitochondrial DNA show that the dung originated from a small ground sloth species not yet represented by skeletal material in the region, and not closely related to any of the four previously sequenced extinct and extant sloth species. Analyses of pollen and plant cuticles, as well as analyses of the chloroplast DNA, show that the Cuchillo Curá ground sloth browsed on many of the same herb, grass, and shrub genera common at the site today, and that its habitat was treeless Patagonian scrub-steppe. We envision a day when molecular analyses are used routinely to supplement morphological identifications and possibly to provide a time-lapse view of molecular diversification.
A Comprehensive Genetic Study of Streptococcal Immunoglobulin A1 Proteases: Evidence for Recombination within and between Species

PubMed Central

Poulsen, Knud; Reinholdt, Jesper; Jespersgaard, Christina; Boye, Kit; Brown, Thomas A.; Hauge, Majbritt; Kilian, Mogens

1998-01-01

An analysis of 13 immunoglobulin A1 (IgA1) protease genes (iga) of strains of Streptococcus pneumoniae, Streptococcus oralis, Streptococcus mitis, and Streptococcus sanguis was carried out to obtain information on the structure, polymorphism, and phylogeny of this specific protease, which enables bacteria to evade functions of the predominant Ig isotype on mucosal surfaces. The analysis included cloning and sequencing of iga genes from S. oralis and S. mitis biovar 1, sequencing of an additional seven iga genes from S. sanguis biovars 1 through 4, and restriction fragment length polymorphism (RFLP) analyses of iga genes of another 10 strains of S. mitis biovar 1 and 6 strains of S. oralis. All 13 genes sequenced had the potential of encoding proteins with molecular masses of approximately 200 kDa containing the sequence motif HEMTH and an E residue 20 amino acids downstream, which are characteristic of Zn metalloproteinases. In addition, all had a typical gram-positive cell wall anchor motif, LPNTG, which, in contrast to such motifs in other known streptococcal and staphylococcal proteins, was located in their N-terminal parts. Repeat structures showing variation in number and sequence were present in all strains and may be of relevance to the immunogenicities of the enzymes. Protease activities in cultures of the streptococcal strains were associated with species of different molecular masses ranging from 130 to 200 kDa, suggesting posttranslational processing possibly as a result of autoproteolysis at post-proline peptide bonds in the N-terminal parts of the molecules. Comparison of deduced amino acid sequences revealed a 94% similarity between S. oralis and S. mitis IgA1 proteases and a 75 to 79% similarity between IgA1 proteases of these species and those of S. pneumoniae and S. sanguis, respectively. Combined with the results of RFLP analyses using different iga gene fragments as probes, the results of nucleotide sequence comparisons provide evidence of horizontal transfer of iga gene sequences among individual strains of S. sanguis as well as among S. mitis and the two species S. pneumoniae and S. oralis. While iga genes of S. sanguis and S. oralis were highly homogeneous, the genes of S. pneumoniae and S. mitis showed extensive polymorphism reflected in different degrees of antigenic diversity. PMID:9423856
Bacterial diversity in typical Italian salami at different ripening stages as revealed by high-throughput sequencing of 16S rRNA amplicons.

PubMed

Połka, Justyna; Rebecchi, Annalisa; Pisacane, Vincenza; Morelli, Lorenzo; Puglisi, Edoardo

2015-04-01

The bacterial diversity involved in food fermentations is one of the most important factors shaping the final characteristics of traditional foods. Knowledge about this diversity can be greatly improved by the application of high-throughput sequencing technologies (HTS) coupled to the PCR amplification of the 16S rRNA subunit. Here we investigated the bacterial diversity in batches of Salame Piacentino PDO (Protected Designation of Origin), a dry fermented sausage that is typical of a regional area of Northern Italy. Salami samples from 6 different local factories were analysed at 0, 21, 49 and 63 days of ripening; raw meat at time 0 and casing samples at 21 days of ripening where also analysed, and the effect of starter addition was included in the experimental set-up. Culture-based microbiological analyses and PCR-DGGE were carried out in order to be compared with HTS results. A total of 722,196 high quality sequences were obtained after trimming, paired-reads assembly and quality screening of raw reads obtained by Illumina MiSeq sequencing of the two bacterial 16S hypervariable regions V3 and V4; manual curation of 16S database allowed a correct taxonomical classification at the species for 99.5% of these reads. Results confirmed the presence of main bacterial species involved in the fermentation of salami as assessed by PCR-DGGE, but with a greater extent of resolution and quantitative assessments that are not possible by the mere analyses of gel banding patterns. Thirty-two different Staphylococcus and 33 Lactobacillus species where identified in the salami from different producers, while the whole data set obtained accounted for 13 main families and 98 rare ones, 23 of which were present in at least 10% of the investigated samples, with casings being the major sources of the observed diversity. Multivariate analyses also showed that batches from 6 local producers tend to cluster altogether after 21 days of ripening, thus indicating that HTS has the potential for fine scale differentiation of local fermented foods. Copyright © 2014 Elsevier Ltd. All rights reserved.
Complete Genome Sequence of Germline Chromosomally Integrated Human Herpesvirus 6A and Analyses Integration Sites Define a New Human Endogenous Virus with Potential to Reactivate as an Emerging Infection.

PubMed

Tweedy, Joshua; Spyrou, Maria Alexandra; Pearson, Max; Lassner, Dirk; Kuhl, Uwe; Gompels, Ursula A

2016-01-15

Human herpesvirus-6A and B (HHV-6A, HHV-6B) have recently defined endogenous genomes, resulting from integration into the germline: chromosomally-integrated "CiHHV-6A/B". These affect approximately 1.0% of human populations, giving potential for virus gene expression in every cell. We previously showed that CiHHV-6A was more divergent than CiHHV-6B by examining four genes in 44 European CiHHV-6A/B cardiac/haematology patients. There was evidence for gene expression/reactivation, implying functional non-defective genomes. To further define the relationship between HHV-6A and CiHHV-6A we used next-generation sequencing to characterize genomes from three CiHHV-6A cardiac patients. Comparisons to known exogenous HHV-6A showed CiHHV-6A genomes formed a separate clade; including all 85 non-interrupted genes and necessary cis-acting signals for reactivation as infectious virus. Greater single nucleotide polymorphism (SNP) density was defined in 16 genes and the direct repeats (DR) terminal regions. Using these SNPs, deep sequencing analyses demonstrated superinfection with exogenous HHV-6A in two of the CiHHV-6A patients with recurrent cardiac disease. Characterisation of the integration sites in twelve patients identified the human chromosome 17p subtelomere as a prevalent site, which had specific repeat structures and phylogenetically related CiHHV-6A coding sequences indicating common ancestral origins. Overall CiHHV-6A genomes were similar, but distinct from known exogenous HHV-6A virus, and have the capacity to reactivate as emerging virus infections.
Tripping over emerging pathogens around the world: a phylogeographical approach for determining the epidemiology of Porcine circovirus-2 (PCV-2), considering global trading.

PubMed

Vidigal, Pedro M P; Mafra, Claudio L; Silva, Fernanda M F; Fietto, Juliana L R; Silva Júnior, Abelardo; Almeida, Márcia R

2012-01-01

Porcine circovirus-2 (PCV-2) is an emerging virus associated with a number of different syndromes in pigs known as Porcine Circovirus Associated Diseases (PCVAD). Since its identification and characterization in the early 1990s, PCV-2 has achieved a worldwide distribution, becoming endemic in most pig-producing countries, and is currently considered as the main cause of losses on pig farms. In this study, we analyzed the main routes of the spread of PCV-2 between pig-producing countries using phylogenetic and phylogeographical approaches. A search for PCV-2 genome sequences in GenBank was performed, and the 420 PCV-2 sequences obtained were grouped into haplotypes (group of sequences that showed 100% identity), based on the infinite sites model of genome evolution. A phylogenetic hypothesis was inferred by Bayesian Inference for the classification of viral strains and a haplotype network was constructed by Median Joining to predict the geographical distribution of and genealogical relationships between haplotypes. In order to establish an epidemiological and economic context in these analyses, we considered all information about PCV-2 sequences available in GenBank, including papers published on viral isolation, and live pig trading statistics available on the UN Comtrade database (http://comtrade.un.org/). In these analyses, we identified a strong correlation between the means of PCV-2 dispersal predicted by the haplotype network and the statistics on the international trading of live pigs. This correlation provides a new perspective on the epidemiology of PCV-2, highlighting the importance of the movement of animals around the world in the emergence of new pathogens, and showing the need for effective sanitary barriers when trading live animals. Copyright © 2011 Elsevier B.V. All rights reserved.
Spatiotemporal Phylogenetic Analysis and Molecular Characterisation of Infectious Bursal Disease Viruses Based on the VP2 Hyper-Variable Region

PubMed Central

Dolz, Roser; Valle, Rosa; Perera, Carmen L.; Bertran, Kateri; Frías, Maria T.; Majó, Natàlia; Ganges, Llilianne; Pérez, Lester J.

2013-01-01

Background Infectious bursal disease is a highly contagious and acute viral disease caused by the infectious bursal disease virus (IBDV); it affects all major poultry producing areas of the world. The current study was designed to rigorously measure the global phylogeographic dynamics of IBDV strains to gain insight into viral population expansion as well as the emergence, spread and pattern of the geographical structure of very virulent IBDV (vvIBDV) strains. Methodology/Principal Findings Sequences of the hyper-variable region of the VP2 (HVR-VP2) gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank database; Cuban sequences were obtained in the current work. All sequences were analysed by Bayesian phylogeographic analysis, implemented in the Bayesian Evolutionary Analysis Sampling Trees (BEAST), Bayesian Tip-association Significance testing (BaTS) and Spatial Phylogenetic Reconstruction of Evolutionary Dynamics (SPREAD) software packages. Selection pressure on the HVR-VP2 was also assessed. The phylogeographic association-trait analysis showed that viruses sampled from individual countries tend to cluster together, suggesting a geographic pattern for IBDV strains. Spatial analysis from this study revealed that strains carrying sequences that were linked to increased virulence of IBDV appeared in Iran in 1981 and spread to Western Europe (Belgium) in 1987, Africa (Egypt) around 1990, East Asia (China and Japan) in 1993, the Caribbean Region (Cuba) by 1995 and South America (Brazil) around 2000. Selection pressure analysis showed that several codons in the HVR-VP2 region were under purifying selection. Conclusions/Significance To our knowledge, this work is the first study applying the Bayesian phylogeographic reconstruction approach to analyse the emergence and spread of vvIBDV strains worldwide. PMID:23805195
Genomic Epidemiology of Clostridium botulinum Isolates from Temporally Related Cases of Infant Botulism in New South Wales, Australia

PubMed Central

Gray, Timothy J.; Wang, Qinning; Ng, Jimmy; Hicks, Leanne; Nguyen, Trang; Yuen, Marion; Hill-Cawthorne, Grant A.; Sintchenko, Vitali

2015-01-01

Infant botulism is a potentially life-threatening paralytic disease that can be associated with prolonged morbidity if not rapidly diagnosed and treated. Four infants were diagnosed and treated for infant botulism in NSW, Australia, between May 2011 and August 2013. Despite the temporal relationship between the cases, there was no close geographical clustering or other epidemiological links. Clostridium botulinum isolates, three of which produced botulism neurotoxin serotype A (BoNT/A) and one BoNT serotype B (BoNT/B), were characterized using whole-genome sequencing (WGS). In silico multilocus sequence typing (MLST) found that two of the BoNT/A-producing isolates shared an identical novel sequence type, ST84. The other two isolates were single-locus variants of this sequence type (ST85 and ST86). All BoNT/A-producing isolates contained the same chromosomally integrated BoNT/A2 neurotoxin gene cluster. The BoNT/B-producing isolate carried a single plasmid-borne bont/B gene cluster, encoding BoNT subtype B6. Single nucleotide polymorphism (SNP)-based typing results corresponded well with MLST; however, the extra resolution provided by the whole-genome SNP comparisons showed that the isolates differed from each other by >3,500 SNPs. WGS analyses indicated that the four infant botulism cases were caused by genomically distinct strains of C. botulinum that were unlikely to have originated from a common environmental source. The isolates did, however, cluster together, compared with international isolates, suggesting that C. botulinum from environmental reservoirs throughout NSW have descended from a common ancestor. Analyses showed that the high resolution of WGS provided important phylogenetic information that would not be captured by standard seven-loci MLST. PMID:26109442
Genomic Epidemiology of Clostridium botulinum Isolates from Temporally Related Cases of Infant Botulism in New South Wales, Australia.

PubMed

McCallum, Nadine; Gray, Timothy J; Wang, Qinning; Ng, Jimmy; Hicks, Leanne; Nguyen, Trang; Yuen, Marion; Hill-Cawthorne, Grant A; Sintchenko, Vitali

2015-09-01

Infant botulism is a potentially life-threatening paralytic disease that can be associated with prolonged morbidity if not rapidly diagnosed and treated. Four infants were diagnosed and treated for infant botulism in NSW, Australia, between May 2011 and August 2013. Despite the temporal relationship between the cases, there was no close geographical clustering or other epidemiological links. Clostridium botulinum isolates, three of which produced botulism neurotoxin serotype A (BoNT/A) and one BoNT serotype B (BoNT/B), were characterized using whole-genome sequencing (WGS). In silico multilocus sequence typing (MLST) found that two of the BoNT/A-producing isolates shared an identical novel sequence type, ST84. The other two isolates were single-locus variants of this sequence type (ST85 and ST86). All BoNT/A-producing isolates contained the same chromosomally integrated BoNT/A2 neurotoxin gene cluster. The BoNT/B-producing isolate carried a single plasmid-borne bont/B gene cluster, encoding BoNT subtype B6. Single nucleotide polymorphism (SNP)-based typing results corresponded well with MLST; however, the extra resolution provided by the whole-genome SNP comparisons showed that the isolates differed from each other by >3,500 SNPs. WGS analyses indicated that the four infant botulism cases were caused by genomically distinct strains of C. botulinum that were unlikely to have originated from a common environmental source. The isolates did, however, cluster together, compared with international isolates, suggesting that C. botulinum from environmental reservoirs throughout NSW have descended from a common ancestor. Analyses showed that the high resolution of WGS provided important phylogenetic information that would not be captured by standard seven-loci MLST. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Spatiotemporal Phylogenetic Analysis and Molecular Characterisation of Infectious Bursal Disease Viruses Based on the VP2 Hyper-Variable Region.

PubMed

Alfonso-Morales, Abdulahi; Martínez-Pérez, Orlando; Dolz, Roser; Valle, Rosa; Perera, Carmen L; Bertran, Kateri; Frías, Maria T; Majó, Natàlia; Ganges, Llilianne; Pérez, Lester J

2013-01-01

Infectious bursal disease is a highly contagious and acute viral disease caused by the infectious bursal disease virus (IBDV); it affects all major poultry producing areas of the world. The current study was designed to rigorously measure the global phylogeographic dynamics of IBDV strains to gain insight into viral population expansion as well as the emergence, spread and pattern of the geographical structure of very virulent IBDV (vvIBDV) strains. Sequences of the hyper-variable region of the VP2 (HVR-VP2) gene from IBDV strains isolated from diverse geographic locations were obtained from the GenBank database; Cuban sequences were obtained in the current work. All sequences were analysed by Bayesian phylogeographic analysis, implemented in the Bayesian Evolutionary Analysis Sampling Trees (BEAST), Bayesian Tip-association Significance testing (BaTS) and Spatial Phylogenetic Reconstruction of Evolutionary Dynamics (SPREAD) software packages. Selection pressure on the HVR-VP2 was also assessed. The phylogeographic association-trait analysis showed that viruses sampled from individual countries tend to cluster together, suggesting a geographic pattern for IBDV strains. Spatial analysis from this study revealed that strains carrying sequences that were linked to increased virulence of IBDV appeared in Iran in 1981 and spread to Western Europe (Belgium) in 1987, Africa (Egypt) around 1990, East Asia (China and Japan) in 1993, the Caribbean Region (Cuba) by 1995 and South America (Brazil) around 2000. Selection pressure analysis showed that several codons in the HVR-VP2 region were under purifying selection. To our knowledge, this work is the first study applying the Bayesian phylogeographic reconstruction approach to analyse the emergence and spread of vvIBDV strains worldwide.
Complete Genome Sequence of Germline Chromosomally Integrated Human Herpesvirus 6A and Analyses Integration Sites Define a New Human Endogenous Virus with Potential to Reactivate as an Emerging Infection

PubMed Central

Tweedy, Joshua; Spyrou, Maria Alexandra; Pearson, Max; Lassner, Dirk; Kuhl, Uwe; Gompels, Ursula A.

2016-01-01

Human herpesvirus-6A and B (HHV-6A, HHV-6B) have recently defined endogenous genomes, resulting from integration into the germline: chromosomally-integrated “CiHHV-6A/B”. These affect approximately 1.0% of human populations, giving potential for virus gene expression in every cell. We previously showed that CiHHV-6A was more divergent than CiHHV-6B by examining four genes in 44 European CiHHV-6A/B cardiac/haematology patients. There was evidence for gene expression/reactivation, implying functional non-defective genomes. To further define the relationship between HHV-6A and CiHHV-6A we used next-generation sequencing to characterize genomes from three CiHHV-6A cardiac patients. Comparisons to known exogenous HHV-6A showed CiHHV-6A genomes formed a separate clade; including all 85 non-interrupted genes and necessary cis-acting signals for reactivation as infectious virus. Greater single nucleotide polymorphism (SNP) density was defined in 16 genes and the direct repeats (DR) terminal regions. Using these SNPs, deep sequencing analyses demonstrated superinfection with exogenous HHV-6A in two of the CiHHV-6A patients with recurrent cardiac disease. Characterisation of the integration sites in twelve patients identified the human chromosome 17p subtelomere as a prevalent site, which had specific repeat structures and phylogenetically related CiHHV-6A coding sequences indicating common ancestral origins. Overall CiHHV-6A genomes were similar, but distinct from known exogenous HHV-6A virus, and have the capacity to reactivate as emerging virus infections. PMID:26784220
In-silico and in-vivo analyses of EST databases unveil conserved miRNAs from Carthamus tinctorius and Cynara cardunculus

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are small RNAs (21-24 bp) providing an RNA-based system of gene regulation highly conserved in plants and animals. In plants, miRNAs control mRNA degradation or restrain translation, affecting development and responses to stresses. Plant miRNAs show imperfect but extensive complementarity to mRNA targets, making their computational prediction possible, useful when data mining is applied on different species. In this study we used a comparative approach to identify both miRNAs and their targets, in artichoke and safflower. Results Two complete expressed sequence tags (ESTs) datasets from artichoke (3.6·104 entries) and safflower (4.2·104), were analysed with a bioinformatic pipeline and in vitro experiments, identifying 17 potential miRNAs. For each EST, using RNAhybrid program and 953 non redundant miRNA mature sequences, available in mirBase as reference, we searched matching putative targets. 8730 out of 42011 ESTs from safflower and 7145 of 36323 ESTs from artichoke showed at least one predicted miRNA target. BLAST analysis showed that 75% of all ESTs shared at least a common homologous region (E-value < 10-4) and about 50% of these displayed 400 bp or longer aligned sequences as conserved homologous/orthologous (COS) regions. 960 and 890 ESTs of safflower and artichoke organized in COS shared 79 different miRNA targets, considered functionally conserved, and statistically significant when compared with random sequences (signal to noise ratio > 2 and specificity ≥ 0.85). Four highly significant miRNAs selected from in silico data were experimentally validated in globe artichoke leaves. Conclusions Mature miRNAs and targets were predicted within EST sequences of safflower and artichoke. Most of the miRNA targets appeared highly/moderately conserved, highlighting an important and conserved function. In this study we introduce a stringent parameter for the comparative sequence analysis, represented by the identification of the same target in the COS region. After statistical analysis 79 targets, found on the COS regions and belonging to 60 miRNA families, have a signal to noise ratio > 2, with ≥ 0.85 specificity. The putative miRNAs identified belong to 55 dicotyledon plants and to 24 families only in monocotyledon. PMID:22536958
“Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes”

PubMed Central

Neafsey, Daniel E.; Waterhouse, Robert M.; Abai, Mohammad R.; Aganezov, Sergey S.; Alekseyev, Max A.; Allen, James E.; Amon, James; Arcà, Bruno; Arensburger, Peter; Artemov, Gleb; Assour, Lauren A.; Basseri, Hamidreza; Berlin, Aaron; Birren, Bruce W.; Blandin, Stephanie A.; Brockman, Andrew I.; Burkot, Thomas R.; Burt, Austin; Chan, Clara S.; Chauve, Cedric; Chiu, Joanna C.; Christensen, Mikkel; Costantini, Carlo; Davidson, Victoria L.M.; Deligianni, Elena; Dottorini, Tania; Dritsou, Vicky; Gabriel, Stacey B.; Guelbeogo, Wamdaogo M.; Hall, Andrew B.; Han, Mira V.; Hlaing, Thaung; Hughes, Daniel S.T.; Jenkins, Adam M.; Jiang, Xiaofang; Jungreis, Irwin; Kakani, Evdoxia G.; Kamali, Maryam; Kemppainen, Petri; Kennedy, Ryan C.; Kirmitzoglou, Ioannis K.; Koekemoer, Lizette L.; Laban, Njoroge; Langridge, Nicholas; Lawniczak, Mara K.N.; Lirakis, Manolis; Lobo, Neil F.; Lowy, Ernesto; MacCallum, Robert M.; Mao, Chunhong; Maslen, Gareth; Mbogo, Charles; McCarthy, Jenny; Michel, Kristin; Mitchell, Sara N.; Moore, Wendy; Murphy, Katherine A.; Naumenko, Anastasia N.; Nolan, Tony; Novoa, Eva M.; O'Loughlin, Samantha; Oringanje, Chioma; Oshaghi, Mohammad A.; Pakpour, Nazzy; Papathanos, Philippos A.; Peery, Ashley N.; Povelones, Michael; Prakash, Anil; Price, David P.; Rajaraman, Ashok; Reimer, Lisa J.; Rinker, David C.; Rokas, Antonis; Russell, Tanya L.; Sagnon, N'Fale; Sharakhova, Maria V.; Shea, Terrance; Simão, Felipe A.; Simard, Frederic; Slotman, Michel A.; Somboon, Pradya; Stegniy, Vladimir; Struchiner, Claudio J.; Thomas, Gregg W.C.; Tojo, Marta; Topalis, Pantelis; Tubio, José M.C.; Unger, Maria F.; Vontas, John; Walton, Catherine; Wilding, Craig S.; Willis, Judith H.; Wu, Yi-Chieh; Yan, Guiyun; Zdobnov, Evgeny M.; Zhou, Xiaofan; Catteruccia, Flaminia; Christophides, George K.; Collins, Frank H.; Cornman, Robert S.; Crisanti, Andrea; Donnelly, Martin J.; Emrich, Scott J.; Fontaine, Michael C.; Gelbart, William; Hahn, Matthew W.; Hansen, Immo A.; Howell, Paul I.; Kafatos, Fotis C.; Kellis, Manolis; Lawson, Daniel; Louis, Christos; Luckhart, Shirley; Muskavitch, Marc A.T.; Ribeiro, José M.; Riehle, Michael A.; Sharakhov, Igor V.; Tu, Zhijian; Zwiebel, Laurence J.; Besansky, Nora J.

2015-01-01

Variation in vectorial capacity for human malaria among Anopheles mosquito species is determined by many factors, including behavior, immunity, and life history. To investigate the genomic basis of vectorial capacity and explore new avenues for vector control, we sequenced the genomes of 16 anopheline mosquito species from diverse locations spanning ~100 million years of evolution. Comparative analyses show faster rates of gene gain and loss, elevated gene shuffling on the X chromosome, and more intron losses, relative to Drosophila. Some determinants of vectorial capacity, such as chemosensory genes, do not show elevated turnover, but instead diversify through protein-sequence changes. This dynamism of anopheline genes and genomes may contribute to their flexible capacity to take advantage of new ecological niches, including adapting to humans as primary hosts. PMID:25554792
Genetic divergence in populations of Lutzomyia ayacuchensis, a vector of Andean-type cutaneous leishmaniasis, in Ecuador and Peru.

PubMed

Kato, Hirotomo; Cáceres, Abraham G; Gomez, Eduardo A; Mimori, Tatsuyuki; Uezato, Hiroshi; Hashiguchi, Yoshihisa

2015-01-01

Haplotype and gene network analyses were performed on mitochondrial cytochrome oxidase I and cytochrome b gene sequences of Lutzomyia (Lu.) ayacuchensis populations from Andean areas of Ecuador and southern Peru where the sand fly species transmit Leishmania (Leishmania) mexicana and Leishmania (Viannia) peruviana, respectively, and populations from the northern Peruvian Andes, for which transmission of Leishmania by Lu. ayacuchensis has not been reported. The haplotype analyses showed higher intrapopulation genetic divergence in northern Peruvian Andes populations and less divergence in the southern Peru and Ecuador populations, suggesting that a population bottleneck occurred in the latter populations, but not in former ones. Importantly, both haplotype and phylogenetic analyses showed that populations from Ecuador consisted of clearly distinct clusters from southern Peru, and the two populations were separated from those of northern Peru. Crown Copyright © 2014. Published by Elsevier B.V. All rights reserved.
Analysis of a diverse assemblage of diazotrophic bacteria from Spartina alterniflora using DGGE and clone library screening.

PubMed

Lovell, Charles R; Decker, Peter V; Bagwell, Christopher E; Thompson, Shelly; Matsui, George Y

2008-05-01

Methods to assess the diversity of the diazotroph assemblage in the rhizosphere of the salt marsh cordgrass, Spartina alterniflora were examined. The effectiveness of nifH PCR-denaturing gradient gel electrophoresis (DGGE) was compared to that of nifH clone library analysis. Seventeen DGGE gel bands were sequenced and yielded 58 nonidentical nifH sequences from a total of 67 sequences determined. A clone library constructed using the GC-clamp nifH primers that were employed in the PCR-DGGE (designated the GC-Library) yielded 83 nonidentical sequences from a total of 257 nifH sequences. A second library constructed using an alternate set of nifH primers (N-Library) yielded 83 nonidentical sequences from a total of 138 nifH sequences. Rarefaction curves for the libraries did not reach saturation, although the GC-Library curve was substantially dampened and appeared to be closer to saturation than the N-Library curve. Phylogenetic analyses showed that DGGE gel band sequencing recovered nifH sequences that were frequently sampled in the GC-Library, as well as sequences that were infrequently sampled, and provided a species composition assessment that was robust, efficient, and relatively inexpensive to obtain. Further, the DGGE method permits a large number of samples to be examined for differences in banding patterns, after which bands of interest can be sampled for sequence determination.
Complementary DNA cloning and molecular evolution of opine dehydrogenases in some marine invertebrates.

PubMed

Kimura, Tomohiro; Nakano, Toshiki; Yamaguchi, Toshiyasu; Sato, Minoru; Ogawa, Tomohisa; Muramoto, Koji; Yokoyama, Takehiko; Kan-No, Nobuhiro; Nagahisa, Eizou; Janssen, Frank; Grieshaber, Manfred K

2004-01-01

The complete complementary DNA sequences of genes presumably coding for opine dehydrogenases from Arabella iricolor (sandworm), Haliotis discus hannai (abalone), and Patinopecten yessoensis (scallop) were determined, and partial cDNA sequences were derived for Meretrix lusoria (Japanese hard clam) and Spisula sachalinensis (Sakhalin surf clam). The primers ODH-9F and ODH-11R proved useful for amplifying the sequences for opine dehydrogenases from the 4 mollusk species investigated in this study. The sequence of the sandworm was obtained using primers constructed from the amino acid sequence of tauropine dehydrogenase, the main opine dehydrogenase in A. iricolor. The complete cDNA sequence of A. iricolor, H. discus hannai, and P. yessoensis encode 397, 400, and 405 amino acids, respectively. All sequences were aligned and compared with published databank sequences of Loligo opalescens, Loligo vulgaris (squid), Sepia officinalis (cuttlefish), and Pecten maximus (scallop). As expected, a high level of homology was observed for the cDNA from closely related species, such as for cephalopods or scallops, whereas cDNA from the other species showed lower-level homologies. A similar trend was observed when the deduced amino acid sequences were compared. Furthermore, alignment of these sequences revealed some structural motifs that are possibly related to the binding sites of the substrates. The phylogenetic trees derived from the nucleotide and amino acid sequences were consistent with the classification of species resulting from classical taxonomic analyses.
Phylogeny and differentiation of reptilian and amphibian ranaviruses detected in Europe.

PubMed

Stöhr, Anke C; López-Bueno, Alberto; Blahak, Silvia; Caeiro, Maria F; Rosa, Gonçalo M; Alves de Matos, António Pedro; Martel, An; Alejo, Alí; Marschang, Rachel E

2015-01-01

Ranaviruses in amphibians and fish are considered emerging pathogens and several isolates have been extensively characterized in different studies. Ranaviruses have also been detected in reptiles with increasing frequency, but the role of reptilian hosts is still unclear and only limited sequence data has been provided. In this study, we characterized a number of ranaviruses detected in wild and captive animals in Europe based on sequence data from six genomic regions (major capsid protein (MCP), DNA polymerase (DNApol), ribonucleoside diphosphate reductase alpha and beta subunit-like proteins (RNR-α and -β), viral homolog of the alpha subunit of eukaryotic initiation factor 2, eIF-2α (vIF-2α) genes and microsatellite region). A total of ten different isolates from reptiles (tortoises, lizards, and a snake) and four ranaviruses from amphibians (anurans, urodeles) were included in the study. Furthermore, the complete genome sequences of three reptilian isolates were determined and a new PCR for rapid classification of the different variants of the genomic arrangement was developed. All ranaviruses showed slight variations on the partial nucleotide sequences from the different genomic regions (92.6-100%). Some very similar isolates could be distinguished by the size of the band from the microsatellite region. Three of the lizard isolates had a truncated vIF-2α gene; the other ranaviruses had full-length genes. In the phylogenetic analyses of concatenated sequences from different genes (3223 nt/10287 aa), the reptilian ranaviruses were often more closely related to amphibian ranaviruses than to each other, and most clustered together with previously detected ranaviruses from the same geographic region of origin. Comparative analyses show that among the closely related amphibian-like ranaviruses (ALRVs) described to date, three recently split and independently evolving distinct genetic groups can be distinguished. These findings underline the wide host range of ranaviruses and the emergence of pathogen pollution via animal trade of ectothermic vertebrates.
Phylogeny and Differentiation of Reptilian and Amphibian Ranaviruses Detected in Europe

PubMed Central

Stöhr, Anke C.; López-Bueno, Alberto; Blahak, Silvia; Caeiro, Maria F.; Rosa, Gonçalo M.; Alves de Matos, António Pedro; Martel, An; Alejo, Alí; Marschang, Rachel E.

2015-01-01

Ranaviruses in amphibians and fish are considered emerging pathogens and several isolates have been extensively characterized in different studies. Ranaviruses have also been detected in reptiles with increasing frequency, but the role of reptilian hosts is still unclear and only limited sequence data has been provided. In this study, we characterized a number of ranaviruses detected in wild and captive animals in Europe based on sequence data from six genomic regions (major capsid protein (MCP), DNA polymerase (DNApol), ribonucleoside diphosphate reductase alpha and beta subunit-like proteins (RNR-α and -β), viral homolog of the alpha subunit of eukaryotic initiation factor 2, eIF-2α (vIF-2α) genes and microsatellite region). A total of ten different isolates from reptiles (tortoises, lizards, and a snake) and four ranaviruses from amphibians (anurans, urodeles) were included in the study. Furthermore, the complete genome sequences of three reptilian isolates were determined and a new PCR for rapid classification of the different variants of the genomic arrangement was developed. All ranaviruses showed slight variations on the partial nucleotide sequences from the different genomic regions (92.6–100%). Some very similar isolates could be distinguished by the size of the band from the microsatellite region. Three of the lizard isolates had a truncated vIF-2α gene; the other ranaviruses had full-length genes. In the phylogenetic analyses of concatenated sequences from different genes (3223 nt/10287 aa), the reptilian ranaviruses were often more closely related to amphibian ranaviruses than to each other, and most clustered together with previously detected ranaviruses from the same geographic region of origin. Comparative analyses show that among the closely related amphibian-like ranaviruses (ALRVs) described to date, three recently split and independently evolving distinct genetic groups can be distinguished. These findings underline the wide host range of ranaviruses and the emergence of pathogen pollution via animal trade of ectothermic vertebrates. PMID:25706285

Unraveling the genetic diversity and phylogeny of Leishmania RNA virus 1 strains of infected Leishmania isolates circulating in French Guiana.

PubMed

Tirera, Sourakhata; Ginouves, Marine; Donato, Damien; Caballero, Ignacio S; Bouchier, Christiane; Lavergne, Anne; Bourreau, Eliane; Mosnier, Emilie; Vantilcke, Vincent; Couppié, Pierre; Prevot, Ghislaine; Lacoste, Vincent

2017-07-01

Leishmania RNA virus type 1 (LRV1) is an endosymbiont of some Leishmania (Vianna) species in South America. Presence of LRV1 in parasites exacerbates disease severity in animal models and humans, related to a disproportioned innate immune response, and is correlated with drug treatment failures in humans. Although the virus was identified decades ago, its genomic diversity has been overlooked until now. We subjected LRV1 strains from 19 L. (V.) guyanensis and one L. (V.) braziliensis isolates obtained from cutaneous leishmaniasis samples identified throughout French Guiana with next-generation sequencing and de novo sequence assembly. We generated and analyzed 24 unique LRV1 sequences over their full-length coding regions. Multiple alignment of these new sequences revealed variability (0.5%-23.5%) across the entire sequence except for highly conserved motifs within the 5' untranslated region. Phylogenetic analyses showed that viral genomes of L. (V.) guyanensis grouped into five distinct clusters. They further showed a species-dependent clustering between viral genomes of L. (V.) guyanensis and L. (V.) braziliensis, confirming a long-term co-evolutionary history. Noteworthy, we identified cases of multiple LRV1 infections in three of the 20 Leishmania isolates. Here, we present the first-ever estimate of LRV1 genomic diversity that exists in Leishmania (V.) guyanensis parasites. Genetic characterization and phylogenetic analyses of these viruses has shed light on their evolutionary relationships. To our knowledge, this study is also the first to report cases of multiple LRV1 infections in some parasites. Finally, this work has made it possible to develop molecular tools for adequate identification and genotyping of LRV1 strains for diagnostic purposes. Given the suspected worsening role of LRV1 infection in the pathogenesis of human leishmaniasis, these data have a major impact from a clinical viewpoint and for the management of Leishmania-infected patients.
Unraveling the genetic diversity and phylogeny of Leishmania RNA virus 1 strains of infected Leishmania isolates circulating in French Guiana

PubMed Central

Caballero, Ignacio S.; Bouchier, Christiane; Lavergne, Anne; Bourreau, Eliane; Mosnier, Emilie; Vantilcke, Vincent; Couppié, Pierre; Prevot, Ghislaine

2017-01-01

Introduction Leishmania RNA virus type 1 (LRV1) is an endosymbiont of some Leishmania (Vianna) species in South America. Presence of LRV1 in parasites exacerbates disease severity in animal models and humans, related to a disproportioned innate immune response, and is correlated with drug treatment failures in humans. Although the virus was identified decades ago, its genomic diversity has been overlooked until now. Methodology/Principles findings We subjected LRV1 strains from 19 L. (V.) guyanensis and one L. (V.) braziliensis isolates obtained from cutaneous leishmaniasis samples identified throughout French Guiana with next-generation sequencing and de novo sequence assembly. We generated and analyzed 24 unique LRV1 sequences over their full-length coding regions. Multiple alignment of these new sequences revealed variability (0.5%–23.5%) across the entire sequence except for highly conserved motifs within the 5’ untranslated region. Phylogenetic analyses showed that viral genomes of L. (V.) guyanensis grouped into five distinct clusters. They further showed a species-dependent clustering between viral genomes of L. (V.) guyanensis and L. (V.) braziliensis, confirming a long-term co-evolutionary history. Noteworthy, we identified cases of multiple LRV1 infections in three of the 20 Leishmania isolates. Conclusions/Significance Here, we present the first-ever estimate of LRV1 genomic diversity that exists in Leishmania (V.) guyanensis parasites. Genetic characterization and phylogenetic analyses of these viruses has shed light on their evolutionary relationships. To our knowledge, this study is also the first to report cases of multiple LRV1 infections in some parasites. Finally, this work has made it possible to develop molecular tools for adequate identification and genotyping of LRV1 strains for diagnostic purposes. Given the suspected worsening role of LRV1 infection in the pathogenesis of human leishmaniasis, these data have a major impact from a clinical viewpoint and for the management of Leishmania-infected patients. PMID:28715422
Phylogenetic placement of the enigmatic parasite, Polypodium hydriforme, within the Phylum Cnidaria

PubMed Central

2008-01-01

Background Polypodium hydriforme is a parasite with an unusual life cycle and peculiar morphology, both of which have made its systematic position uncertain. Polypodium has traditionally been considered a cnidarian because it possesses nematocysts, the stinging structures characteristic of this phylum. However, recent molecular phylogenetic studies using 18S rDNA sequence data have challenged this interpretation, and have shown that Polypodium is a close relative to myxozoans and together they share a closer affinity to bilaterians than cnidarians. Due to the variable rates of 18S rDNA sequences, these results have been suggested to be an artifact of long-branch attraction (LBA). A recent study, using multiple protein coding markers, shows that the myxozoan Buddenbrockia, is nested within cnidarians. Polypodium was not included in this study. To further investigate the phylogenetic placement of Polypodium, we have performed phylogenetic analyses of metazoans with 18S and partial 28S rDNA sequences in a large dataset that includes Polypodium and a comprehensive sampling of cnidarian taxa. Results Analyses of a combined dataset of 18S and partial 28S sequences, and partial 28S alone, support the placement of Polypodium within Cnidaria. Removal of the long-branched myxozoans from the 18S dataset also results in Polypodium being nested within Cnidaria. These results suggest that previous reports showing that Polypodium and Myxozoa form a sister group to Bilateria were an artifact of long-branch attraction. Conclusion By including 28S rDNA sequences and a comprehensive sampling of cnidarian taxa, we demonstrate that previously conflicting hypotheses concerning the phylogenetic placement of Polypodium can be reconciled. Specifically, the data presented provide evidence that Polypodium is indeed a cnidarian and is either the sister taxon to Hydrozoa, or part of the hydrozoan clade, Leptothecata. The former hypothesis is consistent with the traditional view that Polypodium should be placed in its own cnidarian class, Polypodiozoa. PMID:18471296
Phylogenetic placement of the enigmatic parasite, Polypodium hydriforme, within the Phylum Cnidaria.

PubMed

Evans, Nathaniel M; Lindner, Alberto; Raikova, Ekaterina V; Collins, Allen G; Cartwright, Paulyn

2008-05-09

Polypodium hydriforme is a parasite with an unusual life cycle and peculiar morphology, both of which have made its systematic position uncertain. Polypodium has traditionally been considered a cnidarian because it possesses nematocysts, the stinging structures characteristic of this phylum. However, recent molecular phylogenetic studies using 18S rDNA sequence data have challenged this interpretation, and have shown that Polypodium is a close relative to myxozoans and together they share a closer affinity to bilaterians than cnidarians. Due to the variable rates of 18S rDNA sequences, these results have been suggested to be an artifact of long-branch attraction (LBA). A recent study, using multiple protein coding markers, shows that the myxozoan Buddenbrockia, is nested within cnidarians. Polypodium was not included in this study. To further investigate the phylogenetic placement of Polypodium, we have performed phylogenetic analyses of metazoans with 18S and partial 28S rDNA sequences in a large dataset that includes Polypodium and a comprehensive sampling of cnidarian taxa. Analyses of a combined dataset of 18S and partial 28S sequences, and partial 28S alone, support the placement of Polypodium within Cnidaria. Removal of the long-branched myxozoans from the 18S dataset also results in Polypodium being nested within Cnidaria. These results suggest that previous reports showing that Polypodium and Myxozoa form a sister group to Bilateria were an artifact of long-branch attraction. By including 28S rDNA sequences and a comprehensive sampling of cnidarian taxa, we demonstrate that previously conflicting hypotheses concerning the phylogenetic placement of Polypodium can be reconciled. Specifically, the data presented provide evidence that Polypodium is indeed a cnidarian and is either the sister taxon to Hydrozoa, or part of the hydrozoan clade, Leptothecata. The former hypothesis is consistent with the traditional view that Polypodium should be placed in its own cnidarian class, Polypodiozoa.
Analysis of mitochondrial DNA in Bolivian llama, alpaca and vicuna populations: a contribution to the phylogeny of the South American camelids.

PubMed

Barreta, J; Gutiérrez-Gil, B; Iñiguez, V; Saavedra, V; Chiri, R; Latorre, E; Arranz, J J

2013-04-01

The objectives of this work were to assess the mtDNA diversity of Bolivian South American camelid (SAC) populations and to shed light on the evolutionary relationships between the Bolivian camelids and other populations of SACs. We have analysed two different mtDNA regions: the complete coding region of the MT-CYB gene and 513 bp of the D-loop region. The populations sampled included Bolivian llamas, alpacas and vicunas, and Chilean guanacos. High levels of genetic diversity were observed in the studied populations. In general, MT-CYB was more variable than D-loop. On a species level, the vicunas showed the lowest genetic variability, followed by the guanacos, alpacas and llamas. Phylogenetic analyses performed by including additional available mtDNA sequences from the studied species confirmed the existence of the two monophyletic clades previously described by other authors for guanacos (G) and vicunas (V). Significant levels of mtDNA hybridization were found in the domestic species. Our sequence analyses revealed significant sequence divergence within clade G, and some of the Bolivian llamas grouped with the majority of the southern guanacos. This finding supports the existence of more than the one llama domestication centre in South America previously suggested on the basis of archaeozoological evidence. Additionally, analysis of D-loop sequences revealed two new matrilineal lineages that are distinct from the previously reported G and V clades. The results presented here represent the first report on the population structure and genetic variability of Bolivian camelids and may help to elucidate the complex and dynamic domestication process of SAC populations. © 2012 The Authors, Animal Genetics © 2012 Stichting International Foundation for Animal Genetics.
Molecular typing and phylogenetic analysis of some species belonging to phlebotomus (larroussius) and phlebotomus (adlerius) subgenera (Diptera: psychodidae) from two locations in iran.

PubMed

Parvizi, P; Naddaf, S R; Alaeenovin, E

2010-01-01

Haematophagous females of some phlebotomine sandflies are the only natural vectors of Leishmania species, the causative agents of leishmaniasis in many parts of the tropics and subtropics, including Iran. We report the presence of Phlebotomus (Larroussius) major and Phlebotomus (Adlerius) halepensis in Tonekabon (Mazanderan Province) and Phlebotomus (Larroussius) tobbi in Pakdasht (Tehran Province). It is the first report of these species, known as potential vectors of zoonotic visceral leishmaniasis in Iran, are identified in these areas. In 2006-2007 individual wild-caught sandflies were characterized by both morphological features and sequence analysis of their mitochondrial genes (Cytochrome b). The analyses were based on a fragment of 494 bp at the 3' end of the Cyt b gene (Cyt b 3' fragment) and a fragment of 382 bp CB3 at the 5' end of the Cyt b gene (Cyt b 5' fragment). We also analysed the Cyt b Long fragment, which is located on the last 717 bp of the Cyt b gene, followed by 20 bp of intergenic spacer and the transfer RNA ser(TCN) gene. Twenty-seven P. halepensis and four P. major from Dohezar, Tonekabon, Mazanderan province and 8 P. tobbi from Packdasht, Tehran Province were identified by morphological and molecular characters. Cyt b 5' and Cyt b 3' fragment sequences were obtained from 15 and 9 flies, respectively. Cyt b long fragment sequences were obtained from 8 out of 27 P. halepensis. Parsimony analyses (using heuristic searches) of the DNA sequences of Cyt b always showed monophyletic clades of subgenera and each species did form a monophyletic group.
Combinatory RNA-Sequencing Analyses Reveal a Dual Mode of Gene Regulation by ADAR1 in Gastric Cancer.

PubMed

Cho, Charles J; Jung, Jaeeun; Jiang, Lushang; Lee, Eun Ji; Kim, Dae-Soo; Kim, Byung Sik; Kim, Hee Sung; Jung, Hwoon-Yong; Song, Ho-June; Hwang, Sung Wook; Park, Yangsoon; Jung, Min Kyo; Pack, Chan Gi; Myung, Seung-Jae; Chang, Suhwan

2018-04-25

Adenosine deaminase acting on RNA 1 (ADAR1) is known to mediate deamination of adenosine-to-inosine through binding to double-stranded RNA, the phenomenon known as RNA editing. Currently, the function of ADAR1 in gastric cancer is unclear. This study was aimed at investigating RNA editing-dependent and editing-independent functions of ADAR1 in gastric cancer, especially focusing on its influence on editing of 3' untranslated regions (UTRs) and subsequent changes in expression of messenger RNAs (mRNAs) as well as microRNAs (miRNAs). RNA-sequencing and small RNA-sequencing were performed on AGS and MKN-45 cells with a stable ADAR1 knockdown. Changed frequencies of editing and mRNA and miRNA expression were then identified by bioinformatic analyses. Targets of RNA editing were further validated in patients' samples. In the Alu region of both gastric cell lines, editing was most commonly of the A-to-I type in 3'-UTR or intron. mRNA and protein levels of PHACTR4 increased in ADAR1 knockdown cells, because of the loss of seed sequences in 3'-UTR of PHACTR4 mRNA that are required for miRNA-196a-3p binding. Immunohistochemical analyses of tumor and paired normal samples from 16 gastric cancer patients showed that ADAR1 expression was higher in tumors than in normal tissues and inversely correlated with PHACTR4 staining. On the other hand, decreased miRNA-148a-3p expression in ADAR1 knockdown cells led to increased mRNA and protein expression of NFYA, demonstrating ADAR1's editing-independent function. ADAR1 regulates post-transcriptional gene expression in gastric cancer through both RNA editing-dependent and editing-independent mechanisms.
Whole Genome Sequence and Phylogenetic Analysis Show Helicobacter pylori Strains from Latin America Have Followed a Unique Evolution Pathway

PubMed Central

Muñoz-Ramírez, Zilia Y.; Mendez-Tenorio, Alfonso; Kato, Ikuko; Bravo, Maria M.; Rizzato, Cosmeri; Thorell, Kaisa; Torres, Roberto; Aviles-Jimenez, Francisco; Camorlinga, Margarita; Canzian, Federico; Torres, Javier

2017-01-01

Helicobacter pylori (HP) genetics may determine its clinical outcomes. Despite high prevalence of HP infection in Latin America (LA), there have been no phylogenetic studies in the region. We aimed to understand the structure of HP populations in LA mestizo individuals, where gastric cancer incidence remains high. The genome of 107 HP strains from Mexico, Nicaragua and Colombia were analyzed with 59 publicly available worldwide genomes. To study bacterial relationship on whole genome level we propose a virtual hybridization technique using thousands of high-entropy 13 bp DNA probes to generate fingerprints. Phylogenetic virtual genome fingerprint (VGF) was compared with Multi Locus Sequence Analysis (MLST) and with phylogenetic analyses of cagPAI virulence island sequences. With MLST some Nicaraguan and Mexican strains clustered close to Africa isolates, whereas European isolates were spread without clustering and intermingled with LA isolates. VGF analysis resulted in increased resolution of populations, separating European from LA strains. Furthermore, clusters with exclusively Colombian, Mexican, or Nicaraguan strains were observed, where the Colombian cluster separated from Europe, Asia, and Africa, while Nicaraguan and Mexican clades grouped close to Africa. In addition, a mixed large LA cluster including Mexican, Colombian, Nicaraguan, Peruvian, and Salvadorian strains was observed; all LA clusters separated from the Amerind clade. With cagPAI sequence analyses LA clades clearly separated from Europe, Asia and Amerind, and Colombian strains formed a single cluster. A NeighborNet analyses suggested frequent and recent recombination events particularly among LA strains. Results suggests that in the new world, H. pylori has evolved to fit mestizo LA populations, already 500 years after the Spanish colonization. This co-adaption may account for regional variability in gastric cancer risk. PMID:28293542
A highly divergent Puumala virus lineage in southern Poland.

PubMed

Rosenfeld, Ulrike M; Drewes, Stephan; Ali, Hanan Sheikh; Sadowska, Edyta T; Mikowska, Magdalena; Heckel, Gerald; Koteja, Paweł; Ulrich, Rainer G

2017-05-01

Puumala virus (PUUV) represents one of the most important hantaviruses in Central Europe. Phylogenetic analyses of PUUV strains indicate a strong genetic structuring of this hantavirus. Recently, PUUV sequences were identified in the natural reservoir, the bank vole (Myodes glareolus), collected in the northern part of Poland. The objective of this study was to evaluate the presence of PUUV in bank voles from southern Poland. A total of 72 bank voles were trapped in 2009 at six sites in this part of Poland. RT-PCR and IgG-ELISA analyses detected three PUUV positive voles at one trapping site. The PUUV-infected animals were identified by cytochrome b gene analysis to belong to the Carpathian and Eastern evolutionary lineages of bank vole. The novel PUUV S, M and L segment nucleotide sequences showed the closest similarity to sequences of the Russian PUUV lineage from Latvia, but were highly divergent to those previously found in northern Poland, Slovakia and Austria. In conclusion, the detection of a highly divergent PUUV lineage in southern Poland indicates the necessity of further bank vole monitoring in this region allowing rational public health measures to prevent human infections.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.

PubMed

Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H

2013-12-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family

PubMed Central

Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.

2013-01-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
BM-Map: Bayesian Mapping of Multireads for Next-Generation Sequencing Data

PubMed Central

Ji, Yuan; Xu, Yanxun; Zhang, Qiong; Tsui, Kam-Wah; Yuan, Yuan; Norris, Clift; Liang, Shoudan; Liang, Han

2011-01-01

Summary Next-generation sequencing (NGS) technology generates millions of short reads, which provide valuable information for various aspects of cellular activities and biological functions. A key step in NGS applications (e.g., RNA-Seq) is to map short reads to correct genomic locations within the source genome. While most reads are mapped to a unique location, a significant proportion of reads align to multiple genomic locations with equal or similar numbers of mismatches; these are called multireads. The ambiguity in mapping the multireads may lead to bias in downstream analyses. Currently, most practitioners discard the multireads in their analysis, resulting in a loss of valuable information, especially for the genes with similar sequences. To refine the read mapping, we develop a Bayesian model that computes the posterior probability of mapping a multiread to each competing location. The probabilities are used for downstream analyses, such as the quantification of gene expression. We show through simulation studies and RNA-Seq analysis of real life data that the Bayesian method yields better mapping than the current leading methods. We provide a C++ program for downloading that is being packaged into a user-friendly software. PMID:21517792
Retention-error patterns in complex alphanumeric serial-recall tasks.

PubMed

Mathy, Fabien; Varré, Jean-Stéphane

2013-01-01

We propose a new method based on an algorithm usually dedicated to DNA sequence alignment in order to both reliably score short-term memory performance on immediate serial-recall tasks and analyse retention-error patterns. There can be considerable confusion on how performance on immediate serial list recall tasks is scored, especially when the to-be-remembered items are sampled with replacement. We discuss the utility of sequence-alignment algorithms to compare the stimuli to the participants' responses. The idea is that deletion, substitution, translocation, and insertion errors, which are typical in DNA, are also typical putative errors in short-term memory (respectively omission, confusion, permutation, and intrusion errors). We analyse four data sets in which alphanumeric lists included a few (or many) repetitions. After examining the method on two simple data sets, we show that sequence alignment offers 1) a compelling method for measuring capacity in terms of chunks when many regularities are introduced in the material (third data set) and 2) a reliable estimator of individual differences in short-term memory capacity. This study illustrates the difficulty of arriving at a good measure of short-term memory performance, and also attempts to characterise the primary factors underpinning remembering and forgetting.
Targeted sequencing identifies novel variants involved in autosomal recessive hereditary hearing loss in Qatari families.

PubMed

Alkowari, Moza K; Vozzi, Diego; Bhagat, Shruti; Krishnamoorthy, Navaneethakrishnan; Morgan, Anna; Hayder, Yousra; Logendra, Barathy; Najjar, Nehal; Gandin, Ilaria; Gasparini, Paolo; Badii, Ramin; Girotto, Giorgia; Abdulhadi, Khalid

2017-08-01

Hereditary hearing loss is characterized by a very high genetic heterogeneity. In the Qatari population the role of GJB2, the worldwide HHL major player, seems to be quite limited compared to Caucasian populations. In this study we analysed 18 Qatari families affected by non-syndromic hearing loss using a targeted sequencing approach that allowed us to analyse 81 genes simultaneously. Thanks to this approach, 50% of these families (9 out of 18) resulted positive for the presence of likely causative alleles in 6 different genes: CDH23, MYO6, GJB6, OTOF, TMC1 and OTOA. In particular, 4 novel alleles were detected while the remaining ones were already described to be associated to HHL in other ethnic groups. Molecular modelling has been used to further investigate the role of novel alleles identified in CDH23 and TMC1 genes demonstrating their crucial role in Ca2+ binding and therefore possible functional role in proteins. Present study showed that an accurate molecular diagnosis based on next generation sequencing technologies might largely improve molecular diagnostics outcome leading to benefits for both genetic counseling and definition of recurrence risk. Copyright © 2017 Elsevier B.V. All rights reserved.
Time Clustered Sampling Can Inflate the Inferred Substitution Rate in Foot-And-Mouth Disease Virus Analyses.

PubMed

Pedersen, Casper-Emil T; Frandsen, Peter; Wekesa, Sabenzia N; Heller, Rasmus; Sangula, Abraham K; Wadsworth, Jemma; Knowles, Nick J; Muwanika, Vincent B; Siegismund, Hans R

2015-01-01

With the emergence of analytical software for the inference of viral evolution, a number of studies have focused on estimating important parameters such as the substitution rate and the time to the most recent common ancestor (tMRCA) for rapidly evolving viruses. Coupled with an increasing abundance of sequence data sampled under widely different schemes, an effort to keep results consistent and comparable is needed. This study emphasizes commonly disregarded problems in the inference of evolutionary rates in viral sequence data when sampling is unevenly distributed on a temporal scale through a study of the foot-and-mouth (FMD) disease virus serotypes SAT 1 and SAT 2. Our study shows that clustered temporal sampling in phylogenetic analyses of FMD viruses will strongly bias the inferences of substitution rates and tMRCA because the inferred rates in such data sets reflect a rate closer to the mutation rate rather than the substitution rate. Estimating evolutionary parameters from viral sequences should be performed with due consideration of the differences in short-term and longer-term evolutionary processes occurring within sets of temporally sampled viruses, and studies should carefully consider how samples are combined.
The phylogenetic position of the Critically Endangered Saint Croix ground lizard Ameiva polops: revisiting molecular systematics of West Indian Ameiva.

PubMed

Hurtado, Luis A; Santamaria, Carlos A; Fitzgerald, Lee A

2014-05-06

The phylogenetic position of the critically endangered Saint Croix ground lizard Ameiva polops is presently unknown and several hypotheses have been proposed. We investigated the phylogenetic position of this species using molecular phylogenetic methods. We obtained sequences of DNA fragments of the mitochondrial ribosomal genes 12S rDNA and 16S rDNA for this species. We aligned these sequences with published sequences of other Ameiva species, which include most of the Ameiva species from the West Indies, three Ameiva species from Central America and South America, and one from the teiid lizard Tupinambis teguixin, which was used as outgroup. We conducted Maximum Likelihood and Bayesian phylogenetic analyses. The phylogenetic reconstructions among the different methods were very similar, supporting the monophyly of West Indian Ameiva and showing within this lineage, a basal polytomy of four clades that are separated geographically. Ameiva polops grouped in a cluster that included the other two Ameiva species found in the Puerto Rican Bank: A. wetmorei and A. exsul. A sister relationship between A. polops and A. wetmorei is suggested by our analyses. We compare our results with a previous study on molecular systematics of West Indian Ameiva.
Infections by Babesia caballi and Theileria equi in Jordanian equids: epidemiology and genetic diversity.

PubMed

Qablan, Moneeb A; Oborník, Miroslav; Petrželková, Klára J; Sloboda, Michal; Shudiefat, Mustafa F; Hořín, Petr; Lukeš, Julius; Modrý, David

2013-08-01

Microscopic diagnosis of equine piroplasmoses, caused by Theileria equi and Babesia caballi, is hindered by low parasitaemia during the latent phase of the infections. However, this constraint can be overcome by the application of PCR followed by sequencing. Out of 288 animals examined, the piroplasmid DNA was detected in 78 (27·1%). Multiplex PCR indicated that T. equi (18·8%) was more prevalent than B. caballi (7·3%), while mixed infections were conspicuously absent. Sequences of 69 PCR amplicons obtained by the 'catch-all' PCR were in concordance with those amplified by the multiplex strategy. Computed minimal adequate model analyses for both equine piroplasmid species separately showed a significant effect of host species and age in the case of T. equi, while in the B. caballi infections only the correlation with host sex was significant. Phylogenetic analyses inferred the occurrence of three genotypes of T. equi and B. caballi. Moreover, a novel genotype C of B. caballi was identified. The dendrogram based on obtained sequences of T. equi revealed possible speciation events. The infections with T. equi and B. caballi are enzootic in all ecozones of Jordan and different genotypes circulate wherever dense horse population exists.
Molecular taxonomy of Dunaliella (Chlorophyceae), with a special focus on D. salina: ITS2 sequences revisited with an extensive geographical sampling

PubMed Central

2012-01-01

We used an ITS2 primary and secondary structure and Compensatory Base Changes (CBCs) analyses on new French and Spanish Dunallela salina strains to investigate their phylogenetic position and taxonomic status within the genus Dunaliella. Our analyses show a great diversity within D. salina (with only some clades not statistically supported) and reveal considerable genetic diversity and structure within Dunaliella, although the CBC analysis did not bolster the existence of different biological groups within this taxon. The ITS2 sequences of the new Spanish and French D. salina strains were very similar except for two of them: ITC5105 "Janubio" from Spain and ITC5119 from France. Although the Spanish one had a unique ITS2 sequence profile and the phylogenetic tree indicates that this strain can represent a new species, this hypothesis was not confirmed by CBCs, and clarification of its taxonomic status requires further investigation with new data. Overall, the use of CBCs to define species boundaries within Dunaliella was not conclusive in some cases, and the ITS2 region does not contain a geographical signal overall. PMID:22520929
A molecular phylogenetic investigation of bakuella, anteholosticha, and caudiholosticha (protista, ciliophora, hypotrichia) based on three-gene sequences.

PubMed

Lv, Zhao; Shao, Chen; Yi, Zhenzhen; Warren, Alan

2015-01-01

Traditionally classifications of the Urostyloida have been mainly based on morphology and morphogenesis. Recent molecular phylogenetic analyses have been largely based on single-gene data for a limited number of taxa. Consequently, incongruence has arisen between the morphological/morphogenetic and the molecular data. In this study, the three phylogenetic markers (SSU rDNA, ITS1-5.8S-ITS2 region, and LSU-rDNA) of three urostyloid genera represented by four species (Bakuella granulifera, Anteholosticha monilata, Caudiholosticha sylvatica, and C. tetracirra) were sequenced to investigate their phylogeny. The results show that: (1) all three genera should be regarded as the members of the order Urostyloida within the subclass Hypotrichia, as indicated by morphological characters; (2) phylogenetic analyses and sequence similarities both indicate that neither Anteholosticha nor Caudiholosticha are monophyletic and the systematic assignment of both genera awaits further evaluation; and (3) Bakuella has a closer relationship with Urostyla than with bakuellids (e.g. Apobakuella and Metaurostylopsis), suggesting Bakuella may belong to the family Urostylidae rather than the family Bakuellidae. © 2014 The Author(s) Journal of Eukaryotic Microbiology © 2014 International Society of Protistologists.
Whole-genome analyses of DS-1-like human G2P[4] and G8P[4] rotavirus strains from Eastern, Western and Southern Africa

PubMed Central

Nyaga, Martin M.; Stucker, Karla M.; Esona, Mathew D.; Jere, Khuzwayo C.; Mwinyi, Bakari; Shonhai, Annie; Tsolenyanu, Enyonam; Mulindwa, Augustine; Chibumbya, Julia N.; Adolfine, Hokororo; Halpin, Rebecca A.; Roy, Sunando; Stockwell, Timothy B.; Berejena, Chipo; Seheri, Mapaseka L.; Mwenda, Jason M.; Steele, A. Duncan; Wentworth, David E.

2018-01-01

Group A rotaviruses (RVAs) with distinct G and P genotype combinations have been reported globally. We report the genome composition and possible origin of seven G8P[4] and five G2P[4] human RVA strains based on the genetic evolution of all 11 genome segments at the nucleotide level. Twelve RVA ELISA positive stool samples collected in the representative countries of Eastern, Southern and West Africa during the 2007–2012 surveillance seasons were subjected to sequencing using the Ion Torrent PGM and Illumina MiSeq platforms. A reference-based assembly was performed using CLC Bio’s clc_ref_assemble_long program, and full-genome consensus sequences were obtained. With the exception of the neutralising antigen, VP7, all study strains exhibited the DS-1-like genome constellation (P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2) and clustered phylogenetically with reference strains having a DS-1-like genetic backbone. Comparison of the nucleotide and amino acid sequences with selected global cognate genome segments revealed nucleotide and amino acid sequence identities of 81.7–100 % and 90.6–100 %, respectively, with NSP4 gene segment showing the most diversity among the strains. Bayesian analyses of all gene sequences to estimate the time of divergence of the lineage indicated that divergence times ranged from 16 to 44 years, except for the NSP4 gene where the lineage seemed to arise in the more distant past at an estimated 203 years ago. However, the long-term effects of changes found within the NSP4 genome segment should be further explored, and thus we recommend continued whole-genome analyses from larger sample sets to determine the evolutionary mechanisms of the DS-1-like strains collected in Africa. PMID:24952422

Conifer R2R3-MYB transcription factors: sequence analyses and gene expression in wood-forming tissues of white spruce (Picea glauca)

PubMed Central

Bedon, Frank; Grima-Pettenati, Jacqueline; Mackay, John

2007-01-01

Background Several members of the R2R3-MYB family of transcription factors act as regulators of lignin and phenylpropanoid metabolism during wood formation in angiosperm and gymnosperm plants. The angiosperm Arabidopsis has over one hundred R2R3-MYBs genes; however, only a few members of this family have been discovered in gymnosperms. Results We isolated and characterised full-length cDNAs encoding R2R3-MYB genes from the gymnosperms white spruce, Picea glauca (13 sequences), and loblolly pine, Pinus taeda L. (five sequences). Sequence similarities and phylogenetic analyses placed the spruce and pine sequences in diverse subgroups of the large R2R3-MYB family, although several of the sequences clustered closely together. We searched the highly variable C-terminal region of diverse plant MYBs for conserved amino acid sequences and identified 20 motifs in the spruce MYBs, nine of which have not previously been reported and three of which are specific to conifers. The number and length of the introns in spruce MYB genes varied significantly, but their positions were well conserved relative to angiosperm MYB genes. Quantitative RTPCR of MYB genes transcript abundance in root and stem tissues revealed diverse expression patterns; three MYB genes were preferentially expressed in secondary xylem, whereas others were preferentially expressed in phloem or were ubiquitous. The MYB genes expressed in xylem, and three others, were up-regulated in the compression wood of leaning trees within 76 hours of induction. Conclusion Our survey of 18 conifer R2R3-MYB genes clearly showed a gene family structure similar to that of Arabidopsis. Three of the sequences are likely to play a role in lignin metabolism and/or wood formation in gymnosperm trees, including a close homolog of the loblolly pine PtMYB4, shown to regulate lignin biosynthesis in transgenic tobacco. PMID:17397551
Deep sequencing of the Trypanosoma cruzi GP63 surface proteases reveals diversity and diversifying selection among chronic and congenital Chagas disease patients.

PubMed

Llewellyn, Martin S; Messenger, Louisa A; Luquetti, Alejandro O; Garcia, Lineth; Torrico, Faustino; Tavares, Suelene B N; Cheaib, Bachar; Derome, Nicolas; Delepine, Marc; Baulard, Céline; Deleuze, Jean-Francois; Sauer, Sascha; Miles, Michael A

2015-04-01

Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology. A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target--ND5--was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus. Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I gene family suggests a link between genetic diversity within this gene family and survival in the mammalian host.
FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.

PubMed

Hoogenboom, Jerry; van der Gaag, Kristiaan J; de Leeuw, Rick H; Sijen, Titia; de Knijff, Peter; Laros, Jeroen F J

2017-03-01

Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Identification and characterisation of Short Interspersed Nuclear Elements in the olive tree (Olea europaea L.) genome.

PubMed

Barghini, Elena; Mascagni, Flavia; Natali, Lucia; Giordani, Tommaso; Cavallini, Andrea

2017-02-01

Short Interspersed Nuclear Elements (SINEs) are nonautonomous retrotransposons in the genome of most eukaryotic species. While SINEs have been intensively investigated in humans and other animal systems, SINE identification has been carried out only in a limited number of plant species. This lack of information is apparent especially in non-model plants whose genome has not been sequenced yet. The aim of this work was to produce a specific bioinformatics pipeline for analysing second generation sequence reads of a non-model species and identifying SINEs. We have identified, for the first time, 227 putative SINEs of the olive tree (Olea europaea), that constitute one of the few sets of such sequences in dicotyledonous species. The identified SINEs ranged from 140 to 362 bp in length and were characterised with regard to the occurrence of the tRNA domain in their sequence. The majority of identified elements resulted in single copy or very lowly repeated, often in association with genic sequences. Analysis of sequence similarity allowed us to identify two major groups of SINEs showing different abundances in the olive tree genome, the former with sequence similarity to SINEs of Scrophulariaceae and Solanaceae and the latter to SINEs of Salicaceae. A comparison of sequence conservation between olive SINEs and LTR retrotransposon families suggested that SINE expansion in the genome occurred especially in very ancient times, before LTR retrotransposon expansion, and presumably before the separation of the rosids (to which Oleaceae belong) from the Asterids. Besides providing data on olive SINEs, our results demonstrate the suitability of the pipeline employed for SINE identification. Applying this pipeline will favour further structural and functional analyses on these relatively unknown elements to be performed also in other plant species, even in the absence of a reference genome, and will allow establishing general evolutionary patterns for this kind of repeats in plants.
Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome

PubMed Central

Margulies, Elliott H.; Cooper, Gregory M.; Asimenos, George; Thomas, Daryl J.; Dewey, Colin N.; Siepel, Adam; Birney, Ewan; Keefe, Damian; Schwartz, Ariel S.; Hou, Minmei; Taylor, James; Nikolaev, Sergey; Montoya-Burgos, Juan I.; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Brown, James B.; Bickel, Peter; Holmes, Ian; Mullikin, James C.; Ureta-Vidal, Abel; Paten, Benedict; Stone, Eric A.; Rosenbloom, Kate R.; Kent, W. James; Bouffard, Gerard G.; Guan, Xiaobin; Hansen, Nancy F.; Idol, Jacquelyn R.; Maduro, Valerie V.B.; Maskeri, Baishali; McDowell, Jennifer C.; Park, Morgan; Thomas, Pamela J.; Young, Alice C.; Blakesley, Robert W.; Muzny, Donna M.; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Jiang, Huaiyang; Weinstock, George M.; Gibbs, Richard A.; Graves, Tina; Fulton, Robert; Mardis, Elaine R.; Wilson, Richard K.; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B.; Chang, Jean L.; Lindblad-Toh, Kerstin; Lander, Eric S.; Hinrichs, Angie; Trumbower, Heather; Clawson, Hiram; Zweig, Ann; Kuhn, Robert M.; Barber, Galt; Harte, Rachel; Karolchik, Donna; Field, Matthew A.; Moore, Richard A.; Matthewson, Carrie A.; Schein, Jacqueline E.; Marra, Marco A.; Antonarakis, Stylianos E.; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross; Haussler, David; Miller, Webb; Pachter, Lior; Green, Eric D.; Sidow, Arend

2007-01-01

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization. PMID:17567995
Occurrence of different Canine distemper virus lineages in Italian dogs.

PubMed

Balboni, Andrea; De Lorenzo Dandola, Giorgia; Scagliarini, Alessandra; Prosperi, Santino; Battilani, Mara

2014-01-01

This study describes the sequence analysis of the H gene of 7 Canine distemper virus (CDV) strains identified in dogs in Italy between years 2002-2012. The phylogenetic analysis showed that the CDV strains belonged to 2 clusters: 6 viruses were identified as Arctic-like lineage and 1 as Europe 1 lineage. These data show a considerable prevalence of Arctic-like-CDVs in the analysed dogs. The dogs and the 3 viruses more recently identified showed 4 distinctive amino acid mutations compared to all other Arctic CDVs.
Genome sequence of the progenitor of wheat A subgenome Triticum urartu.

PubMed

Ling, Hong-Qing; Ma, Bin; Shi, Xiaoli; Liu, Hui; Dong, Lingli; Sun, Hua; Cao, Yinghao; Gao, Qiang; Zheng, Shusong; Li, Ye; Yu, Ying; Du, Huilong; Qi, Ming; Li, Yan; Lu, Hongwei; Yu, Hua; Cui, Yan; Wang, Ning; Chen, Chunlin; Wu, Huilan; Zhao, Yan; Zhang, Juncheng; Li, Yiwen; Zhou, Wenjuan; Zhang, Bairu; Hu, Weijuan; van Eijk, Michiel J T; Tang, Jifeng; Witsenboer, Hanneke M A; Zhao, Shancen; Li, Zhensheng; Zhang, Aimin; Wang, Daowen; Liang, Chengzhi

2018-05-09

Triticum urartu (diploid, AA) is the progenitor of the A subgenome of tetraploid (Triticum turgidum, AABB) and hexaploid (Triticum aestivum, AABBDD) wheat 1,2 . Genomic studies of T. urartu have been useful for investigating the structure, function and evolution of polyploid wheat genomes. Here we report the generation of a high-quality genome sequence of T. urartu by combining bacterial artificial chromosome (BAC)-by-BAC sequencing, single molecule real-time whole-genome shotgun sequencing 3 , linked reads and optical mapping 4,5 . We assembled seven chromosome-scale pseudomolecules and identified protein-coding genes, and we suggest a model for the evolution of T. urartu chromosomes. Comparative analyses with genomes of other grasses showed gene loss and amplification in the numbers of transposable elements in the T. urartu genome. Population genomics analysis of 147 T. urartu accessions from across the Fertile Crescent showed clustering of three groups, with differences in altitude and biostress, such as powdery mildew disease. The T. urartu genome assembly provides a valuable resource for studying genetic variation in wheat and related grasses, and promises to facilitate the discovery of genes that could be useful for wheat improvement.
Nucleotide sequence analysis of the 3' terminal region of a wasabi strain of crucifer tobamovirus genomic RNA: subgrouping of crucifer tobamoviruses.

PubMed

Shimamoto, I; Sonoda, S; Vazquez, P; Minaka, N; Nishiguchi, M

1998-01-01

The 3' terminal 2378 nucleotides of a wasabi strain of crucifer tobamovirus (CTMV-W) infectious to crucifer plants was determined. This includes the 3' non-coding region of 235 nucleotides, coat protein (CP) gene (468 nucleotides), movement protein (MP) gene (798 nucleotides) and C-terminal partial readthrough portion of 180 K protein gene (940 nucleotides). Comparison of the sequence with homologous regions of thirteen other tobamovirus genomes showed that it had much higher identity to those of four other crucifer tobamoviruses, 85.2% to cr-TMV and turnip vein-clearing virus (TVCV), 87.4% to oilseed rape mosaic virus (ORMV) and 87.1% to TMV-Cg, than to those of other tobamoviruses. Thus CTMV-W was most similar to ORMV and TMV-Cg in sequence, but only marginally so, whereas the location and size of its MP gene was the same as cr-TMV amd TVCV. These results, together with other analyses, show that CTMV-W is a new crucifer tobamovirus, that the five crucifer tobamoviruses can be classified into two subgroups based on MP gene organization, and that the rate of sequence change is not the same in all lineages.
The accelerated build-up of the red sequence in high-redshift galaxy clusters

NASA Astrophysics Data System (ADS)

Cerulo, P.; Couch, W. J.; Lidman, C.; Demarco, R.; Huertas-Company, M.; Mei, S.; Sánchez-Janssen, R.; Barrientos, L. F.; Muñoz, R. P.

2016-04-01

We analyse the evolution of the red sequence in a sample of galaxy clusters at redshifts 0.8 < z < 1.5 taken from the HAWK-I Cluster Survey (HCS). The comparison with the low-redshift (0.04 < z < 0.08) sample of the WIde-field Nearby Galaxy-cluster Survey (WINGS) and other literature results shows that the slope and intrinsic scatter of the cluster red sequence have undergone little evolution since z = 1.5. We find that the luminous-to-faint ratio and the slope of the faint end of the luminosity distribution of the HCS red sequence are consistent with those measured in WINGS, implying that there is no deficit of red galaxies at magnitudes fainter than M_V^{ast } at high redshifts. We find that the most massive HCS clusters host a population of bright red sequence galaxies at MV < -22.0 mag, which are not observed in low-mass clusters. Interestingly, we also note the presence of a population of very bright (MV < -23.0 mag) and massive (log (M*/M⊙) > 11.5) red sequence galaxies in the WINGS clusters, which do not include only the brightest cluster galaxies and which are not present in the HCS clusters, suggesting that they formed at epochs later than z = 0.8. The comparison with the luminosity distribution of a sample of passive red sequence galaxies drawn from the COSMOS/UltraVISTA field in the photometric redshift range 0.8 < zphot < 1.5 shows that the red sequence in clusters is more developed at the faint end, suggesting that halo mass plays an important role in setting the time-scales for the build-up of the red sequence.
Mitochondrial DNA sequences of 37 collar-spined echinostomes (Digenea: Echinostomatidae) in Thailand and Lao PDR reveals presence of two species: Echinostoma revolutum and E. miyagawai.

PubMed

Nagataki, Mitsuru; Tantrawatpan, Chairat; Agatsuma, Takeshi; Sugiura, Tetsuro; Duenngai, Kunyarat; Sithithaworn, Paiboon; Andrews, Ross H; Petney, Trevor N; Saijuntha, Weerachai

2015-10-01

The "37 collar-spined" or "revolutum" group of echinostomes is recognized as a species complex. The identification of members of this complex by morphological taxonomic characters is difficult and confusing, and hence, molecular analyses are a useful alternative method for molecular systematic studies. The current study examined the genetic diversity of those 37 collar-spined echinostomes which are recognized morphologically as Echinostoma revolutum in Thailand and Lao PDR using the cytochrome c oxidase subunit 1 (CO1) and the NADH dehydrogenase subunit 1 (ND1) sequences. On the basis of molecular investigations, at least two species of 37 collar-spined echinostomes exist in Southeast Asia, namely E. revolutum and Echinostoma miyagawai. The specimens examined in this study, coming from ducks in Thailand and Lao PDR, were compared to isolates from America, Europe and Australia for which DNA sequences are available in public databases. Haplotype analysis detected 6 and 26 haplotypes when comparing the CO1 sequences of E. revolutum and E. miyagawai, respectively, from different geographical isolates from Thailand and Lao PDR. The phylogenetic trees, ND1 haplotype network and genetic differentiation (ɸST) analyses showed that E. revolutum were genetically different on a continental scale, i.e. Eurasian and American lineages. Copyright © 2015 Elsevier B.V. All rights reserved.
CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects

PubMed Central

Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

2014-01-01

CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234
Adjacent DNA sequences modulate Sox9 transcriptional activation at paired Sox sites in three chondrocyte-specific enhancer elements

PubMed Central

Bridgewater, Laura C.; Walker, Marlan D.; Miller, Gwen C.; Ellison, Trevor A.; Holsinger, L. Daniel; Potter, Jennifer L.; Jackson, Todd L.; Chen, Reuben K.; Winkel, Vicki L.; Zhang, Zhaoping; McKinney, Sandra; de Crombrugghe, Benoit

2003-01-01

Expression of the type XI collagen gene Col11a2 is directed to cartilage by at least three chondrocyte-specific enhancer elements, two in the 5′ region and one in the first intron of the gene. The three enhancers each contain two heptameric sites with homology to the Sox protein-binding consensus sequence. The two sites are separated by 3 or 4 bp and arranged in opposite orientation to each other. Targeted mutational analyses of these three enhancers showed that in the intronic enhancer, as in the other two enhancers, both Sox sites in a pair are essential for enhancer activity. The transcription factor Sox9 binds as a dimer at the paired sites, and the introduction of insertion mutations between the sites demonstrated that physical interactions between the adjacently bound proteins are essential for enhancer activity. Additional mutational analyses demonstrated that although Sox9 binding at the paired Sox sites is necessary for enhancer activity, it alone is not sufficient. Adjacent DNA sequences in each enhancer are also required, and mutation of those sequences can eliminate enhancer activity without preventing Sox9 binding. The data suggest a new model in which adjacently bound proteins affect the DNA bend angle produced by Sox9, which in turn determines whether an active transcriptional enhancer complex is assembled. PMID:12595563
DNA mutation motifs in the genes associated with inherited diseases.

PubMed

Růžička, Michal; Kulhánek, Petr; Radová, Lenka; Čechová, Andrea; Špačková, Naďa; Fajkusová, Lenka; Réblová, Kamila

2017-01-01

Mutations in human genes can be responsible for inherited genetic disorders and cancer. Mutations can arise due to environmental factors or spontaneously. It has been shown that certain DNA sequences are more prone to mutate. These sites are termed hotspots and exhibit a higher mutation frequency than expected by chance. In contrast, DNA sequences with lower mutation frequencies than expected by chance are termed coldspots. Mutation hotspots are usually derived from a mutation spectrum, which reflects particular population where an effect of a common ancestor plays a role. To detect coldspots/hotspots unaffected by population bias, we analysed the presence of germline mutations obtained from HGMD database in the 5-nucleotide segments repeatedly occurring in genes associated with common inherited disorders, in particular, the PAH, LDLR, CFTR, F8, and F9 genes. Statistically significant sequences (mutational motifs) rarely associated with mutations (coldspots) and frequently associated with mutations (hotspots) exhibited characteristic sequence patterns, e.g. coldspots contained purine tract while hotspots showed alternating purine-pyrimidine bases, often with the presence of CpG dinucleotide. Using molecular dynamics simulations and free energy calculations, we analysed the global bending properties of two selected coldspots and two hotspots with a G/T mismatch. We observed that the coldspots were inherently more flexible than the hotspots. We assume that this property might be critical for effective mismatch repair as DNA with a mutation recognized by MutSα protein is noticeably bent.
Investigating the genetics of Bti resistance using mRNA tag sequencing: application on laboratory strains and natural populations of the dengue vector Aedes aegypti

PubMed Central

Paris, Margot; Marcombe, Sebastien; Coissac, Eric; Corbel, Vincent; David, Jean-Philippe; Després, Laurence

2013-01-01

Mosquito control is often the main method used to reduce mosquito-transmitted diseases. In order to investigate the genetic basis of resistance to the bio-insecticide Bacillus thuringiensis subsp. israelensis (Bti), we used information on polymorphism obtained from cDNA tag sequences from pooled larvae of laboratory Bti-resistant and susceptible Aedes aegypti mosquito strains to identify and analyse 1520 single nucleotide polymorphisms (SNPs). Of the 372 SNPs tested, 99.2% were validated using DNA Illumina GoldenGate® array, with a strong correlation between the allelic frequencies inferred from the pooled and individual data (r = 0.85). A total of 11 genomic regions and five candidate genes were detected using a genome scan approach. One of these candidate genes showed significant departures from neutrality in the resistant strain at sequence level. Six natural populations from Martinique Island were sequenced for the 372 tested SNPs with a high transferability (87%), and association mapping analyses detected 14 loci associated with Bti resistance, including one located in a putative receptor for Cry11 toxins. Three of these loci were also significantly differentiated between the laboratory strains, suggesting that most of the genes associated with resistance might differ between the two environments. It also suggests that common selected regions might harbour key genes for Bti resistance. PMID:24187584
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).

PubMed

Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar

2016-12-01

In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects.

PubMed

Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

2014-01-01

CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.
Sockeye: A 3D Environment for Comparative Genomics

PubMed Central

Montgomery, Stephen B.; Astakhova, Tamara; Bilenky, Mikhail; Birney, Ewan; Fu, Tony; Hassel, Maik; Melsopp, Craig; Rak, Marcin; Robertson, A. Gordon; Sleumer, Monica; Siddiqui, Asim S.; Jones, Steven J.M.

2004-01-01

Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization. PMID:15123592
RIEMS: a software pipeline for sensitive and comprehensive taxonomic classification of reads from metagenomics datasets.

PubMed

Scheuch, Matthias; Höper, Dirk; Beer, Martin

2015-03-03

Fuelled by the advent and subsequent development of next generation sequencing technologies, metagenomics became a powerful tool for the analysis of microbial communities both scientifically and diagnostically. The biggest challenge is the extraction of relevant information from the huge sequence datasets generated for metagenomics studies. Although a plethora of tools are available, data analysis is still a bottleneck. To overcome the bottleneck of data analysis, we developed an automated computational workflow called RIEMS - Reliable Information Extraction from Metagenomic Sequence datasets. RIEMS assigns every individual read sequence within a dataset taxonomically by cascading different sequence analyses with decreasing stringency of the assignments using various software applications. After completion of the analyses, the results are summarised in a clearly structured result protocol organised taxonomically. The high accuracy and performance of RIEMS analyses were proven in comparison with other tools for metagenomics data analysis using simulated sequencing read datasets. RIEMS has the potential to fill the gap that still exists with regard to data analysis for metagenomics studies. The usefulness and power of RIEMS for the analysis of genuine sequencing datasets was demonstrated with an early version of RIEMS in 2011 when it was used to detect the orthobunyavirus sequences leading to the discovery of Schmallenberg virus.
Levels of integration in cognitive control and sequence processing in the prefrontal cortex.

PubMed

Bahlmann, Jörg; Korb, Franziska M; Gratton, Caterina; Friederici, Angela D

2012-01-01

Cognitive control is necessary to flexibly act in changing environments. Sequence processing is needed in language comprehension to build the syntactic structure in sentences. Functional imaging studies suggest that sequence processing engages the left ventrolateral prefrontal cortex (PFC). In contrast, cognitive control processes additionally recruit bilateral rostral lateral PFC regions. The present study aimed to investigate these two types of processes in one experimental paradigm. Sequence processing was manipulated using two different sequencing rules varying in complexity. Cognitive control was varied with different cue-sets that determined the choice of a sequencing rule. Univariate analyses revealed distinct PFC regions for the two types of processing (i.e. sequence processing: left ventrolateral PFC and cognitive control processing: bilateral dorsolateral and rostral PFC). Moreover, in a common brain network (including left lateral PFC and intraparietal sulcus) no interaction between sequence and cognitive control processing was observed. In contrast, a multivariate pattern analysis revealed an interaction of sequence and cognitive control processing, such that voxels in left lateral PFC and parietal cortex showed different tuning functions for tasks involving different sequencing and cognitive control demands. These results suggest that the difference between the process of rule selection (i.e. cognitive control) and the process of rule-based sequencing (i.e. sequence processing) find their neuronal underpinnings in distinct activation patterns in lateral PFC. Moreover, the combination of rule selection and rule sequencing can shape the response of neurons in lateral PFC and parietal cortex.
Levels of Integration in Cognitive Control and Sequence Processing in the Prefrontal Cortex

PubMed Central

Bahlmann, Jörg; Korb, Franziska M.; Gratton, Caterina; Friederici, Angela D.

2012-01-01

Cognitive control is necessary to flexibly act in changing environments. Sequence processing is needed in language comprehension to build the syntactic structure in sentences. Functional imaging studies suggest that sequence processing engages the left ventrolateral prefrontal cortex (PFC). In contrast, cognitive control processes additionally recruit bilateral rostral lateral PFC regions. The present study aimed to investigate these two types of processes in one experimental paradigm. Sequence processing was manipulated using two different sequencing rules varying in complexity. Cognitive control was varied with different cue-sets that determined the choice of a sequencing rule. Univariate analyses revealed distinct PFC regions for the two types of processing (i.e. sequence processing: left ventrolateral PFC and cognitive control processing: bilateral dorsolateral and rostral PFC). Moreover, in a common brain network (including left lateral PFC and intraparietal sulcus) no interaction between sequence and cognitive control processing was observed. In contrast, a multivariate pattern analysis revealed an interaction of sequence and cognitive control processing, such that voxels in left lateral PFC and parietal cortex showed different tuning functions for tasks involving different sequencing and cognitive control demands. These results suggest that the difference between the process of rule selection (i.e. cognitive control) and the process of rule-based sequencing (i.e. sequence processing) find their neuronal underpinnings in distinct activation patterns in lateral PFC. Moreover, the combination of rule selection and rule sequencing can shape the response of neurons in lateral PFC and parietal cortex. PMID:22952762

Whole-genome characterization of a Peruvian alpaca rotavirus isolate expressing a novel VP4 genotype.

PubMed

Rojas, Miguel; Gonçalves, Jorge Luiz S; Dias, Helver G; Manchego, Alberto; Pezo, Danilo; Santos, Norma

2016-11-30

The SA44 isolate of Rotavirus A (RVA) was identified from a neonatal Peruvian alpaca presenting with diarrhea, and the full-length genome sequence of the isolate (designated RVA/Alpaca-tc/PER/SA44/2014/G3P[40]) was determined. Phylogenetic analyses showed that the isolate possessed the genotype constellation G3-P[40]-I8-R3-C3-M3-A9-N3-T3-E3-H6, which differs considerably from those of RVA strains isolated from other species of the order Artiodactyla. Overall, the genetic constellation of the SA44 strain was quite similar to those of RVA strains isolated from a bat in Asia (MSLH14 and MYAS33). Nonetheless, phylogenetic analyses of each genome segment identified a distinct combination of genes. Several sequences were closely related to corresponding gene sequences in RVA strains from other species, including human (VP1, VP2, NSP1, and NSP2), simian (VP3 and NSP5), bat (VP6 and NSP4), and equine (NSP3). The VP7 gene sequence was closely related to RVA strains from a Peruvian alpaca (K'ayra/3368-10; 99.0% nucleotide and 99.7% amino acid identity) and from humans (RCH272; 95% nucleotide and 99.0% amino acid identity). The nucleotide sequence of the VP4 gene was distantly related to other VP4 sequences and was designated as the reference strain for the new P[40] genotype. This unique genetic makeup suggests that the SA44 strain emerged from multiple reassortment events between bat-, equine-, and human-like RVA strains. Copyright © 2016 Elsevier B.V. All rights reserved.
Communities of archaea and bacteria in a subsurface radioactive thermal spring in the Austrian Central Alps, and evidence of ammonia-oxidizing Crenarchaeota.

PubMed

Weidler, Gerhard W; Dornmayr-Pfaffenhuemer, Marion; Gerbl, Friedrich W; Heinen, Wolfgang; Stan-Lotter, Helga

2007-01-01

Scanning electron microscopy revealed great morphological diversity in biofilms from several largely unexplored subterranean thermal Alpine springs, which contain radium 226 and radon 222. A culture-independent molecular analysis of microbial communities on rocks and in the water of one spring, the "Franz-Josef-Quelle" in Bad Gastein, Austria, was performed. Four hundred fifteen clones were analyzed. One hundred thirty-two sequences were affiliated with 14 bacterial operational taxonomic units (OTUs) and 283 with four archaeal OTUs. Rarefaction analysis indicated a high diversity of bacterial sequences, while archaeal sequences were less diverse. The majority of the cloned archaeal 16S rRNA gene sequences belonged to the soil-freshwater-subsurface (1.1b) crenarchaeotic group; other representatives belonged to the freshwater-wastewater-soil (1.3b) group, except one clone, which was related to a group of uncultivated Euryarchaeota. These findings support recent reports that Crenarchaeota are not restricted to high-temperature environments. Most of the bacterial sequences were related to the Proteobacteria (alpha, beta, gamma, and delta), Bacteroidetes, and Planctomycetes. One OTU was allied with Nitrospina sp. (delta-Proteobacteria) and three others grouped with Nitrospira. Statistical analyses suggested high diversity based on 16S rRNA gene analyses; the rarefaction plot of archaeal clones showed a plateau. Since Crenarchaeota have been implicated recently in the nitrogen cycle, the spring environment was probed for the presence of the ammonia monooxygenase subunit A (amoA) gene. Sequences were obtained which were related to crenarchaeotic amoA genes from marine and soil habitats. The data suggested that nitrification processes are occurring in the subterranean environment and that ammonia may possibly be an energy source for the resident communities.
First molecular detection and phylogenetic analysis of Anaplasma phagocytophilum in shelter dogs in Seoul, Korea.

PubMed

Lee, Sukyee; Lee, Seung-Hun; VanBik, Dorene; Kim, Neung-Hee; Kim, Kyoo-Tae; Goo, Youn-Kyoung; Rhee, Man Hee; Kwon, Oh-Deog; Kwak, Dongmi

2016-07-01

In this study, the status of Anaplasma phagocytophilum infection was assessed in shelter dogs in Seoul, Korea, with PCR and phylogenetic analyses. Nested PCR on 1058 collected blood samples revealed only one A. phagocytophilum positive sample (female, age <1year, mixed breed, collected from the north of the Han River). The genetic variability of A. phagocytophilum was evaluated by genotyping, using the 16S rRNA, groEL, and msp2 gene sequences of the positive sample. BLASTn analysis revealed that the 16S rRNA, groEL, and msp2 genes had 99.6%, 99.9%, and 100% identity with the following sequences deposited in GenBank: a cat 16S rRNA sequence from Korea (KR021166), a rat groEL sequence from Korea (KT220194), and a water deer msp2 sequence from Korea (HM752099), respectively. Phylogenetic analyses classified the groEL gene into two distinct groups (serine and alanine), whereas the msp2 gene showed a general classification into two groups (USA and Europe) that were further subgrouped according to region. To the best of our knowledge, this study is the first to describe the molecular diagnosis of A. phagocytophilum in dogs reared in Korea. In addition, the high genetic identity of the 16S rRNA and groEL sequences between humans and dogs from the same region suggests a possible epidemiological relation. Given the conditions of climate change, tick ecology, and recent incidence of human granulocytic anaplasmosis in Korea, the findings of this study underscore the need to establish appropriate control programs for tick-borne diseases in Korea. Copyright © 2016 Elsevier GmbH. All rights reserved.
A Phylogenetic Analysis of the Genus Fragaria (Strawberry) Using Intron-Containing Sequence from the ADH-1 Gene

PubMed Central

DiMeglio, Laura M.; Yu, Hongrun; Davis, Thomas M.

2014-01-01

The genus Fragaria encompasses species at ploidy levels ranging from diploid to decaploid. The cultivated strawberry, Fragaria×ananassa, and its two immediate progenitors, F. chiloensis and F. virginiana, are octoploids. To elucidate the ancestries of these octoploid species, we performed a phylogenetic analysis using intron-containing sequences of the nuclear ADH-1 gene from 39 germplasm accessions representing nineteen Fragaria species and one outgroup species, Dasiphora fruticosa. All trees from Maximum Parsimony and Maximum Likelihood analyses showed two major clades, Clade A and Clade B. Each of the sampled octoploids contributed alleles to both major clades. All octoploid-derived alleles in Clade A clustered with alleles of diploid F. vesca, with the exception of one octoploid allele that clustered with the alleles of diploid F. mandshurica. All octoploid-derived alleles in clade B clustered with the alleles of only one diploid species, F. iinumae. When gaps encoded as binary characters were included in the Maximum Parsimony analysis, tree resolution was improved with the addition of six nodes, and the bootstrap support was generally higher, rising above the 50% threshold for an additional nine branches. These results, coupled with the congruence of the sequence data and the coded gap data, validate and encourage the employment of sequence sets containing gaps for phylogenetic analysis. Our phylogenetic conclusions, based upon sequence data from the ADH-1 gene located on F. vesca linkage group II, complement and generally agree with those obtained from analyses of protein-encoding genes GBSSI-2 and DHAR located on F. vesca linkage groups V and VII, respectively, but differ from a previous study that utilized rDNA sequences and did not detect the ancestral role of F. iinumae. PMID:25078607
Interim Reliability Evaluation Program: analysis of the Browns Ferry, Unit 1, nuclear plant. Main report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mays, S.E.; Poloski, J.P.; Sullivan, W.H.

1982-07-01

A probabilistic risk assessment (PRA) was made of the Browns Ferry, Unit 1, nuclear plant as part of the Nuclear Regulatory Commission's Interim Reliability Evaluation Program (IREP). Specific goals of the study were to identify the dominant contributors to core melt, develop a foundation for more extensive use of PRA methods, expand the cadre of experienced PRA practitioners, and apply procedures for extension of IREP analyses to other domestic light water reactors. Event tree and fault tree analyses were used to estimate the frequency of accident sequences initiated by transients and loss of coolant accidents. External events such as floods,more » fires, earthquakes, and sabotage were beyond the scope of this study and were, therefore, excluded. From these sequences, the dominant contributors to probable core melt frequency were chosen. Uncertainty and sensitivity analyses were performed on these sequences to better understand the limitations associated with the estimated sequence frequencies. Dominant sequences were grouped according to common containment failure modes and corresponding release categories on the basis of comparison with analyses of similar designs rather than on the basis of detailed plant-specific calculations.« less
Cytophotometric and biochemical analyses of DNA in pentaploid and diploid Agave species.

PubMed

Cavallini, A; Natali, L; Cionini, G; Castorena-Sanchez, I

1996-04-01

Nuclear DNA content, chromatin structure, and DNA composition were investigated in four Agave species: two diploid, Agave tequilana Weber and Agave angustifolia Haworth var. marginata Hort., and two pentaploid, Agave fourcroydes Lemaire and Agave sisalana Perrine. It was determined that the genome size of pentaploid species is nearly 2.5 times that of diploid ones. Cytophotometric analyses of chromatin structure were performed following Feulgen or DAPI staining to determine optical density profiles of interphase nuclei. Pentaploid species showed higher frequencies of condensed chromatin (heterochromatin) than diploid species. On the other hand, a lower frequency of A-T rich (DAPI stained) heterochromatin was found in pentaploid species than in diploid ones, indicating that heterochromatin in pentaploid species is made up of sequences with base compositions different from those of diploid species. Since thermal denaturation profiles of extracted DNA showed minor variations in the base composition of the genomes of the four species, it is supposed that, in pentaploid species, the large heterochromatin content is not due to an overrepresentation of G-C repetitive sequences but rather to the condensation of nonrepetitive sequences, such as, for example, redundant gene copies switched off in the polyploid complement. It is suggested that speciation in the genus Agave occurs through point mutations and minor DNA rearrangements, as is also indicated by the relative stability of the karyotype of this genus. Key words : Agave, DNA cytophotometry, DNA melting profiles, chromatin structure, genome size.
Lack of viral selection in human immunodeficiency virus type 1 mother-to-child transmission with primary infection during late pregnancy and/or breastfeeding.

PubMed

Ceballos, Ana; Andreani, Guadalupe; Ripamonti, Chiara; Dilernia, Dario; Mendez, Ramiro; Rabinovich, Roberto D; Cárdenas, Patricia Coll; Zala, Carlos; Cahn, Pedro; Scarlatti, Gabriella; Martínez Peralta, Liliana

2008-11-01

Mother-to-child transmission (MTCT) of human immunodeficiency virus type 1 (HIV-1) as described for women with an established infection is, in most cases, associated with the transmission of few maternal variants. This study analysed virus variability in four cases of maternal primary infection occurring during pregnancy and/or breastfeeding. Estimated time of seroconversion was at 4 months of pregnancy for one woman (early seroconversion) and during the last months of pregnancy and/or breastfeeding for the remaining three (late seroconversion). The C2V3 envelope region was analysed in samples of mother-child pairs by molecular cloning and sequencing. Comparisons of nucleotide and amino acid sequences as well as phylogenetic analysis were performed. The results showed low variability in the virus population of both mother and child. Maximum-likelihood analysis showed that, in the early pregnancy seroconversion case, a minor viral variant with further evolution in the child was transmitted, which could indicate a selection event in MTCT or a stochastic event, whereas in the late seroconversion cases, the mother's and child's sequences were intermingled, which is compatible with the transmission of multiple viral variants from the mother's major population. These results could be explained by the less pronounced selective pressure exerted by the immune system in the early stages of the mother's infection, which could play a role in MTCT of HIV-1.
Taxonomic and predicted metabolic profiles of the human gut microbiome in pre-Columbian mummies.

PubMed

Santiago-Rodriguez, Tasha M; Fornaciari, Gino; Luciani, Stefania; Dowd, Scot E; Toranzos, Gary A; Marota, Isolina; Cano, Raul J

2016-11-01

Characterization of naturally mummified human gut remains could potentially provide insights into the preservation and evolution of commensal and pathogenic microorganisms, and metabolic profiles. We characterized the gut microbiome of two pre-Columbian Andean mummies dating to the 10-15th centuries using 16S rRNA gene high-throughput sequencing and metagenomics, and compared them to a previously characterized gut microbiome of an 11th century AD pre-Columbian Andean mummy. Our previous study showed that the Clostridiales represented the majority of the bacterial communities in the mummified gut remains, but that other microbial communities were also preserved during the process of natural mummification, as shown with the metagenomics analyses. The gut microbiome of the other two mummies were mainly comprised by Clostridiales or Bacillales, as demonstrated with 16S rRNA gene amplicon sequencing, many of which are facultative anaerobes, possibly consistent with the process of natural mummification requiring low oxygen levels. Metagenome analyses showed the presence of other microbial groups that were positively or negatively correlated with specific metabolic profiles. The presence of sequences similar to both Trypanosoma cruzi and Leishmania donovani could suggest that these pathogens were prevalent in pre-Columbian individuals. Taxonomic and functional profiling of mummified human gut remains will aid in the understanding of the microbial ecology of the process of natural mummification. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Pigmentiphaga aceris sp. nov., isolated from tree sap.

PubMed

Lee, Soon Dong

2017-09-01

Two Gram-stain-negative bacterial strains, SAP-32T and SAP-36, were isolated from sap drawn from the Acer pictum from Mount Halla in Jeju, Republic of Korea. The organisms were strictly aerobic, non-sporulating, motile rods and showed growth at 10-30 °C, pH 7-8 and with 0-2 % NaCl. The major isoprenoid quinone was Q-8. The predominant fatty acids were C16 : 0, cyclo-C17 : 0, summed feature 3 and C18 : 0. The polar lipids contained phosphatidylcholine, phosphatidylethanolamine, phosphatidylglycerol, an unknown aminophosphoglycolipid, an unknown glycolipid, an unknown phospholipid and two unknown lipids. The DNA G+C content was 64.4 mol%. The results of phylogenetic analyses based on 16S rRNA gene sequences indicated that SAP-32T and SAP-36 formed a distinct cluster with members of the genus Pigmentiphaga within the family Alcaligenaceae. Both strains showed 16S rRNA gene sequence similarity of 100 % to each other. The closest relatives of the isolates were Pigmentiphaga daeguensis (97.08 % sequence similarity), Pigmentiphaga kullae (97.01 %) and Pigmentiphaga litoralis (96.73 %). On the basis of data from phenotypic, chemotaxonomic and phylogenetic analyses, SAP-32T (=KCTC 52619T=DSM 104039T) and SAP-36 (=KCTC 52620=DSM 104072) represent members of a novel species of the genus Pigmentiphaga, for which the name Pigmentiphaga aceris sp. nov. is proposed.
Next generation sequencing of SNPs for non-invasive prenatal diagnosis: challenges and feasibility as illustrated by an application to β-thalassaemia

PubMed Central

Papasavva, Thessalia; van IJcken, Wilfred F J; Kockx, Christel E M; van den Hout, Mirjam C G N; Kountouris, Petros; Kythreotis, Loukas; Kalogirou, Eleni; Grosveld, Frank G; Kleanthous, Marina

2013-01-01

β-Thalassaemia is one of the most common autosomal recessive single-gene disorder worldwide, with a carrier frequency of 12% in Cyprus. Prenatal tests for at risk pregnancies use invasive methods and development of a non-invasive prenatal diagnostic (NIPD) method is of paramount importance to prevent unnecessary risks inherent to invasive methods. Here, we describe such a method by assessing a modified version of next generation sequencing (NGS) using the Illumina platform, called ‘targeted sequencing', based on the detection of paternally inherited fetal alleles in maternal plasma. We selected four single-nucleotide polymorphisms (SNPs) located in the β-globin locus with a high degree of heterozygosity in the Cypriot population. Spiked genomic samples were used to determine the specificity of the platform. We could detect the minor alleles in the expected ratio, showing the specificity of the platform. We then developed a multiplexed format for the selected SNPs and analysed ten maternal plasma samples from pregnancies at risk. The presence or absence of the paternal mutant allele was correctly determined in 27 out of 34 samples analysed. With haplotype analysis, NIPD was possible on eight out of ten families. This is the first study carried out for the NIPD of β-thalassaemia using targeted NGS and haplotype analysis. Preliminary results show that NGS is effective in detecting paternally inherited alleles in the maternal plasma. PMID:23572027
Defining objective clusters for rabies virus sequences using affinity propagation clustering

PubMed Central

Fischer, Susanne; Freuling, Conrad M.; Pfaff, Florian; Bodenhofer, Ulrich; Höper, Dirk; Fischer, Mareike; Marston, Denise A.; Fooks, Anthony R.; Mettenleiter, Thomas C.; Conraths, Franz J.; Homeier-Bachmann, Timo

2018-01-01

Rabies is caused by lyssaviruses, and is one of the oldest known zoonoses. In recent years, more than 21,000 nucleotide sequences of rabies viruses (RABV), from the prototype species rabies lyssavirus, have been deposited in public databases. Subsequent phylogenetic analyses in combination with metadata suggest geographic distributions of RABV. However, these analyses somewhat experience technical difficulties in defining verifiable criteria for cluster allocations in phylogenetic trees inviting for a more rational approach. Therefore, we applied a relatively new mathematical clustering algorythm named ‘affinity propagation clustering’ (AP) to propose a standardized sub-species classification utilizing full-genome RABV sequences. Because AP has the advantage that it is computationally fast and works for any meaningful measure of similarity between data samples, it has previously been applied successfully in bioinformatics, for analysis of microarray and gene expression data, however, cluster analysis of sequences is still in its infancy. Existing (516) and original (46) full genome RABV sequences were used to demonstrate the application of AP for RABV clustering. On a global scale, AP proposed four clusters, i.e. New World cluster, Arctic/Arctic-like, Cosmopolitan, and Asian as previously assigned by phylogenetic studies. By combining AP with established phylogenetic analyses, it is possible to resolve phylogenetic relationships between verifiably determined clusters and sequences. This workflow will be useful in confirming cluster distributions in a uniform transparent manner, not only for RABV, but also for other comparative sequence analyses. PMID:29357361
How Does Context Play "a Part" in Splitting Words "Apart"? Production and Perception of Word Boundaries in Casual Speech

ERIC Educational Resources Information Center

Kim, Dahee; Stephens, Joseph D. W.; Pitt, Mark A.

2012-01-01

Four experiments examined listeners' segmentation of ambiguous schwa-initial sequences (e.g., "a long" vs. "along") in casual speech, where acoustic cues can be unclear, possibly increasing reliance on contextual information to resolve the ambiguity. In Experiment 1, acoustic analyses of talkers' productions showed that the one-word and two-word…
A phylogenetic analysis of Aquifex pyrophilus

NASA Technical Reports Server (NTRS)

Burggraf, S.; Olsen, G. J.; Stetter, K. O.; Woese, C. R.

1992-01-01

The 16S rRNA of the bacterion Aquifex pyrophilus, a microaerophilic, oxygen-reducing hyperthermophile, has been sequenced directly from the the PCR amplified gene. Phylogenetic analyses show the Aq. pyrophilus lineage to be probably the deepest (earliest) in the (eu)bacterial tree. The addition of this deep branching to the bacterial tree further supports the argument that the Bacteria are of thermophilic ancestry.
Low-pass sequencing for microbial comparative genomics

PubMed Central

Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy

2004-01-01

Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1. Identification of multiple TBP and TFB homologs in these four halophiles are consistent with the hypothesis that different types of complex transcriptional regulation may occur through multiple TBP-TFB combinations in response to rapidly changing environmental conditions. Low-pass shotgun sequence analyses of genomes permit extensive and diverse analyses, and should be generally useful for comparative microbial genomics. PMID:14718067
Acoustic sequences in non-human animals: a tutorial review and prospectus.

PubMed

Kershenbaum, Arik; Blumstein, Daniel T; Roch, Marie A; Akçay, Çağlar; Backus, Gregory; Bee, Mark A; Bohn, Kirsten; Cao, Yan; Carter, Gerald; Cäsar, Cristiane; Coen, Michael; DeRuiter, Stacy L; Doyle, Laurance; Edelman, Shimon; Ferrer-i-Cancho, Ramon; Freeberg, Todd M; Garland, Ellen C; Gustison, Morgan; Harley, Heidi E; Huetz, Chloé; Hughes, Melissa; Hyland Bruno, Julia; Ilany, Amiyaal; Jin, Dezhe Z; Johnson, Michael; Ju, Chenghui; Karnowski, Jeremy; Lohr, Bernard; Manser, Marta B; McCowan, Brenda; Mercado, Eduardo; Narins, Peter M; Piel, Alex; Rice, Megan; Salmi, Roberta; Sasahara, Kazutoshi; Sayigh, Laela; Shiu, Yu; Taylor, Charles; Vallejo, Edgar E; Waller, Sara; Zamora-Gutierrez, Veronica

2016-02-01

Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well-known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise - let alone understand - the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near-future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, 'Analysing vocal sequences in animals'. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial-style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality. © 2014 Cambridge Philosophical Society.
Acoustic sequences in non-human animals: a tutorial review and prospectus

PubMed Central

Kershenbaum, Arik; Blumstein, Daniel T.; Roch, Marie A.; Akçay, Çağlar; Backus, Gregory; Bee, Mark A.; Bohn, Kirsten; Cao, Yan; Carter, Gerald; Cäsar, Cristiane; Coen, Michael; DeRuiter, Stacy L.; Doyle, Laurance; Edelman, Shimon; Ferrer-i-Cancho, Ramon; Freeberg, Todd M.; Garland, Ellen C.; Gustison, Morgan; Harley, Heidi E.; Huetz, Chloé; Hughes, Melissa; Bruno, Julia Hyland; Ilany, Amiyaal; Jin, Dezhe Z.; Johnson, Michael; Ju, Chenghui; Karnowski, Jeremy; Lohr, Bernard; Manser, Marta B.; McCowan, Brenda; Mercado, Eduardo; Narins, Peter M.; Piel, Alex; Rice, Megan; Salmi, Roberta; Sasahara, Kazutoshi; Sayigh, Laela; Shiu, Yu; Taylor, Charles; Vallejo, Edgar E.; Waller, Sara; Zamora-Gutierrez, Veronica

2015-01-01

Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well-known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise – let alone understand – the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near-future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, “Analysing vocal sequences in animals”. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial-style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality. PMID:25428267
Positive Selection Driving Cytoplasmic Genome Evolution of the Medicinally Important Ginseng Plant Genus Panax

PubMed Central

Jiang, Peng; Shi, Feng-Xue; Li, Ming-Rui; Liu, Bao; Wen, Jun; Xiao, Hong-Xing; Li, Lin-Feng

2018-01-01

Panax L. (the ginseng genus) is a shade-demanding group within the family Araliaceae and all of its species are of crucial significance in traditional Chinese medicine. Phylogenetic and biogeographic analyses demonstrated that two rounds of whole genome duplications accompanying with geographic and ecological isolations promoted the diversification of Panax species. However, contributions of the cytoplasmic genomes to the adaptive evolution of Panax species remained largely uninvestigated. In this study, we sequenced the chloroplast and mitochondrial genomes of 11 accessions belonging to seven Panax species. Our results show that heterogeneity in nucleotide substitution rate is abundant in both of the two cytoplasmic genomes, with the mitochondrial genome possessing more variants at the total level but the chloroplast showing higher sequence polymorphisms at the genic regions. Genome-wide scanning of positive selection identified five and 12 genes from the chloroplast and mitochondrial genomes, respectively. Functional analyses further revealed that these selected genes play important roles in plant development, cellular metabolism and adaptation. We therefore conclude that positive selection might be one of the potential evolutionary forces that shaped nucleotide variation pattern of these Panax species. In particular, the mitochondrial genes evolved under stronger selective pressure compared to the chloroplast genes. PMID:29670636
Positive Selection Driving Cytoplasmic Genome Evolution of the Medicinally Important Ginseng Plant Genus Panax.

PubMed

Jiang, Peng; Shi, Feng-Xue; Li, Ming-Rui; Liu, Bao; Wen, Jun; Xiao, Hong-Xing; Li, Lin-Feng

2018-01-01

Panax L. (the ginseng genus) is a shade-demanding group within the family Araliaceae and all of its species are of crucial significance in traditional Chinese medicine. Phylogenetic and biogeographic analyses demonstrated that two rounds of whole genome duplications accompanying with geographic and ecological isolations promoted the diversification of Panax species. However, contributions of the cytoplasmic genomes to the adaptive evolution of Panax species remained largely uninvestigated. In this study, we sequenced the chloroplast and mitochondrial genomes of 11 accessions belonging to seven Panax species. Our results show that heterogeneity in nucleotide substitution rate is abundant in both of the two cytoplasmic genomes, with the mitochondrial genome possessing more variants at the total level but the chloroplast showing higher sequence polymorphisms at the genic regions. Genome-wide scanning of positive selection identified five and 12 genes from the chloroplast and mitochondrial genomes, respectively. Functional analyses further revealed that these selected genes play important roles in plant development, cellular metabolism and adaptation. We therefore conclude that positive selection might be one of the potential evolutionary forces that shaped nucleotide variation pattern of these Panax species. In particular, the mitochondrial genes evolved under stronger selective pressure compared to the chloroplast genes.
Microbial diversity and chemical analysis of the starters used in traditional Chinese sweet rice wine.

PubMed

Cai, Haiying; Zhang, Ting; Zhang, Qi; Luo, Jie; Cai, Chenggang; Mao, Jianwei

2018-08-01

Chinese sweet rice wine (CSRW) is a popular alcoholic drink in China. To investigate the effect of the microbial composition in CSRW starters on the final quality of the alcoholic drink, high-throughput sequencing on the fungal internal transcribed spacer II and bacterial 16S rRNA gene of the microflora in 8 starter samples was performed. The sequencing data analysis showed that 10 genera of yeasts and mold, and 11 genera of bacteria were identified. Fungal diversity analyses showed the significant variances in the fungal compositions among the starter samples. Starter microbiota were dominated by the Rhizopus genus in SZ5, LS6, NN8, QD9, DZ10 and DZ11, indicating its important role in starch hydrolysis during CSRW brewing. According to principal coordinate analyses, the bacterial composition had even less similarity among the 8 starter samples. The chemical determination of CSRW fermented with the 8 starters demonstrated that the CSRW quality and flavor were drastically influenced by the taxonomic composition and metabolism of the microbes in the starters. This study suggests it is necessary to standardize rice wine manufacturing and flavor classification by specifying starter and fermentation techniques. Copyright © 2018 Elsevier Ltd. All rights reserved.
Cloning and characterization of cDNAs encoding human gastrin-releasing peptide.

PubMed Central

Spindel, E R; Chin, W W; Price, J; Rees, L H; Besser, G M; Habener, J F

1984-01-01

We have prepared and cloned cDNAs derived from poly(A)+ RNA from a human pulmonary carcinoid tumor rich in immunoreactivity to gastrin-releasing peptide, a peptide closely related in structure to amphibian bombesin. Mixtures of synthetic oligodeoxyribonucleotides corresponding to amphibian bombesin were used as hybridization probes to screen a cDNA library prepared from the tumor RNA. Sequencing of the recombinant plasmids shows that human gastrin-releasing peptide (hGRP) mRNA encodes a precursor of 148 amino acids containing a typical signal sequence, hGRP consisting of 27 or 28 amino acids, and a carboxyl-terminal extension peptide. hGRP is flanked at its carboxyl terminus by two basic amino acids, following a glycine used for amidation of the carboxyl-terminal methionine. RNA blot analyses of tumor RNA show a major mRNA of 900 bases and a minor mRNA of 850 bases. Blot hybridization analyses using human genomic DNA are consistent with a single hGRP-encoding gene. The presence of two mRNAs encoding the hGRP precursor protein in the face of a single hGRP gene raises the possibility of alternative processing of the single RNA transcript. Images PMID:6207529

Shotgun Pyrosequencing Metagenomic Analyses of Dusts from Swine Confinement and Grain Facilities

PubMed Central

Boissy, Robert J.; Romberger, Debra J.; Roughead, William A.; Weissenburger-Moser, Lisa; Poole, Jill A.; LeVan, Tricia D.

2014-01-01

Inhalation of agricultural dusts causes inflammatory reactions and symptoms such as headache, fever, and malaise, which can progress to chronic airway inflammation and associated diseases, e.g. asthma, chronic bronchitis, chronic obstructive pulmonary disease, and hypersensitivity pneumonitis. Although in many agricultural environments feed particles are the major constituent of these dusts, the inflammatory responses that they provoke are likely attributable to particle-associated bacteria, archaebacteria, fungi, and viruses. In this study, we performed shotgun pyrosequencing metagenomic analyses of DNA from dusts from swine confinement facilities or grain elevators, with comparisons to dusts from pet-free households. DNA sequence alignment showed that 19% or 62% of shotgun pyrosequencing metagenomic DNA sequence reads from swine facility or household dusts, respectively, were of swine or human origin, respectively. In contrast only 2% of such reads from grain elevator dust were of mammalian origin. These metagenomic shotgun reads of mammalian origin were excluded from our analyses of agricultural dust microbiota. The ten most prevalent bacterial taxa identified in swine facility compared to grain elevator or household dust were comprised of 75%, 16%, and 42% gram-positive organisms, respectively. Four of the top five swine facility dust genera were assignable (Clostridium, Lactobacillus, Ruminococcus, and Eubacterium, ranging from 4% to 19% relative abundance). The relative abundances of these four genera were lower in dust from grain elevators or pet-free households. These analyses also highlighted the predominance in swine facility dust of Firmicutes (70%) at the phylum level, Clostridia (44%) at the Class level, and Clostridiales at the Order level (41%). In summary, shotgun pyrosequencing metagenomic analyses of agricultural dusts show that they differ qualitatively and quantitatively at the level of microbial taxa present, and that the bioinformatic analyses used for such studies must be carefully designed to avoid the potential contribution of non-microbial DNA, e.g. from resident mammals. PMID:24748147
Shotgun pyrosequencing metagenomic analyses of dusts from swine confinement and grain facilities.

PubMed

Boissy, Robert J; Romberger, Debra J; Roughead, William A; Weissenburger-Moser, Lisa; Poole, Jill A; LeVan, Tricia D

2014-01-01

Inhalation of agricultural dusts causes inflammatory reactions and symptoms such as headache, fever, and malaise, which can progress to chronic airway inflammation and associated diseases, e.g. asthma, chronic bronchitis, chronic obstructive pulmonary disease, and hypersensitivity pneumonitis. Although in many agricultural environments feed particles are the major constituent of these dusts, the inflammatory responses that they provoke are likely attributable to particle-associated bacteria, archaebacteria, fungi, and viruses. In this study, we performed shotgun pyrosequencing metagenomic analyses of DNA from dusts from swine confinement facilities or grain elevators, with comparisons to dusts from pet-free households. DNA sequence alignment showed that 19% or 62% of shotgun pyrosequencing metagenomic DNA sequence reads from swine facility or household dusts, respectively, were of swine or human origin, respectively. In contrast only 2% of such reads from grain elevator dust were of mammalian origin. These metagenomic shotgun reads of mammalian origin were excluded from our analyses of agricultural dust microbiota. The ten most prevalent bacterial taxa identified in swine facility compared to grain elevator or household dust were comprised of 75%, 16%, and 42% gram-positive organisms, respectively. Four of the top five swine facility dust genera were assignable (Clostridium, Lactobacillus, Ruminococcus, and Eubacterium, ranging from 4% to 19% relative abundance). The relative abundances of these four genera were lower in dust from grain elevators or pet-free households. These analyses also highlighted the predominance in swine facility dust of Firmicutes (70%) at the phylum level, Clostridia (44%) at the Class level, and Clostridiales at the Order level (41%). In summary, shotgun pyrosequencing metagenomic analyses of agricultural dusts show that they differ qualitatively and quantitatively at the level of microbial taxa present, and that the bioinformatic analyses used for such studies must be carefully designed to avoid the potential contribution of non-microbial DNA, e.g. from resident mammals.
Molecular phylogeny of 21 tropical bamboo species reconstructed by integrating non-coding internal transcribed spacer (ITS1 and 2) sequences and their consensus secondary structure.

PubMed

Ghosh, Jayadri Sekhar; Bhattacharya, Samik; Pal, Amita

2017-06-01

The unavailability of the reproductive structure and unpredictability of vegetative characters for the identification and phylogenetic study of bamboo prompted the application of molecular techniques for greater resolution and consensus. We first employed internal transcribed spacer (ITS1, 5.8S rRNA and ITS2) sequences to construct the phylogenetic tree of 21 tropical bamboo species. While the sequence alone could grossly reconstruct the traditional phylogeny amongst the 21-tropical species studied, some anomalies were encountered that prompted a further refinement of the phylogenetic analyses. Therefore, we integrated the secondary structure of the ITS sequences to derive individual sequence-structure matrix to gain more resolution on the phylogenetic reconstruction. The results showed that ITS sequence-structure is the reliable alternative to the conventional phenotypic method for the identification of bamboo species. The best-fit topology obtained by the sequence-structure based phylogeny over the sole sequence based one underscores closer clustering of all the studied Bambusa species (Sub-tribe Bambusinae), while Melocanna baccifera, which belongs to Sub-Tribe Melocanneae, disjointedly clustered as an out-group within the consensus phylogenetic tree. In this study, we demonstrated the dependability of the combined (ITS sequence+structure-based) approach over the only sequence-based analysis for phylogenetic relationship assessment of bamboo.
The Evolution of Ebola virus: Insights from the 2013–2016 Epidemic

PubMed Central

Holmes, Edward C.; Dudas, Gytis; Rambaut, Andrew; Andersen, Kristian G.

2017-01-01

Preface The 2013–2016 epidemic of Ebola virus disease in West Africa was of unprecedented magnitude and changed our perspective on this lethal but sporadically emerging virus. This outbreak also marked the beginning of large-scale real-time molecular epidemiology. Herein, we show how evolutionary analyses of Ebola virus genome sequences provided key insights into virus origins, evolution, and spread during the epidemic. We provide basic scientists, epidemiologists, medical practitioners, and other outbreak responders with an enhanced understanding of the utility and limitations of pathogen genomic sequencing. This will be crucially important in our attempts to track and control future infectious disease outbreaks. PMID:27734858
CPm gene diversity in field isolates of Citrus tristeza virus from Colombia.

PubMed

Oliveros-Garay, Oscar Arturo; Martinez-Salazar, Natalhie; Torres-Ruiz, Yanneth; Acosta, Orlando

2009-01-01

The nucleotide sequence diversity of the CPm gene from 28 field isolates of Citrus tristeza virus (CTV) was assessed by SSCP and sequence analyses. These isolates showed two major shared haplotypes, which differed in distribution: A1 was the major haplotype in 23 isolates from different geographic regions, whereas R1 was found in isolates from a discrete region. Phylogenetic reconstruction clustered A1 within an independent group, while R1 was grouped with mild isolates T30 from Florida and T385 from Spain. Some isolates contained several minor haplotypes, which were very similar to, and associated with, the major haplotype.
Molecular detection and sequence characterization of diverse rhabdoviruses in bats, China.

PubMed

Xu, Lin; Wu, Jianmin; Jiang, Tinglei; Qin, Shaomin; Xia, Lele; Li, Xingyu; He, Biao; Tu, Changchun

2018-01-15

The Rhabdoviridae is among the most diverse families of RNA viruses and currently classified into 18 genera with some rhabdoviruses lethal to humans and other animals. Herein, we describe genetic characterization of three novel rhabdoviruses from bats in China. Of these, two viruses (Jinghong bat virus and Benxi bat virus) found in Rhinolophus bats showed a phylogenetic relationship with vesiculoviruses, and sequence analyses indicate that they represent two new species within the genus Vesiculovirus. The remaining Yangjiang bat virus found in Hipposideros larvatus bats were only distantly related to currently known rhabdoviruses. Copyright © 2017 Elsevier B.V. All rights reserved.
Maintenance of an Intact Human Immunodeficiency Virus Type 1 vpr Gene following Mother-to-Infant Transmission

PubMed Central

Yedavalli, Venkat R. K.; Chappey, Colombe; Ahmad, Nafees

1998-01-01

The vpr sequences from six human immunodeficiency virus type 1 (HIV-1)-infected mother-infant pairs following perinatal transmission were analyzed. We found that 153 of the 166 clones analyzed from uncultured peripheral blood mononuclear cell DNA samples showed a 92.17% frequency of intact vpr open reading frames. There was a low degree of heterogeneity of vpr genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vpr sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Moreover, the infants’ sequences displayed patterns similar to those seen in their mothers. The functional domains essential for Vpr activity, including virion incorporation, nuclear import, and cell cycle arrest and differentiation were highly conserved in most of the sequences. Phylogenetic analyses of 166 mother-infant pairs and 195 other available vpr sequences from HIV databases formed distinct clusters for each mother-infant pair and for other vpr sequences and grouped the six mother-infant pairs’ sequences with subtype B sequences. A high degree of conservation of intact and functional vpr supports the notion that vpr plays an important role in HIV-1 infection and replication in mother-infant isolates that are involved in perinatal transmission. PMID:9658150
Development of self-compressing BLSOM for comprehensive analysis of big sequence data.

PubMed

Kikuchi, Akihito; Ikemura, Toshimichi; Abe, Takashi

2015-01-01

With the remarkable increase in genomic sequence data from various organisms, novel tools are needed for comprehensive analyses of available big sequence data. We previously developed a Batch-Learning Self-Organizing Map (BLSOM), which can cluster genomic fragment sequences according to phylotype solely dependent on oligonucleotide composition and applied to genome and metagenomic studies. BLSOM is suitable for high-performance parallel-computing and can analyze big data simultaneously, but a large-scale BLSOM needs a large computational resource. We have developed Self-Compressing BLSOM (SC-BLSOM) for reduction of computation time, which allows us to carry out comprehensive analysis of big sequence data without the use of high-performance supercomputers. The strategy of SC-BLSOM is to hierarchically construct BLSOMs according to data class, such as phylotype. The first-layer BLSOM was constructed with each of the divided input data pieces that represents the data subclass, such as phylotype division, resulting in compression of the number of data pieces. The second BLSOM was constructed with a total of weight vectors obtained in the first-layer BLSOMs. We compared SC-BLSOM with the conventional BLSOM by analyzing bacterial genome sequences. SC-BLSOM could be constructed faster than BLSOM and cluster the sequences according to phylotype with high accuracy, showing the method's suitability for efficient knowledge discovery from big sequence data.
The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.).

PubMed

Raju, Nikku L; Gnanesh, Belaghihalli N; Lekha, Pazhamala; Jayashree, Balaji; Pande, Suresh; Hiremath, Pavana J; Byregowda, Munishamappa; Singh, Nagendra K; Varshney, Rajeev K

2010-03-11

Pigeonpea (Cajanus cajan (L.) Millsp) is one of the major grain legume crops of the tropics and subtropics, but biotic stresses [Fusarium wilt (FW), sterility mosaic disease (SMD), etc.] are serious challenges for sustainable crop production. Modern genomic tools such as molecular markers and candidate genes associated with resistance to these stresses offer the possibility of facilitating pigeonpea breeding for improving biotic stress resistance. Availability of limited genomic resources, however, is a serious bottleneck to undertake molecular breeding in pigeonpea to develop superior genotypes with enhanced resistance to above mentioned biotic stresses. With an objective of enhancing genomic resources in pigeonpea, this study reports generation and analysis of comprehensive resource of FW- and SMD- responsive expressed sequence tags (ESTs). A total of 16 cDNA libraries were constructed from four pigeonpea genotypes that are resistant and susceptible to FW ('ICPL 20102' and 'ICP 2376') and SMD ('ICP 7035' and 'TTB 7') and a total of 9,888 (9,468 high quality) ESTs were generated and deposited in dbEST of GenBank under accession numbers GR463974 to GR473857 and GR958228 to GR958231. Clustering and assembly analyses of these ESTs resulted into 4,557 unique sequences (unigenes) including 697 contigs and 3,860 singletons. BLASTN analysis of 4,557 unigenes showed a significant identity with ESTs of different legumes (23.2-60.3%), rice (28.3%), Arabidopsis (33.7%) and poplar (35.4%). As expected, pigeonpea ESTs are more closely related to soybean (60.3%) and cowpea ESTs (43.6%) than other plant ESTs. Similarly, BLASTX similarity results showed that only 1,603 (35.1%) out of 4,557 total unigenes correspond to known proteins in the UniProt database (or= 5 sequences detected 102 single nucleotide polymorphisms (SNPs) in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS) assay. The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.
The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.)

PubMed Central

2010-01-01

Background Pigeonpea (Cajanus cajan (L.) Millsp) is one of the major grain legume crops of the tropics and subtropics, but biotic stresses [Fusarium wilt (FW), sterility mosaic disease (SMD), etc.] are serious challenges for sustainable crop production. Modern genomic tools such as molecular markers and candidate genes associated with resistance to these stresses offer the possibility of facilitating pigeonpea breeding for improving biotic stress resistance. Availability of limited genomic resources, however, is a serious bottleneck to undertake molecular breeding in pigeonpea to develop superior genotypes with enhanced resistance to above mentioned biotic stresses. With an objective of enhancing genomic resources in pigeonpea, this study reports generation and analysis of comprehensive resource of FW- and SMD- responsive expressed sequence tags (ESTs). Results A total of 16 cDNA libraries were constructed from four pigeonpea genotypes that are resistant and susceptible to FW ('ICPL 20102' and 'ICP 2376') and SMD ('ICP 7035' and 'TTB 7') and a total of 9,888 (9,468 high quality) ESTs were generated and deposited in dbEST of GenBank under accession numbers GR463974 to GR473857 and GR958228 to GR958231. Clustering and assembly analyses of these ESTs resulted into 4,557 unique sequences (unigenes) including 697 contigs and 3,860 singletons. BLASTN analysis of 4,557 unigenes showed a significant identity with ESTs of different legumes (23.2-60.3%), rice (28.3%), Arabidopsis (33.7%) and poplar (35.4%). As expected, pigeonpea ESTs are more closely related to soybean (60.3%) and cowpea ESTs (43.6%) than other plant ESTs. Similarly, BLASTX similarity results showed that only 1,603 (35.1%) out of 4,557 total unigenes correspond to known proteins in the UniProt database (≤ 1E-08). Functional categorization of the annotated unigenes sequences showed that 153 (3.3%) genes were assigned to cellular component category, 132 (2.8%) to biological process, and 132 (2.8%) in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR) motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8%) markers with an average of four alleles per marker and an average polymorphic information content (PIC) value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs) in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS) assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding. PMID:20222972
Cloning of Giardia lamblia heat shock protein HSP70 homologs: implications regarding origin of eukaryotic cells and of endoplasmic reticulum.

PubMed Central

Gupta, R S; Aitken, K; Falah, M; Singh, B

1994-01-01

The genes for two different 70-kDa heat shock protein (HSP70) homologs have been cloned and sequenced from the protozoan Giardia lamblia. On the basis of their sequence features, one of these genes corresponds to the cytoplasmic form of HSP70. The second gene, on the basis of its characteristic N-terminal hydrophobic signal sequence and C-terminal endoplasmic reticulum (ER) retention sequence (Lys-Asp-Glu-Leu), is the equivalent of ER-resident GRP78 or the Bip family of proteins. Phylogenetic trees based on HSP70 sequences show that G. lamblia homologs show the deepest divergence among eukaryotic species. The identification of a GRP78 or Bip homolog in G. lamblia strongly suggests the existence of ER in this ancient eukaryote. Detailed phylogenetic analyses of HSP70 sequences by boot-strap neighbor-joining and maximum-parsimony methods show that the cytoplasmic and ER homologs form distinct subfamilies that evolved from a common eukaryotic ancestor by gene duplication that occurred very early in the evolution of eukaryotic cells. It is postulated that because of the essential "molecular chaperone" function of these proteins in translocation of other proteins across membranes, duplication of their genes accompanied the evolution of ER or nucleus in the eukaryotic cell ancestor. The presence in all eukaryotic cytoplasmic HSP70 homologs (including the cognate, heat-induced, and ER forms) of a number of autapomorphic sequence signatures that are not present in any prokaryotic or organellar homologs provides strong evidence regarding the monophyletic nature of eukaryotic lineage. Further, all eukaryotic HSP70 homologs share in common with the Gram-negative group of eubacteria a number of sequence features that are not present in any archaebacterium or Gram-positive bacterium, indicating their evolution from this group of organisms. Some implications of these findings regarding the evolution of eukaryotic cells and ER are discussed. Images PMID:8159675
Short branches lead to systematic artifacts when BLAST searches are used as surrogate for phylogenetic reconstruction.

PubMed

Dick, Amanda A; Harlow, Timothy J; Gogarten, J Peter

2017-02-01

Long Branch Attraction (LBA) is a well-known artifact in phylogenetic reconstruction when dealing with branch length heterogeneity. Here we show another phenomenon, Short Branch Attraction (SBA), which occurs when BLAST searches, a phenetic analysis, are used as a surrogate method for phylogenetic analysis. This error also results from branch length heterogeneity, but this time it is the short branches that are attracting. The SBA artifact is reciprocal and can be returned 100% of the time when multiple branches differ in length by a factor of more than two. SBA is an intended feature of BLAST searches, but becomes an issue, when top scoring BLAST hit analyses are used to infer Horizontal Gene Transfers (HGTs), assign taxonomic category with environmental sequence data in phylotyping, or gather homologous sequences for building gene families. SBA can lead researchers to believe that there has been a HGT event when only vertical descent has occurred, cause slowly evolving taxa to be over-represented and quickly evolving taxa to be under-represented in phylotyping, or systematically exclude quickly evolving taxa from analyses. SBA also contributes to the changing results of top scoring BLAST hit analyses as the database grows, because more slowly evolving taxa, or short branches, are added over time, introducing more potential for SBA. SBA can be detected by examining reciprocal best BLAST hits among a larger group of taxa, including the known closest phylogenetic neighbors. Therefore, one should look for this phenomenon when conducting best BLAST hit analyses as a surrogate method to identify HGTs, in phylotyping, or when using BLAST to gather homologous sequences. Copyright © 2016 Elsevier Inc. All rights reserved.
Re-evaluation of the taxonomy of the Mitis group of the genus Streptococcus based on whole genome phylogenetic analyses, and proposed reclassification of Streptococcus dentisani as Streptococcus oralis subsp. dentisani comb. nov., Streptococcus tigurinus as Streptococcus oralis subsp. tigurinus comb. nov., and Streptococcus oligofermentans as a later synonym of Streptococcus cristatus.

PubMed

Jensen, Anders; Scholz, Christian F P; Kilian, Mogens

2016-11-01

The Mitis group of the genus Streptococcus currently comprises 20 species with validly published names, including the pathogen S. pneumoniae. They have been the subject of much taxonomic confusion, due to phenotypic overlap and genetic heterogeneity, which has hampered a full appreciation of their clinical significance. The purpose of this study was to critically re-examine the taxonomy of the Mitis group using 195 publicly available genomes, including designated type strains for phylogenetic analyses based on core genomes, multilocus sequences and 16S rRNA gene sequences, combined with estimates of average nucleotide identity (ANI) and in silico and in vitro analyses of specific phenotypic characteristics. Our core genomic phylogenetic analyses revealed distinct clades that, to some extent, and from the clustering of type strains represent known species. However, many of the genomes have been incorrectly identified adding to the current confusion. Furthermore, our data show that 16S rRNA gene sequences and ANI are unsuitable for identifying and circumscribing new species of the Mitis group of the genus Streptococci. Based on the clustering patterns resulting from core genome phylogenetic analysis, we conclude that S. oligofermentans is a later synonym of S. cristatus. The recently described strains of the species Streptococcus dentisani includes one previously referred to as 'S. mitis biovar 2'. Together with S. oralis, S. dentisani and S. tigurinus form subclusters within a coherent phylogenetic clade. We propose that the species S. oralis consists of three subspecies: S. oralis subsp. oralis subsp. nov., S. oralis subsp. tigurinus comb. nov., and S. oralis subsp. dentisani comb. nov.
An Ancient Transkingdom Horizontal Transfer of Penelope-Like Retroelements from Arthropods to Conifers

PubMed Central

Lin, Xuan; Faridi, Nurul; Casola, Claudio

2016-01-01

Comparative genomics analyses empowered by the wealth of sequenced genomes have revealed numerous instances of horizontal DNA transfers between distantly related species. In eukaryotes, repetitive DNA sequences known as transposable elements (TEs) are especially prone to move across species boundaries. Such horizontal transposon transfers, or HTTs, are relatively common within major eukaryotic kingdoms, including animals, plants, and fungi, while rarely occurring across these kingdoms. Here, we describe the first case of HTT from animals to plants, involving TEs known as Penelope-like elements, or PLEs, a group of retrotransposons closely related to eukaryotic telomerases. Using a combination of in situ hybridization on chromosomes, polymerase chain reaction experiments, and computational analyses we show that the predominant PLE lineage, EN(+)PLEs, is highly diversified in loblolly pine and other conifers, but appears to be absent in other gymnosperms. Phylogenetic analyses of both protein and DNA sequences reveal that conifers EN(+)PLEs, or Dryads, form a monophyletic group clustering within a clade of primarily arthropod elements. Additionally, no EN(+)PLEs were detected in 1,928 genome assemblies from 1,029 nonmetazoan and nonconifer genomes from 14 major eukaryotic lineages. These findings indicate that Dryads emerged following an ancient horizontal transfer of EN(+)PLEs from arthropods to a common ancestor of conifers approximately 340 Ma. This represents one of the oldest known interspecific transmissions of TEs, and the most conspicuous case of DNA transfer between animals and plants. PMID:27190138
Annotated ESTs from various tissues of the brown planthopper Nilaparvata lugens: a genomic resource for studying agricultural pests.

PubMed

Noda, Hiroaki; Kawai, Sawako; Koizumi, Yoko; Matsui, Kageaki; Zhang, Qiang; Furukawa, Shigetoyo; Shimomura, Michihiko; Mita, Kazuei

2008-03-03

The brown planthopper (BPH), Nilaparvata lugens (Hemiptera, Delphacidae), is a serious insect pests of rice plants. Major means of BPH control are application of agricultural chemicals and cultivation of BPH resistant rice varieties. Nevertheless, BPH strains that are resistant to agricultural chemicals have developed, and BPH strains have appeared that are virulent against the resistant rice varieties. Expressed sequence tag (EST) analysis and related applications are useful to elucidate the mechanisms of resistance and virulence and to reveal physiological aspects of this non-model insect, with its poorly understood genetic background. More than 37,000 high-quality ESTs, excluding sequences of mitochondrial genome, microbial genomes, and rDNA, have been produced from 18 libraries of various BPH tissues and stages. About 10,200 clusters have been made from whole EST sequences, with average EST size of 627 bp. Among the top ten most abundantly expressed genes, three are unique and show no homology in BLAST searches. The actin gene was highly expressed in BPH, especially in the thorax. Tissue-specifically expressed genes were extracted based on the expression frequency among the libraries. An EST database is available at our web site. The EST library will provide useful information for transcriptional analyses, proteomic analyses, and gene functional analyses of BPH. Moreover, specific genes for hemimetabolous insects will be identified. The microarray fabricated based on the EST information will be useful for finding genes related to agricultural and biological problems related to this pest.
Development of a real-time PCR for detection of Staphylococcus pseudintermedius using a novel automated comparison of whole-genome sequences.

PubMed

Verstappen, Koen M; Huijbregts, Loes; Spaninks, Mirlin; Wagenaar, Jaap A; Fluit, Ad C; Duim, Birgitta

2017-01-01

Staphylococcus pseudintermedius is an opportunistic pathogen in dogs and cats and occasionally causes infections in humans. S. pseudintermedius is often resistant to multiple classes of antimicrobials. It requires a reliable detection so that it is not misidentified as S. aureus. Phenotypic and currently-used molecular-based diagnostic assays lack specificity or are labour-intensive using multiplex PCR or nucleic acid sequencing. The aim of this study was to identify a specific target for real-time PCR by comparing whole genome sequences of S. pseudintermedius and non-pseudintermedius.Genome sequences were downloaded from public repositories and supplemented by isolates that were sequenced in this study. A Perl-script was written that analysed 300-nt fragments from a reference genome sequence of S. pseudintermedius and checked if this sequence was present in other S. pseudintermedius genomes (n = 74) and non-pseudintermedius genomes (n = 138). Six sequences specific for S. pseudintermedius were identified (sequence length between 300-500 nt). One sequence, which was located in the spsJ gene, was used to develop primers and a probe. The real-time PCR showed 100% specificity when testing for S. pseudintermedius isolates (n = 54), and eight other staphylococcal species (n = 43). In conclusion, a novel approach by comparing whole genome sequences identified a sequence that is specific for S. pseudintermedius and provided a real-time PCR target for rapid and reliable detection of S. pseudintermedius.
A High-Density Genetic Map with Array-Based Markers Facilitates Structural and Quantitative Trait Locus Analyses of the Common Wheat Genome

PubMed Central

Iehisa, Julio Cesar Masaru; Ohno, Ryoko; Kimura, Tatsuro; Enoki, Hiroyuki; Nishimura, Satoru; Okamoto, Yuki; Nasuda, Shuhei; Takumi, Shigeo

2014-01-01

The large genome and allohexaploidy of common wheat have complicated construction of a high-density genetic map. Although improvements in the throughput of next-generation sequencing (NGS) technologies have made it possible to obtain a large amount of genotyping data for an entire mapping population by direct sequencing, including hexaploid wheat, a significant number of missing data points are often apparent due to the low coverage of sequencing. In the present study, a microarray-based polymorphism detection system was developed using NGS data obtained from complexity-reduced genomic DNA of two common wheat cultivars, Chinese Spring (CS) and Mironovskaya 808. After design and selection of polymorphic probes, 13,056 new markers were added to the linkage map of a recombinant inbred mapping population between CS and Mironovskaya 808. On average, 2.49 missing data points per marker were observed in the 201 recombinant inbred lines, with a maximum of 42. Around 40% of the new markers were derived from genic regions and 11% from repetitive regions. The low number of retroelements indicated that the new polymorphic markers were mainly derived from the less repetitive region of the wheat genome. Around 25% of the mapped sequences were useful for alignment with the physical map of barley. Quantitative trait locus (QTL) analyses of 14 agronomically important traits related to flowering, spikes, and seeds demonstrated that the new high-density map showed improved QTL detection, resolution, and accuracy over the original simple sequence repeat map. PMID:24972598
A New Omics Data Resource of Pleurocybella porrigens for Gene Discovery

PubMed Central

Dohra, Hideo; Someya, Takumi; Takano, Tomoyuki; Harada, Kiyonori; Omae, Saori; Hirai, Hirofumi; Yano, Kentaro; Kawagishi, Hirokazu

2013-01-01

Background Pleurocybella porrigens is a mushroom-forming fungus, which has been consumed as a traditional food in Japan. In 2004, 55 people were poisoned by eating the mushroom and 17 people among them died of acute encephalopathy. Since then, the Japanese government has been alerting Japanese people to take precautions against eating the P . porrigens mushroom. Unfortunately, despite efforts, the molecular mechanism of the encephalopathy remains elusive. The genome and transcriptome sequence data of P . porrigens and the related species, however, are not stored in the public database. To gain the omics data in P . porrigens , we sequenced genome and transcriptome of its fruiting bodies and mycelia by next generation sequencing. Methodology/Principal Findings Short read sequences of genomic DNAs and mRNAs in P . porrigens were generated by Illumina Genome Analyzer. Genome short reads were de novo assembled into scaffolds using Velvet. Comparisons of genome signatures among Agaricales showed that P . porrigens has a unique genome signature. Transcriptome sequences were assembled into contigs (unigenes). Biological functions of unigenes were predicted by Gene Ontology and KEGG pathway analyses. The majority of unigenes would be novel genes without significant counterparts in the public omics databases. Conclusions Functional analyses of unigenes present the existence of numerous novel genes in the basidiomycetes division. The results mean that the omics information such as genome, transcriptome and metabolome in basidiomycetes is short in the current databases. The large-scale omics information on P . porrigens , provided from this research, will give a new data resource for gene discovery in basidiomycetes. PMID:23936076
A high-density genetic map with array-based markers facilitates structural and quantitative trait locus analyses of the common wheat genome.

PubMed

Iehisa, Julio Cesar Masaru; Ohno, Ryoko; Kimura, Tatsuro; Enoki, Hiroyuki; Nishimura, Satoru; Okamoto, Yuki; Nasuda, Shuhei; Takumi, Shigeo

2014-10-01

The large genome and allohexaploidy of common wheat have complicated construction of a high-density genetic map. Although improvements in the throughput of next-generation sequencing (NGS) technologies have made it possible to obtain a large amount of genotyping data for an entire mapping population by direct sequencing, including hexaploid wheat, a significant number of missing data points are often apparent due to the low coverage of sequencing. In the present study, a microarray-based polymorphism detection system was developed using NGS data obtained from complexity-reduced genomic DNA of two common wheat cultivars, Chinese Spring (CS) and Mironovskaya 808. After design and selection of polymorphic probes, 13,056 new markers were added to the linkage map of a recombinant inbred mapping population between CS and Mironovskaya 808. On average, 2.49 missing data points per marker were observed in the 201 recombinant inbred lines, with a maximum of 42. Around 40% of the new markers were derived from genic regions and 11% from repetitive regions. The low number of retroelements indicated that the new polymorphic markers were mainly derived from the less repetitive region of the wheat genome. Around 25% of the mapped sequences were useful for alignment with the physical map of barley. Quantitative trait locus (QTL) analyses of 14 agronomically important traits related to flowering, spikes, and seeds demonstrated that the new high-density map showed improved QTL detection, resolution, and accuracy over the original simple sequence repeat map. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Molecular epidemiologic analysis of a Pneumocystis pneumonia outbreak among renal transplant patients.

PubMed

Urabe, N; Ishii, Y; Hyodo, Y; Aoki, K; Yoshizawa, S; Saga, T; Murayama, S Y; Sakai, K; Homma, S; Tateda, K

2016-04-01

Between 18 November and 3 December 2011, five renal transplant patients at the Department of Nephrology, Toho University Omori Medical Centre, Tokyo, were diagnosed with Pneumocystis pneumonia (PCP). We used molecular epidemiologic methods to determine whether the patients were infected with the same strain of Pneumocystis jirovecii. DNA extracted from the residual bronchoalveolar lavage fluid from the five outbreak cases and from another 20 cases of PCP between 2007 and 2014 were used for multilocus sequence typing to compare the genetic similarity of the P. jirovecii. DNA base sequencing by the Sanger method showed some regions where two bases overlapped and could not be defined. A next-generation sequencer was used to analyse the types and ratios of these overlapping bases. DNA base sequences of P. jirovecii in the bronchoalveolar lavage fluid from four of the five PCP patients in the 2011 outbreak and from another two renal transplant patients who developed PCP in 2013 were highly homologous. The Sanger method revealed 14 genomic regions where two differing DNA bases overlapped and could not be identified. Analyses of the overlapping bases by a next-generation sequencer revealed that the differing types of base were present in almost identical ratios. There is a strong possibility that the PCP outbreak at the Toho University Omori Medical Centre was caused by the same strain of P. jirovecii. Two different types of base present in some regions may be due to P. jirovecii's being a diploid species. Copyright © 2015 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.

Draft genome sequences of bacteria isolated from the Deschampsia antarctica phyllosphere.

PubMed

Cid, Fernanda P; Maruyama, Fumito; Murase, Kazunori; Graether, Steffen P; Larama, Giovanni; Bravo, Leon A; Jorquera, Milko A

2018-05-01

Genome analyses are being used to characterize plant growth-promoting (PGP) bacteria living in different plant compartiments. In this context, we have recently isolated bacteria from the phyllosphere of an Antarctic plant (Deschampsia antarctica) showing ice recrystallization inhibition (IRI), an activity related to the presence of antifreeze proteins (AFPs). In this study, the draft genomes of six phyllospheric bacteria showing IRI activity were sequenced and annotated according to their functional gene categories. Genome sizes ranged from 5.6 to 6.3 Mbp, and based on sequence analysis of the 16S rRNA genes, five strains were identified as Pseudomonas and one as Janthinobacterium. Interestingly, most strains showed genes associated with PGP traits, such as nutrient uptake (ammonia assimilation, nitrogen fixing, phosphatases, and organic acid production), bioactive metabolites (indole acetic acid and 1-aminocyclopropane-1-carboxylate deaminase), and antimicrobial compounds (hydrogen cyanide and pyoverdine). In relation with IRI activity, a search of putative AFPs using current bioinformatic tools was also carried out. Despite that genes associated with reported AFPs were not found in these genomes, genes connected to ice-nucleation proteins (InaA) were found in all Pseudomonas strains, but not in the Janthinobacterium strain.
First results of the German Barcode of Life (GBOL) – Myriapoda project: Cryptic lineages in German Stenotaenia linearis (Koch, 1835) (Chilopoda, Geophilomorpha)

PubMed Central

Wesener, Thomas; Voigtländer, Karin; Decker, Peter; Oeyen, Jan Philip; Spelda, Jörg; Lindner, Norman

2015-01-01

Abstract As part of the German Barcode of Life (GBOL) Myriapoda program, which aims to sequence the COI barcoding fragment for 2000 specimens of Germany’s 200 myriapod species in the near future, 44 sequences of the centipede order Geophilomorpha are analyzed. The analyses are limited to the genera Geophilus Leach, 1814 and Stenotaenia Koch, 1847 and include a total of six species. A special focus is Stenotaenia, of which 19 specimens from southern, western and eastern Germany could be successfully sequenced. The Stenotaenia data shows the presence of three to four vastly different (13.7–16.7% p-distance) lineages of the genus in Germany. At least two of the three lineages show a wide distribution across Germany, only the lineage including topotypes of Stenotaenia linearis shows a more restricted distribution in southern Germany. In a maximum likelihood phylogenetic analysis the Italian species Stenotaenia ‘sorrentina’ (Attems, 1903) groups with the different German Stenotaenia linearis clades. The strongly different Stenotaenia linearis lineages within Germany, independent of geography, are a strong hint for the presence of additional, cryptic Stenotaenia species in Germany. PMID:26257532
Novel, diverse RNA viruses from Mediterranean isolates of the phytopathogenic fungus, Rosellinia necatrix: insights into evolutionary biology of fungal viruses.

PubMed

Arjona-Lopez, Juan Manuel; Telengech, Paul; Jamal, Atif; Hisano, Sakae; Kondo, Hideki; Yelin, Mery Dafny; Arjona-Girona, Isabel; Kanematsu, Satoko; Lopez-Herrera, Carlos José; Suzuki, Nobuhiro

2018-04-01

To reveal mycovirus diversity, we conducted a search of as-yet-unexplored Mediterranean isolates of the phytopathogenic ascomycete Rosellinia necatrix for virus infections. Of seventy-nine, eleven fungal isolates tested RNA virus-positive, with many showing coinfections, indicating a virus incidence of 14%, which is slightly lower than that (approximately 20%) previously reported for extensive surveys of over 1000 Japanese R. necatrix isolates. All viral sequences were fully or partially characterized by Sanger and next-generation sequencing. These sequences appear to represent isolates of various new species spanning at least 6 established or previously proposed families such as Partiti-, Hypo-, Megabirna-, Yado-kari-, Fusagra- and Fusarividae, as well as a newly proposed family, Megatotiviridae. This observation greatly expands the diversity of R. necatrix viruses, because no hypo-, fusagra- or megatotiviruses were previously reported from R. necatrix. The sequence analyses showed a rare horizontal gene transfer event of the 2A-like protease domain between a dsRNA (phlegivirus) and a positive-sense, single-stranded RNA virus (hypovirus). Moreover, many of the newly detected viruses showed the closest relation to viruses reported from fungi other than R. necatrix, such as Fusarium spp., which are sympatric to R. necatrix. These combined results imply horizontal virus transfer between these soil-inhabitant fungi. © 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.
Sequence variations of the partially dominant DELLA gene Rht-B1c in wheat and their functional impacts

PubMed Central

Ma, Zhengqiang

2013-01-01

Rht-B1c, allelic to the DELLA protein-encoding gene Rht-B1a, is a natural mutation documented in common wheat (Triticum aestivum). It confers variation to a number of traits related to cell and plant morphology, seed dormancy, and photosynthesis. The present study was conducted to examine the sequence variations of Rht-B1c and their functional impacts. The results showed that Rht-B1c was partially dominant or co-dominant for plant height, and exhibited an increased dwarfing effect. At the sequence level, Rht-B1c differed from Rht-B1a by one 2kb Veju retrotransposon insertion, three coding region single nucleotide polymorphisms (SNPs), one 197bp insertion, and four SNPs in the 1kb upstream sequence. Haplotype investigations, association analyses, transient expression assays, and expression profiling showed that the Veju insertion was primarily responsible for the extreme dwarfing effect. It was found that the Veju insertion changed processing of the Rht-B1c transcripts and resulted in DELLA motif primary structure disruption. Expression assays showed that Rht-B1c caused reduction of total Rht-1 transcript levels, and up-regulation of GATA-like transcription factors and genes positively regulated by these factors, suggesting that one way in which Rht-1 proteins affect plant growth and development is through GATA-like transcription factor regulation. PMID:23918966
Base substitutions at scissile bond sites are sufficient to alter RNA-binding and cleavage activity of RNase III.

PubMed

Kim, Kyungsub; Sim, Se-Hoon; Jeon, Che Ok; Lee, Younghoon; Lee, Kangseok

2011-02-01

RNase III, a double-stranded RNA-specific endoribonuclease, degrades bdm mRNA via cleavage at specific sites. To better understand the mechanism of cleavage site selection by RNase III, we performed a genetic screen for sequences containing mutations at the bdm RNA cleavage sites that resulted in altered mRNA stability using a transcriptional bdm'-'cat fusion construct. While most of the isolated mutants showed the increased bdm'-'cat mRNA stability that resulted from the inability of RNase III to cleave the mutated sequences, one mutant sequence (wt-L) displayed in vivo RNA stability similar to that of the wild-type sequence. In vivo and in vitro analyses of the wt-L RNA substrate showed that it was cut only once on the RNA strand to the 5'-terminus by RNase III, while the binding constant of RNase III to this mutant substrate was moderately increased. A base substitution at the uncleaved RNase III cleavage site in wt-L mutant RNA found in another mutant lowered the RNA-binding affinity by 11-fold and abolished the hydrolysis of scissile bonds by RNase III. Our results show that base substitutions at sites forming the scissile bonds are sufficient to alter RNA cleavage as well as the binding activity of RNase III. © 2010 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Dyslexic children show short-term memory deficits in phonological storage and serial rehearsal: an fMRI study.

PubMed

Beneventi, Harald; Tønnessen, Finn Egil; Ersland, Lars

2009-01-01

Dyslexia is primarily associated with a phonological processing deficit. However, the clinical manifestation also includes a reduced verbal working memory (WM) span. It is unclear whether this WM impairment is caused by the phonological deficit or a distinct WM deficit. The main aim of this study was to investigate neuronal activation related to phonological storage and rehearsal of serial order in WM in a sample of 13-year-old dyslexic children compared with age-matched nondyslexic children. A sequential verbal WM task with two tasks was used. In the Letter Probe task, the probe consisted of a single letter and the judgment was for the presence or absence of that letter in the prior sequence of six letters. In the Sequence Probe (SP) task, the probe consisted of all six letters and the judgment was for a match of their serial order with the temporal order in the prior sequence. Group analyses as well as single-subject analysis were performed with the statistical parametric mapping software SPM2. In the Letter Probe task, the dyslexic readers showed reduced activation in the left precentral gyrus (BA6) compared to control group. In the Sequence Probe task, the dyslexic readers showed reduced activation in the prefrontal cortex and the superior parietal cortex (BA7) compared to the control subjects. Our findings suggest that a verbal WM impairment in dyslexia involves an extended neural network including the prefrontal cortex and the superior parietal cortex. Reduced activation in the left BA6 in both the Letter Probe and Sequence Probe tasks may be caused by a deficit in phonological processing. However, reduced bilateral activation in the BA7 in the Sequence Probe task only could indicate a distinct working memory deficit in dyslexia associated with temporal order processing.
Molecular Cloning and Characterization of cDNA Encoding a Putative Stress-Induced Heat-Shock Protein from Camelus dromedarius

PubMed Central

Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.

2011-01-01

Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
The glycoprotein genes and gene junctions of the fish rhabdoviruses spring viremia of carp virus and hirame rhabdovirus: Analysis of relationships with other rhabdoviruses

USGS Publications Warehouse

Bjorklund, H.V.; Higman, K.H.; Kurath, G.

1996-01-01

The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2–33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.
The glycoprotein genes and gene junctions of the fish rhabdoviruses spring viremia of carp virus and hirame rhabdovirus: Analysis of relationships with other rhabdoviruses

USGS Publications Warehouse

Bjorklund, H.V.; Higman, K.H.; Kurath, G.

1996-01-01

The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2-33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.
Complete cpDNA genome sequence of Smilax china and phylogenetic placement of Liliales--influences of gene partitions and taxon sampling.

PubMed

Liu, Juan; Qi, Zhe-Chen; Zhao, Yun-Peng; Fu, Cheng-Xin; Jenny Xiang, Qiu-Yun

2012-09-01

The complete nucleotide sequence of the chloroplast genome (cpDNA) of Smilax china L. (Smilacaceae) is reported. It is the first complete cp genome sequence in Liliales. Genomic analyses were conducted to examine the rate and pattern of cpDNA genome evolution in Smilax relative to other major lineages of monocots. The cpDNA genomic sequences were combined with those available for Lilium to evaluate the phylogenetic position of Liliales and to investigate the influence of taxon sampling, gene sampling, gene function, natural selection, and substitution rate on phylogenetic inference in monocots. Phylogenetic analyses using sequence data of gene groups partitioned according to gene function, selection force, and total substitution rate demonstrated evident impacts of these factors on phylogenetic inference of monocots and the placement of Liliales, suggesting potential evolutionary convergence or adaptation of some cpDNA genes in monocots. Our study also demonstrated that reduced taxon sampling reduced the bootstrap support for the placement of Liliales in the cpDNA phylogenomic analysis. Analyses of sequences of 77 protein genes with some missing data and sequences of 81 genes (all protein genes plus the rRNA genes) support a sister relationship of Liliales to the commelinids-Asparagales clade, consistent with the APG III system. Analyses of 63 cpDNA protein genes for 32 taxa with few missing data, however, support a sister relationship of Liliales (represented by Smilax and Lilium) to Dioscoreales-Pandanales. Topology tests indicated that these two alignments do not significantly differ given any of these three cpDNA genomic sequence data sets. Furthermore, we found no saturation effect of the data, suggesting that the cpDNA genomic sequence data used in the study are appropriate for monocot phylogenetic study and long-branch attraction is unlikely to be the cause to explain the result of two well-supported, conflict placements of Liliales. Further analyses using sufficient nuclear data remain necessary to evaluate these two phylogenetic hypotheses regarding the position of Liliales and to address the causes of signal conflict among genes and partitions. Copyright © 2012 Elsevier Inc. All rights reserved.
Phylogenetic Characterizations of Highly Mutated EV-B106 Recombinants Showing Extensive Genetic Exchanges with Other EV-B in Xinjiang, China.

PubMed

Song, Yang; Zhang, Yong; Fan, Qin; Cui, Hui; Yan, Dongmei; Zhu, Shuangli; Tang, Haishu; Sun, Qiang; Wang, Dongyan; Xu, Wenbo

2017-02-23

Human enterovirus B106 (EV-B106) is a new member of the enterovirus B species. To date, only three nucleotide sequences of EV-B106 have been published, and only one full-length genome sequence (the Yunnan strain 148/YN/CHN/12) is available in the GenBank database. In this study, we conducted phylogenetic characterisation of four EV-B106 strains isolated in Xinjiang, China. Pairwise comparisons of the nucleotide sequences and the deduced amino acid sequences revealed that the four Xinjiang EV-B106 strains had only 80.5-80.8% nucleotide identity and 95.4-97.3% amino acid identity with the Yunnan EV-B106 strain, indicating high mutagenicity. Similarity plots and bootscanning analyses revealed that frequent intertypic recombination occurred in all four Xinjiang EV-B106 strains in the non-structural region. These four strains may share a donor sequence with the EV-B85 strain, which circulated in Xinjiang in 2011, indicating extensive genetic exchanges between these strains. All Xinjiang EV-B106 strains were temperature-sensitive. An antibody seroprevalence study against EV-B106 in two Xinjiang prefectures also showed low titres of neutralizing antibodies, suggesting limited exposure and transmission in the population. This study contributes the whole genome sequences of EV-B106 to the GenBank database and provides valuable information regarding the molecular epidemiology of EV-B106 in China.
Phylogenetic Characterizations of Highly Mutated EV-B106 Recombinants Showing Extensive Genetic Exchanges with Other EV-B in Xinjiang, China

PubMed Central

Song, Yang; Zhang, Yong; Fan, Qin; Cui, Hui; Yan, Dongmei; Zhu, Shuangli; Tang, Haishu; Sun, Qiang; Wang, Dongyan; Xu, Wenbo

2017-01-01

Human enterovirus B106 (EV-B106) is a new member of the enterovirus B species. To date, only three nucleotide sequences of EV-B106 have been published, and only one full-length genome sequence (the Yunnan strain 148/YN/CHN/12) is available in the GenBank database. In this study, we conducted phylogenetic characterisation of four EV-B106 strains isolated in Xinjiang, China. Pairwise comparisons of the nucleotide sequences and the deduced amino acid sequences revealed that the four Xinjiang EV-B106 strains had only 80.5–80.8% nucleotide identity and 95.4–97.3% amino acid identity with the Yunnan EV-B106 strain, indicating high mutagenicity. Similarity plots and bootscanning analyses revealed that frequent intertypic recombination occurred in all four Xinjiang EV-B106 strains in the non-structural region. These four strains may share a donor sequence with the EV-B85 strain, which circulated in Xinjiang in 2011, indicating extensive genetic exchanges between these strains. All Xinjiang EV-B106 strains were temperature-sensitive. An antibody seroprevalence study against EV-B106 in two Xinjiang prefectures also showed low titres of neutralizing antibodies, suggesting limited exposure and transmission in the population. This study contributes the whole genome sequences of EV-B106 to the GenBank database and provides valuable information regarding the molecular epidemiology of EV-B106 in China. PMID:28230168
Single-cell analyses of transcriptional heterogeneity during drug tolerance transition in cancer cells by RNA sequencing.

PubMed

Lee, Mei-Chong Wendy; Lopez-Diaz, Fernando J; Khan, Shahid Yar; Tariq, Muhammad Akram; Dayn, Yelena; Vaske, Charles Joseph; Radenbaugh, Amie J; Kim, Hyunsung John; Emerson, Beverly M; Pourmand, Nader

2014-11-04

The acute cellular response to stress generates a subpopulation of reversibly stress-tolerant cells under conditions that are lethal to the majority of the population. Stress tolerance is attributed to heterogeneity of gene expression within the population to ensure survival of a minority. We performed whole transcriptome sequencing analyses of metastatic human breast cancer cells subjected to the chemotherapeutic agent paclitaxel at the single-cell and population levels. Here we show that specific transcriptional programs are enacted within untreated, stressed, and drug-tolerant cell groups while generating high heterogeneity between single cells within and between groups. We further demonstrate that drug-tolerant cells contain specific RNA variants residing in genes involved in microtubule organization and stabilization, as well as cell adhesion and cell surface signaling. In addition, the gene expression profile of drug-tolerant cells is similar to that of untreated cells within a few doublings. Thus, single-cell analyses reveal the dynamics of the stress response in terms of cell-specific RNA variants driving heterogeneity, the survival of a minority population through generation of specific RNA variants, and the efficient reconversion of stress-tolerant cells back to normalcy.
Single-cell analyses of transcriptional heterogeneity during drug tolerance transition in cancer cells by RNA sequencing

PubMed Central

Lee, Mei-Chong Wendy; Lopez-Diaz, Fernando J.; Khan, Shahid Yar; Tariq, Muhammad Akram; Dayn, Yelena; Vaske, Charles Joseph; Radenbaugh, Amie J.; Kim, Hyunsung John; Emerson, Beverly M.; Pourmand, Nader

2014-01-01

The acute cellular response to stress generates a subpopulation of reversibly stress-tolerant cells under conditions that are lethal to the majority of the population. Stress tolerance is attributed to heterogeneity of gene expression within the population to ensure survival of a minority. We performed whole transcriptome sequencing analyses of metastatic human breast cancer cells subjected to the chemotherapeutic agent paclitaxel at the single-cell and population levels. Here we show that specific transcriptional programs are enacted within untreated, stressed, and drug-tolerant cell groups while generating high heterogeneity between single cells within and between groups. We further demonstrate that drug-tolerant cells contain specific RNA variants residing in genes involved in microtubule organization and stabilization, as well as cell adhesion and cell surface signaling. In addition, the gene expression profile of drug-tolerant cells is similar to that of untreated cells within a few doublings. Thus, single-cell analyses reveal the dynamics of the stress response in terms of cell-specific RNA variants driving heterogeneity, the survival of a minority population through generation of specific RNA variants, and the efficient reconversion of stress-tolerant cells back to normalcy. PMID:25339441
Bone morphogenetic protein-binding endothelial regulator of liver sinusoidal endothelial cells induces iron overload in a fatty liver mouse model.

PubMed

Hasebe, Takumu; Tanaka, Hiroki; Sawada, Koji; Nakajima, Shunsuke; Ohtake, Takaaki; Fujiya, Mikihiro; Kohgo, Yutaka

2017-03-01

Non-alcoholic fatty liver disease (NAFLD) is frequently accompanied by iron overload. However, because of the complex hepcidin-regulating molecules, the molecular mechanism underlying iron overload remains unknown. To identify the key molecule involved in NAFLD-associated iron dysregulation, we performed whole-RNA sequencing on the livers of obese mice. Male C57BL/6 mice were fed a regular or high-fat diet for 16 or 48 weeks. Internal iron was evaluated by plasma iron, ferritin or hepatic iron content. Whole-RNA sequencing was performed by transcriptome analysis using semiconductor high-throughput sequencer. Mouse liver tissues or isolated hepatocytes and sinusoidal endothelial cells were used to assess the expression of iron-regulating molecules. Mice fed a high-fat diet for 16 weeks showed excess iron accumulation. Longer exposure to a high-fat diet increased hepatic fibrosis and intrahepatic iron accumulation. A pathway analysis of the sequencing data showed that several inflammatory pathways, including bone morphogenetic protein (BMP)-SMAD signaling, were significantly affected. Sequencing analysis showed 2314 altered genes, including decreased mRNA expression of the hepcidin-coding gene Hamp. Hepcidin protein expression and SMAD phosphorylation, which induces Hamp, were found to be reduced. The expression of BMP-binding endothelial regulator (BMPER), which inhibits BMP-SMAD signaling by binding BMP extracellularly, was up-regulated in fatty livers. In addition, immunohistochemical and cell isolation analyses showed that BMPER was primarily expressed in the liver sinusoidal endothelial cells (LSECs) rather than hepatocytes. BMPER secretion by LSECs inhibits BMP-SMAD signaling in hepatocytes and further reduces hepcidin protein expression. These intrahepatic molecular interactions suggest a novel molecular basis of iron overload in NAFLD.
Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes.

PubMed

Neafsey, Daniel E; Waterhouse, Robert M; Abai, Mohammad R; Aganezov, Sergey S; Alekseyev, Max A; Allen, James E; Amon, James; Arcà, Bruno; Arensburger, Peter; Artemov, Gleb; Assour, Lauren A; Basseri, Hamidreza; Berlin, Aaron; Birren, Bruce W; Blandin, Stephanie A; Brockman, Andrew I; Burkot, Thomas R; Burt, Austin; Chan, Clara S; Chauve, Cedric; Chiu, Joanna C; Christensen, Mikkel; Costantini, Carlo; Davidson, Victoria L M; Deligianni, Elena; Dottorini, Tania; Dritsou, Vicky; Gabriel, Stacey B; Guelbeogo, Wamdaogo M; Hall, Andrew B; Han, Mira V; Hlaing, Thaung; Hughes, Daniel S T; Jenkins, Adam M; Jiang, Xiaofang; Jungreis, Irwin; Kakani, Evdoxia G; Kamali, Maryam; Kemppainen, Petri; Kennedy, Ryan C; Kirmitzoglou, Ioannis K; Koekemoer, Lizette L; Laban, Njoroge; Langridge, Nicholas; Lawniczak, Mara K N; Lirakis, Manolis; Lobo, Neil F; Lowy, Ernesto; MacCallum, Robert M; Mao, Chunhong; Maslen, Gareth; Mbogo, Charles; McCarthy, Jenny; Michel, Kristin; Mitchell, Sara N; Moore, Wendy; Murphy, Katherine A; Naumenko, Anastasia N; Nolan, Tony; Novoa, Eva M; O'Loughlin, Samantha; Oringanje, Chioma; Oshaghi, Mohammad A; Pakpour, Nazzy; Papathanos, Philippos A; Peery, Ashley N; Povelones, Michael; Prakash, Anil; Price, David P; Rajaraman, Ashok; Reimer, Lisa J; Rinker, David C; Rokas, Antonis; Russell, Tanya L; Sagnon, N'Fale; Sharakhova, Maria V; Shea, Terrance; Simão, Felipe A; Simard, Frederic; Slotman, Michel A; Somboon, Pradya; Stegniy, Vladimir; Struchiner, Claudio J; Thomas, Gregg W C; Tojo, Marta; Topalis, Pantelis; Tubio, José M C; Unger, Maria F; Vontas, John; Walton, Catherine; Wilding, Craig S; Willis, Judith H; Wu, Yi-Chieh; Yan, Guiyun; Zdobnov, Evgeny M; Zhou, Xiaofan; Catteruccia, Flaminia; Christophides, George K; Collins, Frank H; Cornman, Robert S; Crisanti, Andrea; Donnelly, Martin J; Emrich, Scott J; Fontaine, Michael C; Gelbart, William; Hahn, Matthew W; Hansen, Immo A; Howell, Paul I; Kafatos, Fotis C; Kellis, Manolis; Lawson, Daniel; Louis, Christos; Luckhart, Shirley; Muskavitch, Marc A T; Ribeiro, José M; Riehle, Michael A; Sharakhov, Igor V; Tu, Zhijian; Zwiebel, Laurence J; Besansky, Nora J

2015-01-02

Variation in vectorial capacity for human malaria among Anopheles mosquito species is determined by many factors, including behavior, immunity, and life history. To investigate the genomic basis of vectorial capacity and explore new avenues for vector control, we sequenced the genomes of 16 anopheline mosquito species from diverse locations spanning ~100 million years of evolution. Comparative analyses show faster rates of gene gain and loss, elevated gene shuffling on the X chromosome, and more intron losses, relative to Drosophila. Some determinants of vectorial capacity, such as chemosensory genes, do not show elevated turnover but instead diversify through protein-sequence changes. This dynamism of anopheline genes and genomes may contribute to their flexible capacity to take advantage of new ecological niches, including adapting to humans as primary hosts. Copyright © 2015, American Association for the Advancement of Science.
Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

DOE PAGES

Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.; ...

2018-01-09

The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae).

PubMed

Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

2016-04-01

Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae)

PubMed Central

Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

2016-01-01

Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans. PMID:27180575
Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.

The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

Current state-of-art of STR sequencing in forensic genetics.

PubMed

Alonso, Antonio; Barrio, Pedro A; Müller, Petra; Köcher, Steffi; Berger, Burkhard; Martin, Pablo; Bodner, Martin; Willuweit, Sascha; Parson, Walther; Roewer, Lutz; Budowle, Bruce

2018-05-11

The current state of validation and implementation strategies of MPS technology for the analysis of STR markers for forensic genetics use is described, covering the topics of the current catalogue of commercial MPS-STR panels, leading MPS-platforms, and MPS-STR data analysis tools. In addition, the developmental and internal validation studies carried out to date to evaluate reliability, sensitivity, mixture analysis, concordance, and the ability to analyze challenged samples are summarized. The results of various MPS-STR population studies that showed a large number of new STR sequence variants that increase the power of discrimination in several forensically-relevant loci are also presented. Finally, various initiatives developed by several international projects and standardization (or guidelines) groups to facilitate application of MPS technology for STR marker analyses are discussed in regard to promoting a standard STR sequence nomenclature, performing population studies to detect sequence variants, and developing a universal system to translate sequence variants into a simple STR nomenclature (numbers and letters) compatible with national STR databases. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Phylogenetic position of the North American isolate of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, as inferred from 16S rDNA sequence analysis.

PubMed

Atibalentja, N; Noel, G R; Domier, L L

2000-03-01

A 1341 bp sequence of the 16S rDNA of an undescribed species of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, was determined and then compared with a homologous sequence of Pasteuria ramosa, a parasite of cladoceran water fleas of the family Daphnidae. The two Pasteuria sequences, which diverged from each other by a dissimilarity index of 7%, also were compared with the 16S rDNA sequences of 30 other bacterial species to determine the phylogenetic position of the genus Pasteuria among the Gram-positive eubacteria. Phylogenetic analyses using maximum-likelihood, maximum-parsimony and neighbour-joining methods showed that the Heterodera glycines-infecting Pasteuria and its sister species, P. ramosa, form a distinct line of descent within the Alicyclobacillus group of the Bacillaceae. These results are consistent with the view that the genus Pasteuria is a deeply rooted member of the Clostridium-Bacillus-Streptococcus branch of the Gram-positive eubacteria, neither related to the actinomycetes nor closely related to true endospore-forming bacteria.
A multiple-alignment based primer design algorithm for genetically highly variable DNA targets

PubMed Central

2013-01-01

Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160
Molecular systematics of Indian Alysicarpus (Fabaceae) based on analyses of nuclear ribosomal DNA sequences.

PubMed

Gholami, Akram; Subramaniam, Shweta; Geeta, R; Pandey, Arun K

2017-06-01

Alysicarpus Necker ex Desvaux (Fabaceae, Desmodieae) consists of ~30 species that are distributed in tropical and subtropical regions of theworld. In India, the genus is represented by ca. 18 species, ofwhich seven are endemic. Sequences of the nuclear Internal transcribed spacer from38 accessions representing 16 Indian specieswere subjected to phylogenetic analyses. The ITS sequence data strongly support the monophyly of the genus Alysicarpus. Analyses revealed four major well-supported clades within Alysicarpus. Ancestral state reconstructions were done for two morphological characters, namely calyx length in relation to pod (macrocalyx and microcalyx) and pod surface ornamentation (transversely rugose and nonrugose). The present study is the first report on molecular systematics of Indian Alysicarpus.
Osteoblast-specific factor 2: cloning of a putative bone adhesion protein with homology with the insect protein fasciclin I.

PubMed Central

Takeshita, S; Kikuno, R; Tezuka, K; Amann, E

1993-01-01

A cDNA library prepared from the mouse osteoblastic cell line MC3T3-E1 was screened for the presence of specifically expressed genes by employing a combined subtraction hybridization/differential screening approach. A cDNA was identified and sequenced which encodes a protein designated osteoblast-specific factor 2 (OSF-2) comprising 811 amino acids. OSF-2 has a typical signal sequence, followed by a cysteine-rich domain, a fourfold repeated domain and a C-terminal domain. The protein lacks a typical transmembrane region. The fourfold repeated domain of OSF-2 shows homology with the insect protein fasciclin I. RNA analyses revealed that OSF-2 is expressed in bone and to a lesser extent in lung, but not in other tissues. Mouse OSF-2 cDNA was subsequently used as a probe to clone the human counterpart. Mouse and human OSF-2 show a high amino acid sequence conservation except for the signal sequence and two regions in the C-terminal domain in which 'in-frame' insertions or deletions are observed, implying alternative splicing events. On the basis of the amino acid sequence homology with fasciclin I, we suggest that OSF-2 functions as a homophilic adhesion molecule in bone formation. Images Figure 3 Figure 4 Figure 5 Figure 6 PMID:8363580
Deep Sequencing Analysis of RNAs from Citrus Plants Grown in a Citrus Sudden Death-Affected Area Reveals Diverse Known and Putative Novel Viruses.

PubMed

Matsumura, Emilyn E; Coletta-Filho, Helvecio D; Nouri, Shahideh; Falk, Bryce W; Nerva, Luca; Oliveira, Tiago S; Dorta, Silvia O; Machado, Marcos A

2017-04-24

Citrus sudden death (CSD) has caused the death of approximately four million orange trees in a very important citrus region in Brazil. Although its etiology is still not completely clear, symptoms and distribution of affected plants indicate a viral disease. In a search for viruses associated with CSD, we have performed a comparative high-throughput sequencing analysis of the transcriptome and small RNAs from CSD-symptomatic and -asymptomatic plants using the Illumina platform. The data revealed mixed infections that included Citrus tristeza virus (CTV) as the most predominant virus, followed by the Citrus sudden death-associated virus (CSDaV), Citrus endogenous pararetrovirus (CitPRV) and two putative novel viruses tentatively named Citrus jingmen-like virus (CJLV), and Citrus virga-like virus (CVLV). The deep sequencing analyses were sensitive enough to differentiate two genotypes of both viruses previously associated with CSD-affected plants: CTV and CSDaV. Our data also showed a putative association of the CSD-symptomatic plants with a specific CSDaV genotype and a likely association with CitPRV as well, whereas the two putative novel viruses showed to be more associated with CSD-asymptomatic plants. This is the first high-throughput sequencing-based study of the viral sequences present in CSD-affected citrus plants, and generated valuable information for further CSD studies.
A case study to determine the geographical origin of unknown GM papaya in routine food sample analysis, followed by identification of papaya events 16-0-1 and 18-2-4.

PubMed

Prins, Theo W; Scholtens, Ingrid M J; Bak, Arno W; van Dijk, Jeroen P; Voorhuijzen, Marleen M; Laurensse, Emile J; Kok, Esther J

2016-12-15

During routine monitoring for GMOs in food in the Netherlands, papaya-containing food supplements were found positive for the genetically modified (GM) elements P-35S and T-nos. The goal of this study was to identify the unknown and EU unauthorised GM papaya event(s). A screening strategy was applied using additional GM screening elements including a newly developed PRSV coat protein PCR. The detected PRSV coat protein PCR product was sequenced and the nucleotide sequence showed identity to PRSV YK strains indigenous to China and Taiwan. The GM events 16-0-1 and 18-2-4 could be identified by amplifying and sequencing events-specific sequences. Further analyses showed that both papaya event 16-0-1 and event 18-2-4 were transformed with the same construct. For use in routine analysis, derived TaqMan qPCR methods for events 16-0-1 and 18-2-4 were developed. Event 16-0-1 was detected in all samples tested whereas event 18-2-4 was detected in one sample. This study presents a strategy for combining information from different sources (literature, patent databases) and novel sequence data to identify unknown GM papaya events. Copyright © 2016 Elsevier Ltd. All rights reserved.
Circadian clock protein KaiC forms ATP-dependent hexameric rings and binds DNA

PubMed Central

Mori, Tetsuya; Saveliev, Sergei V.; Xu, Yao; Stafford, Walter F.; Cox, Michael M.; Inman, Ross B.; Johnson, Carl H.

2002-01-01

KaiC from Synechococcus elongatus PCC 7942 (KaiC) is an essential circadian clock protein in cyanobacteria. Previous sequence analyses suggested its inclusion in the RecA/DnaB superfamily. A characteristic of the proteins of this superfamily is that they form homohexameric complexes that bind DNA. We show here that KaiC also forms ring complexes with a central pore that can be visualized by electron microscopy. A combination of analytical ultracentrifugation and chromatographic analyses demonstrates that these complexes are hexameric. The association of KaiC molecules into hexamers depends on the presence of ATP. The KaiC sequence does not include the obvious DNA-binding motifs found in RecA or DnaB. Nevertheless, KaiC binds forked DNA substrates. These data support the inclusion of KaiC into the RecA/DnaB superfamily and have important implications for enzymatic activity of KaiC in the circadian clock mechanism that regulates global changes in gene expression patterns. PMID:12477935
Circadian clock protein KaiC forms ATP-dependent hexameric rings and binds DNA.

PubMed

Mori, Tetsuya; Saveliev, Sergei V; Xu, Yao; Stafford, Walter F; Cox, Michael M; Inman, Ross B; Johnson, Carl H

2002-12-24

KaiC from Synechococcus elongatus PCC 7942 (KaiC) is an essential circadian clock protein in cyanobacteria. Previous sequence analyses suggested its inclusion in the RecADnaB superfamily. A characteristic of the proteins of this superfamily is that they form homohexameric complexes that bind DNA. We show here that KaiC also forms ring complexes with a central pore that can be visualized by electron microscopy. A combination of analytical ultracentrifugation and chromatographic analyses demonstrates that these complexes are hexameric. The association of KaiC molecules into hexamers depends on the presence of ATP. The KaiC sequence does not include the obvious DNA-binding motifs found in RecA or DnaB. Nevertheless, KaiC binds forked DNA substrates. These data support the inclusion of KaiC into the RecADnaB superfamily and have important implications for enzymatic activity of KaiC in the circadian clock mechanism that regulates global changes in gene expression patterns.
Phylogeny of sipunculan worms: A combined analysis of four gene regions and morphology.

PubMed

Schulze, Anja; Cutler, Edward B; Giribet, Gonzalo

2007-01-01

The intra-phyletic relationships of sipunculan worms were analyzed based on DNA sequence data from four gene regions and 58 morphological characters. Initially we analyzed the data under direct optimization using parsimony as optimality criterion. An implied alignment resulting from the direct optimization analysis was subsequently utilized to perform a Bayesian analysis with mixed models for the different data partitions. For this we applied a doublet model for the stem regions of the 18S rRNA. Both analyses support monophyly of Sipuncula and most of the same clades within the phylum. The analyses differ with respect to the relationships among the major groups but whereas the deep nodes in the direct optimization analysis generally show low jackknife support, they are supported by 100% posterior probability in the Bayesian analysis. Direct optimization has been useful for handling sequences of unequal length and generating conservative phylogenetic hypotheses whereas the Bayesian analysis under mixed models provided high resolution in the basal nodes of the tree.
The Large Subunit rDNA Sequence of Plasmodiophora brassicae Does not Contain Intra-species Polymorphism.

PubMed

Schwelm, Arne; Berney, Cédric; Dixelius, Christina; Bass, David; Neuhauser, Sigrid

2016-12-01

Clubroot disease caused by Plasmodiophora brassicae is one of the most important diseases of cultivated brassicas. P. brassicae occurs in pathotypes which differ in the aggressiveness towards their Brassica host plants. To date no DNA based method to distinguish these pathotypes has been described. In 2011 polymorphism within the 28S rDNA of P. brassicae was reported which potentially could allow to distinguish pathotypes without the need of time-consuming bioassays. However, isolates of P. brassicae from around the world analysed in this study do not show polymorphism in their LSU rDNA sequences. The previously described polymorphism most likely derived from soil inhabiting Cercozoa more specifically Neoheteromita-like glissomonads. Here we correct the LSU rDNA sequence of P. brassicae. By using FISH we demonstrate that our newly generated sequence belongs to the causal agent of clubroot disease. Copyright © 2016 The Authors. Published by Elsevier GmbH.. All rights reserved.
Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data

PubMed Central

Flickinger, Matthew; Jun, Goo; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min

2015-01-01

DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%–20%), contamination-adjusted calls eliminate 48%–77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%. PMID:26235984
Production of a full-length infectious GFP-tagged cDNA clone of Beet mild yellowing virus for the study of plant-polerovirus interactions.

PubMed

Stevens, Mark; Viganó, Felicita

2007-04-01

The full-length cDNA of Beet mild yellowing virus (Broom's Barn isolate) was sequenced and cloned into the vector pLitmus 29 (pBMYV-BBfl). The sequence of BMYV-BBfl (5721 bases) shared 96% and 98% nucleotide identity with the other complete sequences of BMYV (BMYV-2ITB, France and BMYV-IPP, Germany respectively). Full-length capped RNA transcripts of pBMYV-BBfl were synthesised and found to be biologically active in Arabidopsis thaliana protoplasts following electroporation or PEG inoculation when the protoplasts were subsequently analysed using serological and molecular methods. The BMYV sequence was modified by inserting DNA that encoded the jellyfish green fluorescent protein (GFP) into the P5 gene close to its 3' end. A. thaliana protoplasts electroporated with these RNA transcripts were biologically active and up to 2% of transfected protoplasts showed GFP-specific fluorescence. The exploitation of these cDNA clones for the study of the biology of beet poleroviruses is discussed.
Genomic sequencing and the impact of molecular diagnosis on patient care.

PubMed

Solomon, Benjamin D

2015-02-01

Evolving sequencing technologies allow more accurate, efficient and affordable genomic analysis. As a result, these technologies are increasingly available, especially to provide molecular diagnoses for patients with suspected genetic disorders. However, there are many challenges to using genomic sequencing to benefit patients, including concerns that there is insufficient evidence that identifying an underlying molecular explanation may positively impact a patient's healthcare. This concern has many repercussions, including funding and/or (in some countries and healthcare systems) insurance reimbursement for genomic sequencing. To investigate this concern, all monogenic disorders were analyzed based on the impact of achieving molecular diagnosis. Of the 2,849 individual genes in which germline mutations cause disorders (not including contiguous gene syndromes or what may be categorized as susceptibility alleles), our analyses showed a specific, available intervention related to at least one affected organ system for 1,419 (49.8%) genes. In 95.6% of these genes, the intervention(s) would be recommended during the pediatric time frame.
The BaMM web server for de-novo motif discovery and regulatory sequence analysis.

PubMed

Kiesel, Anja; Roth, Christian; Ge, Wanwan; Wess, Maximilian; Meier, Markus; Söding, Johannes

2018-05-28

The BaMM web server offers four tools: (i) de-novo discovery of enriched motifs in a set of nucleotide sequences, (ii) scanning a set of nucleotide sequences with motifs to find motif occurrences, (iii) searching with an input motif for similar motifs in our BaMM database with motifs for >1000 transcription factors, trained from the GTRD ChIP-seq database and (iv) browsing and keyword searching the motif database. In contrast to most other servers, we represent sequence motifs not by position weight matrices (PWMs) but by Bayesian Markov Models (BaMMs) of order 4, which we showed previously to perform substantially better in ROC analyses than PWMs or first order models. To address the inadequacy of P- and E-values as measures of motif quality, we introduce the AvRec score, the average recall over the TP-to-FP ratio between 1 and 100. The BaMM server is freely accessible without registration at https://bammmotif.mpibpc.mpg.de.
De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries.

PubMed

Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee

2015-09-21

Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.
Application of genetic algorithm in integrated setup planning and operation sequencing

NASA Astrophysics Data System (ADS)

Kafashi, Sajad; Shakeri, Mohsen

2011-01-01

Process planning is an essential component for linking design and manufacturing process. Setup planning and operation sequencing is two main tasks in process planning. Many researches solved these two problems separately. Considering the fact that the two functions are complementary, it is necessary to integrate them more tightly so that performance of a manufacturing system can be improved economically and competitively. This paper present a generative system and genetic algorithm (GA) approach to process plan the given part. The proposed approach and optimization methodology analyses the TAD (tool approach direction), tolerance relation between features and feature precedence relations to generate all possible setups and operations using workshop resource database. Based on these technological constraints the GA algorithm approach, which adopts the feature-based representation, optimizes the setup plan and sequence of operations using cost indices. Case study show that the developed system can generate satisfactory results in optimizing the setup planning and operation sequencing simultaneously in feasible condition.
Application of rDNA-PCR amplification and DGGE fingerprinting for detection of microbial diversity in a Malaysian crude oil.

PubMed

Liew, Pauline Woanying; Jong, Bor Chyan

2008-05-01

Two culture-independent methods, namely ribosomal DNA libraries and denaturing gradient gel electrophoresis (DGGE), were adopted to examine the microbial community of a Malaysian light crude oil. In this study, both 16S and 18S rDNAs were PCR-amplified from bulk DNA of crude oil samples, cloned, and sequenced. Analyses of restriction fragment length polymorphism (RFLP) and phylogenetics clustered the 16S and 18S rDNA sequences into seven and six groups, respectively. The ribosomal DNA sequences obtained showed sequence similarity between 90 to 100% to those available in the GenBank database. The closest relatives documented for the 16S rDNAs include member species of Thermoincola and Rhodopseudomonas, whereas the closest fungal relatives include Acremonium, Ceriporiopsis, Xeromyces, Lecythophora, and Candida. Others were affiliated to uncultured bacteria and uncultured ascomycete. The 16S rDNA library demonstrated predomination by a single uncultured bacterial type by >80% relative abundance. The predomination was confirmed by DGGE analysis.
Variability of Actinobacteria, a minor component of rumen microflora.

PubMed

Suľák, M; Sikorová, L; Jankuvová, J; Javorský, P; Pristaš, P

2012-07-01

Actinobacteria (Actinomycetes) are a significant and interesting group of gram-positive bacteria. They are regular, though infrequent, members of the microbial life in the rumen and represent up to 3 % of total rumen bacteria; there is considerable lack of information about ecology and biology of rumen actinobacteria. During the characterization of variability of rumen treponemas using non-cultivation approach, we also noted the variability of rumen actinobacteria. By using Treponema-specific primers a specific 16S rRNA gene library was prepared from cow and sheep rumen total DNA. About 10 % of recombinant clones contained actinobacteria-like sequences. Phylogenetic analyses of 11 clones obtained showed the high variability of actinobacteria in the ruminant digestive system. While some sequences are nearly identical to known sequences of actinobacteria, we detected completely new clusters of actinobacteria-like sequences, representing probably new, as yet undiscovered, group of rumen Actinobacteria. Further research will be necessary for understanding their nature and functions in the rumen.
Gene sequence analyses and other DNA-based methods for yeast species recognition

USDA-ARS?s Scientific Manuscript database

DNA sequence analyses, as well as other DNA-based methodologies, have transformed the way in which yeasts are identified. The focus of this chapter will be on the resolution of species using various types of DNA comparisons. In other chapters in this book, Rozpedowska, Piškur and Wolfe discuss mul...

Heuristics for multiobjective multiple sequence alignment.

PubMed

Abbasi, Maryam; Paquete, Luís; Pereira, Francisco B

2016-07-15

Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio.
MELOGEN: an EST database for melon functional genomics

PubMed Central

Gonzalez-Ibeas, Daniel; Blanca, José; Roig, Cristina; González-To, Mireia; Picó, Belén; Truniger, Verónica; Gómez, Pedro; Deleu, Wim; Caño-Delgado, Ana; Arús, Pere; Nuez, Fernando; Garcia-Mas, Jordi; Puigdomènech, Pere; Aranda, Miguel A

2007-01-01

Background Melon (Cucumis melo L.) is one of the most important fleshy fruits for fresh consumption. Despite this, few genomic resources exist for this species. To facilitate the discovery of genes involved in essential traits, such as fruit development, fruit maturation and disease resistance, and to speed up the process of breeding new and better adapted melon varieties, we have produced a large collection of expressed sequence tags (ESTs) from eight normalized cDNA libraries from different tissues in different physiological conditions. Results We determined over 30,000 ESTs that were clustered into 16,637 non-redundant sequences or unigenes, comprising 6,023 tentative consensus sequences (contigs) and 10,614 unclustered sequences (singletons). Many potential molecular markers were identified in the melon dataset: 1,052 potential simple sequence repeats (SSRs) and 356 single nucleotide polymorphisms (SNPs) were found. Sixty-nine percent of the melon unigenes showed a significant similarity with proteins in databases. Functional classification of the unigenes was carried out following the Gene Ontology scheme. In total, 9,402 unigenes were mapped to one or more ontology. Remarkably, the distributions of melon and Arabidopsis unigenes followed similar tendencies, suggesting that the melon dataset is representative of the whole melon transcriptome. Bioinformatic analyses primarily focused on potential precursors of melon micro RNAs (miRNAs) in the melon dataset, but many other genes potentially controlling disease resistance and fruit quality traits were also identified. Patterns of transcript accumulation were characterised by Real-Time-qPCR for 20 of these genes. Conclusion The collection of ESTs characterised here represents a substantial increase on the genetic information available for melon. A database (MELOGEN) which contains all EST sequences, contig images and several tools for analysis and data mining has been created. This set of sequences constitutes also the basis for an oligo-based microarray for melon that is being used in experiments to further analyse the melon transcriptome. PMID:17767721
High-throughput sequencing reveals unprecedented diversities of Aspergillus species in outdoor air.

PubMed

Lee, S; An, C; Xu, S; Lee, S; Yamamoto, N

2016-09-01

This study used the Illumina MiSeq to analyse compositions and diversities of Aspergillus species in outdoor air. The seasonal air samplings were performed at two locations in Seoul, South Korea. The results showed the relative abundances of all Aspergillus species combined ranging from 0·20 to 18% and from 0·19 to 21% based on the number of the internal transcribed spacer 1 (ITS1) and β-tubulin (BenA) gene sequences respectively. Aspergillus fumigatus was the most dominant species with the mean relative abundances of 1·2 and 5·5% based on the number of the ITS1 and BenA sequences respectively. A total of 29 Aspergillus species were detected and identified down to the species rank, among which nine species were known opportunistic pathogens. Remarkably, eight of the nine pathogenic species were detected by either one of the two markers, suggesting the need of using multiple markers and/or primer pairs when the assessments are made based on the high-throughput sequencing. Due to diversity of species within the genus Aspergillus, the high-throughput sequencing was useful to characterize their compositions and diversities in outdoor air, which are thought to be difficult to be accurately characterized by conventional culture and/or Sanger sequencing-based techniques. Aspergillus is a diverse genus of fungi with more than 300 species reported in literature. Aspergillus is important since some species are known allergens and opportunistic human pathogens. Traditionally, growth-dependent methods have been used to detect Aspergillus species in air. However, these methods are limited in the number of isolates that can be analysed for their identities, resulting in inaccurate characterizations of Aspergillus diversities. This study used the high-throughput sequencing to explore Aspergillus diversities in outdoor, which are thought to be difficult to be accurately characterized by traditional growth-dependent techniques. © 2016 The Society for Applied Microbiology.
Diversity and community composition of methanogenic archaea in the rumen of Scottish upland sheep assessed by different methods.

PubMed

Snelling, Timothy J; Genç, Buğra; McKain, Nest; Watson, Mick; Waters, Sinéad M; Creevey, Christopher J; Wallace, R John

2014-01-01

Ruminal archaeomes of two mature sheep grazing in the Scottish uplands were analysed by different sequencing and analysis methods in order to compare the apparent archaeal communities. All methods revealed that the majority of methanogens belonged to the Methanobacteriales order containing the Methanobrevibacter, Methanosphaera and Methanobacteria genera. Sanger sequenced 1.3 kb 16S rRNA gene amplicons identified the main species of Methanobrevibacter present to be a SGMT Clade member Mbb. millerae (≥ 91% of OTUs); Methanosphaera comprised the remainder of the OTUs. The primers did not amplify ruminal Thermoplasmatales-related 16S rRNA genes. Illumina sequenced V6-V8 16S rRNA gene amplicons identified similar Methanobrevibacter spp. and Methanosphaera clades and also identified the Thermoplasmatales-related order as 13% of total archaea. Unusually, both methods concluded that Mbb. ruminantium and relatives from the same clade (RO) were almost absent. Sequences mapping to rumen 16S rRNA and mcrA gene references were extracted from Illumina metagenome data. Mapping of the metagenome data to 16S rRNA gene references produced taxonomic identification to Order level including 2-3% Thermoplasmatales, but was unable to discriminate to species level. Mapping of the metagenome data to mcrA gene references resolved 69% to unclassified Methanobacteriales. Only 30% of sequences were assigned to species level clades: of the sequences assigned to Methanobrevibacter, most mapped to SGMT (16%) and RO (10%) clades. The Sanger 16S amplicon and Illumina metagenome mcrA analyses showed similar species richness (Chao1 Index 19-35), while Illumina metagenome and amplicon 16S rRNA analysis gave lower richness estimates (10-18). The values of the Shannon Index were low in all methods, indicating low richness and uneven species distribution. Thus, although much information may be extracted from the other methods, Illumina amplicon sequencing of the V6-V8 16S rRNA gene would be the method of choice for studying rumen archaeal communities.
Viral metagenomics of aphids present in bean and maize plots on mixed-use farms in Kenya reveals the presence of three dicistroviruses including a novel Big Sioux River virus-like dicistrovirus.

PubMed

Wamonje, Francis O; Michuki, George N; Braidwood, Luke A; Njuguna, Joyce N; Musembi Mutuku, J; Djikeng, Appolinaire; Harvey, Jagger J W; Carr, John P

2017-10-02

Aphids are major vectors of plant viruses. Common bean (Phaseolus vulgaris L.) and maize (Zea mays L.) are important crops that are vulnerable to aphid herbivory and aphid-transmitted viruses. In East and Central Africa, common bean is frequently intercropped by smallholder farmers to provide fixed nitrogen for cultivation of starch crops such as maize. We used a PCR-based technique to identify aphids prevalent in smallholder bean farms and next generation sequencing shotgun metagenomics to examine the diversity of viruses present in aphids and in maize leaf samples. Samples were collected from farms in Kenya in a range of agro-ecological zones. Cytochrome oxidase 1 (CO1) gene sequencing showed that Aphis fabae was the sole aphid species present in bean plots in the farms visited. Sequencing of total RNA from aphids using the Illumina platform detected three dicistroviruses. Maize leaf RNA was also analysed. Identification of Aphid lethal paralysis virus (ALPV), Rhopalosiphum padi virus (RhPV), and a novel Big Sioux River virus (BSRV)-like dicistrovirus in aphid and maize samples was confirmed using reverse transcription-polymerase chain reactions and sequencing of amplified DNA products. Phylogenetic, nucleotide and protein sequence analyses of eight ALPV genomes revealed evidence of intra-species recombination, with the data suggesting there may be two ALPV lineages. Analysis of BSRV-like virus genomic RNA sequences revealed features that are consistent with other dicistroviruses and that it is phylogenetically closely related to dicistroviruses of the genus Cripavirus. The discovery of ALPV and RhPV in aphids and maize further demonstrates the broad occurrence of these dicistroviruses. Dicistroviruses are remarkable in that they use plants as reservoirs that facilitate infection of their insect replicative hosts, such as aphids. This is the first report of these viruses being isolated from either organism. The BSRV-like sequences represent a potentially novel dicistrovirus infecting A. fabae.
Evaluation of Benthic Foraminiferal Mg/Ca and δ18O: Paleoceanographic Application

NASA Astrophysics Data System (ADS)

Fukuda, K.; Frew, R. D.; Fordyce, R. E.

2005-12-01

Using several different analytical approaches on the same samples is crucial for reducing uncertainties in paleoceanographic studies. We examined two different sequences near Oamaru, New Zealand to evaluate a combination of Mg/Ca and δ18O techniques on benthic foraminifera. As a trial, we chose well-preserved material from the Altonian stage (-18 Ma) while as an application, cemented/altered material in Whaingaroan/Runangan stage (-34 Ma) was selected. For the Altonian, Mg/Ca in Notorotalia spinosa and Cibicides spp. were analysed by ICP-OES throughout the fossiliferous sequence and then paleotemperatures were estimated by our modern Mg/Ca calibration curves. The δ18O in N. spinosa and some Cibicides were also measured from the same stations for pairing with Mg/Ca results. Further, to evaluate paleotemperature estimates from the whole tests, spots analyses of Mg/Ca were taken through the successive chambers for the two species using Electron Probe Micro Analysis (EPMA). Paleotemperatures through the successive chambers, which should be related to their life spans, were estimated by the modern calibration curves established from EPMA analysis. Results show that Notorotalia may retain at least an annual record while the signal in Cibicides may retain a part of season. There is distinctive seasonality observed in this period and the δ18Oseawater estimates paired with Mg/Ca in N. spinosa are comparable with published estimates. For the Whaingaroan/Runangan, Mg/Ca in Cibicides parki (ICP) shows relatively low values (cool) through this sequence in agreement with EPMA analysis. However, δ18O-derived temperatures from C. parki imply warmer conditions prevailed. In addition, Mg/Ca and δ18O from Cribrorotalia (closely related to Notorotalia) provide similar temperature estimates to the C. parki isotope results. It appears that Mg/Ca in certain species are susceptible to post-mortem alteration resulting in lower apparent temperatures. Spot analyses in Cribrorotalia show no distinctive seasonality and the δ18Oseawater estimates indicate ice-free conditions. We conclude that pairing Mg/Ca with δ18O allows the estimation of δ18Oseawater, but only if well-preserved and annual recorder specimens are examined. Combination with EPMA analysis may provide insight into seasonal variability.
Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data

PubMed Central

Xu, Shuhua

2015-01-01

Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer's disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution. PMID:26053627
Sequencing, bioinformatic characterization and expression pattern of a putative amino acid transporter from the parasitic cestode Echinococcus granulosus.

PubMed

Camicia, Federico; Paredes, Rodolfo; Chalar, Cora; Galanti, Norbel; Kamenetzky, Laura; Gutierrez, Ariana; Rosenzvit, Mara C

2008-03-31

We have sequenced and partially characterized an Echinococcus granulosus cDNA, termed egat1, from a protoscolex signal sequence trap (SST) cDNA library. The isolated 1627 bp long cDNA contains an ORF of 489 amino acids and shows an amino acid identity of 30% with neutral and excitatory amino acid transporters members of the Dicarboxylate/Amino Acid Na+ and/or H+ Cation Symporter family (DAACS) (TC 2.A.23). Additional bioinformatics analysis of EgAT1, confirmed the results obtained by similarity searches and showed the presence of 9 to 10 transmembrane domains, consensus sequences for N-glycosylation between the third and fourth transmembrane domain, a highly similar hydropathy profile with ASCT1 (a known member of DAACS family), high score with SDF (Sodium Dicarboxilate Family) and similar motifs with EDTRANSPORT, a fingerprint of excitatory amino acid transporters. The localization of the putative amino acid transporter was analyzed by in situ hybridization and immunofluorescence in protoscoleces and associated germinal layer. The in situ hybridization labelling indicates the distribution of egat1 mRNA throughout the tegument. EgAT1 protein, which showed in Western blots a molecular mass of approximately 60 kD, is localized in the subtegumental region of the metacestode, particularly around suckers and rostellum of protoscoleces and layers from brood capsules. The sequence and expression analyses of EgAT1 pave the way for functional analysis of amino acids transporters of E. granulosus and its evaluation as new drug targets against cystic echinococcosis.
Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

PubMed

Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

2012-05-01

The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs. Copyright © 2012 Elsevier Inc. All rights reserved.
PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

PubMed

Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho

2017-11-01

High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.
The effects of recombination, mutation and selection on the evolution of the Rp1 resistance genes in grasses.

PubMed

Jouet, Agathe; McMullan, Mark; van Oosterhout, Cock

2015-06-01

Plant immune genes, or resistance genes, are involved in a co-evolutionary arms race with a diverse range of pathogens. In agronomically important grasses, such R genes have been extensively studied because of their role in pathogen resistance and in the breeding of resistant cultivars. In this study, we evaluate the importance of recombination, mutation and selection on the evolution of the R gene complex Rp1 of Sorghum, Triticum, Brachypodium, Oryza and Zea. Analyses show that recombination is widespread, and we detected 73 independent instances of sequence exchange, involving on average 1567 of 4692 nucleotides analysed (33.4%). We were able to date 24 interspecific recombination events and found that four occurred postspeciation, which suggests that genetic introgression took place between different grass species. Other interspecific events seemed to have been maintained over long evolutionary time, suggesting the presence of balancing selection. Significant positive selection (i.e. a relative excess of nonsynonymous substitutions (dN /dS >1)) was detected in 17-95 codons (0.42-2.02%). Recombination was significantly associated with areas with high levels of polymorphism but not with an elevated dN /dS ratio. Finally, phylogenetic analyses show that recombination results in a general overestimation of the divergence time (mean = 14.3%) and an alteration of the gene tree topology if the tree is not calibrated. Given that the statistical power to detect recombination is determined by the level of polymorphism of the amplicon as well as the number of sequences analysed, it is likely that many studies have underestimated the importance of recombination relative to the mutation rate. © 2015 John Wiley & Sons Ltd.
Molecular, phylogenetic and comparative genomic analysis of the cytokinin oxidase/dehydrogenase gene family in the Poaceae.

PubMed

Mameaux, Sabine; Cockram, James; Thiel, Thomas; Steuernagel, Burkhard; Stein, Nils; Taudien, Stefan; Jack, Peter; Werner, Peter; Gray, John C; Greenland, Andy J; Powell, Wayne

2012-01-01

The genomes of cereals such as wheat (Triticum aestivum) and barley (Hordeum vulgare) are large and therefore problematic for the map-based cloning of agronomicaly important traits. However, comparative approaches within the Poaceae permit transfer of molecular knowledge between species, despite their divergence from a common ancestor sixty million years ago. The finding that null variants of the rice gene cytokinin oxidase/dehydrogenase 2 (OsCKX2) result in large yield increases provides an opportunity to explore whether similar gains could be achieved in other Poaceae members. Here, phylogenetic, molecular and comparative analyses of CKX families in the sequenced grass species rice, brachypodium, sorghum, maize and foxtail millet, as well as members identified from the transcriptomes/genomes of wheat and barley, are presented. Phylogenetic analyses define four Poaceae CKX clades. Comparative analyses showed that CKX phylogenetic groupings can largely be explained by a combination of local gene duplication, and the whole-genome duplication event that predates their speciation. Full-length OsCKX2 homologues in barley (HvCKX2.1, HvCKX2.2) and wheat (TaCKX2.3, TaCKX2.4, TaCKX2.5) are characterized, with comparative analysis at the DNA, protein and genetic/physical map levels suggesting that true CKX2 orthologs have been identified. Furthermore, our analysis shows CKX2 genes in barley and wheat have undergone a Triticeae-specific gene-duplication event. Finally, by identifying ten of the eleven CKX genes predicted to be present in barley by comparative analyses, we show that next-generation sequencing approaches can efficiently determine the gene space of large-genome crops. Together, this work provides the foundation for future functional investigation of CKX family members within the Poaceae. © 2011 National Institute of Agricultural Botany (NIAB). Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
DUF1220 protein domains drive proliferation in human neural stem cells and are associated with increased cortical volume in anthropoid primates.

PubMed

Keeney, J G; Davis, J M; Siegenthaler, J; Post, M D; Nielsen, B S; Hopkins, W D; Sikela, J M

2015-09-01

Genome sequences encoding DUF1220 protein domains show a burst in copy number among anthropoid species and especially humans, where they have undergone the greatest human lineage-specific copy number expansion of any protein coding sequence in the genome. While DUF1220 copy number shows a dosage-related association with brain size in both normal populations and in 1q21.1-associated microcephaly and macrocephaly, a function for these domains has not yet been described. Here we provide multiple lines of evidence supporting the view that DUF1220 domains function as drivers of neural stem cell proliferation among anthropoid species including humans. First, we show that brain MRI data from 131 individuals across 7 anthropoid species shows a strong correlation between DUF1220 copy number and multiple brain size-related measures. Using in situ hybridization analyses of human fetal brain, we also show that DUF1220 domains are expressed in the ventricular zone and primarily during human cortical neurogenesis, and are therefore expressed at the right time and place to be affecting cortical brain development. Finally, we demonstrate that in vitro expression of DUF1220 sequences in neural stem cells strongly promotes proliferation. Taken together, these data provide the strongest evidence so far reported implicating DUF1220 dosage in anthropoid and human brain expansion through mechanisms involving increasing neural stem cell proliferation.
Anaerobic degradation of cyclohexane by sulfate-reducing bacteria from hydrocarbon-contaminated marine sediments.

PubMed

Jaekel, Ulrike; Zedelius, Johannes; Wilkes, Heinz; Musat, Florin

2015-01-01

The fate of cyclohexane, often used as a model compound for the biodegradation of cyclic alkanes due to its abundance in crude oils, in anoxic marine sediments has been poorly investigated. In the present study, we obtained an enrichment culture of cyclohexane-degrading sulfate-reducing bacteria from hydrocarbon-contaminated intertidal marine sediments. Microscopic analyses showed an apparent dominance by oval cells of 1.5 × 0.8 μm. Analysis of a 16S rRNA gene library, followed by whole-cell hybridization with group- and sequence-specific oligonucleotide probes showed that these cells belonged to a single phylotype, and were accounting for more than 80% of the total cell number. The dominant phylotype, affiliated with the Desulfosarcina-Desulfococcus cluster of the Deltaproteobacteria, is proposed to be responsible for the degradation of cyclohexane. Quantitative growth experiments showed that cyclohexane degradation was coupled with the stoichiometric reduction of sulfate to sulfide. Substrate response tests corroborated with hybridization with a sequence-specific oligonucleotide probe suggested that the dominant phylotype apparently was able to degrade other cyclic and n-alkanes, including the gaseous alkane n-butane. Based on GC-MS analyses of culture extracts cyclohexylsuccinate was identified as a metabolite, indicating an activation of cyclohexane by addition to fumarate. Other metabolites detected were 3-cyclohexylpropionate and cyclohexanecarboxylate providing evidence that the overall degradation pathway of cyclohexane under anoxic conditions is analogous to that of n-alkanes.
The Control Region of Mitochondrial DNA Shows an Unusual CpG and Non-CpG Methylation Pattern

PubMed Central

Bellizzi, Dina; D'Aquila, Patrizia; Scafone, Teresa; Giordano, Marco; Riso, Vincenzo; Riccio, Andrea; Passarino, Giuseppe

2013-01-01

DNA methylation is a common epigenetic modification of the mammalian genome. Conflicting data regarding the possible presence of methylated cytosines within mitochondrial DNA (mtDNA) have been reported. To clarify this point, we analysed the methylation status of mtDNA control region (D-loop) on human and murine DNA samples from blood and cultured cells by bisulphite sequencing and methylated/hydroxymethylated DNA immunoprecipitation assays. We found methylated and hydroxymethylated cytosines in the L-strand of all samples analysed. MtDNA methylation particularly occurs within non-C-phosphate-G (non-CpG) nucleotides, mainly in the promoter region of the heavy strand and in conserved sequence blocks, suggesting its involvement in regulating mtDNA replication and/or transcription. We observed DNA methyltransferases within the mitochondria, but the inactivation of Dnmt1, Dnmt3a, and Dnmt3b in mouse embryonic stem (ES) cells results in a reduction of the CpG methylation, while the non-CpG methylation shows to be not affected. This suggests that D-loop epigenetic modification is only partially established by these enzymes. Our data show that DNA methylation occurs in the mtDNA control region of mammals, not only at symmetrical CpG dinucleotides, typical of nuclear genome, but in a peculiar non-CpG pattern previously reported for plants and fungi. The molecular mechanisms responsible for this pattern remain an open question. PMID:23804556
FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery

PubMed Central

Piazza, Rocco; Pirola, Alessandra; Spinelli, Roberta; Valletta, Simona; Redaelli, Sara; Magistroni, Vera; Gambacorti-Passerini, Carlo

2012-01-01

Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translocations occurring in cancer and subsequently using transcriptome data from three chronic and three acute myeloid leukaemia samples. in all the cases our tool was invariably able to detect the presence of the correct driver fusion event(s) with high specificity. In one of the acute myeloid leukaemia samples, FusionAnalyser identified a novel, cryptic, in-frame ETS2–ERG fusion. A fully event-driven graphical interface and a flexible filtering system allow complex analyses to be run in the absence of any a priori programming or scripting knowledge. Therefore, we propose FusionAnalyser as an efficient and robust graphical tool for the identification of functional rearrangements in the context of high-throughput transcriptome sequencing data. PMID:22570408
FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery.

PubMed

Piazza, Rocco; Pirola, Alessandra; Spinelli, Roberta; Valletta, Simona; Redaelli, Sara; Magistroni, Vera; Gambacorti-Passerini, Carlo

2012-09-01

Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translocations occurring in cancer and subsequently using transcriptome data from three chronic and three acute myeloid leukaemia samples. in all the cases our tool was invariably able to detect the presence of the correct driver fusion event(s) with high specificity. In one of the acute myeloid leukaemia samples, FusionAnalyser identified a novel, cryptic, in-frame ETS2-ERG fusion. A fully event-driven graphical interface and a flexible filtering system allow complex analyses to be run in the absence of any a priori programming or scripting knowledge. Therefore, we propose FusionAnalyser as an efficient and robust graphical tool for the identification of functional rearrangements in the context of high-throughput transcriptome sequencing data.
Sequence of Radiotherapy and Chemotherapy in Breast Cancer After Breast-Conserving Surgery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jobsen, Jan J., E-mail: J.Jobsen@mst.nl; Palen, Job van der; Department of Research Methodology, Measurement and Data Analysis, Faculty of Behavioural Science, University of Twente

2012-04-01

Purpose: The optimal sequence of radiotherapy and chemotherapy in breast-conserving therapy is unknown. Methods and Materials: From 1983 through 2007, a total of 641 patients with 653 instances of breast-conserving therapy (BCT), received both chemotherapy and radiotherapy and are the basis of this analysis. Patients were divided into three groups. Groups A and B comprised patients treated before 2005, Group A radiotherapy first and Group B chemotherapy first. Group C consisted of patients treated from 2005 onward, when we had a fixed sequence of radiotherapy first, followed by chemotherapy. Results: Local control did not show any differences among the threemore » groups. For distant metastasis, no difference was shown between Groups A and B. Group C, when compared with Group A, showed, on univariate and multivariate analyses, a significantly better distant metastasis-free survival. The same was noted for disease-free survival. With respect to disease-specific survival, no differences were shown on multivariate analysis among the three groups. Conclusion: Radiotherapy, as an integral part of the primary treatment of BCT, should be administered first, followed by adjuvant chemotherapy.« less
Molecular characterization of a divergent strain of calla lily chlorotic spot virus infecting celtuce (Lactuca sativa var. augustana) in China.

PubMed

Wu, Xiaodong; Wu, Xiaoyun; Li, Wenbin; Cheng, Xiaofei

2018-05-01

Through sequencing and assembly of small RNAs, an orthotospovirus was identified from a celtuce plant (Lactuca sativa var. augustana) showing vein clearing and chlorotic spots in the Zhejiang province of China. The S, M, and L RNAs of this orthotospovirus were determined to be 3146, 4734, and 8934 nt, respectively, and shared 30.4-72.5%, 43.4-80.8%, and 29.84-82.9% nucleotide sequence identities with that of known orthotospoviruses. The full length nucleoprotein (N) of this orthotospovirus shared highest amino acid sequence identity (90.25%) with that of calla lily chlorotic spot virus isolated from calla lily (CCSV-calla) [China: Taiwan: 2001] and tobacco (CCSV-LJ1) [China: Lijiang: 2014]. Phylogenetic analyses showed that this orthotospovirus is phylogenetically associated with CCSV isolates and clustered with CCSV, tomato zonate spot virus (TZSV), and tomato necrotic spot-associated virus (TNSaV) in a separate sub-branch. These results suggest that this orthotospovirus is a divergent isolate of CCSV and was thus named CCSV-Cel [China: Zhejiang: 2017].
Evaluation of peptides release using a natural rubber latex biomembrane as a carrier.

PubMed

Miranda, M C R; Borges, F A; Barros, N R; Santos Filho, N A; Mendonça, R J; Herculano, R D; Cilli, E M

2018-05-01

The biomembrane natural (NRL-Natural Rubber Latex), manipulated from the latex obtained from the rubber tree Hevea brasiliensis, has shown great potential for application in biomedicine and biomaterials. Reflecting the biocompatibility and low bounce rate of this material, NRL has been used as a physical barrier to infectious agents and for the controlled release of drugs and extracts. The aim of the present study was to evaluate the incorporation and release of peptides using a latex biomembrane carrier. After incorporation, the release of material from the membrane was observed using spectrophotometry. Analyses using HPLC and mass spectroscopy did not confirm the release of the antimicrobial peptide [W 6 ]Hylin a1 after 24 h. In addition, analysis of the release solution showed new compounds, indicating the degradation of the peptide by enzymes contained in the latex. Additionally, the release of a peptide with a shorter sequence (Ac-WAAAA) was evaluated, and degradation was not observed. These results showed that the use of NRL as solid matrices as delivery systems of peptide are sequence dependent and could to be evaluated for each sequence.

Analysis of quality raw data of second generation sequencers with Quality Assessment Software.

PubMed

Ramos, Rommel Tj; Carneiro, Adriana R; Baumbach, Jan; Azevedo, Vasco; Schneider, Maria Pc; Silva, Artur

2011-04-18

Second generation technologies have advantages over Sanger; however, they have resulted in new challenges for the genome construction process, especially because of the small size of the reads, despite the high degree of coverage. Independent of the program chosen for the construction process, DNA sequences are superimposed, based on identity, to extend the reads, generating contigs; mismatches indicate a lack of homology and are not included. This process improves our confidence in the sequences that are generated. We developed Quality Assessment Software, with which one can review graphs showing the distribution of quality values from the sequencing reads. This software allow us to adopt more stringent quality standards for sequence data, based on quality-graph analysis and estimated coverage after applying the quality filter, providing acceptable sequence coverage for genome construction from short reads. Quality filtering is a fundamental step in the process of constructing genomes, as it reduces the frequency of incorrect alignments that are caused by measuring errors, which can occur during the construction process due to the size of the reads, provoking misassemblies. Application of quality filters to sequence data, using the software Quality Assessment, along with graphing analyses, provided greater precision in the definition of cutoff parameters, which increased the accuracy of genome construction.
Transposon-like properties of the major, long repetitive sequence family in the genome of Physarum polycephalum

PubMed Central

Pearston, Douglas H.; Gordon, Mairi; Hardman, Norman

1985-01-01

A family of long, highly-repetitive sequences, referred to previously as `HpaII-repeats', dominates the genome of the eukaryotic slime mould Physarum polycephalum. These sequences are found exclusively in scrambled clusters. They account for about one-half of the total complement of repetitive DNA in Physarum, and represent the major sequence component found in hypermethylated, 20-50 kb segments of Physarum genomic DNA that fail to be cleaved using the restriction endonuclease HpaII. The structure of this abundant repetitive element was investigated by analysing cloned segments derived from the hypermethylated genomic DNA compartment. We show that the `HpaII-repeat' forms part of a larger repetitive DNA structure, ∼8.6 kb in length, with several structural features in common with recognised eukaryotic transposable genetic elements. Scrambled clusters of the sequence probably arise as a result of transposition-like events, during which the element preferentially recombines in either orientation with target sites located in other copies of the same repeated sequence. The target sites for transposition/recombination are not related in sequence but in all cases studied they are potentially capable of promoting the formation of small `cruciforms' or `Z-DNA' structures which might be recognised during the recombination process. ImagesFig. 3.Fig. 4. PMID:16453652
Software for optimization of SNP and PCR-RFLP genotyping to discriminate many genomes with the fewest assays

PubMed Central

Gardner, Shea N; Wagner, Mark C

2005-01-01

Background Microbial forensics is important in tracking the source of a pathogen, whether the disease is a naturally occurring outbreak or part of a criminal investigation. Results A method and SPR Opt (SNP and PCR-RFLP Optimization) software to perform a comprehensive, whole-genome analysis to forensically discriminate multiple sequences is presented. Tools for the optimization of forensic typing using Single Nucleotide Polymorphism (SNP) and PCR-Restriction Fragment Length Polymorphism (PCR-RFLP) analyses across multiple isolate sequences of a species are described. The PCR-RFLP analysis includes prediction and selection of optimal primers and restriction enzymes to enable maximum isolate discrimination based on sequence information. SPR Opt calculates all SNP or PCR-RFLP variations present in the sequences, groups them into haplotypes according to their co-segregation across those sequences, and performs combinatoric analyses to determine which sets of haplotypes provide maximal discrimination among all the input sequences. Those set combinations requiring that membership in the fewest haplotypes be queried (i.e. the fewest assays be performed) are found. These analyses highlight variable regions based on existing sequence data. These markers may be heterogeneous among unsequenced isolates as well, and thus may be useful for characterizing the relationships among unsequenced as well as sequenced isolates. The predictions are multi-locus. Analyses of mumps and SARS viruses are summarized. Phylogenetic trees created based on SNPs, PCR-RFLPs, and full genomes are compared for SARS virus, illustrating that purported phylogenies based only on SNP or PCR-RFLP variations do not match those based on multiple sequence alignment of the full genomes. Conclusion This is the first software to optimize the selection of forensic markers to maximize information gained from the fewest assays, accepting whole or partial genome sequence data as input. As more sequence data becomes available for multiple strains and isolates of a species, automated, computational approaches such as those described here will be essential to make sense of large amounts of information, and to guide and optimize efforts in the laboratory. The software and source code for SPR Opt is publicly available and free for non-profit use at . PMID:15904493
The complete plastome of macaw palm [Acrocomia aculeata (Jacq.) Lodd. ex Mart.] and extensive molecular analyses of the evolution of plastid genes in Arecaceae.

PubMed

de Santana Lopes, Amanda; Gomes Pacheco, Túlio; Nimz, Tabea; do Nascimento Vieira, Leila; Guerra, Miguel P; Nodari, Rubens O; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Rogalski, Marcelo

2018-04-01

The plastome of macaw palm was sequenced allowing analyses of evolution and molecular markers. Additionally, we demonstrated that more than half of plastid protein-coding genes in Arecaceae underwent positive selection. Macaw palm is a native species from tropical and subtropical Americas. It shows high production of oil per hectare reaching up to 70% of oil content in fruits and an interesting plasticity to grow in different ecosystems. Its domestication and breeding are still in the beginning, which makes the development of molecular markers essential to assess natural populations and germplasm collections. Therefore, we sequenced and characterized in detail the plastome of macaw palm. A total of 221 SSR loci were identified in the plastome of macaw palm. Additionally, eight polymorphism hotspots were characterized at level of subfamily and tribe. Moreover, several events of gain and loss of RNA editing sites were found within the subfamily Arecoideae. Aiming to uncover evolutionary events in Arecaceae, we also analyzed extensively the evolution of plastid genes. The analyses show that highly divergent genes seem to evolve in a species-specific manner, suggesting that gene degeneration events may be occurring within Arecaceae at the level of genus or species. Unexpectedly, we found that more than half of plastid protein-coding genes are under positive selection, including genes for photosynthesis, gene expression machinery and other essential plastid functions. Furthermore, we performed a phylogenomic analysis using whole plastomes of 40 taxa, representing all subfamilies of Arecaceae, which placed the macaw palm within the tribe Cocoseae. Finally, the data showed here are important for genetic studies in macaw palm and provide new insights into the evolution of plastid genes and environmental adaptation in Arecaceae.
MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes

PubMed Central

Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe

2016-01-01

The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a ‘dark matter.’ We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body. PMID:27065984
Construction of a dairy microbial genome catalog opens new perspectives for the metagenomic analysis of dairy fermented products.

PubMed

Almeida, Mathieu; Hébert, Agnès; Abraham, Anne-Laure; Rasmussen, Simon; Monnet, Christophe; Pons, Nicolas; Delbès, Céline; Loux, Valentin; Batto, Jean-Michel; Leonard, Pierre; Kennedy, Sean; Ehrlich, Stanislas Dusko; Pop, Mihai; Montel, Marie-Christine; Irlinger, Françoise; Renault, Pierre

2014-12-13

Microbial communities of traditional cheeses are complex and insufficiently characterized. The origin, safety and functional role in cheese making of these microbial communities are still not well understood. Metagenomic analysis of these communities by high throughput shotgun sequencing is a promising approach to characterize their genomic and functional profiles. Such analyses, however, critically depend on the availability of appropriate reference genome databases against which the sequencing reads can be aligned. We built a reference genome catalog suitable for short read metagenomic analysis using a low-cost sequencing strategy. We selected 142 bacteria isolated from dairy products belonging to 137 different species and 67 genera, and succeeded to reconstruct the draft genome of 117 of them at a standard or high quality level, including isolates from the genera Kluyvera, Luteococcus and Marinilactibacillus, still missing from public database. To demonstrate the potential of this catalog, we analysed the microbial composition of the surface of two smear cheeses and one blue-veined cheese, and showed that a significant part of the microbiota of these traditional cheeses was composed of microorganisms newly sequenced in our study. Our study provides data, which combined with publicly available genome references, represents the most expansive catalog to date of cheese-associated bacteria. Using this extended dairy catalog, we revealed the presence in traditional cheese of dominant microorganisms not deliberately inoculated, mainly Gram-negative genera such as Pseudoalteromonas haloplanktis or Psychrobacter immobilis, that may contribute to the characteristics of cheese produced through traditional methods.
Comparative fine mapping of the Wax 1 (W1) locus in hexaploid wheat.

PubMed

Lu, Ping; Qin, Jinxia; Wang, Guoxin; Wang, Lili; Wang, Zhenzhong; Wu, Qiuhong; Xie, Jingzhong; Liang, Yong; Wang, Yong; Zhang, Deyun; Sun, Qixin; Liu, Zhiyong

2015-08-01

By applying comparative genomics analyses, a high-density genetic linkage map of the Wax 1 ( W1 ) locus was constructed as a framework for map-based cloning. Glaucousness is described as the scattering effect of visible light from wax deposited on the cuticle of plant aerial organs. In wheat, the wax on leaves and stems is mainly controlled by two sets of genes: glaucousness loci (W1 and W2) and non-glaucousness loci (Iw1 and Iw2). Bulked segregant analysis (BSA) and simple sequence repeat (SSR) mapping showed that Wax1 (W1) is located on chromosome arm 2BS between markers Xgwm210 and Xbarc35. By applying comparative genomics analyses, colinearity genomic regions of the W1 locus on wheat 2BS were identified in Brachypodium distachyon chromosome 5, rice chromosome 4 and sorghum chromosome 6, respectively. Four STS markers were developed using the Triticum aestivum cv. Chinese Spring 454 contig sequences and the International Wheat Genome Sequencing Consortium (IWGSC) survey sequences. W1 was mapped into a 0.93 cM genetic interval flanked by markers XWGGC3197 and XWGGC2484, which has synteny with genomic regions of 56.5 kb in Brachypodium, 390 kb in rice and 31.8 kb in sorghum. The fine genetic map can serve as a framework for chromosome landing, physical mapping and map-based cloning of the W1 in wheat.
A cost-effectiveness analysis of first-line induction and maintenance treatment sequences in patients with advanced nonsquamous non-small-cell lung cancer in France

PubMed Central

Taipale, Kaisa; Winfree, Katherine B; Boye, Mark; Basson, Mickael; Sleilaty, Ghassan; Eaton, James; Evans, Rachel; Chouaid, Christos

2017-01-01

Background Comparative effectiveness and cost-effectiveness data for induction–maintenance (I–M) sequences for the treatment of patients with nonsquamous non-small-cell lung cancer (nsqNSCLC) are limited because of a lack of direct evidence. This analysis aimed to compare the cost-effectiveness of I–M pemetrexed with those of other I–M regimens used for the treatment of patients with advanced nsqNSCLC in the French health-care setting. Materials and methods A previously developed global partitioned survival model was adapted to the France-only setting by restricting treatment sequences to include 12 I–M regimens most relevant to France, and incorporating French costs and resource-use data. Following a systematic literature review, network meta-analyses were performed to obtain hazard ratios for progression-free survival (PFS) and overall survival (OS) relative to gemcitabine + cisplatin (induction sequences) or best supportive care (BSC) (maintenance sequences). Modeled health-care benefits were expressed as life-years (LYs) and quality-adjusted LYs (QALYs) (estimated using French EuroQol five-dimension questionnaire tariffs). The study was conducted from the payer perspective (National Health Insurance). Cost- and benefit-model inputs were discounted at an annual rate of 4%. Results Base-case results showed pemetrexed + cisplatin induction followed by (→) pemetrexed maintenance had the longest mean OS and PFS and highest LYs and QALYs. Costs ranged from €12,762 for paclitaxel + carboplatin → BSC to €35,617 for pemetrexed + cisplatin → pemetrexed (2015 values). Gemcitabine + cisplatin → BSC, pemetrexed + cisplatin → BSC, and pemetrexed + cisplatin → pemetrexed were associated with fully incremental cost-effectiveness ratios (ICERs) of €16,593, €80,656, and €102,179, respectively, per QALY gained versus paclitaxel + carboplatin → BSC. All other treatment sequences were either dominated (ie, another sequence had lower costs and better/equivalent outcomes) or extendedly dominated (ie, the comparator had a higher ICER than a more effective comparator) in the model. Sensitivity analyses showed the model to be relatively insensitive to plausible changes in the main assumptions, with none increasing or decreasing the ICER by more than ~€20,000 per QALY gained. Conclusion In the absence of direct comparative trial evidence, this cost-effectiveness analysis indicated that of a large number of I–M sequences used for the treatment of patients with nsqNSCLC in France, pemetrexed + cisplatin → pemetrexed achieved the best clinical outcomes (0.28 incremental QALYs gained) versus paclitaxel + carboplatin → BSC. PMID:28860832
Within-Host Variations of Human Papillomavirus Reveal APOBEC Signature Mutagenesis in the Viral Genome.

PubMed

Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

2018-06-15

Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied by the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here, we explored within-host genetic diversity of HPV by performing deep-sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52, and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC) and were deep sequenced. After constructing a reference viral genome sequence for each specimen, nucleotide positions showing changes with >0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with various numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the trinucleotide context encompassing substituted bases revealed that TpCpN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep-sequencing analyses, we show for the first time a comprehensive snapshot of the within-host genetic diversity of high-risk HPVs during cervical carcinogenesis. Quasispecies harboring minor nucleotide variations in viral whole-genome sequences were extensively observed across different grades of CIN and cervical cancer. Among the within-host variations, C-to-T transitions, a characteristic change mediated by cellular APOBEC cytosine deaminases, were predominantly detected throughout the whole viral genome, most strikingly in low-grade CIN lesions. The results strongly suggest that within-host variations of the HPV genome are primarily generated through the interaction with host cell DNA-editing enzymes and that such within-host variability is an evolutionary source of the genetic diversity of HPVs. Copyright © 2018 American Society for Microbiology.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

PubMed

Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

2011-09-12

Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

PubMed Central

2011-01-01

Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
Evolutionary dynamics of Hepatitis C virus in a chronic HIV co-infected patient and its correlation with the immune status.

PubMed

Culasso, Andrés Carlos Alberto; Monzani, María Cecilia; Baré, Patricia; Campos, Rodolfo Hector

2018-05-04

The HCV evolutionary dynamics play a key role in the infection onset, maintenance of chronicity, pathogenicity, and drug resistance variants fixation, and are thought to be one of the main caveats in the development of an effective vaccine. Previous studies in HCV/HIV co-infected patients suggest that a decline in the immune status is related with increases in the HCV intra-host genetic diversity. However, these findings are based on single point sequence diversity measures or coalescence analyses in several virus-host interactions. In this work, we describe the molecular evolution of HCV-E2 region in a single HIV-co-infected patient with two clearly defined immune conditions. The phylogenetic analysis of the HCV-1a sequences from the studied patient showed that he was co-infected with three different viral lineages. These lineages were not evenly detected throughout time. The sequence diversity and coalescence analyses of these lineages suggested the action of different evolutionary patterns in different immune conditions: a slow rate, drift-like process in an immunocompromised condition (low levels of CD4+ T lymphocytes); and a fast rate, variant-switch process in an immunocompetent condition (high levels of CD4+ T lymphocytes). Copyright © 2017. Published by Elsevier B.V.
Comparative Analyses of DNA Methylation and Sequence Evolution Using Nasonia Genomes

PubMed Central

Park, Jungsun; Peng, Zuogang; Zeng, Jia; Elango, Navin; Park, Taesung; Wheeler, Dave; Werren, John H.; Yi, Soojin V.

2011-01-01

The functional and evolutionary significance of DNA methylation in insect genomes remains to be resolved. Nasonia is well situated for comparative analyses of DNA methylation and genome evolution, since the genomes of a moderately distant outgroup species as well as closely related sibling species are available. Using direct sequencing of bisulfite-converted DNA, we uncovered a substantial level of DNA methylation in 17 of 18 Nasonia vitripennis genes and a strong correlation between methylation level and CpG depletion. Notably, in the sex-determining locus transformer, the exon that is alternatively spliced between the sexes is heavily methylated in both males and females, whereas other exons are only sparsely methylated. Orthologous genes of the honeybee and Nasonia show highly similar relative levels of CpG depletion, despite ∼190 My divergence. Densely and sparsely methylated genes in these species also exhibit similar functional enrichments. We found that the degree of CpG depletion is negatively correlated with substitution rates between closely related Nasonia species for synonymous, nonsynonymous, and intron sites. This suggests that mutation rates increase with decreasing levels of germ line methylation. Thus, DNA methylation is prevalent in the Nasonia genome, may participate in regulatory processes such as sex determination and alternative splicing, and is correlated with several aspects of genome and sequence evolution. PMID:21693438
Molecular and Immunological Characterization of Ragweed (Ambrosia artemisiifolia L.) Pollen after Exposure of the Plants to Elevated Ozone over a Whole Growing Season

PubMed Central

Kanter, Ulrike; Heller, Werner; Durner, Jörg; Winkler, J. Barbro; Engel, Marion; Behrendt, Heidrun; Holzinger, Andreas; Braun, Paula; Hauser, Michael; Ferreira, Fatima; Mayer, Klaus; Pfeifer, Matthias; Ernst, Dieter

2013-01-01

Climate change and air pollution, including ozone is known to affect plants and might also influence the ragweed pollen, known to carry strong allergens. We compared the transcriptome of ragweed pollen produced under ambient and elevated ozone by 454-sequencing. An enzyme-linked immunosorbent assay (ELISA) was carried out for the major ragweed allergen Amb a 1. Pollen surface was examined by scanning electron microscopy and attenuated total reflectance–Fourier transform infrared spectroscopy (ATR-FTIR), and phenolics were analysed by high-performance liquid chromatography. Elevated ozone had no influence on the pollen size, shape, surface structure or amount of phenolics. ATR-FTIR indicated increased pectin-like material in the exine. Transcriptomic analyses showed changes in expressed-sequence tags (ESTs), including allergens. However, ELISA indicated no significantly increased amounts of Amb a 1 under elevated ozone concentrations. The data highlight a direct influence of ozone on the exine components and transcript level of allergens. As the total protein amount of Amb a 1 was not altered, a direct correlation to an increased risk to human health could not be derived. Additional, the 454-sequencing contributes to the identification of stress-related transcripts in mature pollen that could be grouped into distinct gene ontology terms. PMID:23637846
The complete mitochondrial genome of the dwarf tapeworm Hymenolepis nana--a neglected zoonotic helminth.

PubMed

Cheng, Tian; Liu, Guo-Hua; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan

2016-03-01

Hymenolepis nana, commonly known as the dwarf tapeworm, is one of the most common tapeworms of humans and rodents and can cause hymenolepiasis. Although this zoonotic tapeworm is of socio-economic significance in many countries of the world, its genetics, systematics, epidemiology, and biology are poorly understood. In the present study, we sequenced and characterized the complete mitochondrial (mt) genome of H. nana. The mt genome is 13,764 bp in size and encodes 36 genes, including 12 protein-coding genes, 2 ribosomal RNA, and 22 transfer RNA genes. All genes are transcribed in the same direction. The gene order and genome content are completely identical with their congener Hymenolepis diminuta. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference, Maximum likelihood, and Maximum parsimony showed the division of class Cestoda into two orders, supported the monophylies of both the orders Cyclophyllidea and Pseudophyllidea. Analyses of mt genome sequences also support the monophylies of the three families Taeniidae, Hymenolepididae, and Diphyllobothriidae. This novel mt genome provides a useful genetic marker for studying the molecular epidemiology, systematics, and population genetics of the dwarf tapeworm and should have implications for the diagnosis, prevention, and control of hymenolepiasis in humans.
Unveiling the metabolic potential of two soil-derived microbial consortia selected on wheat straw

PubMed Central

Jiménez, Diego Javier; Chaves-Moreno, Diego; van Elsas, Jan Dirk

2015-01-01

Based on the premise that plant biomass can be efficiently degraded by mixed microbial cultures and/or enzymes, we here applied a targeted metagenomics-based approach to explore the metabolic potential of two forest soil-derived lignocellulolytic microbial consortia, denoted RWS and TWS (bred on wheat straw). Using the metagenomes of three selected batches of two experimental systems, about 1.2 Gb of sequence was generated. Comparative analyses revealed an overrepresentation of predicted carbohydrate transporters (ABC, TonB and phosphotransferases), two-component sensing systems and β-glucosidases/galactosidases in the two consortia as compared to the forest soil inoculum. Additionally, “profiling” of carbohydrate-active enzymes showed significant enrichments of several genes encoding glycosyl hydrolases of families GH2, GH43, GH92 and GH95. Sequence analyses revealed these to be most strongly affiliated to genes present on the genomes of Sphingobacterium, Bacteroides, Flavobacterium and Pedobacter spp. Assembly of the RWS and TWS metagenomes generated 16,536 and 15,902 contigs of ≥10 Kb, respectively. Thirteen contigs, containing 39 glycosyl hydrolase genes, constitute novel (hemi)cellulose utilization loci with affiliation to sequences primarily found in the Bacteroidetes. Overall, this study provides deep insight in the plant polysaccharide degrading capabilities of microbial consortia bred from forest soil, highlighting their biotechnological potential. PMID:26343383
Telomerecat: A ploidy-agnostic method for estimating telomere length from whole genome sequencing data.

PubMed

Farmery, James H R; Smith, Mike L; Lynch, Andy G

2018-01-22

Telomere length is a risk factor in disease and the dynamics of telomere length are crucial to our understanding of cell replication and vitality. The proliferation of whole genome sequencing represents an unprecedented opportunity to glean new insights into telomere biology on a previously unimaginable scale. To this end, a number of approaches for estimating telomere length from whole-genome sequencing data have been proposed. Here we present Telomerecat, a novel approach to the estimation of telomere length. Previous methods have been dependent on the number of telomeres present in a cell being known, which may be problematic when analysing aneuploid cancer data and non-human samples. Telomerecat is designed to be agnostic to the number of telomeres present, making it suited for the purpose of estimating telomere length in cancer studies. Telomerecat also accounts for interstitial telomeric reads and presents a novel approach to dealing with sequencing errors. We show that Telomerecat performs well at telomere length estimation when compared to leading experimental and computational methods. Furthermore, we show that it detects expected patterns in longitudinal data, repeated measurements, and cross-species comparisons. We also apply the method to a cancer cell data, uncovering an interesting relationship with the underlying telomerase genotype.
DNA isolation protocol effects on nuclear DNA analysis by microarrays, droplet digital PCR, and whole genome sequencing, and on mitochondrial DNA copy number estimation.

PubMed

Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H; Proukakis, Christos

2017-01-01

Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array "waves", and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance.
DNA isolation protocol effects on nuclear DNA analysis by microarrays, droplet digital PCR, and whole genome sequencing, and on mitochondrial DNA copy number estimation

PubMed Central

Nacheva, Elizabeth; Mokretar, Katya; Soenmez, Aynur; Pittman, Alan M.; Grace, Colin; Valli, Roberto; Ejaz, Ayesha; Vattathil, Selina; Maserati, Emanuela; Houlden, Henry; Taanman, Jan-Willem; Schapira, Anthony H.

2017-01-01

Potential bias introduced during DNA isolation is inadequately explored, although it could have significant impact on downstream analysis. To investigate this in human brain, we isolated DNA from cerebellum and frontal cortex using spin columns under different conditions, and salting-out. We first analysed DNA using array CGH, which revealed a striking wave pattern suggesting primarily GC-rich cerebellar losses, even against matched frontal cortex DNA, with a similar pattern on a SNP array. The aCGH changes varied with the isolation protocol. Droplet digital PCR of two genes also showed protocol-dependent losses. Whole genome sequencing showed GC-dependent variation in coverage with spin column isolation from cerebellum. We also extracted and sequenced DNA from substantia nigra using salting-out and phenol / chloroform. The mtDNA copy number, assessed by reads mapping to the mitochondrial genome, was higher in substantia nigra when using phenol / chloroform. We thus provide evidence for significant method-dependent bias in DNA isolation from human brain, as reported in rat tissues. This may contribute to array “waves”, and could affect copy number determination, particularly if mosaicism is being sought, and sequencing coverage. Variations in isolation protocol may also affect apparent mtDNA abundance. PMID:28683077
Genetic diversity of mtDNA D-loop sequences in four native Chinese chicken breeds.

PubMed

Guo, H W; Li, C; Wang, X N; Li, Z J; Sun, G R; Li, G X; Liu, X J; Kang, X T; Han, R L

2017-10-01

1. To explore the genetic diversity of Chinese indigenous chicken breeds, a 585 bp fragment of the mitochondrial DNA (mtDNA) region was sequenced in 102 birds from the Xichuan black-bone chicken, Yunyang black-bone chicken and Lushi chicken. In addition, 30 mtDNA D-loop sequences of Silkie fowls were downloaded from NCBI. The mtDNA D-loop sequence polymorphism and maternal origin of 4 chicken breeds were analysed in this study. 2. The results showed that a total of 33 mutation sites and 28 haplotypes were detected in the 4 chicken breeds. The haplotype diversity and nucleotide diversity of these 4 native breeds were 0.916 ± 0.014 and 0.012 ± 0.002, respectively. Three clusters were formed in 4 Chinese native chickens and 12 reference breeds. Both the Xichuan black-bone chicken and Yunyang black-bone chicken were grouped into one cluster. Four haplogroups (A, B, C and E) emerged in the median-joining network in these breeds. 3. It was concluded that these 4 Chinese chicken breeds had high genetic diversity. The phylogenetic tree and median network profiles showed that Chinese native chickens and its neighbouring countries had at least two maternal origins, one from Yunnan, China and another from Southeast Asia or its surrounding area.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.