consensus sequence resulted: Topics by Science.gov

Sample records for consensus sequence resulted

To Clone or Not To Clone: Method Analysis for Retrieving Consensus Sequences In Ancient DNA Samples

PubMed Central

Winters, Misa; Barta, Jodi Lynn; Monroe, Cara; Kemp, Brian M.

2011-01-01

The challenges associated with the retrieval and authentication of ancient DNA (aDNA) evidence are principally due to post-mortem damage which makes ancient samples particularly prone to contamination from “modern” DNA sources. The necessity for authentication of results has led many aDNA researchers to adopt methods considered to be “gold standards” in the field, including cloning aDNA amplicons as opposed to directly sequencing them. However, no standardized protocol has emerged regarding the necessary number of clones to sequence, how a consensus sequence is most appropriately derived, or how results should be reported in the literature. In addition, there has been no systematic demonstration of the degree to which direct sequences are affected by damage or whether direct sequencing would provide disparate results from a consensus of clones. To address this issue, a comparative study was designed to examine both cloned and direct sequences amplified from ∼3,500 year-old ancient northern fur seal DNA extracts. Majority rules and the Consensus Confidence Program were used to generate consensus sequences for each individual from the cloned sequences, which exhibited damage at 31 of 139 base pairs across all clones. In no instance did the consensus of clones differ from the direct sequence. This study demonstrates that, when appropriate, cloning need not be the default method, but instead, should be used as a measure of authentication on a case-by-case basis, especially when this practice adds time and cost to studies where it may be superfluous. PMID:21738625
Embedding strategies for effective use of information from multiple sequence alignments.

PubMed Central

Henikoff, S.; Henikoff, J. G.

1997-01-01

We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452
Comparative analysis of seven viral nuclear export signals (NESs) reveals the crucial role of nuclear export mediated by the third NES consensus sequence of nucleoprotein (NP) in influenza A virus replication.

PubMed

Chutiwitoonchai, Nopporn; Kakisaka, Michinori; Yamada, Kazunori; Aida, Yoko

2014-01-01

The assembly of influenza virus progeny virions requires machinery that exports viral genomic ribonucleoproteins from the cell nucleus. Currently, seven nuclear export signal (NES) consensus sequences have been identified in different viral proteins, including NS1, NS2, M1, and NP. The present study examined the roles of viral NES consensus sequences and their significance in terms of viral replication and nuclear export. Mutation of the NP-NES3 consensus sequence resulted in a failure to rescue viruses using a reverse genetics approach, whereas mutation of the NS2-NES1 and NS2-NES2 sequences led to a strong reduction in viral replication kinetics compared with the wild-type sequence. While the viral replication kinetics for other NES mutant viruses were also lower than those of the wild-type, the difference was not so marked. Immunofluorescence analysis after transient expression of NP-NES3, NS2-NES1, or NS2-NES2 proteins in host cells showed that they accumulated in the cell nucleus. These results suggest that the NP-NES3 consensus sequence is mostly required for viral replication. Therefore, each of the hydrophobic (Φ) residues within this NES consensus sequence (Φ1, Φ2, Φ3, or Φ4) was mutated, and its viral replication and nuclear export function were analyzed. No viruses harboring NP-NES3 Φ2 or Φ3 mutants could be rescued. Consistent with this, the NP-NES3 Φ2 and Φ3 mutants showed reduced binding affinity with CRM1 in a pull-down assay, and both accumulated in the cell nucleus. Indeed, a nuclear export assay revealed that these mutant proteins showed lower nuclear export activity than the wild-type protein. Moreover, the Φ2 and Φ3 residues (along with other Φ residues) within the NP-NES3 consensus were highly conserved among different influenza A viruses, including human, avian, and swine. Taken together, these results suggest that the Φ2 and Φ3 residues within the NP-NES3 protein are important for its nuclear export function during viral replication.
Multiple splicing defects in an intronic false exon.

PubMed

Sun, H; Chasin, L A

2000-09-01

Splice site consensus sequences alone are insufficient to dictate the recognition of real constitutive splice sites within the typically large transcripts of higher eukaryotes, and large numbers of pseudoexons flanked by pseudosplice sites with good matches to the consensus sequences can be easily designated. In an attempt to identify elements that prevent pseudoexon splicing, we have systematically altered known splicing signals, as well as immediately adjacent flanking sequences, of an arbitrarily chosen pseudoexon from intron 1 of the human hprt gene. The substitution of a 5' splice site that perfectly matches the 5' consensus combined with mutation to match the CAG/G sequence of the 3' consensus failed to get this model pseudoexon included as the central exon in a dhfr minigene context. Provision of a real 3' splice site and a consensus 5' splice site and removal of an upstream inhibitory sequence were necessary and sufficient to confer splicing on the pseudoexon. This activated context also supported the splicing of a second pseudoexon sequence containing no apparent enhancer. Thus, both the 5' splice site sequence and the polypyrimidine tract of the pseudoexon are defective despite their good agreement with the consensus. On the other hand, the pseudoexon body did not exert a negative influence on splicing. The introduction into the pseudoexon of a sequence selected for binding to ASF/SF2 or its replacement with beta-globin exon 2 only partially reversed the effect of the upstream negative element and the defective polypyrimidine tract. These results support the idea that exon-bridging enhancers are not a prerequisite for constitutive exon definition and suggest that intrinsically defective splice sites and negative elements play important roles in distinguishing the real splicing signal from the vast number of false splicing signals.
Intercalation of XR5944 with the estrogen response element is modulated by the tri-nucleotide spacer sequence between half-sites

PubMed Central

Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou

2011-01-01

DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738
Consensus generation and variant detection by Celera Assembler.

PubMed

Denisov, Gennady; Walenz, Brian; Halpern, Aaron L; Miller, Jason; Axelrod, Nelson; Levy, Samuel; Sutton, Granger

2008-04-15

We present an algorithm to identify allelic variation given a Whole Genome Shotgun (WGS) assembly of haploid sequences, and to produce a set of haploid consensus sequences rather than a single consensus sequence. Existing WGS assemblers take a column-by-column approach to consensus generation, and produce a single consensus sequence which can be inconsistent with the underlying haploid alleles, and inconsistent with any of the aligned sequence reads. Our new algorithm uses a dynamic windowing approach. It detects alleles by simultaneously processing the portions of aligned reads spanning a region of sequence variation, assigns reads to their respective alleles, phases adjacent variant alleles and generates a consensus sequence corresponding to each confirmed allele. This algorithm was used to produce the first diploid genome sequence of an individual human. It can also be applied to assemblies of multiple diploid individuals and hybrid assemblies of multiple haploid organisms. Being applied to the individual human genome assembly, the new algorithm detects exactly two confirmed alleles and reports two consensus sequences in 98.98% of the total number 2,033311 detected regions of sequence variation. In 33,269 out of 460,373 detected regions of size >1 bp, it fixes the constructed errors of a mosaic haploid representation of a diploid locus as produced by the original Celera Assembler consensus algorithm. Using an optimized procedure calibrated against 1 506 344 known SNPs, it detects 438 814 new heterozygous SNPs with false positive rate 12%. The open source code is available at: http://wgs-assembler.cvs.sourceforge.net/wgs-assembler/
Site directed recombination

DOEpatents

Jurka, Jerzy W.

1997-01-01

Enhanced homologous recombination is obtained by employing a consensus sequence which has been found to be associated with integration of repeat sequences, such as Alu and ID. The consensus sequence or sequence having a single transition mutation determines one site of a double break which allows for high efficiency of integration at the site. By introducing single or double stranded DNA having the consensus sequence flanking region joined to a sequence of interest, one can reproducibly direct integration of the sequence of interest at one or a limited number of sites. In this way, specific sites can be identified and homologous recombination achieved at the site by employing a second flanking sequence associated with a sequence proximal to the 3'-nick.
Using a color-coded ambigraphic nucleic acid notation to visualize conserved palindromic motifs within and across genomes

PubMed Central

2014-01-01

Background Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences. Results We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs. Conclusion Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte. PMID:24447494
Molecular phylogeny of 21 tropical bamboo species reconstructed by integrating non-coding internal transcribed spacer (ITS1 and 2) sequences and their consensus secondary structure.

PubMed

Ghosh, Jayadri Sekhar; Bhattacharya, Samik; Pal, Amita

2017-06-01

The unavailability of the reproductive structure and unpredictability of vegetative characters for the identification and phylogenetic study of bamboo prompted the application of molecular techniques for greater resolution and consensus. We first employed internal transcribed spacer (ITS1, 5.8S rRNA and ITS2) sequences to construct the phylogenetic tree of 21 tropical bamboo species. While the sequence alone could grossly reconstruct the traditional phylogeny amongst the 21-tropical species studied, some anomalies were encountered that prompted a further refinement of the phylogenetic analyses. Therefore, we integrated the secondary structure of the ITS sequences to derive individual sequence-structure matrix to gain more resolution on the phylogenetic reconstruction. The results showed that ITS sequence-structure is the reliable alternative to the conventional phenotypic method for the identification of bamboo species. The best-fit topology obtained by the sequence-structure based phylogeny over the sole sequence based one underscores closer clustering of all the studied Bambusa species (Sub-tribe Bambusinae), while Melocanna baccifera, which belongs to Sub-Tribe Melocanneae, disjointedly clustered as an out-group within the consensus phylogenetic tree. In this study, we demonstrated the dependability of the combined (ITS sequence+structure-based) approach over the only sequence-based analysis for phylogenetic relationship assessment of bamboo.
Accurate RNA consensus sequencing for high-fidelity detection of transcriptional mutagenesis-induced epimutations.

PubMed

Reid-Bayliss, Kate S; Loeb, Lawrence A

2017-08-29

Transcriptional mutagenesis (TM) due to misincorporation during RNA transcription can result in mutant RNAs, or epimutations, that generate proteins with altered properties. TM has long been hypothesized to play a role in aging, cancer, and viral and bacterial evolution. However, inadequate methodologies have limited progress in elucidating a causal association. We present a high-throughput, highly accurate RNA sequencing method to measure epimutations with single-molecule sensitivity. Accurate RNA consensus sequencing (ARC-seq) uniquely combines RNA barcoding and generation of multiple cDNA copies per RNA molecule to eliminate errors introduced during cDNA synthesis, PCR, and sequencing. The stringency of ARC-seq can be scaled to accommodate the quality of input RNAs. We apply ARC-seq to directly assess transcriptome-wide epimutations resulting from RNA polymerase mutants and oxidative stress.
Sequence polymorphism in an insect RNA virus field population: A snapshot from a single point in space and time reveals stochastic differences among and within individual hosts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stenger, Drake C., E-mail: drake.stenger@ars.usda.

Population structure of Homalodisca coagulata Virus-1 (HoCV-1) among and within field-collected insects sampled from a single point in space and time was examined. Polymorphism in complete consensus sequences among single-insect isolates was dominated by synonymous substitutions. The mutant spectrum of the C2 helicase region within each single-insect isolate was unique and dominated by nonsynonymous singletons. Bootstrapping was used to correct the within-isolate nonsynonymous:synonymous arithmetic ratio (N:S) for RT-PCR error, yielding an N:S value ~one log-unit greater than that of consensus sequences. Probability of all possible single-base substitutions for the C2 region predicted N:S values within 95% confidence limits of themore » corrected within-isolate N:S when the only constraint imposed was viral polymerase error bias for transitions over transversions. These results indicate that bottlenecks coupled with strong negative/purifying selection drive consensus sequences toward neutral sequence space, and that most polymorphism within single-insect isolates is composed of newly-minted mutations sampled prior to selection. -- Highlights: •Sampling protocol minimized differential selection/history among isolates. •Polymorphism among consensus sequences dominated by negative/purifying selection. •Within-isolate N:S ratio corrected for RT-PCR error by bootstrapping. •Within-isolate mutant spectrum dominated by new mutations yet to undergo selection.« less
A safe an easy method for building consensus HIV sequences from 454 massively parallel sequencing data.

PubMed

Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico

2018-02-01

To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.
Fine-tuning structural RNA alignments in the twilight zone

PubMed Central

2010-01-01

Background A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Results Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Conclusions Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index. PMID:20433706
The hypervariable region 1 protein of hepatitis C virus broadly reactive with sera of patients with chronic hepatitis C has a similar amino acid sequence with the consensus sequence.

PubMed

Watanabe, K; Yoshioka, K; Ito, H; Ishigami, M; Takagi, K; Utsunomiya, S; Kobayashi, M; Kishimoto, H; Yano, M; Kakumu, S

1999-11-10

Hypervariable region 1 (HVR1) proteins of hepatitis C virus (HCV) have been reported to react broadly with sera of patients with HCV infection. However, the variability of the broad reactivity of individual HVR1 proteins has not been elucidated. We assessed the reactivity of 25 different HVR1 proteins (genotype 1b) with sera of 81 patients with HCV infection (genotype 1b) by Western blot. HVR1 proteins reacted with 2-60 sera. The number of sera reactive with each HVR1 protein significantly correlated with the number of amino acid residues identical to the consensus sequence defined by Puntoriero et al. (G. Puntoriero, A. Lahm, S. Zucchelli, B. B. Ercole, R. Tafi, M. Penzzanera, M. U. Mondelli, R. Cortese, A. Tramontano, G. Galfre', and A. Nicosia. 1998. EMBO J. 17, 3521-3533. ) (r = 0.561, P < 0.005). The most widely reactive HVR1 protein, 12-22, had a sequence similar to the consensus sequence. The peptide with C-terminal 13-amino-acids sequence of HVR1 protein 12-22 (NH2-CSFTSLFTPGPSQK) was injected into rabbits as an immunogen. The rabbit immune sera reacted with 9 of 25 HVR1 proteins of genotype 1b including HVR1 protein 12-22 and with 3 of 12 proteins of genotype 2a. These results indicate that the HVR1 protein broadly reactive with patients' sera has a sequence similar to the consensus sequence, can induce broadly reactive sera, and could be one of the candidate immunogens in a prophylactic vaccine against HCV. Copyright 1999 Academic Press.
A first report and complete genome sequence of alfalfa enamovirus from Sudan

USDA-ARS?s Scientific Manuscript database

A full genome sequence of a viral pathogen, provisionally named alfalfa enamovirus 2 (AEV-2), was reconstructed from short reads obtained by Illumina RNA sequencing of alfalfa sample originating from Sudan. Ambiguous nucleotides in the resultant consensus assembly and identity of the predicted virus...
Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

PubMed Central

Davis, C A; Wyatt, G R

1989-01-01

The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148
Structure of genes and an insertion element in the methane producing archaebacterium Methanobrevibacter smithii.

PubMed

Hamilton, P T; Reeve, J N

1985-01-01

DNA fragments cloned from the methanogenic archaebacterium Methanobrevibacter smithii which complement mutations in the purE and proC genes of E. coli have been sequenced. Sequence analyses, transposon mutagenesis and expression in E. coli minicells indicate that purE and proC complementations result from the synthesis of M. smithii polypeptides with molecular weights of 36,697 and 27,836 respectively. The encoding genes appear to be located in operons. The M. smithii genome contains 69% A/T basepairs (bp) which is reflected in unusual codon usages and intergenic regions containing approximately 85% A/T bp. An insertion element, designated ISM1, was found within the cloned M. smithii DNA located adjacent to the proC complementing region. ISM1 is 1381 bp in length, has 29 bp terminal inverted repeat sequences and contains one major ORF encoded in 87% of the ISM1 sequence. ISM1 is mobile, present in approximately 10 copies per genome and integration duplicates 8 bp at the site of insertion. The duplicated sequences show homology with sequences within the 29 bp terminal repeat sequence of ISM1. Comparison of our data with sequences from halophilic archaebacteria suggests that 5'GAANTTTCA and 5'TTTTAATATAAA may be consensus promoter sequences for archaebacteria. These sequences closely resemble the consensus sequences which precede Drosophila heat-shock genes (Pelham 1982; Davidson et al. 1983). Methanogens appear to employ the eubacterial system of mRNA: 16SrRNA hybridization to ensure initiation of translation; the consensus ribosome binding sequence is 5'AGGTGA.
Generation and Characterization of HIV-1 Transmitted and Founder Virus Consensus Sequence from Intravenous Drug Users in Xinjiang, China.

PubMed

Li, Fan; Ma, Liying; Feng, Yi; Hu, Jing; Ni, Na; Ruan, Yuhua; Shao, Yiming

2017-06-01

HIV-1 transmission in intravenous drug users (IDUs) has been characterized by high genetic multiplicity and suggests a greater challenge for HIV-1 infection blocking. We investigated a total of 749 sequences of full-length gp160 gene obtained by single genome sequencing (SGS) from 22 HIV-1 early infected IDUs in Xinjiang province, northwest China, and generated a transmitted and founder virus (T/F virus) consensus sequence (IDU.CON). The T/F virus was classified as subtype CRF07_BC and predicted to be CCR5-tropic virus. The variable region (V1, V2, and V4 loop) of IDU.CON showed length variation compared with the heterosexual T/F virus consensus sequence (HSX.CON) and homosexual T/F virus consensus sequence (MSM.CON). A total of 26 N-linked glycosylation sites were discovered in the IDU.CON sequence, which is less than that of MSM.CON and HSX.CON. Characterization of T/F virus from IDUs highlights the genetic make-up and complexity of virus near the moment of transmission or in early infection preceding systemic dissemination and is important toward the development of an effective HIV-1 preventive methods, including vaccines.
Transcriptome characterization and polymorphism detection between subspecies of big sagebrush (Artemisia tridentata)

PubMed Central

2011-01-01

Background Big sagebrush (Artemisia tridentata) is one of the most widely distributed and ecologically important shrub species in western North America. This species serves as a critical habitat and food resource for many animals and invertebrates. Habitat loss due to a combination of disturbances followed by establishment of invasive plant species is a serious threat to big sagebrush ecosystem sustainability. Lack of genomic data has limited our understanding of the evolutionary history and ecological adaptation in this species. Here, we report on the sequencing of expressed sequence tags (ESTs) and detection of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers in subspecies of big sagebrush. Results cDNA of A. tridentata sspp. tridentata and vaseyana were normalized and sequenced using the 454 GS FLX Titanium pyrosequencing technology. Assembly of the reads resulted in 20,357 contig consensus sequences in ssp. tridentata and 20,250 contigs in ssp. vaseyana. A BLASTx search against the non-redundant (NR) protein database using 29,541 consensus sequences obtained from a combined assembly resulted in 21,436 sequences with significant blast alignments (≤ 1e-15). A total of 20,952 SNPs and 119 polymorphic SSRs were detected between the two subspecies. SNPs were validated through various methods including sequence capture. Validation of SNPs in different individuals uncovered a high level of nucleotide variation in EST sequences. EST sequences of a third, tetraploid subspecies (ssp. wyomingensis) obtained by Illumina sequencing were mapped to the consensus sequences of the combined 454 EST assembly. Approximately one-third of the SNPs between sspp. tridentata and vaseyana identified in the combined assembly were also polymorphic within the two geographically distant ssp. wyomingensis samples. Conclusion We have produced a large EST dataset for Artemisia tridentata, which contains a large sample of the big sagebrush leaf transcriptome. SNP mapping among the three subspecies suggest the origin of ssp. wyomingensis via mixed ancestry. A large number of SNP and SSR markers provide the foundation for future research to address questions in big sagebrush evolution, ecological genetics, and conservation using genomic approaches. PMID:21767398
Construction of an ultra-high density consensus genetic map, and enhancement of the physical map from genome sequencing in Lupinus angustifolius.

PubMed

Zhou, Gaofeng; Jian, Jianbo; Wang, Penghao; Li, Chengdao; Tao, Ye; Li, Xuan; Renshaw, Daniel; Clements, Jonathan; Sweetingham, Mark; Yang, Huaan

2018-01-01

An ultra-high density genetic map containing 34,574 sequence-defined markers was developed in Lupinus angustifolius. Markers closely linked to nine genes of agronomic traits were identified. A physical map was improved to cover 560.5 Mb genome sequence. Lupin (Lupinus angustifolius L.) is a recently domesticated legume grain crop. In this study, we applied the restriction-site associated DNA sequencing (RADseq) method to genotype an F 9 recombinant inbred line population derived from a wild type × domesticated cultivar (W × D) cross. A high density linkage map was developed based on the W × D population. By integrating sequence-defined DNA markers reported in previous mapping studies, we established an ultra-high density consensus genetic map, which contains 34,574 markers consisting of 3508 loci covering 2399 cM on 20 linkage groups. The largest gap in the entire consensus map was 4.73 cM. The high density W × D map and the consensus map were used to develop an improved physical map, which covered 560.5 Mb of genome sequence data. The ultra-high density consensus linkage map, the improved physical map and the markers linked to genes of breeding interest reported in this study provide a common tool for genome sequence assembly, structural genomics, comparative genomics, functional genomics, QTL mapping, and molecular plant breeding in lupin.

Sequences within the 5' untranslated region regulate the levels of a kinetoplast DNA topoisomerase mRNA during the cell cycle.

PubMed Central

Pasion, S G; Hines, J C; Ou, X; Mahmood, R; Ray, D S

1996-01-01

Gene expression in trypanosomatids appears to be regulated largely at the posttranscriptional level and involves maturation of mRNA precursors by trans splicing of a 39-nucleotide miniexon sequence to the 5' end of the mRNA and cleavage and polyadenylation at the 3' end of the mRNA. To initiate the identification of sequences involved in the periodic expression of DNA replication genes in trypanosomatids, we have mapped splice acceptor sites in the 5' flanking region of the TOP2 gene, which encodes the kinetoplast DNA topoisomerase, and have carried out deletion analysis of this region on a plasmid-encoded TOP2 gene. Block deletions within the 5' untranslated region (UTR) identified two regions (-608 to -388 and -387 to -186) responsible for periodic accumulation of the mRNA. Deletion of one or the other of these sequences had no effect on periodic expression of the mRNA, while deletion of both regions resulted in constitutive expression of the mRNA throughout the cell cycle. Subcloning of these sequences into the 5' UTR of a construct lacking both regions of the TOP2 5' UTR has shown that an octamer consensus sequence present in the 5' UTR of the TOP2, RPA1, and DHFR-TS mRNAs is required for normal cycling of the TOP2 mRNA. Mutation of the consensus octamer sequence in the TOP2 5' UTR in a plasmid construct containing only a single consensus octamer and that shows normal cycling of the plasmid-encoded TOP2 mRNA resulted in substantial reduction of the cycling of the mRNA level. These results imply a negative regulation of TOP2 mRNA during the cell cycle by a mechanism involving redundant elements containing one or more copies of a conserved octamer sequence within the 5' UTR of TOP2 mRNA. PMID:8943327
Sequences within the 5' untranslated region regulate the levels of a kinetoplast DNA topoisomerase mRNA during the cell cycle.

PubMed

Pasion, S G; Hines, J C; Ou, X; Mahmood, R; Ray, D S

1996-12-01

Gene expression in trypanosomatids appears to be regulated largely at the posttranscriptional level and involves maturation of mRNA precursors by trans splicing of a 39-nucleotide miniexon sequence to the 5' end of the mRNA and cleavage and polyadenylation at the 3' end of the mRNA. To initiate the identification of sequences involved in the periodic expression of DNA replication genes in trypanosomatids, we have mapped splice acceptor sites in the 5' flanking region of the TOP2 gene, which encodes the kinetoplast DNA topoisomerase, and have carried out deletion analysis of this region on a plasmid-encoded TOP2 gene. Block deletions within the 5' untranslated region (UTR) identified two regions (-608 to -388 and -387 to -186) responsible for periodic accumulation of the mRNA. Deletion of one or the other of these sequences had no effect on periodic expression of the mRNA, while deletion of both regions resulted in constitutive expression of the mRNA throughout the cell cycle. Subcloning of these sequences into the 5' UTR of a construct lacking both regions of the TOP2 5' UTR has shown that an octamer consensus sequence present in the 5' UTR of the TOP2, RPA1, and DHFR-TS mRNAs is required for normal cycling of the TOP2 mRNA. Mutation of the consensus octamer sequence in the TOP2 5' UTR in a plasmid construct containing only a single consensus octamer and that shows normal cycling of the plasmid-encoded TOP2 mRNA resulted in substantial reduction of the cycling of the mRNA level. These results imply a negative regulation of TOP2 mRNA during the cell cycle by a mechanism involving redundant elements containing one or more copies of a conserved octamer sequence within the 5' UTR of TOP2 mRNA.
Cofactor specificity switch in Shikimate dehydrogenase by rational design and consensus engineering.

PubMed

García-Guevara, Fernando; Bravo, Iris; Martínez-Anaya, Claudia; Segovia, Lorenzo

2017-08-01

Consensus engineering has been used to design more stable variants using the most frequent amino acid at each site of a multiple sequence alignment; sometimes consensus engineering modifies function, but efforts have mainly been focused on studying stability. Here we constructed a consensus Rossmann domain for the Shikimate dehydrogenase enzyme; separately we decided to switch the cofactor specificity through rational design in the Escherichia coli Shikimate dehydrogenase enzyme and then analyzed the effect of consensus mutations on top of our design. We found that consensus mutations closest to the 2' adenine moiety increased the activity in our design. Consensus engineering has been shown to result in more stable proteins and our findings suggest it could also be used as a complementary tool for increasing or modifying enzyme activity during design. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Asparagine-linked oligosaccharides present on a non-consensus amino acid sequence in the CH1 domain of human antibodies.

PubMed

Valliere-Douglass, John F; Kodama, Paul; Mujacic, Mirna; Brady, Lowell J; Wang, Wes; Wallace, Alison; Yan, Boxu; Reddy, Pranhitha; Treuheit, Michael J; Balland, Alain

2009-11-20

We report that N-linked oligosaccharide structures can be present on an asparagine residue not adhering to the consensus site motif NX(S/T), where X is not proline, described in the literature. We have observed oligosaccharides on a non-consensus asparaginyl residue in the C(H)1 constant domain of IgG1 and IgG2 antibodies. The initial findings were obtained from characterization of charge variant populations evident in a recombinant human antibody of the IgG2 subclass. HPLC-MS results indicated that cation-exchange chromatography acidic variant populations were enriched in antibody with a second glycosylation site, in addition to the well documented canonical glycosylation site located in the C(H)2 domain. Subsequent tryptic and chymotryptic peptide map data indicated that the second glycosylation site was associated with the amino acid sequence TVSWN(162)SGAL in the C(H)1 domain of the antibody. This highly atypical modification is present at levels of 0.5-2.0% on most of the recombinant antibodies that have been tested and has also been observed in IgG1 antibodies derived from human donors. Site-directed mutagenesis of the C(H)1 domain sequence in a recombinant-human IgG1 antibody resulted in an increase in non-consensus glycosylation to 3.15%, a greater than 4-fold increase over the level observed in the wild type, by changing the -1 and +1 amino acids relative to the asparagine residue at position 162. We believe that further understanding of the phenomenon of non-consensus glycosylation can be used to gain fundamental insights into the fidelity of the cellular glycosylation machinery.
Rice MEL2, the RNA recognition motif (RRM) protein, binds in vitro to meiosis-expressed genes containing U-rich RNA consensus sequences in the 3'-UTR.

PubMed

Miyazaki, Saori; Sato, Yutaka; Asano, Tomoya; Nagamura, Yoshiaki; Nonomura, Ken-Ichi

2015-10-01

Post-transcriptional gene regulation by RNA recognition motif (RRM) proteins through binding to cis-elements in the 3'-untranslated region (3'-UTR) is widely used in eukaryotes to complete various biological processes. Rice MEIOSIS ARRESTED AT LEPTOTENE2 (MEL2) is the RRM protein that functions in the transition to meiosis in proper timing. The MEL2 RRM preferentially associated with the U-rich RNA consensus, UUAGUU[U/A][U/G][A/U/G]U, dependently on sequences and proportionally to MEL2 protein amounts in vitro. The consensus sequences were located in the putative looped structures of the RNA ligand. A genome-wide survey revealed a tendency of MEL2-binding consensus appearing in 3'-UTR of rice genes. Of 249 genes that conserved the consensus in their 3'-UTR, 13 genes spatiotemporally co-expressed with MEL2 in meiotic flowers, and included several genes whose function was supposed in meiosis; such as Replication protein A and OsMADS3. The proteome analysis revealed that the amounts of small ubiquitin-related modifier-like protein and eukaryotic translation initiation factor3-like protein were dramatically altered in mel2 mutant anthers. Taken together with transcriptome and gene ontology results, we propose that the rice MEL2 is involved in the translational regulation of key meiotic genes on 3'-UTRs to achieve the faithful transition of germ cells to meiosis.
GeneSilico protein structure prediction meta-server.

PubMed

Kurowski, Michal A; Bujnicki, Janusz M

2003-07-01

Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.
GeneSilico protein structure prediction meta-server

PubMed Central

Kurowski, Michal A.; Bujnicki, Janusz M.

2003-01-01

Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta. PMID:12824313
PROSPECT improves cis-acting regulatory element prediction by integrating expression profile data with consensus pattern searches

PubMed Central

Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David

2001-01-01

Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681
R2R - software to speed the depiction of aesthetic consensus RNA secondary structures

PubMed Central

2011-01-01

Background With continuing identification of novel structured noncoding RNAs, there is an increasing need to create schematic diagrams showing the consensus features of these molecules. RNA structural diagrams are typically made either with general-purpose drawing programs like Adobe Illustrator, or with automated or interactive programs specific to RNA. Unfortunately, the use of applications like Illustrator is extremely time consuming, while existing RNA-specific programs produce figures that are useful, but usually not of the same aesthetic quality as those produced at great cost in Illustrator. Additionally, most existing RNA-specific applications are designed for drawing single RNA molecules, not consensus diagrams. Results We created R2R, a computer program that facilitates the generation of aesthetic and readable drawings of RNA consensus diagrams in a fraction of the time required with general-purpose drawing programs. Since the inference of a consensus RNA structure typically requires a multiple-sequence alignment, the R2R user annotates the alignment with commands directing the layout and annotation of the RNA. R2R creates SVG or PDF output that can be imported into Adobe Illustrator, Inkscape or CorelDRAW. R2R can be used to create consensus sequence and secondary structure models for novel RNA structures or to revise models when new representatives for known RNA classes become available. Although R2R does not currently have a graphical user interface, it has proven useful in our efforts to create 100 schematic models of distinct noncoding RNA classes. Conclusions R2R makes it possible to obtain high-quality drawings of the consensus sequence and structural models of many diverse RNA structures with a more practical amount of effort. R2R software is available at http://breaker.research.yale.edu/R2R and as an Additional file. PMID:21205310
DNA sequence analysis of ARS elements from chromosome III of Saccharomyces cerevisiae: identification of a new conserved sequence.

PubMed Central

Palzkill, T G; Oliver, S G; Newlon, C S

1986-01-01

Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036
CapZyme-Seq Comprehensively Defines Promoter-Sequence Determinants for RNA 5' Capping with NAD.

PubMed

Vvedenskaya, Irina O; Bird, Jeremy G; Zhang, Yuanchao; Zhang, Yu; Jiao, Xinfu; Barvík, Ivan; Krásný, Libor; Kiledjian, Megerditch; Taylor, Deanne M; Ebright, Richard H; Nickels, Bryce E

2018-05-03

Nucleoside-containing metabolites such as NAD + can be incorporated as 5' caps on RNA by serving as non-canonical initiating nucleotides (NCINs) for transcription initiation by RNA polymerase (RNAP). Here, we report CapZyme-seq, a high-throughput-sequencing method that employs NCIN-decapping enzymes NudC and Rai1 to detect and quantify NCIN-capped RNA. By combining CapZyme-seq with multiplexed transcriptomics, we determine efficiencies of NAD + capping by Escherichia coli RNAP for ∼16,000 promoter sequences. The results define preferred transcription start site (TSS) positions for NAD + capping and define a consensus promoter sequence for NAD + capping: HRRASWW (TSS underlined). By applying CapZyme-seq to E. coli total cellular RNA, we establish that sequence determinants for NCIN capping in vivo match the NAD + -capping consensus defined in vitro, and we identify and quantify NCIN-capped small RNAs (sRNAs). Our findings define the promoter-sequence determinants for NCIN capping with NAD + and provide a general method for analysis of NCIN capping in vitro and in vivo. Copyright © 2018 Elsevier Inc. All rights reserved.
The Functional Human C-Terminome

PubMed Central

Hedden, Michael; Lyon, Kenneth F.; Brooks, Steven B.; David, Roxanne P.; Limtong, Justin; Newsome, Jacklyn M.; Novakovic, Nemanja; Rajasekaran, Sanguthevar; Thapar, Vishal; Williams, Sean R.; Schiller, Martin R.

2016-01-01

All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new “C-terminome” database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3–10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com. PMID:27050421
Sequence specificity of the human mRNA N6-adenosine methylase in vitro.

PubMed Central

Harper, J E; Miceli, S M; Roberts, R J; Manley, J L

1990-01-01

N6-adenosine methylation is a frequent modification of mRNAs and their precursors, but little is known about the mechanism of the reaction or the function of the modification. To explore these questions, we developed conditions to examine N6-adenosine methylase activity in HeLa cell nuclear extracts. Transfer of the methyl group from S-[3H methyl]-adenosylmethionine to unlabeled random copolymer RNA substrates of varying ribonucleotide composition revealed a substrate specificity consistent with a previously deduced consensus sequence, Pu[G greater than A]AC[A/C/U]. 32-P labeled RNA substrates of defined sequence were used to examine the minimum sequence requirements for methylation. Each RNA was 20 nucleotides long, and contained either the core consensus sequence GGACU, or some variation of this sequence. RNAs containing GGACU, either in single or multiple copies, were good substrates for methylation, whereas RNAs containing single base substitutions within the GGACU sequence gave dramatically reduced methylation. These results demonstrate that the N6-adenosine methylase has a strict sequence specificity, and that there is no requirement for extended sequences or secondary structures for methylation. Recognition of this sequence does not require an RNA component, as micrococcal nuclease pretreatment of nuclear extracts actually increased methylation efficiency. Images PMID:2216767
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.

PubMed Central

Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A

1993-01-01

The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
A consensus linkage map of lentil based on DArT markers from three RIL mapping populations

PubMed Central

Ates, Duygu; Aldemir, Secil; Alsaleh, Ahmad; Erdogmus, Semih; Nemli, Seda; Kahriman, Abdullah; Ozkan, Hakan; Vandenberg, Albert

2018-01-01

Background Lentil (Lens culinaris ssp. culinaris Medikus) is a diploid (2n = 2x = 14), self-pollinating grain legume with a haploid genome size of about 4 Gbp and is grown throughout the world with current annual production of 4.9 million tonnes. Materials and methods A consensus map of lentil (Lens culinaris ssp. culinaris Medikus) was constructed using three different lentils recombinant inbred line (RIL) populations, including “CDC Redberry” x “ILL7502” (LR8), “ILL8006” x “CDC Milestone” (LR11) and “PI320937” x “Eston” (LR39). Results The lentil consensus map was composed of 9,793 DArT markers, covered a total of 977.47 cM with an average distance of 0.10 cM between adjacent markers and constructed 7 linkage groups representing 7 chromosomes of the lentil genome. The consensus map had no gap larger than 12.67 cM and only 5 gaps were found to be between 12.67 cM and 6.0 cM (on LG3 and LG4). The localization of the SNP markers on the lentil consensus map were in general consistent with their localization on the three individual genetic linkage maps and the lentil consensus map has longer map length, higher marker density and shorter average distance between the adjacent markers compared to the component linkage maps. Conclusion This high-density consensus map could provide insight into the lentil genome. The consensus map could also help to construct a physical map using a Bacterial Artificial Chromosome library and map based cloning studies. Sequence information of DArT may help localization of orientation scaffolds from Next Generation Sequencing data. PMID:29351563
Automated Sanger Analysis Pipeline (ASAP): A Tool for Rapidly Analyzing Sanger Sequencing Data with Minimum User Interference.

PubMed

Singh, Aditya; Bhatia, Prateek

2016-12-01

Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.
First full-length genome sequence of the polerovirus luffa aphid-borne yellows virus (LABYV) reveals the presence of at least two consensus sequences in an isolate from Thailand.

PubMed

Knierim, Dennis; Maiss, Edgar; Kenyon, Lawrence; Winter, Stephan; Menzel, Wulf

2015-10-01

Luffa aphid-borne yellows virus (LABYV) was proposed as the name for a previously undescribed polerovirus based on partial genome sequences obtained from samples of cucurbit plants collected in Thailand between 2008 and 2013. In this study, we determined the first full-length genome sequence of LABYV. Based on phylogenetic analysis and genome properties, it is clear that this virus represents a distinct species in the genus Polerovirus. Analysis of sequences from sample TH24, which was collected in 2010 from a luffa plant in Thailand, reveals the presence of two different full-length genome consensus sequences.
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.

PubMed

Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Bertozzi, Terry; Adelson, David L

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies

PubMed Central

Zeng, Lu; Kortschak, R. Daniel; Raison, Joy M.

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package. PMID:29538441
Enterobacterial Repetitive Intergenic Consensus Sequences as Molecular Targets for Typing of Mycobacterium tuberculosis Strains

PubMed Central

Sechi, Leonardo A.; Zanetti, Stefania; Dupré, Ilaria; Delogu, Giovanni; Fadda, Giovanni

1998-01-01

The presence of enterobacterial repetitive intergenic consensus (ERIC) sequences was demonstrated for the first time in the genome of Mycobacterium tuberculosis; these sequences have been found in transcribed regions of the chromosomes of gram-negative bacteria. In this study genetic diversity among clinical isolates of M. tuberculosis was determined by PCR with ERIC primers (ERIC-PCR). The study isolates comprised 71 clinical isolates collected from Sardinia, Italy. ERIC-PCR was able to identify 59 distinct profiles. The results obtained were compared with IS6110 and PCR-GTG fingerprinting. We found that the level of differentiation obtained by ERIC-PCR is greater than that obtained by IS6110 fingerprinting and comparable to that obtained by PCR-GTG. This method of fingerprinting is rapid and sensitive and can be applied to the study of the epidemiology of M. tuberculosis infections, especially when IS6110 fingerprinting is not of any help. PMID:9431935

Simplifying complex sequence information: a PCP-consensus protein binds antibodies against all four Dengue serotypes.

PubMed

Bowen, David M; Lewis, Jessica A; Lu, Wenzhe; Schein, Catherine H

2012-09-14

Designing proteins that reflect the natural variability of a pathogen is essential for developing novel vaccines and drugs. Flaviviruses, including Dengue (DENV) and West Nile (WNV), evolve rapidly and can "escape" neutralizing monoclonal antibodies by mutation. Designing antigens that represent many distinct strains is important for DENV, where infection with a strain from one of the four serotypes may lead to severe hemorrhagic disease on subsequent infection with a strain from another serotype. Here, a DENV physicochemical property (PCP)-consensus sequence was derived from 671 unique sequences from the Flavitrack database. PCP-consensus proteins for domain 3 of the envelope protein (EdomIII) were expressed from synthetic genes in Escherichia coli. The ability of the purified consensus proteins to bind polyclonal antibodies generated in response to infection with strains from each of the four DENV serotypes was determined. The initial consensus protein bound antibodies from DENV-1-3 in ELISA and Western blot assays. This sequence was altered in 3 steps to incorporate regions of maximum variability, identified as significant changes in the PCPs, characteristic of DENV-4 strains. The final protein was recognized by antibodies against all four serotypes. Two amino acids essential for efficient binding to all DENV antibodies are part of a discontinuous epitope previously defined for a neutralizing monoclonal antibody. The PCP-consensus method can significantly reduce the number of experiments required to define a multivalent antigen, which is particularly important when dealing with pathogens that must be tested at higher biosafety levels. Copyright © 2012 Elsevier Ltd. All rights reserved.
Fine-tuning structural RNA alignments in the twilight zone.

PubMed

Bremges, Andreas; Schirmer, Stefanie; Giegerich, Robert

2010-04-30

A widely used method to find conserved secondary structure in RNA is to first construct a multiple sequence alignment, and then fold the alignment, optimizing a score based on thermodynamics and covariance. This method works best around 75% sequence similarity. However, in a "twilight zone" below 55% similarity, the sequence alignment tends to obscure the covariance signal used in the second phase. Therefore, while the overall shape of the consensus structure may still be found, the degree of conservation cannot be estimated reliably. Based on a combination of available methods, we present a method named planACstar for improving structure conservation in structural alignments in the twilight zone. After constructing a consensus structure by alignment folding, planACstar abandons the original sequence alignment, refolds the sequences individually, but consistent with the consensus, aligns the structures, irrespective of sequence, by a pure structure alignment method, and derives an improved sequence alignment from the alignment of structures, to be re-submitted to alignment folding, etc.. This circle may be iterated as long as structural conservation improves, but normally, one step suffices. Employing the tools ClustalW, RNAalifold, and RNAforester, we find that for sequences with 30-55% sequence identity, structural conservation can be improved by 10% on average, with a large variation, measured in terms of RNAalifold's own criterion, the structure conservation index.
Genetic dissection of the consensus sequence for the class 2 and class 3 flagellar promoters

PubMed Central

Wozniak, Christopher E.; Hughes, Kelly T.

2008-01-01

Summary Computational searches for DNA binding sites often utilize consensus sequences. These search models make assumptions that the frequency of a base pair in an alignment relates to the base pair’s importance in binding and presume that base pairs contribute independently to the overall interaction with the DNA binding protein. These two assumptions have generally been found to be accurate for DNA binding sites. However, these assumptions are often not satisfied for promoters, which are involved in additional steps in transcription initiation after RNA polymerase has bound to the DNA. To test these assumptions for the flagellar regulatory hierarchy, class 2 and class 3 flagellar promoters were randomly mutagenized in Salmonella. Important positions were then saturated for mutagenesis and compared to scores calculated from the consensus sequence. Double mutants were constructed to determine how mutations combined for each promoter type. Mutations in the binding site for FlhD4C2, the activator of class 2 promoters, better satisfied the assumptions for the binding model than did mutations in the class 3 promoter, which is recognized by the σ28 transcription factor. These in vivo results indicate that the activator sites within flagellar promoters can be modeled using simple assumptions but that the DNA sequences recognized by the flagellar sigma factor require more complex models. PMID:18486950
Regulation of the alpha-glucuronidase-encoding gene ( aguA) from Aspergillus niger.

PubMed

de Vries, R P; van de Vondervoort, P J I; Hendriks, L; van de Belt, M; Visser, J

2002-09-01

The alpha-glucuronidase gene aguA from Aspergillus niger was cloned and characterised. Analysis of the promoter region of aguA revealed the presence of four putative binding sites for the major carbon catabolite repressor protein CREA and one putative binding site for the transcriptional activator XLNR. In addition, a sequence motif was detected which differed only in the last nucleotide from the XLNR consensus site. A construct in which part of the aguA coding region was deleted still resulted in production of a stable mRNA upon transformation of A. niger. The putative XLNR binding sites and two of the putative CREA binding sites were mutated individually in this construct and the effects on expression were examined in A. niger transformants. Northern analysis of the transformants revealed that the consensus XLNR site is not actually functional in the aguA promoter, whereas the sequence that diverges from the consensus at a single position is functional. This indicates that XLNR is also able to bind to the sequence GGCTAG, and the XLNR binding site consensus should therefore be changed to GGCTAR. Both CREA sites are functional, indicating that CREA has a strong influence on aguA expression. A detailed expression analysis of aguA in four genetic backgrounds revealed a second regulatory system involved in activation of aguA gene expression. This system responds to the presence of glucuronic and galacturonic acids, and is not dependent on XLNR.
Suppression Analysis Reveals a Functional Difference between the Serines in Positions Two and Five in the Consensus Sequence of the C-Terminal Domain of Yeast RNA Polymerase II

PubMed Central

Yuryev, A.; Corden, J. L.

1996-01-01

The largest subunit of RNA polymerase II contains a repetitive C-terminal domain (CTD) consisting of tandem repeats of the consensus sequence Tyr(1)Ser(2)Pro(3)Thr(4) Ser(5)Pro(6) Ser(7). Substitution of nonphosphorylatable amino acids at positions two or five of the Saccharomyces cerevisiae CTD is lethal. We developed a selection ssytem for isolating suppressors of this lethal phenotype and cloned a gene, SCA1 (suppressor of CTD alanine), which complements recessive suppressors of lethal multiple-substitution mutations. A partial deletion of SCA1 (sca1Δ::hisG) suppresses alanine or glutamate substitutions at position two of the consensus CTD sequence, and a lethal CTD truncation mutation, but SCA1 deletion does not suppress alanine or glutamate substitutions at position five. SCA1 is identical to SRB9, a suppressor of a cold-sensitive CTD truncation mutation. Strains carrying dominant SRB mutations have the same suppression properties as a sca1Δ::hisG strain. These results reveal a functional difference between positions two and five of the consensus CTD heptapeptide repeat. The ability of SCA1 and SRB mutant alleles to suppress CTD truncation mutations suggest that substitutions at position two, but not at position five, cause a defect in RNA polymerase II function similar to that introduced by CTD truncation. PMID:8725217
Mini-midi-mito: adapting the amplification and sequencing strategy of mtDNA to the degradation state of crime scene samples.

PubMed

Berger, Cordula; Parson, Walther

2009-06-01

The degradation state of some biological traces recovered from the crime scene requires the amplification of very short fragments to attain a useful mitochondrial (mt)DNA sequence. We have previously introduced two mini-multiplex assays that amplify 10 overlapping control region (CR) fragments in two separate multiplex PCRs, which brought successful CR consensus sequences from even highly degraded DNA extracts. This procedure requires a total of 20 sequencing reactions per sample, which is laborious and cost intensive. For only moderately degraded samples that we encounter more frequently with typical mtDNA casework material, we developed two new multiplex assays that use a subset of the mini-amplicon primers but embrace larger fragments (midis) and require only 10 sequencing reactions to build a double-stranded CR consensus sequence. We used a preceding mtDNA quantitation step by real-time PCR with two different target fragments (143 and 283 bp) that roughly correspond to the average fragment sizes of the different multiplex approaches to estimate size-dependent mtDNA quantities and to aid the choice of the appropriate PCR multiplexes with respect to quality of the results and required costs.
Human Splice-Site Prediction with Deep Neural Networks.

PubMed

Naito, Tatsuhiko

2018-04-18

Accurate splice-site prediction is essential to delineate gene structures from sequence data. Several computational techniques have been applied to create a system to predict canonical splice sites. For classification tasks, deep neural networks (DNNs) have achieved record-breaking results and often outperformed other supervised learning techniques. In this study, a new method of splice-site prediction using DNNs was proposed. The proposed system receives an input sequence data and returns an answer as to whether it is splice site. The length of input is 140 nucleotides, with the consensus sequence (i.e., "GT" and "AG" for the donor and acceptor sites, respectively) in the middle. Each input sequence model is applied to the pretrained DNN model that determines the probability that an input is a splice site. The model consists of convolutional layers and bidirectional long short-term memory network layers. The pretraining and validation were conducted using the data set tested in previously reported methods. The performance evaluation results showed that the proposed method can outperform the previous methods. In addition, the pattern learned by the DNNs was visualized as position frequency matrices (PFMs). Some of PFMs were very similar to the consensus sequence. The trained DNN model and the brief source code for the prediction system are uploaded. Further improvement will be achieved following the further development of DNNs.
Novel Bioinformatics-Based Approach for Proteomic Biomarkers Prediction of Calpain-2 & Caspase-3 Protease Fragmentation: Application to βII-Spectrin Protein

NASA Astrophysics Data System (ADS)

El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Kobeissy, Firas

2017-01-01

The crucial biological role of proteases has been visible with the development of degradomics discipline involved in the determination of the proteases/substrates resulting in breakdown-products (BDPs) that can be utilized as putative biomarkers associated with different biological-clinical significance. In the field of cancer biology, matrix metalloproteinases (MMPs) have shown to result in MMPs-generated protein BDPs that are indicative of malignant growth in cancer, while in the field of neural injury, calpain-2 and caspase-3 proteases generate BDPs fragments that are indicative of different neural cell death mechanisms in different injury scenarios. Advanced proteomic techniques have shown a remarkable progress in identifying these BDPs experimentally. In this work, we present a bioinformatics-based prediction method that identifies protease-associated BDPs with high precision and efficiency. The method utilizes state-of-the-art sequence matching and alignment algorithms. It starts by locating consensus sequence occurrences and their variants in any set of protein substrates, generating all fragments resulting from cleavage. The complexity exists in space O(mn) as well as in O(Nmn) time, where N, m, and n are the number of protein sequences, length of the consensus sequence, and length per protein sequence, respectively. Finally, the proposed methodology is validated against βII-spectrin protein, a brain injury validated biomarker.
AMS 4.0: consensus prediction of post-translational modifications in protein sequences.

PubMed

Plewczynski, Dariusz; Basu, Subhadip; Saha, Indrajit

2012-08-01

We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing.
Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM.

PubMed

Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

2015-01-01

Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.
Prototype foamy virus envelope glycoprotein leader peptide processing is mediated by a furin-like cellular protease, but cleavage is not essential for viral infectivity.

PubMed

Duda, Anja; Stange, Annett; Lüftenegger, Daniel; Stanke, Nicole; Westphal, Dana; Pietschmann, Thomas; Eastman, Scott W; Linial, Maxine L; Rethwilm, Axel; Lindemann, Dirk

2004-12-01

Analogous to cellular glycoproteins, viral envelope proteins contain N-terminal signal sequences responsible for targeting them to the secretory pathway. The prototype foamy virus (PFV) envelope (Env) shows a highly unusual biosynthesis. Its precursor protein has a type III membrane topology with both the N and C terminus located in the cytoplasm. Coexpression of FV glycoprotein and interaction of its leader peptide (LP) with the viral capsid is essential for viral particle budding and egress. Processing of PFV Env into the particle-associated LP, surface (SU), and transmembrane (TM) subunits occur posttranslationally during transport to the cell surface by yet-unidentified cellular proteases. Here we provide strong evidence that furin itself or a furin-like protease and not the signal peptidase complex is responsible for both processing events. N-terminal protein sequencing of the SU and TM subunits of purified PFV Env-immunoglobulin G immunoadhesin identified furin consensus sequences upstream of both cleavage sites. Mutagenesis analysis of two overlapping furin consensus sequences at the PFV LP/SU cleavage site in the wild-type protein confirmed the sequencing data and demonstrated utilization of only the first site. Fully processed SU was almost completely absent in viral particles of mutants having conserved arginine residues replaced by alanines in the first furin consensus sequence, but normal processing was observed upon mutation of the second motif. Although these mutants displayed a significant loss in infectivity as a result of reduced particle release, no correlation to processing inhibition was observed, since another mutant having normal LP/SU processing had a similar defect.
The role of recombination in the origin and evolution of Alu subfamilies.

PubMed

Teixeira-Silva, Ana; Silva, Raquel M; Carneiro, João; Amorim, António; Azevedo, Luísa

2013-01-01

Alus are the most abundant and successful short interspersed nuclear elements found in primate genomes. In humans, they represent about 10% of the genome, although few are retrotransposition-competent and are clustered into subfamilies according to the source gene from which they evolved. Recombination between them can lead to genomic rearrangements of clinical and evolutionary significance. In this study, we have addressed the role of recombination in the origin of chimeric Alu source genes by the analysis of all known consensus sequences of human Alus. From the allelic diversity of Alu consensus sequences, validated in extant elements resulting from whole genome searches, distinct events of recombination were detected in the origin of particular subfamilies of AluS and AluY source genes. These results demonstrate that at least two subfamilies are likely to have emerged from ectopic Alu-Alu recombination, which stimulates further research regarding the potential of chimeric active Alus to punctuate the genome.
Sampled-Data Consensus of Linear Multi-agent Systems With Packet Losses.

PubMed

Zhang, Wenbing; Tang, Yang; Huang, Tingwen; Kurths, Jurgen

In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.In this paper, the consensus problem is studied for a class of multi-agent systems with sampled data and packet losses, where random and deterministic packet losses are considered, respectively. For random packet losses, a Bernoulli-distributed white sequence is used to describe packet dropouts among agents in a stochastic way. For deterministic packet losses, a switched system with stable and unstable subsystems is employed to model packet dropouts in a deterministic way. The purpose of this paper is to derive consensus criteria, such that linear multi-agent systems with sampled-data and packet losses can reach consensus. By means of the Lyapunov function approach and the decomposition method, the design problem of a distributed controller is solved in terms of convex optimization. The interplay among the allowable bound of the sampling interval, the probability of random packet losses, and the rate of deterministic packet losses are explicitly derived to characterize consensus conditions. The obtained criteria are closely related to the maximum eigenvalue of the Laplacian matrix versus the second minimum eigenvalue of the Laplacian matrix, which reveals the intrinsic effect of communication topologies on consensus performance. Finally, simulations are given to show the effectiveness of the proposed results.
Investigation of DNA sequence recognition by a streptomycete MarR family transcriptional regulator through surface plasmon resonance and X-ray crystallography

PubMed Central

Stevenson, Clare E. M.; Assaad, Aoun; Chandra, Govind; Le, Tung B. K.; Greive, Sandra J.; Bibb, Mervyn J.; Lawson, David M.

2013-01-01

Consistent with their complex lifestyles and rich secondary metabolite profiles, the genomes of streptomycetes encode a plethora of transcription factors, the vast majority of which are uncharacterized. Herein, we use Surface Plasmon Resonance (SPR) to identify and delineate putative operator sites for SCO3205, a MarR family transcriptional regulator from Streptomyces coelicolor that is well represented in sequenced actinomycete genomes. In particular, we use a novel SPR footprinting approach that exploits indirect ligand capture to vastly extend the lifetime of a standard streptavidin SPR chip. We define two operator sites upstream of sco3205 and a pseudopalindromic consensus sequence derived from these enables further potential operator sites to be identified in the S. coelicolor genome. We evaluate each of these through SPR and test the importance of the conserved bases within the consensus sequence. Informed by these results, we determine the crystal structure of a SCO3205-DNA complex at 2.8 Å resolution, enabling molecular level rationalization of the SPR data. Taken together, our observations support a DNA recognition mechanism involving both direct and indirect sequence readout. PMID:23748564
A 1,681-locus consensus genetic map of cultivated cucumber including 67 NB-LRR resistance gene homolog and ten gene loci

PubMed Central

2013-01-01

Background Cucumber is an important vegetable crop that is susceptible to many pathogens, but no disease resistance (R) genes have been cloned. The availability of whole genome sequences provides an excellent opportunity for systematic identification and characterization of the nucleotide binding and leucine-rich repeat (NB-LRR) type R gene homolog (RGH) sequences in the genome. Cucumber has a very narrow genetic base making it difficult to construct high-density genetic maps. Development of a consensus map by synthesizing information from multiple segregating populations is a method of choice to increase marker density. As such, the objectives of the present study were to identify and characterize NB-LRR type RGHs, and to develop a high-density, integrated cucumber genetic-physical map anchored with RGH loci. Results From the Gy14 draft genome, 70 NB-containing RGHs were identified and characterized. Most RGHs were in clusters with uneven distribution across seven chromosomes. In silico analysis indicated that all 70 RGHs had EST support for gene expression. Phylogenetic analysis classified 58 RGHs into two clades: CNL and TNL. Comparative analysis revealed high-degree sequence homology and synteny in chromosomal locations of these RGH members between the cucumber and melon genomes. Fifty-four molecular markers were developed to delimit 67 of the 70 RGHs, which were integrated into a genetic map through linkage analysis. A 1,681-locus cucumber consensus map including 10 gene loci and spanning 730.0 cM in seven linkage groups was developed by integrating three component maps with a bin-mapping strategy. Physically, 308 scaffolds with 193.2 Mbp total DNA sequences were anchored onto this consensus map that covered 52.6% of the 367 Mbp cucumber genome. Conclusions Cucumber contains relatively few NB-LRR RGHs that are clustered and unevenly distributed in the genome. All RGHs seem to be transcribed and shared significant sequence homology and synteny with the melon genome suggesting conservation of these RGHs in the Cucumis lineage. The 1,681-locus consensus genetic-physical map developed and the RGHs identified and characterized herein are valuable genomics resources that may have many applications such as quantitative trait loci identification, map-based gene cloning, association mapping, marker-assisted selection, as well as assembly of a more complete cucumber genome. PMID:23531125
Nanoplatforms for highly sensitive fluorescence detection of cancer-related proteases.

PubMed

Wang, Hongwang; Udukala, Dinusha N; Samarakoon, Thilani N; Basel, Matthew T; Kalita, Mausam; Abayaweera, Gayani; Manawadu, Harshi; Malalasekera, Aruni; Robinson, Colette; Villanueva, David; Maynez, Pamela; Bossmann, Leonie; Riedy, Elizabeth; Barriga, Jenny; Wang, Ni; Li, Ping; Higgins, Daniel A; Zhu, Gaohong; Troyer, Deryl L; Bossmann, Stefan H

2014-02-01

Numerous proteases are known to be necessary for cancer development and progression including matrix metalloproteinases (MMPs), tissue serine proteases, and cathepsins. The goal of this research is to develop an Fe/Fe3O4 nanoparticle-based system for clinical diagnostics, which has the potential to measure the activity of cancer-associated proteases in biospecimens. Nanoparticle-based "light switches" for measuring protease activity consist of fluorescent cyanine dyes and porphyrins that are attached to Fe/Fe3O4 nanoparticles via consensus sequences. These consensus sequences can be cleaved in the presence of the correct protease, thus releasing a fluorescent dye from the Fe/Fe3O4 nanoparticle, resulting in highly sensitive (down to 1 × 10(-16) mol l(-1) for 12 proteases), selective, and fast nanoplatforms (required time: 60 min).
Characterization of cis-acting elements required for autorepression of the equine herpesvirus 1 IE gene

PubMed Central

Kim, Seongman; Dai, Gan; O’Callaghan, Dennis J.; Kim, Seong Kee

2012-01-01

The immediate-early protein (IEP), the major regulatory protein encoded by the IE gene of equine herpesvirus 1 (EHV-1), plays a crucial role as both transcription activator and repressor during a productive lytic infection. To investigate the mechanism by which the EHV-1 IEP inhibits its own promoter, IE promoter-luciferase reporter plasmids containing wild-type and mutant IEP-binding site (IEBS) were constructed and used for luciferase reporter assays. The IEP inhibited transcription from its own promoter in the presence of a consensus IEBS (5’-ATCGT-3’) located near the transcription initiation site but did not inhibit when the consensus sequence was deleted. To determine whether the distance between the TATA box and the IEBS affects transcriptional repression, the IEBS was displaced from the original site by the insertion of synthetic DNA sequences. Luciferase reporter assays revealed that the IEP is able to repress its own promoter when the IEBS is located within 26-bp from the TATA box. We also found that the proper orientation and position of the IEBS were required for the repression by the IEP. Interestingly, the level of repression was significantly reduced when a consensus TATA sequence was deleted from the promoter region, indicating that the IEP efficiently inhibits its own promoter in a TATA box-dependent manner. Taken together, these results suggest that the EHV-1 IEP delicately modulates autoregulation of its gene through the consensus IEBS that is near the transcription initiation site and the TATA box. PMID:22265772
Characterization of cis-acting elements required for autorepression of the equine herpesvirus 1 IE gene.

PubMed

Kim, Seongman; Dai, Gan; O'Callaghan, Dennis J; Kim, Seong Kee

2012-04-01

The immediate-early protein (IEP), the major regulatory protein encoded by the IE gene of equine herpesvirus 1 (EHV-1), plays a crucial role as both transcription activator and repressor during a productive lytic infection. To investigate the mechanism by which the EHV-1 IEP inhibits its own promoter, IE promoter-luciferase reporter plasmids containing wild-type and mutant IEP-binding site (IEBS) were constructed and used for luciferase reporter assays. The IEP inhibited transcription from its own promoter in the presence of a consensus IEBS (5'-ATCGT-3') located near the transcription initiation site but did not inhibit when the consensus sequence was deleted. To determine whether the distance between the TATA box and the IEBS affects transcriptional repression, the IEBS was displaced from the original site by the insertion of synthetic DNA sequences. Luciferase reporter assays revealed that the IEP is able to repress its own promoter when the IEBS is located within 26-bp from the TATA box. We also found that the proper orientation and position of the IEBS were required for the repression by the IEP. Interestingly, the level of repression was significantly reduced when a consensus TATA sequence was deleted from the promoter region, indicating that the IEP efficiently inhibits its own promoter in a TATA box-dependent manner. Taken together, these results suggest that the EHV-1 IEP delicately modulates autoregulation of its gene through the consensus IEBS that is near the transcription initiation site and the TATA box. Copyright © 2012. Published by Elsevier B.V.
Using information content and base frequencies to distinguish mutations from genetic polymorphisms in splice junction recognition sites.

PubMed

Rogan, P K; Schneider, T D

1995-01-01

Predicting the effects of nucleotide substitutions in human splice sites has been based on analysis of consensus sequences. We used a graphic representation of sequence conservation and base frequency, the sequence logo, to demonstrate that a change in a splice acceptor of hMSH2 (a gene associated with familial nonpolyposis colon cancer) probably does not reduce splicing efficiency. This confirms a population genetic study that suggested that this substitution is a genetic polymorphism. The information theory-based sequence logo is quantitative and more sensitive than the corresponding splice acceptor consensus sequence for detection of true mutations. Information analysis may potentially be used to distinguish polymorphisms from mutations in other types of transcriptional, translational, or protein-coding motifs.
The challenge of informed consent and return of results in translational genomics: empirical analysis and recommendations.

PubMed

Henderson, Gail E; Wolf, Susan M; Kuczynski, Kristine J; Joffe, Steven; Sharp, Richard R; Parsons, D Williams; Knoppers, Bartha M; Yu, Joon-Ho; Appelbaum, Paul S

2014-01-01

As exome and genome sequencing move into clinical application, questions surround how to elicit consent and handle potential return of individual genomic results. This study analyzes nine consent forms used in NIH-funded sequencing studies. Content analysis reveals considerable heterogeneity, including in defining results that may be returned, identifying potential benefits and risks of return, protecting privacy, addressing placement of results in the medical record, and data-sharing. In response to lack of consensus, we offer recommendations. © 2014 American Society of Law, Medicine & Ethics, Inc.

Efficient and Accurate Algorithm for Cleaved Fragments Prediction (CFPA) in Protein Sequences Dataset Based on Consensus and Its Variants: A Novel Degradomics Prediction Application.

PubMed

El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Hajj, Hazem; Kobeissy, Firas H

2017-01-01

Degradomics is a novel discipline that involves determination of the proteases/substrate fragmentation profile, called the substrate degradome, and has been recently applied in different disciplines. A major application of degradomics is its utility in the field of biomarkers where the breakdown products (BDPs) of different protease have been investigated. Among the major proteases assessed, calpain and caspase proteases have been associated with the execution phases of the pro-apoptotic and pro-necrotic cell death, generating caspase/calpain-specific cleaved fragments. The distinction between calpain and caspase protein fragments has been applied to distinguish injury mechanisms. Advanced proteomics technology has been used to identify these BDPs experimentally. However, it has been a challenge to identify these BDPs with high precision and efficiency, especially if we are targeting a number of proteins at one time. In this chapter, we present a novel bioinfromatic detection method that identifies BDPs accurately and efficiently with validation against experimental data. This method aims at predicting the consensus sequence occurrences and their variants in a large set of experimentally detected protein sequences based on state-of-the-art sequence matching and alignment algorithms. After detection, the method generates all the potential cleaved fragments by a specific protease. This space and time-efficient algorithm is flexible to handle the different orientations that the consensus sequence and the protein sequence can take before cleaving. It is O(mn) in space complexity and O(Nmn) in time complexity, with N number of protein sequences, m length of the consensus sequence, and n length of each protein sequence. Ultimately, this knowledge will subsequently feed into the development of a novel tool for researchers to detect diverse types of selected BDPs as putative disease markers, contributing to the diagnosis and treatment of related disorders.
Biological assay using T cell response for Cry-consensus peptide designed for the peptide-based immunotherapy of Japanese cedar pollinosis.

PubMed

Kozutsumi, Daisuke; Tsunematsu, Masako; Yamaji, Taketo; Kino, Kohsuke

2007-01-01

Cry-consensus peptide is a linearly linked peptide of T-cell epitopes for the management of Japanese cedar (JC) pollinosis and is expected to become a new drug for immunotherapy. However, the mechanism of T-cell epitopes in allergic diseases is not well understood, and thus, a simple in vitro procedure for evaluation of its biological activity is desired. Peripheral blood mononuclear cells (PBMC) were isolated from 27 JC pollinosis patients and 10 healthy subjects, and cultured in vitro for 4 days in the presence of Cry-consensus peptide and (3)H-thymidine. The relationship between growth stimulation (stimulation index; SI) and antigen-specific IgE levels in serum was also investigated in JC pollinosis patients. Moreover, to confirm the importance of the primary sequence in Cry-consensus peptide, heat-treated Cry-consensus peptide and a mixture of the amino acids of which Cry-consensus peptide is composed, and their (3)H-thymidine uptake was compared with Cry-consensus peptide. Finally, whether Cry-consensus peptide stimulates PBMCs from healthy subjects was investigated. The mean SI of JC patients showed a good correlation with Cry-consensus peptide concentration in the culture medium; however, the SI was independent of the anti-Cry j 1 IgE level. Heat-denatured Cry-consensus peptide retained a PBMC proliferation stimulatory effect comparable to the original Cry-consensus peptide, while the mixture of amino acids constituting Cry-consensus peptide did not stimulate PBMC proliferation. PBMCs from healthy subjects did not respond to Cry-consensus peptide at all. These data indicate that the PBMC response of patients suffering from JC pollinosis to Cry-consensus peptide is specific for the sequence of T cell epitopes thereof and may be useful for the evaluation of the efficacy of Cry-consensus peptide in vivo.
Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.

PubMed

Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M

2017-08-16

High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.
Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

PubMed Central

Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

1985-01-01

The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815
R2R--software to speed the depiction of aesthetic consensus RNA secondary structures.

PubMed

Weinberg, Zasha; Breaker, Ronald R

2011-01-04

With continuing identification of novel structured noncoding RNAs, there is an increasing need to create schematic diagrams showing the consensus features of these molecules. RNA structural diagrams are typically made either with general-purpose drawing programs like Adobe Illustrator, or with automated or interactive programs specific to RNA. Unfortunately, the use of applications like Illustrator is extremely time consuming, while existing RNA-specific programs produce figures that are useful, but usually not of the same aesthetic quality as those produced at great cost in Illustrator. Additionally, most existing RNA-specific applications are designed for drawing single RNA molecules, not consensus diagrams. We created R2R, a computer program that facilitates the generation of aesthetic and readable drawings of RNA consensus diagrams in a fraction of the time required with general-purpose drawing programs. Since the inference of a consensus RNA structure typically requires a multiple-sequence alignment, the R2R user annotates the alignment with commands directing the layout and annotation of the RNA. R2R creates SVG or PDF output that can be imported into Adobe Illustrator, Inkscape or CorelDRAW. R2R can be used to create consensus sequence and secondary structure models for novel RNA structures or to revise models when new representatives for known RNA classes become available. Although R2R does not currently have a graphical user interface, it has proven useful in our efforts to create 100 schematic models of distinct noncoding RNA classes. R2R makes it possible to obtain high-quality drawings of the consensus sequence and structural models of many diverse RNA structures with a more practical amount of effort. R2R software is available at http://breaker.research.yale.edu/R2R and as an Additional file.
Three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions sequence for routine imaging of the spine: preliminary experience

PubMed Central

Tins, B; Cassar-Pullicino, V; Haddaway, M; Nachtrab, U

2012-01-01

Objectives The bulk of spinal imaging is still performed with conventional two-dimensional sequences. This study assesses the suitability of three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions (SPACE) sequence for routine spinal imaging. Methods 62 MRI examinations of the spine were evaluated by 2 examiners in consensus for the depiction of anatomy and presence of artefact. We noted pathologies that might be missed using the SPACE sequence only or the SPACE and a sagittal T1 weighted sequence. The reference standards were sagittal and axial T1 weighted and T2 weighted sequences. At a later date the evaluation was repeated by one of the original examiners and an additional examiner. Results There was good agreement of the single evaluations and consensus evaluation for the conventional sequences: κ>0.8, confidence interval (CI)>0.6–1.0. For the SPACE sequence, depiction of anatomy was very good for 84% of cases, with high interobserver agreement, but there was poor interobserver agreement for other cases. For artefact assessment of SPACE, κ=0.92, CI=0.92–1.0. The SPACE sequence was superior to conventional sequences for depiction of anatomy and artefact resistance. The SPACE sequence occasionally missed bone marrow oedema. In conjunction with sagittal T1 weighted sequences, no abnormality was missed. The isotropic SPACE sequence was superior to conventional sequences in imaging difficult anatomy such as in scoliosis and spondylolysis. Conclusion The SPACE sequence allows excellent assessment of anatomy owing to high spatial resolution and resistance to artefact. The sensitivity for bone marrow abnormalities is limited. PMID:22374284
Defining a Conformational Consensus Motif in Cotransin-Sensitive Signal Sequences: A Proteomic and Site-Directed Mutagenesis Study

PubMed Central

Klein, Wolfgang; Westendorf, Carolin; Schmidt, Antje; Conill-Cortés, Mercè; Rutz, Claudia; Blohs, Marcus; Beyermann, Michael; Protze, Jonas; Krause, Gerd; Krause, Eberhard; Schülein, Ralf

2015-01-01

The cyclodepsipeptide cotransin was described to inhibit the biosynthesis of a small subset of proteins by a signal sequence-discriminatory mechanism at the Sec61 protein-conducting channel. However, it was not clear how selective cotransin is, i.e. how many proteins are sensitive. Moreover, a consensus motif in signal sequences mediating cotransin sensitivity has yet not been described. To address these questions, we performed a proteomic study using cotransin-treated human hepatocellular carcinoma cells and the stable isotope labelling by amino acids in cell culture technique in combination with quantitative mass spectrometry. We used a saturating concentration of cotransin (30 micromolar) to identify also less-sensitive proteins and to discriminate the latter from completely resistant proteins. We found that the biosynthesis of almost all secreted proteins was cotransin-sensitive under these conditions. In contrast, biosynthesis of the majority of the integral membrane proteins was cotransin-resistant. Cotransin sensitivity of signal sequences was neither related to their length nor to their hydrophobicity. Instead, in the case of signal anchor sequences, we identified for the first time a conformational consensus motif mediating cotransin sensitivity. PMID:25806945
Interactive web-based identification and visualization of transcript shared sequences.

PubMed

Azhir, Alaleh; Merino, Louis-Henri; Nauen, David W

2018-05-12

We have developed TraC (Transcript Consensus), a web-based tool for detecting and visualizing shared sequences among two or more mRNA transcripts such as splice variants. Results including exon-exon boundaries are returned in a highly intuitive, data-rich, interactive plot that permits users to explore the similarities and differences of multiple transcript sequences. The online tool (http://labs.pathology.jhu.edu/nauen/trac/) is free to use. The source code is freely available for download (https://github.com/nauenlab/TraC). Copyright © 2018 Elsevier Inc. All rights reserved.
From cheek swabs to consensus sequences: an A to Z protocol for high-throughput DNA sequencing of complete human mitochondrial genomes

PubMed Central

2014-01-01

Background Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources. PMID:24460871
Bioinformatics prediction of siRNAs as potential antiviral agents against dengue viruses

PubMed Central

Villegas-Rosales, Paula M; Méndez-Tenorio, Alfonso; Ortega-Soto, Elizabeth; Barrón, Blanca L

2012-01-01

Dengue virus (DENV 1-4) represents the major emerging arthropod-borne viral infection in the world. Currently, there is neither an available vaccine nor a specific treatment. Hence, there is a need of antiviral drugs for these viral infections; we describe the prediction of short interfering RNA (siRNA) as potential therapeutic agents against the four DENV serotypes. Our strategy was to carry out a series of multiple alignments using ClustalX program to find conserved sequences among the four DENV serotype genomes to obtain a consensus sequence for siRNAs design. A highly conserved sequence among the four DENV serotypes, located in the encoding sequence for NS4B and NS5 proteins was found. A total of 2,893 complete DENV genomes were downloaded from the NCBI, and after a depuration procedure to identify identical sequences, 220 complete DENV genomes were left. They were edited to select the NS4B and NS5 sequences, which were aligned to obtain a consensus sequence. Three different servers were used for siRNA design, and the resulting siRNAs were aligned to identify the most prevalent sequences. Three siRNAs were chosen, one targeted the genome region that codifies for NS4B protein and the other two; the region for NS5 protein. Predicted secondary structure for DENV genomes was used to demonstrate that the siRNAs were able to target the viral genome forming double stranded structures, necessary to activate the RNA silencing machinery. PMID:22829722
BASIC PENTACYSTEINE Proteins Mediate MADS Domain Complex Binding to the DNA for Tissue-Specific Expression of Target Genes in Arabidopsis[W

PubMed Central

Simonini, Sara; Roig-Villanova, Irma; Gregis, Veronica; Colombo, Bilitis; Colombo, Lucia; Kater, Martin M.

2012-01-01

BASIC PENTACYSTEINE (BPC) transcription factors have been identified in a large variety of plant species. In Arabidopsis thaliana there are seven BPC genes, which, except for BPC5, are expressed ubiquitously. BPC genes are functionally redundant in a wide range of developmental processes. Recently, we reported that BPC1 binds to guanine and adenine (GA)–rich consensus sequences in the SEEDSTICK (STK) promoter in vitro and induces conformational changes. Here we show by chromatin immunoprecipitation experiments that in vivo BPCs also bind to the consensus boxes, and when these were mutated, expression from the STK promoter was derepressed, resulting in ectopic expression in the inflorescence. We also reveal that SHORT VEGETATIVE PHASE (SVP) is a direct regulator of STK. SVP is a floral meristem identity gene belonging to the MADS box gene family. The SVP-APETALA1 (AP1) dimer recruits the SEUSS (SEU)-LEUNIG (LUG) transcriptional cosuppressor to repress floral homeotic gene expression in the floral meristem. Interestingly, we found that GA consensus sequences in the STK promoter to which BPCs bind are essential for recruitment of the corepressor complex to this promoter. Our data suggest that we have identified a new regulatory mechanism controlling plant gene expression that is probably generally used, when considering BPCs’ wide expression profile and the frequent presence of consensus binding sites in plant promoters. PMID:23054472
The Role of Recombination in the Origin and Evolution of Alu Subfamilies

PubMed Central

Teixeira-Silva, Ana; Silva, Raquel M.; Carneiro, João; Amorim, António; Azevedo, Luísa

2013-01-01

Alus are the most abundant and successful short interspersed nuclear elements found in primate genomes. In humans, they represent about 10% of the genome, although few are retrotransposition-competent and are clustered into subfamilies according to the source gene from which they evolved. Recombination between them can lead to genomic rearrangements of clinical and evolutionary significance. In this study, we have addressed the role of recombination in the origin of chimeric Alu source genes by the analysis of all known consensus sequences of human Alus. From the allelic diversity of Alu consensus sequences, validated in extant elements resulting from whole genome searches, distinct events of recombination were detected in the origin of particular subfamilies of AluS and AluY source genes. These results demonstrate that at least two subfamilies are likely to have emerged from ectopic Alu-Alu recombination, which stimulates further research regarding the potential of chimeric active Alus to punctuate the genome. PMID:23750218
Using multi-locus allelic sequence data to estimate genetic divergence among four Lilium (Liliaceae) cultivars

PubMed Central

Shahin, Arwa; Smulders, Marinus J. M.; van Tuyl, Jaap M.; Arens, Paul; Bakker, Freek T.

2014-01-01

Next Generation Sequencing (NGS) may enable estimating relationships among genotypes using allelic variation of multiple nuclear genes simultaneously. We explored the potential and caveats of this strategy in four genetically distant Lilium cultivars to estimate their genetic divergence from transcriptome sequences using three approaches: POFAD (Phylogeny of Organisms from Allelic Data, uses allelic information of sequence data), RAxML (Randomized Accelerated Maximum Likelihood, tree building based on concatenated consensus sequences) and Consensus Network (constructing a network summarizing among gene tree conflicts). Twenty six gene contigs were chosen based on the presence of orthologous sequences in all cultivars, seven of which also had an orthologous sequence in Tulipa, used as out-group. The three approaches generated the same topology. Although the resolution offered by these approaches is high, in this case there was no extra benefit in using allelic information. We conclude that these 26 genes can be widely applied to construct a species tree for the genus Lilium. PMID:25368628
Predicting the transmembrane secondary structure of ligand-gated ion channels.

PubMed

Bertaccini, E; Trudell, J R

2002-06-01

Recent mutational analyses of ligand-gated ion channels (LGICs) have demonstrated a plausible site of anesthetic action within their transmembrane domains. Although there is a consensus that the transmembrane domain is formed from four membrane-spanning segments, the secondary structure of these segments is not known. We utilized 10 state-of-the-art bioinformatics techniques to predict the transmembrane topology of the tetrameric regions within six members of the LGIC family that are relevant to anesthetic action. They are the human forms of the GABA alpha 1 receptor, the glycine alpha 1 receptor, the 5HT3 serotonin receptor, the nicotinic AChR alpha 4 and alpha 7 receptors and the Torpedo nAChR alpha 1 receptor. The algorithms utilized were HMMTOP, TMHMM, TMPred, PHDhtm, DAS, TMFinder, SOSUI, TMAP, MEMSAT and TOPPred2. The resulting predictions were superimposed on to a multiple sequence alignment of the six amino acid sequences created using the CLUSTAL W algorithm. There was a clear statistical consensus for the presence of four alpha helices in those regions experimentally thought to span the membrane. The consensus of 10 topology prediction techniques supports the hypothesis that the transmembrane subunits of the LGICs are tetrameric bundles of alpha helices.
Quantitative mutant analysis of viral quasispecies by chip-based matrix-assisted laser desorption/ ionization time-of-flight mass spectrometry

PubMed Central

Amexis, Georgios; Oeth, Paul; Abel, Kenneth; Ivshina, Anna; Pelloquin, Francois; Cantor, Charles R.; Braun, Andreas; Chumakov, Konstantin

2001-01-01

RNA viruses exist as quasispecies, heterogeneous and dynamic mixtures of mutants having one or more consensus sequences. An adequate description of the genomic structure of such viral populations must include the consensus sequence(s) plus a quantitative assessment of sequence heterogeneities. For example, in quality control of live attenuated viral vaccines, the presence of even small quantities of mutants or revertants may indicate incomplete or unstable attenuation that may influence vaccine safety. Previously, we demonstrated the monitoring of oral poliovirus vaccine with the use of mutant analysis by PCR and restriction enzyme cleavage (MAPREC). In this report, we investigate genetic variation in live attenuated mumps virus vaccine by using both MAPREC and a platform (DNA MassArray) based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. Mumps vaccines prepared from the Jeryl Lynn strain typically contain at least two distinct viral substrains, JL1 and JL2, which have been characterized by full length sequencing. We report the development of assays for characterizing sequence variants in these substrains and demonstrate their use in quantitative analysis of substrains and sequence variations in mixed virus cultures and mumps vaccines. The results obtained from both the MAPREC and MALDI-TOF methods showed excellent correlation. This suggests the potential utility of MALDI-TOF for routine quality control of live viral vaccines and for assessment of genetic stability and quantitative monitoring of genetic changes in other RNA viruses of clinical interest. PMID:11593021
Single molecule sequencing of the M13 virus genome without amplification

PubMed Central

Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X.; Yan, Qin; Deem, Michael W.; He, Jiankui

2017-01-01

Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias. PMID:29253901
Single molecule sequencing of the M13 virus genome without amplification.

PubMed

Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X; Yan, Qin; Deem, Michael W; He, Jiankui

2017-01-01

Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias.
Integration of next-generation sequencing in clinical diagnostic molecular pathology laboratories for analysis of solid tumours; an expert opinion on behalf of IQN Path ASBL.

PubMed

Deans, Zandra C; Costa, Jose Luis; Cree, Ian; Dequeker, Els; Edsjö, Anders; Henderson, Shirley; Hummel, Michael; Ligtenberg, Marjolijn Jl; Loddo, Marco; Machado, Jose Carlos; Marchetti, Antonio; Marquis, Katherine; Mason, Joanne; Normanno, Nicola; Rouleau, Etienne; Schuuring, Ed; Snelson, Keeda-Marie; Thunnissen, Erik; Tops, Bastiaan; Williams, Gareth; van Krieken, Han; Hall, Jacqueline A

2017-01-01

The clinical demand for mutation detection within multiple genes from a single tumour sample requires molecular diagnostic laboratories to develop rapid, high-throughput, highly sensitive, accurate and parallel testing within tight budget constraints. To meet this demand, many laboratories employ next-generation sequencing (NGS) based on small amplicons. Building on existing publications and general guidance for the clinical use of NGS and learnings from germline testing, the following guidelines establish consensus standards for somatic diagnostic testing, specifically for identifying and reporting mutations in solid tumours. These guidelines cover the testing strategy, implementation of testing within clinical service, sample requirements, data analysis and reporting of results. In conjunction with appropriate staff training and international standards for laboratory testing, these consensus standards for the use of NGS in molecular pathology of solid tumours will assist laboratories in implementing NGS in clinical services.
Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

PubMed

Chin, Chen-Shan; Alexander, David H; Marks, Patrick; Klammer, Aaron A; Drake, James; Heiner, Cheryl; Clum, Alicia; Copeland, Alex; Huddleston, John; Eichler, Evan E; Turner, Stephen W; Korlach, Jonas

2013-06-01

We present a hierarchical genome-assembly process (HGAP) for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. Our method uses the longest reads as seeds to recruit all other reads for construction of highly accurate preassembled reads through a directed acyclic graph-based consensus procedure, which we follow with assembly using off-the-shelf long-read assemblers. In contrast to hybrid approaches, HGAP does not require highly accurate raw reads for error correction. We demonstrate efficient genome assembly for several microorganisms using as few as three SMRT Cell zero-mode waveguide arrays of sequencing and for BACs using just one SMRT Cell. Long repeat regions can be successfully resolved with this workflow. We also describe a consensus algorithm that incorporates SMRT sequencing primary quality values to produce de novo genome sequence exceeding 99.999% accuracy.
Consensus-Degenerate Hybrid Oligonucleotide Primers for Amplification of Priming Glycosyltransferase Genes of the Exopolysaccharide Locus in Strains of the Lactobacillus casei Group

PubMed Central

Provencher, Cathy; LaPointe, Gisèle; Sirois, Stéphane; Van Calsteren, Marie-Rose; Roy, Denis

2003-01-01

A primer design strategy named CODEHOP (consensus-degenerate hybrid oligonucleotide primer) for amplification of distantly related sequences was used to detect the priming glycosyltransferase (GT) gene in strains of the Lactobacillus casei group. Each hybrid primer consisted of a short 3′ degenerate core based on four highly conserved amino acids and a longer 5′ consensus clamp region based on six sequences of the priming GT gene products from exopolysaccharide (EPS)-producing bacteria. The hybrid primers were used to detect the priming GT gene of 44 commercial isolates and reference strains of Lactobacillus rhamnosus, L. casei, Lactobacillus zeae, and Streptococcus thermophilus. The priming GT gene was detected in the genome of both non-EPS-producing (EPS−) and EPS-producing (EPS+) strains of L. rhamnosus. The sequences of the cloned PCR products were similar to those of the priming GT gene of various gram-negative and gram-positive EPS+ bacteria. Specific primers designed from the L. rhamnosus RW-9595M GT gene were used to sequence the end of the priming GT gene in selected EPS+ strains of L. rhamnosus. Phylogenetic analysis revealed that Lactobacillus spp. form a distinctive group apart from other lactic acid bacteria for which GT genes have been characterized to date. Moreover, the sequences show a divergence existing among strains of L. rhamnosus with respect to the terminal region of the priming GT gene. Thus, the PCR approach with consensus-degenerate hybrid primers designed with CODEHOP is a practical approach for the detection of similar genes containing conserved motifs in different bacterial genomes. PMID:12788729

Detection and Analysis of Six Lizard Adenoviruses by Consensus Primer PCR Provides Further Evidence of a Reptilian Origin for the Atadenoviruses

PubMed Central

Wellehan, James F. X.; Johnson, April J.; Harrach, Balázs; Benkö, Mária; Pessier, Allan P.; Johnson, Calvin M.; Garner, Michael M.; Childress, April; Jacobson, Elliott R.

2004-01-01

A consensus nested-PCR method was designed for investigation of the DNA polymerase gene of adenoviruses. Gene fragments were amplified and sequenced from six novel adenoviruses from seven lizard species, including four species from which adenoviruses had not previously been reported. Host species included Gila monster, leopard gecko, fat-tail gecko, blue-tongued skink, Tokay gecko, bearded dragon, and mountain chameleon. This is the first sequence information from lizard adenoviruses. Phylogenetic analysis indicated that these viruses belong to the genus Atadenovirus, supporting the reptilian origin of atadenoviruses. This PCR method may be useful for obtaining templates for initial sequencing of novel adenoviruses. PMID:15542689
Detection and analysis of six lizard adenoviruses by consensus primer PCR provides further evidence of a reptilian origin for the atadenoviruses.

PubMed

Wellehan, James F X; Johnson, April J; Harrach, Balázs; Benkö, Mária; Pessier, Allan P; Johnson, Calvin M; Garner, Michael M; Childress, April; Jacobson, Elliott R

2004-12-01

A consensus nested-PCR method was designed for investigation of the DNA polymerase gene of adenoviruses. Gene fragments were amplified and sequenced from six novel adenoviruses from seven lizard species, including four species from which adenoviruses had not previously been reported. Host species included Gila monster, leopard gecko, fat-tail gecko, blue-tongued skink, Tokay gecko, bearded dragon, and mountain chameleon. This is the first sequence information from lizard adenoviruses. Phylogenetic analysis indicated that these viruses belong to the genus Atadenovirus, supporting the reptilian origin of atadenoviruses. This PCR method may be useful for obtaining templates for initial sequencing of novel adenoviruses.
Identification and application of self-binding zipper-like sequences in SARS-CoV spike protein.

PubMed

Zhang, Si Min; Liao, Ying; Neo, Tuan Ling; Lu, Yanning; Liu, Ding Xiang; Vahlne, Anders; Tam, James P

2018-05-22

Self-binding peptides containing zipper-like sequences, such as the Leu/Ile zipper sequence within the coiled coil regions of proteins and the cross-β spine steric zippers within the amyloid-like fibrils, could bind to the protein-of-origin through homophilic sequence-specific zipper motifs. These self-binding sequences represent opportunities for the development of biochemical tools and/or therapeutics. Here, we report on the identification of a putative self-binding β-zipper-forming peptide within the severe acute respiratory syndrome-associated coronavirus spike (S) protein and its application in viral detection. Peptide array scanning of overlapping peptides covering the entire length of S protein identified 34 putative self-binding peptides of six clusters, five of which contained octapeptide core consensus sequences. The Cluster I consensus octapeptide sequence GINITNFR was predicted by the Eisenberg's 3D profile method to have high amyloid-like fibrillation potential through steric β-zipper formation. Peptide C6 containing the Cluster I consensus sequence was shown to oligomerize and form amyloid-like fibrils. Taking advantage of this, C6 was further applied to detect the S protein expression in vitro by fluorescence staining. Meanwhile, the coiled-coil-forming Leu/Ile heptad repeat sequences within the S protein were under-represented during peptide array scanning, in agreement with that long peptide lengths were required to attain high helix-mediated interaction avidity. The data suggest that short β-zipper-like self-binding peptides within the S protein could be identified through combining the peptide scanning and predictive methods, and could be exploited as biochemical detection reagents for viral infection. Copyright © 2018. Published by Elsevier Ltd.
Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.

1987-01-01

The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homologymore » (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.« less
Genotypic and Functional Impact of HIV-1 Adaptation to Its Host Population during the North American Epidemic

PubMed Central

Carlson, Jonathan M.; Chan, Benjamin; Chopera, Denis R.; Brumme, Chanson J.; Markle, Tristan J.; Martin, Eric; Shahid, Aniqa; Anmole, Gursev; Mwimanzi, Philip; Nassab, Pauline; Penney, Kali A.; Rahman, Manal A.; Milloy, M.-J.; Schechter, Martin T.; Markowitz, Martin; Carrington, Mary; Walker, Bruce D.; Wagner, Theresa; Buchbinder, Susan; Fuchs, Jonathan; Koblin, Beryl; Mayer, Kenneth H.; Harrigan, P. Richard; Brockman, Mark A.; Poon, Art F. Y.; Brumme, Zabrina L.

2014-01-01

HLA-restricted immune escape mutations that persist following HIV transmission could gradually spread through the viral population, thereby compromising host antiviral immunity as the epidemic progresses. To assess the extent and phenotypic impact of this phenomenon in an immunogenetically diverse population, we genotypically and functionally compared linked HLA and HIV (Gag/Nef) sequences from 358 historic (1979–1989) and 382 modern (2000–2011) specimens from four key cities in the North American epidemic (New York, Boston, San Francisco, Vancouver). Inferred HIV phylogenies were star-like, with approximately two-fold greater mean pairwise distances in modern versus historic sequences. The reconstructed epidemic ancestral (founder) HIV sequence was essentially identical to the North American subtype B consensus. Consistent with gradual diversification of a “consensus-like” founder virus, the median “background” frequencies of individual HLA-associated polymorphisms in HIV (in individuals lacking the restricting HLA[s]) were ∼2-fold higher in modern versus historic HIV sequences, though these remained notably low overall (e.g. in Gag, medians were 3.7% in the 2000s versus 2.0% in the 1980s). HIV polymorphisms exhibiting the greatest relative spread were those restricted by protective HLAs. Despite these increases, when HIV sequences were analyzed as a whole, their total average burden of polymorphisms that were “pre-adapted” to the average host HLA profile was only ∼2% greater in modern versus historic eras. Furthermore, HLA-associated polymorphisms identified in historic HIV sequences were consistent with those detectable today, with none identified that could explain the few HIV codons where the inferred epidemic ancestor differed from the modern consensus. Results are therefore consistent with slow HIV adaptation to HLA, but at a rate unlikely to yield imminent negative implications for cellular immunity, at least in North America. Intriguingly, temporal changes in protein activity of patient-derived Nef (though not Gag) sequences were observed, suggesting functional implications of population-level HIV evolution on certain viral proteins. PMID:24762668
Discovering weighted patterns in intron sequences using self-adaptive harmony search and back-propagation algorithms.

PubMed

Huang, Yin-Fu; Wang, Chia-Ming; Liou, Sing-Wu

2013-01-01

A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete.
Discovering Weighted Patterns in Intron Sequences Using Self-Adaptive Harmony Search and Back-Propagation Algorithms

PubMed Central

Wang, Chia-Ming; Liou, Sing-Wu

2013-01-01

A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete. PMID:23737711
A Bioinformatics-Based Alternative mRNA Splicing Code that May Explain Some Disease Mutations Is Conserved in Animals.

PubMed

Qu, Wen; Cingolani, Pablo; Zeeberg, Barry R; Ruden, Douglas M

2017-01-01

Deep sequencing of cDNAs made from spliced mRNAs indicates that most coding genes in many animals and plants have pre-mRNA transcripts that are alternatively spliced. In pre-mRNAs, in addition to invariant exons that are present in almost all mature mRNA products, there are at least 6 additional types of exons, such as exons from alternative promoters or with alternative polyA sites, mutually exclusive exons, skipped exons, or exons with alternative 5' or 3' splice sites. Our bioinformatics-based hypothesis is that, in analogy to the genetic code, there is an "alternative-splicing code" in introns and flanking exon sequences, analogous to the genetic code, that directs alternative splicing of many of the 36 types of introns. In humans, we identified 42 different consensus sequences that are each present in at least 100 human introns. 37 of the 42 top consensus sequences are significantly enriched or depleted in at least one of the 36 types of introns. We further supported our hypothesis by showing that 96 out of 96 analyzed human disease mutations that affect RNA splicing, and change alternative splicing from one class to another, can be partially explained by a mutation altering a consensus sequence from one type of intron to that of another type of intron. Some of the alternative splicing consensus sequences, and presumably their small-RNA or protein targets, are evolutionarily conserved from 50 plant to animal species. We also noticed the set of introns within a gene usually share the same splicing codes, thus arguing that one sub-type of splicesosome might process all (or most) of the introns in a given gene. Our work sheds new light on a possible mechanism for generating the tremendous diversity in protein structure by alternative splicing of pre-mRNAs.
Molecular detection of Sarcocystis lutrae in the European badger (Meles meles) in Scotland.

PubMed

Lepore, T; Bartley, P M; Chianini, F; Macrae, A I; Innes, E A; Katzer, F

2017-09-01

Neck samples from 54 badgers and 32 tongue samples of the same badgers (Meles meles), collected in the Lothians and Borders regions of Scotland, were tested using polymerase chain reactions (PCRs) directed against the 18S ribosomal DNA and the internal transcribed spacer (ITS1) region of protozoan parasites of the family Sarcocystidae. Positive results were obtained from 36/54 (67%) neck and 24/32 (75%) tongue samples using an 18S rDNA PCR. A 468 base pair consensus sequence that was generated from the 18S rDNA PCR amplicons (KX229728) showed 100% identity to Sarcocystis lutrae. The ITS1 PCR results revealed that 12/20 (60%) neck and 10/20 (50%) tongue samples were positive for Sarcocystidae DNA. A 1074 bp consensus sequence was generated from the ITS1 PCR amplicons (KX431307) and showed 100% identity to S. lutrae. Multiple sequence alignments and phylogenetic analysis support the finding that the rDNA found in badgers is identical to that of S. lutrae. This parasite has not been previously reported in badgers or in the UK. Sarcocystis lutrae has previously only been detected in tongue, skeletal muscle and diaphragm samples of the Eurasian otter (Lutra lutra) in Norway and potentially in the Arctic fox (Vulpes lagopus).
Cloning, sequencing and characterization of lipase genes from a polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans

USDA-ARS?s Scientific Manuscript database

Lipase (lip) and lipase-specific foldase (lif) genes of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans NRRL B-2649 were cloned using primers based on consensus sequences, followed by PCR-based genome walking. Sequence analyses showed a putative Lip gene-product (...
Insights Into Upland Cotton (Gossypium hirsutum L.) Genetic Recombination Based on 3 High-Density Single-Nucleotide Polymorphism and a Consensus Map Developed Independently With Common Parents.

PubMed

Ulloa, Mauricio; Hulse-Kemp, Amanda M; De Santiago, Luis M; Stelly, David M; Burke, John J

2017-01-01

High-density linkage maps are vital to supporting the correct placement of scaffolds and gene sequences on chromosomes and fundamental to contemporary organismal research and scientific approaches to genetic improvement, especially in paleopolyploids with exceptionally complex genomes, eg, upland cotton ( Gossypium hirsutum L., "2n = 52"). Three independently developed intraspecific upland mapping populations were analyzed to generate 3 high-density genetic linkage single-nucleotide polymorphism (SNP) maps and a consensus map using the CottonSNP63K array. The populations consisted of a previously reported F 2 , a recombinant inbred line (RIL), and reciprocal RIL population, from "Phytogen 72" and "Stoneville 474" cultivars. The cluster file provided 7417 genotyped SNP markers, resulting in 26 linkage groups corresponding to the 26 chromosomes (c) of the allotetraploid upland cotton (AD) 1 arisen from the merging of 2 genomes ("A" Old World and "D" New World). Patterns of chromosome-specific recombination were largely consistent across mapping populations. The high-density genetic consensus map included 7244 SNP markers that spanned 3538 cM and comprised 3824 SNP bins, of which 1783 and 2041 were in the A t and D t subgenomes with 1825 and 1713 cM map lengths, respectively. Subgenome average distances were nearly identical, indicating that subgenomic differences in bin number arose due to the high numbers of SNPs on the D t subgenome. Examination of expected recombination frequency or crossovers (COs) on the chromosomes within each population of the 2 subgenomes revealed that COs were also not affected by the SNPs or SNP bin number in these subgenomes. Comparative alignment analyses identified historical ancestral A t -subgenomic translocations of c02 and c03, as well as of c04 and c05. The consensus map SNP sequences aligned with high congruency to the NBI assembly of Gossypium hirsutum . However, the genomic comparisons revealed evidence of additional unconfirmed possible duplications, inversions and translocations, and unbalance SNP sequence homology or SNP sequence/loci genomic dominance, or homeolog loci bias of the upland tetraploid A t and D t subgenomes. The alignments indicated that 364 SNP-associated previously unintegrated scaffolds can be placed in pseudochromosomes of the NBI G hirsutum assembly. This is the first intraspecific SNP genetic linkage consensus map assembled in G hirsutum with a core of reproducible mendelian SNP markers assayed on different populations and it provides further knowledge of chromosome arrangement of genic and nongenic SNPs. Together, the consensus map and RIL populations provide a synergistically useful platform for localizing and identifying agronomically important loci for improvement of the cotton crop.
Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets

PubMed Central

2010-01-01

Background Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. Results In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score SSU, that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. Conclusions Our results show that proteins unclassified by SSU belong to a twilight zone. Proteins left unclassified by the consensus score SSU have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated. PMID:20409339
Presence of a consensus DNA motif at nearby DNA sequence of the mutation susceptible CG nucleotides.

PubMed

Chowdhury, Kaushik; Kumar, Suresh; Sharma, Tanu; Sharma, Ankit; Bhagat, Meenakshi; Kamai, Asangla; Ford, Bridget M; Asthana, Shailendra; Mandal, Chandi C

2018-01-10

Complexity in tissues affected by cancer arises from somatic mutations and epigenetic modifications in the genome. The mutation susceptible hotspots present within the genome indicate a non-random nature and/or a position specific selection of mutation. An association exists between the occurrence of mutations and epigenetic DNA methylation. This study is primarily aimed at determining mutation status, and identifying a signature for predicting mutation prone zones of tumor suppressor (TS) genes. Nearby sequences from the top five positions having a higher mutation frequency in each gene of 42 TS genes were selected from a cosmic database and were considered as mutation prone zones. The conserved motifs present in the mutation prone DNA fragments were identified. Molecular docking studies were done to determine putative interactions between the identified conserved motifs and enzyme methyltransferase DNMT1. Collective analysis of 42 TS genes found GC as the most commonly replaced and AT as the most commonly formed residues after mutation. Analysis of the top 5 mutated positions of each gene (210 DNA segments for 42 TS genes) identified that CG nucleotides of the amino acid codons (e.g., Arginine) are most susceptible to mutation, and found a consensus DNA "T/AGC/GAGGA/TG" sequence present in these mutation prone DNA segments. Similar to TS genes, analysis of 54 oncogenes not only found CG nucleotides of the amino acid Arg as the most susceptible to mutation, but also identified the presence of similar consensus DNA motifs in the mutation prone DNA fragments (270 DNA segments for 54 oncogenes) of oncogenes. Docking studies depicted that, upon binding of DNMT1 methylates to this consensus DNA motif (C residues of CpG islands), mutation was likely to occur. Thus, this study proposes that DNMT1 mediated methylation in chromosomal DNA may decrease if a foreign DNA segment containing this consensus sequence along with CG nucleotides is exogenously introduced to dividing cancer cells. Copyright © 2017 Elsevier B.V. All rights reserved.
Information Performances and Illative Sequences: Sequential Organization of Explanations of Chemical Phase Equilibrium

ERIC Educational Resources Information Center

Brown, Nathaniel James Swanton

2009-01-01

While there is consensus that conceptual change is surprisingly difficult, many competing theories of conceptual change co-exist in the literature. This dissertation argues that this discord is partly the result of an inadequate account of the unwritten rules of human social interaction that underlie the field's preferred…
A consensus linkage map of lentil based on DArT markers from three RIL mapping populations.

PubMed

Ates, Duygu; Aldemir, Secil; Alsaleh, Ahmad; Erdogmus, Semih; Nemli, Seda; Kahriman, Abdullah; Ozkan, Hakan; Vandenberg, Albert; Tanyolac, Bahattin

2018-01-01

Lentil (Lens culinaris ssp. culinaris Medikus) is a diploid (2n = 2x = 14), self-pollinating grain legume with a haploid genome size of about 4 Gbp and is grown throughout the world with current annual production of 4.9 million tonnes. A consensus map of lentil (Lens culinaris ssp. culinaris Medikus) was constructed using three different lentils recombinant inbred line (RIL) populations, including "CDC Redberry" x "ILL7502" (LR8), "ILL8006" x "CDC Milestone" (LR11) and "PI320937" x "Eston" (LR39). The lentil consensus map was composed of 9,793 DArT markers, covered a total of 977.47 cM with an average distance of 0.10 cM between adjacent markers and constructed 7 linkage groups representing 7 chromosomes of the lentil genome. The consensus map had no gap larger than 12.67 cM and only 5 gaps were found to be between 12.67 cM and 6.0 cM (on LG3 and LG4). The localization of the SNP markers on the lentil consensus map were in general consistent with their localization on the three individual genetic linkage maps and the lentil consensus map has longer map length, higher marker density and shorter average distance between the adjacent markers compared to the component linkage maps. This high-density consensus map could provide insight into the lentil genome. The consensus map could also help to construct a physical map using a Bacterial Artificial Chromosome library and map based cloning studies. Sequence information of DArT may help localization of orientation scaffolds from Next Generation Sequencing data.
Definition of a consensus DNA-binding site for PecS, a global regulator of virulence gene expression in Erwinia chrysanthemi and identification of new members of the PecS regulon.

PubMed

Rouanet, Carine; Reverchon, Sylvie; Rodionov, Dmitry A; Nasser, William

2004-07-16

In Erwinia chrysanthemi, production of pectic enzymes is modulated by a complex network involving several regulators. One of them, PecS, which belongs to the MarR family, also controls the synthesis of various other virulence factors, such as cellulases and indigoidine. Here, the PecS consensus-binding site is defined by combining a systematic evolution of ligands by an exponential enrichment approach and mutational analyses. The consensus consists of a 23-base pair palindromic-like sequence (C(-11)G(-10)A(-9)N(-8)W(-7)T(-6)C(-5)G(-4)T(-3)A(-2))T(-1)A(0)T(1)(T(2)A(3)C(4)G(5)A(6)N(7)N(8)N(9)C(10)G(11)). Mutational experiments revealed that (i) the palindromic organization is required for the binding of PecS, (ii) the very conserved part of the consensus (-6 to 6) allows for a specific interaction with PecS, but the presence of the relatively degenerated bases located apart significantly increases PecS affinity, (iii) the four bases G, A, T, and C are required for efficient binding of PecS, and (iv) the presence of several binding sites on the same promoter increases the affinity of PecS. This consensus is detected in the regions involved in PecS binding on the previously characterized target genes. This variable consensus is in agreement with the observation that the members of the MarR family are able to bind various DNA targets as dimers by means of a winged helix DNA-binding motif. Binding of PecS on a promoter region containing the defined consensus results in a repression of gene transcription in vitro. Preliminary scanning of the E. chrysanthemi genome sequence with the consensus revealed the presence of strong PecS-binding sites in the intergenic region between fliE and fliFGHIJKLMNOPQR which encode proteins involved in the biogenesis of flagellum. Accordingly, PecS directly represses fliE expression. Thus, PecS seems to control the synthesis of virulence factors required for the key steps of plant infection.
Molecular identification and characterization of clustered regularly interspaced short palindromic repeat (CRISPR) gene cluster in Taylorella equigenitalis.

PubMed

Hara, Yasushi; Hayashi, Kyohei; Nakajima, Takuya; Kagawa, Shizuko; Tazumi, Akihiro; Moore, John E; Matsuda, Motoo

2013-09-01

Clustered regularly interspaced short palindromic repeats (CRISPRs), of approximately 10,000 base pairs (bp) in length, were shown to occur in the Japanese Taylorella equigenitalis strain, EQ59. The locus was composed of the putative CRISPRs-associated with 5 (cas5), RAMP csd1, csd2, recB, cas1, a leader region, 13 CRISPR consensus sequence repeats (each 32 bp; 5'-TCAGCCACGTTCGCGTGGCTGTGTGTTTAAAG-3'). These were in turn separated by 12 non repetitive unique spacer regions of similar length. In addition, a leader region, a transposase/IS protein, a leader region, and cas3 were also seen. All seven putative open reading frames carry their ribosome binding sites. Promoter consensus sequences at the -35 and -10 regions and putative intrinsic ρ-independent transcription terminator regions also occurred. A possible long overlap of 170 bp in length occurred between the recB and cas1 loci. Positive reverse transcription PCR signals of cas5, RAMP csd1, csd2-recB/cas1, and cas3 were generated. A putative secondary structure of the CRISPR consensus repeats was constructed. Following this, CRISPR results of the T. equigenitalis EQ59 isolate were subsequently compared with those from the Taylorella asinigenitalis MCE3 isolate.
[An intriguing model for 5S rDNA sequences dispersion in the genome of freshwater stingray Potamotrygon motoro (Chondrichthyes: Potamotrygonidae)].

PubMed

Cruz, V P; Oliveira, C; Foresti, F

2015-01-01

5S rDNA genes of the stingray Potamotrygon motoro were PCR replicated, purified, cloned and sequenced. Two distinct classes of segments of different sizes were obtained. The smallest, with 342 bp units, was classified as class I, and the largest, with 1900 bp units, was designated as class II. Alignment with the consensus sequences for both classes showed changes in a few bases in the 5S rDNA genes. TATA-like sequences were detected in the nontranscribed spacer (NTS) regions of class I and a microsatellite (GCT) 10 sequence was detected in the NTS region of class II. The results obtained can help to understand the molecular organization of ribosomal genes and the mechanism of gene dispersion.
Aggregating and Predicting Sequence Labels from Crowd Annotations

PubMed Central

Nguyen, An T.; Wallace, Byron C.; Li, Junyi Jessy; Nenkova, Ani; Lease, Matthew

2017-01-01

Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online1. PMID:29093611
The nucleotide sequence of the intergenic region between the 5.8S and 26S rRNA genes of the yeast ribosomal RNA operon. Possible implications for the interaction between 5.8S and 26S rRNA and the processing of the primary transcript.

PubMed Central

Veldman, G M; Klootwijk, J; van Heerikhuizen, H; Planta, R J

1981-01-01

We have determined the nucleotide sequence of part of a cloned yeast ribosomal RNA operon extending from the 5.8S RNA gene downstream into the 5' -terminal region of the 26S RNA gene. We mapped the pertinent processing sites, viz. the 5' end of 26S rRNA and the 3'ends of 5.8S rRNA and its immediate precursor, 7S RNA. At the 3' end of 7S RNA we find the sequence UCGUUU which is very similar to the type I consensus sequence UCAUUA/U present at the 3' ends of 17S, 5.8S and 26S rRNA as well as 18S precursor rRNA in yeast. At the 5' end of the 26S RNA gene we find a sequence of thirteen nucleotides which is homologous to the type II sequence present at the 5' termini of both the 17S and the 5.8S RNA gene. These findings further support the suggestion put forward earlier (G.M. Veldman et al. (1980) Nucl. Acids Res. 8, 2907-2920) that both consensus sequences are involved in the recognition of precursor rRNA by the processing nuclease(s). We discuss a model for the processing of yeast rRNA in which a processing enzyme sequentially recognizes several combinations of a type I and a type II consensus sequence. We also describe the existence of a significant base complementarity between sequences in the 5' -terminal region of 26S rRNA and the 3' -terminal region of 5.8S rRNA. We suggest that base pairing between these sequences contributes to the binding between 5.8S and 26S rRNA. Images PMID:7312619

Consensus statement: Virus taxonomy in the age of metagenomics.

PubMed

Simmonds, Peter; Adams, Mike J; Benkő, Mária; Breitbart, Mya; Brister, J Rodney; Carstens, Eric B; Davison, Andrew J; Delwart, Eric; Gorbalenya, Alexander E; Harrach, Balázs; Hull, Roger; King, Andrew M Q; Koonin, Eugene V; Krupovic, Mart; Kuhn, Jens H; Lefkowitz, Elliot J; Nibert, Max L; Orton, Richard; Roossinck, Marilyn J; Sabanadzovic, Sead; Sullivan, Matthew B; Suttle, Curtis A; Tesh, Robert B; van der Vlugt, René A; Varsani, Arvind; Zerbini, F Murilo

2017-03-01

The number and diversity of viral sequences that are identified in metagenomic data far exceeds that of experimentally characterized virus isolates. In a recent workshop, a panel of experts discussed the proposal that, with appropriate quality control, viruses that are known only from metagenomic data can, and should be, incorporated into the official classification scheme of the International Committee on Taxonomy of Viruses (ICTV). Although a taxonomy that is based on metagenomic sequence data alone represents a substantial departure from the traditional reliance on phenotypic properties, the development of a robust framework for sequence-based virus taxonomy is indispensable for the comprehensive characterization of the global virome. In this Consensus Statement article, we consider the rationale for why metagenomic sequence data should, and how it can, be incorporated into the ICTV taxonomy, and present proposals that have been endorsed by the Executive Committee of the ICTV.
Conservative secondary structure motifs already present in early-stage folding (in silico) as found in serpines family.

PubMed

Brylinski, Michal; Konieczny, Leszek; Kononowicz, Andrzej; Roterman, Irena

2008-03-21

The well-known procedure implemented in ClustalW oriented on the sequence comparison was applied to structure comparison. The consensus sequence as well as consensus structure has been defined for proteins belonging to serpine family. The structure of early stage intermediate was the object for similarity search. The high values of W(sequence) appeared to be accordant with high values of W(structure) making possible structure comparison using common criteria for sequence and structure comparison. Since the early stage structural form has been created according to limited conformational sub-space which does not include the beta-structure (this structure is mediated by C7eq structural form), is particularly important to see, that the C7eq structural form may be treated as the seed for beta-structure present in the final native structure of protein. The applicability of ClustalW procedure to structure comparison makes these two comparisons unified.
Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples

PubMed Central

Quick, Josh; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

2018-01-01

Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples without isolation remains challenging for viruses such as Zika, where metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence complete genomes comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimised library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved starting with clinical samples in 1-2 days following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. PMID:28538739
Fast and accurate de novo genome assembly from long uncorrected reads

PubMed Central

Vaser, Robert; Sović, Ivan; Nagarajan, Niranjan

2017-01-01

The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource-intensive error-correction and consensus-generation steps to obtain high-quality assemblies. We show that the error-correction step can be omitted and that high-quality consensus sequences can be generated efficiently with a SIMD-accelerated, partial-order alignment–based, stand-alone consensus module called Racon. Based on tests with PacBio and Oxford Nanopore data sets, we show that Racon coupled with miniasm enables consensus genomes with similar or better quality than state-of-the-art methods while being an order of magnitude faster. PMID:28100585
The consensus sequence of FAMLF alternative splice variants is overexpressed in undifferentiated hematopoietic cells.

PubMed

Chen, W L; Luo, D F; Gao, C; Ding, Y; Wang, S Y

2015-07-01

The familial acute myeloid leukemia related factor gene (FAMLF) was previously identified from a familial AML subtractive cDNA library and shown to undergo alternative splicing. This study used real-time quantitative PCR to investigate the expression of the FAMLF alternative-splicing transcript consensus sequence (FAMLF-CS) in peripheral blood mononuclear cells (PBMCs) from 119 patients with de novo acute leukemia (AL) and 104 healthy controls, as well as in CD34+ cells from 12 AL patients and 10 healthy donors. A 429-bp fragment from a novel splicing variant of FAMLF was obtained, and a 363-bp consensus sequence was targeted to quantify total FAMLF expression. Kruskal-Wallis, Nemenyi, Spearman's correlation, and Mann-Whitney U-tests were used to analyze the data. FAMLF-CS expression in PBMCs from AL patients and CD34+ cells from AL patients and controls was significantly higher than in control PBMCs (P < 0.0001). Moreover, FAMLF-CS expression in PBMCs from the AML group was positively correlated with red blood cell count (rs =0.317, P=0.006), hemoglobin levels (rs = 0.210, P = 0.049), and percentage of peripheral blood blasts (rs = 0.256, P = 0.027), but inversely correlated with hemoglobin levels in the control group (rs = -0.391, P < 0.0001). AML patients with high CD34+ expression showed significantly higher FAMLF-CS expression than those with low CD34+ expression (P = 0.041). Our results showed that FAMLF is highly expressed in both normal and malignant immature hematopoietic cells, but that expression is lower in normal mature PBMCs.
Randomization and In Vivo Selection Reveal a GGRG Motif Essential for Packaging Human Immunodeficiency Virus Type 2 RNA ▿ †

PubMed Central

Baig, Tayyba T.; Lanchy, Jean-Marc; Lodmell, J. Stephen

2009-01-01

The packaging signal (ψ) of human immunodeficiency virus type 2 (HIV-2) is present in the 5′ noncoding region of RNA and contains a 10-nucleotide palindrome (pal; 5′-392-GGAGUGCUCC) located upstream of the dimerization signal stem-loop 1 (SL1). pal has been shown to be functionally important in vitro and in vivo. We previously showed that the 3′ side of pal (GCUCC-3′) is involved in base-pairing interactions with a sequence downstream of SL1 to make an extended SL1, which is important for replication in vivo and the regulation of dimerization in vitro. However, the role of the 5′ side of pal (5′-GGAGU) was less clear. Here, we characterized this role using an in vivo SELEX approach. We produced a population of HIV-2 DNA genomes with random sequences within the 5′ side of pal and transfected these into COS-7 cells. Viruses from COS-7 cells were used to infect C8166 permissive cells. After several weeks of serial passage in C8166 cells, surviving viruses were sequenced. On the 5′ side of pal there was a striking convergence toward a GGRGN consensus sequence. Individual clones with consensus and nonconsensus sequences were tested in infectivity and packaging assays. Analysis of individuals that diverged from the consensus sequence showed normal viral RNA and protein synthesis but had replication defects and impaired RNA packaging. These findings clearly indicate that the GGRG motif is essential for viral replication and genomic RNA packaging. PMID:18971263
Magnetic resonance imaging for the detection, localisation, and characterisation of prostate cancer: recommendations from a European consensus meeting.

PubMed

Dickinson, Louise; Ahmed, Hashim U; Allen, Clare; Barentsz, Jelle O; Carey, Brendan; Futterer, Jurgen J; Heijmink, Stijn W; Hoskin, Peter J; Kirkham, Alex; Padhani, Anwar R; Persad, Raj; Puech, Philippe; Punwani, Shonit; Sohaib, Aslam S; Tombal, Bertrand; Villers, Arnauld; van der Meulen, Jan; Emberton, Mark

2011-04-01

Multiparametric magnetic resonance imaging (mpMRI) may have a role in detecting clinically significant prostate cancer in men with raised serum prostate-specific antigen levels. Variations in technique and the interpretation of images have contributed to inconsistency in its reported performance characteristics. Our aim was to make recommendations on a standardised method for the conduct, interpretation, and reporting of prostate mpMRI for prostate cancer detection and localisation. A consensus meeting of 16 European prostate cancer experts was held that followed the UCLA-RAND Appropriateness Method and facilitated by an independent chair. Before the meeting, 520 items were scored for "appropriateness" by panel members, discussed face to face, and rescored. Agreement was reached in 67% of 260 items related to imaging sequence parameters. T2-weighted, dynamic contrast-enhanced, and diffusion-weighted MRI were the key sequences incorporated into the minimum requirements. Consensus was also reached on 54% of 260 items related to image interpretation and reporting, including features of malignancy on individual sequences. A 5-point scale was agreed on for communicating the probability of malignancy, with a minimum of 16 prostatic regions of interest, to include a pictorial representation of suspicious foci. Limitations relate to consensus methodology. Dominant personalities are known to affect the opinions of the group and were countered by a neutral chairperson. Consensus was reached on a number of areas related to the conduct, interpretation, and reporting of mpMRI for the detection, localisation, and characterisation of prostate cancer. Before optimal dissemination of this technology, these outcomes will require formal validation in prospective trials. Copyright © 2010 European Association of Urology. Published by Elsevier B.V. All rights reserved.
A filtering method to generate high quality short reads using illumina paired-end technology.

PubMed

Eren, A Murat; Vineis, Joseph H; Morrison, Hilary G; Sogin, Mitchell L

2013-01-01

Consensus between independent reads improves the accuracy of genome and transcriptome analyses, however lack of consensus between very similar sequences in metagenomic studies can and often does represent natural variation of biological significance. The common use of machine-assigned quality scores on next generation platforms does not necessarily correlate with accuracy. Here, we describe using the overlap of paired-end, short sequence reads to identify error-prone reads in marker gene analyses and their contribution to spurious OTUs following clustering analysis using QIIME. Our approach can also reduce error in shotgun sequencing data generated from libraries with small, tightly constrained insert sizes. The open-source implementation of this algorithm in Python programming language with user instructions can be obtained from https://github.com/meren/illumina-utils.
BAUM: improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach.

PubMed

Wang, Anqi; Wang, Zhanyu; Li, Zheng; Li, Lei M

2018-06-15

It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the second generation sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the single-molecule real-time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can (i) perform reference-assisted assembly based on the genome of a close species (ii) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. Supplementary data are available at Bioinformatics online.
Molecular cloning of MSSP-2, a c-myc gene single-strand binding protein: characterization of binding specificity and DNA replication activity.

PubMed Central

Takai, T; Nishita, Y; Iguchi-Ariga, S M; Ariga, H

1994-01-01

We have previously reported the human cDNA encoding MSSP-1, a sequence-specific double- and single-stranded DNA binding protein [Negishi, Nishita, Saëgusa, Kakizaki, Galli, Kihara, Tamai, Miyajima, Iguchi-Ariga and Ariga (1994) Oncogene, 9, 1133-1143]. MSSP-1 binds to a DNA replication origin/transcriptional enhancer of the human c-myc gene and has turned out to be identical with Scr2, a human protein which complements the defect of cdc2 kinase in S.pombe [Kataoka and Nojima (1994) Nucleic Acid Res., 22, 2687-2693]. We have cloned the cDNA for MSSP-2, another member of the MSSP family of proteins. The MSSP-2 cDNA shares highly homologous sequences with MSSP-1 cDNA, except for the insertion of 48 bp coding 16 amino acids near the C-terminus. Like MSSP-1, MSSP-2 has RNP-1 consensus sequences. The results of the experiments using bacterially expressed MSSP-2, and its deletion mutants, as histidine fusion proteins suggested that the binding specificity of MSSP-2 to double- and single-stranded DNA is the same as that of MSSP-1, and that the RNP consensus sequences are required for the DNA binding of the protein. MSSP-2 stimulated the DNA replication of an SV40-derived plasmid containing the binding sequence for MSSP-1 or -2. MSSP-2 is hence suggested to play an important role in regulation of DNA replication. Images PMID:7838710
Sequence conservation, HLA-E-Restricted peptide, and best-defined CTL/CD8+ epitopes in gag P24 (capsid) of HIV-1 subtype B

NASA Astrophysics Data System (ADS)

Prasetyo, Afiono Agung; Dharmawan, Ruben; Sari, Yulia; Sariyatun, Ratna

2017-02-01

Human immunodeficiency virus type 1 (HIV-1) remains a cause of global health problem. Continuous studies of HIV-1 genetic and immunological profiles are important to find strategies against the virus. This study aimed to conduct analysis of sequence conservation, HLA-E-restricted peptide, and best-defined CTL/CD8+ epitopes in p24 (capsid) of HIV-1 subtype B worldwide. The p24-coding sequences from 3,557 HIV subtype B isolates were aligned using MUSCLE and analysed. Some highly conserved regions (sequence conservation ≥95%) were observed. Two considerably long series of sequences with conservation of 100% was observed at base 349-356 and 550-557 of p24 (HXB2 numbering). The consensus from all aligned isolates was precisely the same as consensus B in the Los Alamos HIV Database. The HLA-E-restricted peptide in amino acid (aa) 14-22 of HIV-1 p24 (AISPRTLNA) was found in 55.9% (1,987/3,557) of HIV-1 subtype B worldwide. Forty-four best-defined CTL/CD8+ epitopes were observed, in which VKNWMTETL epitope (aa 181-189 of p24) restricted by B*4801 was the most frequent, as found in 94.9% of isolates. The results of this study would contribute information about HIV-1 subtype B and benefits for further works willing to develop diagnostic and therapeutic strategies against the virus.
Community of Priors: A Bayesian Approach to Consensus Building

ERIC Educational Resources Information Center

Hara, Motoaki

2010-01-01

Despite having drawn from empirical evidence and cumulative prior expertise in the formulation of research questions as well as study design, each study is treated as a stand-alone product rather than positioned within a sequence of cumulative evidence. While results of prior studies are typically cited within the body of prior literature review,…
Improvement in Protein Domain Identification Is Reached by Breaking Consensus, with the Agreement of Many Profiles and Domain Co-occurrence

PubMed Central

Bernardes, Juliana; Zaverucha, Gerson; Vaquero, Catherine; Carbone, Alessandra

2016-01-01

Traditional protein annotation methods describe known domains with probabilistic models representing consensus among homologous domain sequences. However, when relevant signals become too weak to be identified by a global consensus, attempts for annotation fail. Here we address the fundamental question of domain identification for highly divergent proteins. By using high performance computing, we demonstrate that the limits of state-of-the-art annotation methods can be bypassed. We design a new strategy based on the observation that many structural and functional protein constraints are not globally conserved through all species but might be locally conserved in separate clades. We propose a novel exploitation of the large amount of data available: 1. for each known protein domain, several probabilistic clade-centered models are constructed from a large and differentiated panel of homologous sequences, 2. a decision-making protocol combines outcomes obtained from multiple models, 3. a multi-criteria optimization algorithm finds the most likely protein architecture. The method is evaluated for domain and architecture prediction over several datasets and statistical testing hypotheses. Its performance is compared against HMMScan and HHblits, two widely used search methods based on sequence-profile and profile-profile comparison. Due to their closeness to actual protein sequences, clade-centered models are shown to be more specific and functionally predictive than the broadly used consensus models. Based on them, we improved annotation of Plasmodium falciparum protein sequences on a scale not previously possible. We successfully predict at least one domain for 72% of P. falciparum proteins against 63% achieved previously, corresponding to 30% of improvement over the total number of Pfam domain predictions on the whole genome. The method is applicable to any genome and opens new avenues to tackle evolutionary questions such as the reconstruction of ancient domain duplications, the reconstruction of the history of protein architectures, and the estimation of protein domain age. Website and software: http://www.lcqb.upmc.fr/CLADE. PMID:27472895
Development of a EST dataset and characterization of EST-SSRs in a traditional Chinese medicinal plant, Epimedium sagittatum (Sieb. Et Zucc.) Maxim

PubMed Central

2010-01-01

Background Epimedium sagittatum (Sieb. Et Zucc.) Maxim, a traditional Chinese medicinal plant species, has been used extensively as genuine medicinal materials. Certain Epimedium species are endangered due to commercial overexploition, while sustainable application studies, conservation genetics, systematics, and marker-assisted selection (MAS) of Epimedium is less-studied due to the lack of molecular markers. Here, we report a set of expressed sequence tags (ESTs) and simple sequence repeats (SSRs) identified in these ESTs for E. sagittatum. Results cDNAs of E. sagittatum are sequenced using 454 GS-FLX pyrosequencing technology. The raw reads are cleaned and assembled into a total of 76,459 consensus sequences comprising of 17,231 contigs and 59,228 singlets. About 38.5% (29,466) of the consensus sequences significantly match to the non-redundant protein database (E-value < 1e-10), 22,295 of which are further annotated using Gene Ontology (GO) terms. A total of 2,810 EST-SSRs is identified from the Epimedium EST dataset. Trinucleotide SSR is the dominant repeat type (55.2%) followed by dinucleotide (30.4%), tetranuleotide (7.3%), hexanucleotide (4.9%), and pentanucleotide (2.2%) SSR. The dominant repeat motif is AAG/CTT (23.6%) followed by AG/CT (19.3%), ACC/GGT (11.1%), AT/AT (7.5%), and AAC/GTT (5.9%). Thirty-two SSR-ESTs are randomly selected and primer pairs are synthesized for testing the transferability across 52 Epimedium species. Eighteen primer pairs (85.7%) could be successfully transferred to Epimedium species and sixteen of those show high genetic diversity with 0.35 of observed heterozygosity (Ho) and 0.65 of expected heterozygosity (He) and high number of alleles per locus (11.9). Conclusion A large EST dataset with a total of 76,459 consensus sequences is generated, aiming to provide sequence information for deciphering secondary metabolism, especially for flavonoid pathway in Epimedium. A total of 2,810 EST-SSRs is identified from EST dataset and ~1580 EST-SSR markers are transferable. E. sagittatum EST-SSR transferability to the major Epimedium germplasm is up to 85.7%. Therefore, this EST dataset and EST-SSRs will be a powerful resource for further studies such as taxonomy, molecular breeding, genetics, genomics, and secondary metabolism in Epimedium species. PMID:20141623
Regulation of iron assimilation: nucleotide sequence analysis of an iron-regulated promoter from a fluorescent pseudomonad.

PubMed

O'Sullivan, D J; O'Gara, F

1991-08-01

An iron-regulated promoter was cloned on a 2.1 kb Bg/II fragment from Pseudomonas sp. strain M114 and fused to the lacZ reporter gene. Iron-regulated lacZ expression from the resulting construct (pSP1) in strain M114 was mediated via the Fur-like repressor which also regulates siderophore production in this strain. A 390 bp StuI-PstI internal fragment contained the necessary information for iron-regulated promoter expression. This fragment was sequenced and the initiation point for transcription was determined by primer extension analysis. The region directly upstream of the transcription start point contained no significant homology to known promoter consensus sequences. However the -16 to -25 bp region contained homology to four other iron-regulated pseudomonad promoters. Deletion of bases downstream from the transcriptional start did not affect the iron-regulated expression of the promoter. The -37 and -43 bp regions exhibited some homology to the 19 bp Escherichia coli Fur-binding consensus sequence. When expressed in E. coli (via a cloned transacting factor from strain M114) lacZ expression from pSP1 was found to be regulated by iron. A region of greater than 77 bases but less than 131 upstream from the transcriptional start was found to be necessary for promoter activity, further suggesting that a transcriptional activator may be required for expression.
RNA-Seq analysis and transcriptome assembly for blackberry (Rubus sp. Var. Lochness) fruit.

PubMed

Garcia-Seco, Daniel; Zhang, Yang; Gutierrez-Mañero, Francisco J; Martin, Cathie; Ramos-Solano, Beatriz

2015-01-22

There is an increasing interest in berries, especially blackberries in the diet, because of recent reports of their health benefits due to their high content of flavonoids. A broad range of genomic tools are available for other Rosaceae species but these tools are still lacking in the Rubus genus, thus limiting gene discovery and the breeding of improved varieties. De novo RNA-seq of ripe blackberries grown under field conditions was performed using Illumina Hiseq 2000. Almost 9 billion nucleotide bases were sequenced in total. Following assembly, 42,062 consensus sequences were detected. For functional annotation, 33,040 (NR), 32,762 (NT), 21,932 (Swiss-Prot), 20,134 (KEGG), 13,676 (COG), 24,168 (GO) consensus sequences were annotated using different databases; in total 34,552 annotated sequences were identified. For protein prediction analysis, the number of coding DNA sequences (CDS) that mapped to the protein database was 32,540. Non redundant (NR), annotation showed that 25,418 genes (73.5%) has the highest similarity with Fragaria vesca subspecies vesca. Reanalysis was undertaken by aligning the reads with this reference genome for a deeper analysis of the transcriptome. We demonstrated that de novo assembly, using Trinity and later annotation with Blast using different databases, were complementary to alignment to the reference sequence using SOAPaligner/SOAP2. The Fragaria reference genome belongs to a species in the same family as blackberry (Rosaceae) but to a different genus. Since blackberries are tetraploids, the possibility of artefactual gene chimeras resulting from mis-assembly was tested with one of the genes sequenced by RNAseq, Chalcone Synthase (CHS). cDNAs encoding this protein were cloned and sequenced. Primers designed to the assembled sequences accurately distinguished different contigs, at least for chalcone synthase genes. We prepared and analysed transcriptome data from ripe blackberries, for which prior genomic information was limited. This new sequence information will improve the knowledge of this important and healthy fruit, providing an invaluable new tool for biological research.
Identification of an estrogen response element in the 3'-flanking region of the murine c-fos protooncogene.

PubMed

Hyder, S M; Stancel, G M; Nawaz, Z; McDonnell, D P; Loose-Mitchell, D S

1992-09-05

We have used transient transfection assays with reporter plasmids expressing chloramphenicol acetyltransferase, linked to regions of mouse c-fos, to identify a specific estrogen response element (ERE) in this protooncogene. This element is located in the untranslated 3'-flanking region of the c-fos gene, 5 kilobases (kb) downstream from the c-fos promoter and 1.5 kb downstream of the poly(A) signal. This element confers estrogen responsiveness to chloramphenicol acetyltransferase reporters linked to both the herpes simplex virus thymidine kinase promoter and the homologous c-fos promoter. Deletion analysis localized the response element to a 200-base pair fragment which contains the element GGTCACCACAGCC that resembles the consensus ERE sequence GGTCACAGTGACC originally identified in Xenopus vitellogenin A2 gene. A synthetic 36-base pair oligodeoxynucleotide containing this c-fos sequence conferred estrogen inducibility to the thymidine kinase promoter. The corresponding sequence also induced reporter activity when present in the c-fos gene fragment 3 kb from the thymidine kinase promoter. Gel-shift experiments demonstrated that synthetic oligonucleotides containing either the consensus ERE or the c-fos element bind human estrogen receptor obtained from a yeast expression system. However, the mobility of the shifted band is faster for the fos-ERE-complex than the consensus ERE complex suggesting that the three-dimensional structure of the protein-DNA complexes is different or that other factors are differentially involved in the two reactions. When the 5'-GGTCA sequence present in the c-fos ERE is mutated to 5'-TTTCA, transcriptional activation and receptor binding activities are both lost. Mutation of the CAGCC-3' element corresponding to the second half-site of the c-fos sequence also led to the loss of receptor binding activity, suggesting that both half-sites of this element are involved in this function. The estrogen induction mediated by either the c-fos or the consensus ERE was blunted by the antiestrogen tamoxifen. Based on these studies, we believe the 3'-fos ERE sequence we have identified may be a major cis-acting element involved in the physiological regulation of the gene by estrogens in vivo.
Sequences of Zika Virus Genomes from a Pediatric Cohort in Nicaragua.

PubMed

Oldfield, Lauren M; Fedorova, Nadia; Puri, Vinita; Shrivastava, Susmita; Amedeo, Paolo; Durbin, Alan; Rocchi, Iara; Williams, Torrey; Shabman, Reed S; Tan, Gene S; Balmaseda, Angel; Kuan, Guillermina; Saborio, Saira; Gordon, Aubree; Harris, Eva; Pickett, Brett E

2018-06-14

We report here the whole-genome sequence of 11 Zika virus (ZIKV) samples from six pediatric patients in Nicaragua. Serum samples were collected, and ZIKV was isolated in tissue culture. Both serum and virus isolates were sequenced. The consensus ZIKV genomes are greater than 99% identical to each other. Copyright © 2018 Oldfield et al.
PERMANENT GENETIC RESOURCES: Consensus primers of cyp73 genes discriminate willow species and hybrids (Salix, Salicaceae).

PubMed

Trung, Le Quang; VAN Puyvelde, Karolien; Triest, Ludwig

2008-03-01

Consensus primers, based on exon sequences of the cyp73 gene family coding for cinnamate 4-hydroxylase (C4H) of the lignin biosynthesis pathway, were designed for the tetraploid willow species Salix alba and Salix fragilis. Diagnostic alleles at species level were observed among introns of three cyp73 genes and allowed unambiguous detection of the first generation and introgressed hybrids in populations. Progeny analysis of a female S. alba with a male introgressed hybrid confirmed the codominant inheritance of each intron. Sequences of the diagnostic alleles of both species were similar to those found in the hybrids. © 2007 The Authors.
Saturation of an Intra-Gene Pool Linkage Map: Towards a Unified Consensus Linkage Map for Fine Mapping and Synteny Analysis in Common Bean

PubMed Central

Galeano, Carlos H.; Fernandez, Andrea C.; Franco-Herrera, Natalia; Cichy, Karen A.; McClean, Phillip E.; Vanderleyden, Jos; Blair, Matthew W.

2011-01-01

Map-based cloning and fine mapping to find genes of interest and marker assisted selection (MAS) requires good genetic maps with reproducible markers. In this study, we saturated the linkage map of the intra-gene pool population of common bean DOR364×BAT477 (DB) by evaluating 2,706 molecular markers including SSR, SNP, and gene-based markers. On average the polymorphism rate was 7.7% due to the narrow genetic base between the parents. The DB linkage map consisted of 291 markers with a total map length of 1,788 cM. A consensus map was built using the core mapping populations derived from inter-gene pool crosses: DOR364×G19833 (DG) and BAT93×JALO EEP558 (BJ). The consensus map consisted of a total of 1,010 markers mapped, with a total map length of 2,041 cM across 11 linkage groups. On average, each linkage group on the consensus map contained 91 markers of which 83% were single copy markers. Finally, a synteny analysis was carried out using our highly saturated consensus maps compared with the soybean pseudo-chromosome assembly. A total of 772 marker sequences were compared with the soybean genome. A total of 44 syntenic blocks were identified. The linkage group Pv6 presented the most diverse pattern of synteny with seven syntenic blocks, and Pv9 showed the most consistent relations with soybean with just two syntenic blocks. Additionally, a co-linear analysis using common bean transcript map information against soybean coding sequences (CDS) revealed the relationship with 787 soybean genes. The common bean consensus map has allowed us to map a larger number of markers, to obtain a more complete coverage of the common bean genome. Our results, combined with synteny relationships provide tools to increase marker density in selected genomic regions to identify closely linked polymorphic markers for indirect selection, fine mapping or for positional cloning. PMID:22174773

A cell-free stock of simian-human immunodeficiency virus that causes AIDS in pig-tailed macaques has a limited number of amino acid substitutions in both SIVmac and HIV-1 regions of the genome and has offered cytotropism.

PubMed

Stephens, E B; Mukherjee, S; Sahni, M; Zhuge, W; Raghavan, R; Singh, D K; Leung, K; Atkinson, B; Li, Z; Joag, S V; Liu, Z Q; Narayan, O

1997-05-12

We have examined both the sequence changes in the LTR, gag, vif, vpr, vpx, tat, rev, vpu, env, and nef genes and the cell tropism of a cell-free stock of chimeric simian-human immunodeficiency virus (SHIV) isolated from the cerebrospinal fluid of a pig-tailed macaque (PNb) that developed AIDS. This virus (SHIVKU-1) is highly pathogenic when inoculated into other macaques. DNA sequence analysis of PCR-amplified products revealed a total of 5 nucleotide changes in the LTR while vif had 2 consensus amino acid changes. The gag, vif, and vpx had no consensus amino acid substitutions, whereas vpr had 1 consensus substitution. The tat and rev genes of the HXB2 region of SHIVKU-1 had 2 and 1 consensus amino acid changes, respectively. The vpu gene of the HXB2 region of SHIV, which originally had an ACG at the beginning of the gene, reverted to an initiation ATG codon and in addition contained a consensus amino acid substitution at position 69 of this protein. As expected, the majority of the nucleotide substitutions were found in the env and nef genes. Thirteen and 5 amino acid changes were predicted for the corresponding Env and Nef proteins, respectively. In addition, one-third of the env gene clones isolated from the SHIVKU-1 stock had a 5-amino-acid deletion in the V4 region. Using three independent assays, we determined that the changes in the SHIVKU-1 were associated with an increase in the efficiency of replication in macrophages. The strikingly few consensus changes in the virus suggest that conversion of this virus to one capable of causing AIDS in pig-tailed macaques was associated with relatively few changes in the viral envelope and/or accessory genes. These results will provide the basis for the development of a pathogenic, molecular clone of SHIV capable of causing AIDS in pig-tailed macaques.
PipeOnline 2.0: automated EST processing and functional data sorting.

PubMed

Ayoubi, Patricia; Jin, Xiaojing; Leite, Saul; Liu, Xianghui; Martajaja, Jeson; Abduraham, Abdurashid; Wan, Qiaolan; Yan, Wei; Misawa, Eduardo; Prade, Rolf A

2002-11-01

Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, unannotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annotated database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.
The combinatorial PP1-binding consensus Motif (R/K)x( (0,1))V/IxFxx(R/K)x(R/K) is a new apoptotic signature.

PubMed

Godet, Angélique N; Guergnon, Julien; Maire, Virginie; Croset, Amélie; Garcia, Alphonse

2010-04-01

Previous studies established that PP1 is a target for Bcl-2 proteins and an important regulator of apoptosis. The two distinct functional PP1 consensus docking motifs, R/Kx((0,1))V/IxF and FxxR/KxR/K, involved in PP1 binding and cell death were previously characterized in the BH1 and BH3 domains of some Bcl-2 proteins. In this study, we demonstrate that DPT-AIF(1), a peptide containing the AIF(562-571) sequence located in a c-terminal domain of AIF, is a new PP1 interacting and cell penetrating molecule. We also showed that DPT-AIF(1) provoked apoptosis in several human cell lines. Furthermore, DPT-APAF(1) a bi-partite cell penetrating peptide containing APAF-1(122-131), a non penetrating sequence from APAF-1 protein, linked to our previously described DPT-sh1 peptide shuttle, is also a PP1-interacting death molecule. Both AIF(562-571) and APAF-1(122-131) sequences contain a common R/Kx((0,1))V/IxFxxR/KxR/K motif, shared by several proteins involved in control of cell survival pathways. This motif combines the two distinct PP1c consensus docking motifs initially identified in some Bcl-2 proteins. Interestingly DPT-AIF(2) and DPT-APAF(2) that carry a F to A mutation within this combinatorial motif, no longer exhibited any PP1c binding or apoptotic effects. Moreover the F to A mutation in DPT-AIF(2) also suppressed cell penetration. These results indicate that the combinatorial PP1c docking motif R/Kx((0,1))V/IxFxxR/KxR/K, deduced from AIF(562-571) and APAF-1(122-131) sequences, is a new PP1c-dependent Apoptotic Signature. This motif is also a new tool for drug design that could be used to characterize potential anti-tumour molecules.
A universal protocol to generate consensus level genome sequences for foot-and-mouth disease virus and other positive-sense polyadenylated RNA viruses using the Illumina MiSeq.

PubMed

Logan, Grace; Freimanis, Graham L; King, David J; Valdazo-González, Begoña; Bachanek-Bankowska, Katarzyna; Sanderson, Nicholas D; Knowles, Nick J; King, Donald P; Cottam, Eleanor M

2014-09-30

Next-Generation Sequencing (NGS) is revolutionizing molecular epidemiology by providing new approaches to undertake whole genome sequencing (WGS) in diagnostic settings for a variety of human and veterinary pathogens. Previous sequencing protocols have been subject to biases such as those encountered during PCR amplification and cell culture, or are restricted by the need for large quantities of starting material. We describe here a simple and robust methodology for the generation of whole genome sequences on the Illumina MiSeq. This protocol is specific for foot-and-mouth disease virus (FMDV) or other polyadenylated RNA viruses and circumvents both the use of PCR and the requirement for large amounts of initial template. The protocol was successfully validated using five FMDV positive clinical samples from the 2001 epidemic in the United Kingdom, as well as a panel of representative viruses from all seven serotypes. In addition, this protocol was successfully used to recover 94% of an FMDV genome that had previously been identified as cell culture negative. Genome sequences from three other non-FMDV polyadenylated RNA viruses (EMCV, ERAV, VESV) were also obtained with minor protocol amendments. We calculated that a minimum coverage depth of 22 reads was required to produce an accurate consensus sequence for FMDV O. This was achieved in 5 FMDV/O/UKG isolates and the type O FMDV from the serotype panel with the exception of the 5' genomic termini and area immediately flanking the poly(C) region. We have developed a universal WGS method for FMDV and other polyadenylated RNA viruses. This method works successfully from a limited quantity of starting material and eliminates the requirement for genome-specific PCR amplification. This protocol has the potential to generate consensus-level sequences within a routine high-throughput diagnostic environment.
An update on the Society for Immunotherapy of Cancer consensus statement on tumor immunotherapy for the treatment of cutaneous melanoma: version 2.0.

PubMed

Sullivan, Ryan J; Atkins, Michael B; Kirkwood, John M; Agarwala, Sanjiv S; Clark, Joseph I; Ernstoff, Marc S; Fecher, Leslie; Gajewski, Thomas F; Gastman, Brian; Lawson, David H; Lutzky, Jose; McDermott, David F; Margolin, Kim A; Mehnert, Janice M; Pavlick, Anna C; Richards, Jon M; Rubin, Krista M; Sharfman, William; Silverstein, Steven; Slingluff, Craig L; Sondak, Vernon K; Tarhini, Ahmad A; Thompson, John A; Urba, Walter J; White, Richard L; Whitman, Eric D; Hodi, F Stephen; Kaufman, Howard L

2018-05-30

Cancer immunotherapy has been firmly established as a standard of care for patients with advanced and metastatic melanoma. Therapeutic outcomes in clinical trials have resulted in the approval of 11 new drugs and/or combination regimens for patients with melanoma. However, prospective data to support evidence-based clinical decisions with respect to the optimal schedule and sequencing of immunotherapy and targeted agents, how best to manage emerging toxicities and when to stop treatment are not yet available. To address this knowledge gap, the Society for Immunotherapy of Cancer (SITC) Melanoma Task Force developed a process for consensus recommendations for physicians treating patients with melanoma integrating evidence-based data, where available, with best expert consensus opinion. The initial consensus statement was published in 2013, and version 2.0 of this report is an update based on a recent meeting of the Task Force and extensive subsequent discussions on new agents, contemporary peer-reviewed literature and emerging clinical data. The Academy of Medicine (formerly Institute of Medicine) clinical practice guidelines were used as a basis for consensus development with an updated literature search for important studies published between 1992 and 2017 and supplemented, as appropriate, by recommendations from Task Force participants. The Task Force considered patients with stage II-IV melanoma and here provide consensus recommendations for how they would incorporate the many immunotherapy options into clinical pathways for patients with cutaneous melanoma. These clinical guidleines provide physicians and healthcare providers with consensus recommendations for managing melanoma patients electing treatment with tumor immunotherapy.
HIV-1 transmission linkage in an HIV-1 prevention clinical trial

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leitner, Thomas; Campbell, Mary S; Mullins, James I

2009-01-01

HIV-1 sequencing has been used extensively in epidemiologic and forensic studies to investigate patterns of HIV-1 transmission. However, the criteria for establishing genetic linkage between HIV-1 strains in HIV-1 prevention trials have not been formalized. The Partners in Prevention HSV/HIV Transmission Study (ClinicaITrials.gov NCT00194519) enrolled 3408 HIV-1 serodiscordant heterosexual African couples to determine the efficacy of genital herpes suppression with acyclovir in reducing HIV-1 transmission. The trial analysis required laboratory confirmation of HIV-1 linkage between enrolled partners in couples in which seroconversion occurred. Here we describe the process and results from HIV-1 sequencing studies used to perform transmission linkage determinationmore » in this clinical trial. Consensus Sanger sequencing of env (C2-V3-C3) and gag (p17-p24) genes was performed on plasma HIV-1 RNA from both partners within 3 months of seroconversion; env single molecule or pyrosequencing was also performed in some cases. For linkage, we required monophyletic clustering between HIV-1 sequences in the transmitting and seroconverting partners, and developed a Bayesian algorithm using genetic distances to evaluate the posterior probability of linkage of participants sequences. Adjudicators classified transmissions as linked, unlinked, or indeterminate. Among 151 seroconversion events, we found 108 (71.5%) linked, 40 (26.5%) unlinked, and 3 (2.0%) to have indeterminate transmissions. Nine (8.3%) were linked by consensus gag sequencing only and 8 (7.4%) required deep sequencing of env. In this first use of HIV-1 sequencing to establish endpoints in a large clinical trial, more than one-fourth of transmissions were unlinked to the enrolled partner, illustrating the relevance of these methods in the design of future HIV-1 prevention trials in serodiscordant couples. A hierarchy of sequencing techniques, analysis methods, and expert adjudication contributed to the linkage determination process.« less
Easy and accurate reconstruction of whole HIV genomes from short-read sequence data with shiver.

PubMed

Wymant, Chris; Blanquart, François; Golubchik, Tanya; Gall, Astrid; Bakker, Margreet; Bezemer, Daniela; Croucher, Nicholas J; Hall, Matthew; Hillebregt, Mariska; Ong, Swee Hoe; Ratmann, Oliver; Albert, Jan; Bannert, Norbert; Fellay, Jacques; Fransen, Katrien; Gourlay, Annabelle; Grabowski, M Kate; Gunsenheimer-Bartmeyer, Barbara; Günthard, Huldrych F; Kivelä, Pia; Kouyos, Roger; Laeyendecker, Oliver; Liitsola, Kirsi; Meyer, Laurence; Porter, Kholoud; Ristola, Matti; van Sighem, Ard; Berkhout, Ben; Cornelissen, Marion; Kellam, Paul; Reiss, Peter; Fraser, Christophe

2018-01-01

Studying the evolution of viruses and their molecular epidemiology relies on accurate viral sequence data, so that small differences between similar viruses can be meaningfully interpreted. Despite its higher throughput and more detailed minority variant data, next-generation sequencing has yet to be widely adopted for HIV. The difficulty of accurately reconstructing the consensus sequence of a quasispecies from reads (short fragments of DNA) in the presence of large between- and within-host diversity, including frequent indels, may have presented a barrier. In particular, mapping (aligning) reads to a reference sequence leads to biased loss of information; this bias can distort epidemiological and evolutionary conclusions. De novo assembly avoids this bias by aligning the reads to themselves, producing a set of sequences called contigs. However contigs provide only a partial summary of the reads, misassembly may result in their having an incorrect structure, and no information is available at parts of the genome where contigs could not be assembled. To address these problems we developed the tool shiver to pre-process reads for quality and contamination, then map them to a reference tailored to the sample using corrected contigs supplemented with the user's choice of existing reference sequences. Run with two commands per sample, it can easily be used for large heterogeneous data sets. We used shiver to reconstruct the consensus sequence and minority variant information from paired-end short-read whole-genome data produced with the Illumina platform, for sixty-five existing publicly available samples and fifty new samples. We show the systematic superiority of mapping to shiver's constructed reference compared with mapping the same reads to the closest of 3,249 real references: median values of 13 bases called differently and more accurately, 0 bases called differently and less accurately, and 205 bases of missing sequence recovered. We also successfully applied shiver to whole-genome samples of Hepatitis C Virus and Respiratory Syncytial Virus. shiver is publicly available from https://github.com/ChrisHIV/shiver.
cWINNOWER algorithm for finding fuzzy dna motifs

NASA Technical Reports Server (NTRS)

Liang, S.; Samanta, M. P.; Biegel, B. A.

2004-01-01

The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if a clique consisting of a sufficiently large number of mutated copies of the motif (i.e., the signals) is present in the DNA sequence. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum detectable clique size qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l, d) = (15, 4). Copyright Imperial College Press.
cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

NASA Technical Reports Server (NTRS)

Liang, Shoudan

2003-01-01

The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that qc increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces qc, by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12000 for (l,d) = (15,4).
CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design

PubMed Central

Rose, Timothy M.; Henikoff, Jorja G.; Henikoff, Steven

2003-01-01

We have developed a new primer design strategy for PCR amplification of distantly related gene sequences based on consensus-degenerate hybrid oligonucleotide primers (CODEHOPs). An interactive program has been written to design CODEHOP PCR primers from conserved blocks of amino acids within multiply-aligned protein sequences. Each CODEHOP consists of a pool of related primers containing all possible nucleotide sequences encoding 3–4 highly conserved amino acids within a 3′ degenerate core. A longer 5′ non-degenerate clamp region contains the most probable nucleotide predicted for each flanking codon. CODEHOPs are used in PCR amplification to isolate distantly related sequences encoding the conserved amino acid sequence. The primer design software and the CODEHOP PCR strategy have been utilized for the identification and characterization of new gene orthologs and paralogs in different plant, animal and bacterial species. In addition, this approach has been successful in identifying new pathogen species. The CODEHOP designer (http://blocks.fhcrc.org/codehop.html) is linked to BlockMaker and the Multiple Alignment Processor within the Blocks Database World Wide Web (http://blocks.fhcrc.org). PMID:12824413
Mapping and Sequencing the Human Genome

DOE R&D Accomplishments Database

1988-01-01

Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.
Identification and Structural Characterization of the ALIX-Binding Late Domains of Simian Immunodeficiency Virus SIV mac239 and SIV agmTan-1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Q Zhai; M Landesman; H Robinson

2011-12-31

Retroviral Gag proteins contain short late-domain motifs that recruit cellular ESCRT pathway proteins to facilitate virus budding. ALIX-binding late domains often contain the core consensus sequence YPX{sub n}L (where X{sub n} can vary in sequence and length). However, some simian immunodeficiency virus (SIV) Gag proteins lack this consensus sequence, yet still bind ALIX. We mapped divergent, ALIX-binding late domains within the p6{sup Gag} proteins of SIV{sub MAC239} ({sub 40}SREK{und P}YKE{und VT}ED{und L}LHLNSLF{sub 59}) and SIV{sub agmTan-1} ({sub 24}AAG{und A}YDP{und AR}KL{und L}EQYAKK{sub 41}). Crystal structures revealed that anchoring tyrosines (in lightface) and nearby hydrophobic residues (underlined) contact the ALIX V domain,more » revealing how lentiviruses employ a diverse family of late-domain sequences to bind ALIX and promote virus budding.« less
Identification and Structural Characterization of the ALIX-Binding Late Domains of Simian Immunodeficiency Virus SIVmac239 and SIVagmTan-1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhai, Q.; Robinson, H.; Landesman, M. B.

2011-01-01

Retroviral Gag proteins contain short late-domain motifs that recruit cellular ESCRT pathway proteins to facilitate virus budding. ALIX-binding late domains often contain the core consensus sequence YPX{sub n}L (where X{sub n} can vary in sequence and length). However, some simian immunodeficiency virus (SIV) Gag proteins lack this consensus sequence, yet still bind ALIX. We mapped divergent, ALIX-binding late domains within the p6{sup Gag} proteins of SIV{sub mac239} ({sub 40}SREK{und P}YKE{und VT}ED{und L}LHLNSLF{sub 59}) and SIV{sub agmTan-1} ({sub 24}AAG{und A}YDP{und AR}KL{und L}EQYAKK{sub 41}). Crystal structures revealed that anchoring tyrosines (in lightface) and nearby hydrophobic residues (underlined) contact the ALIX V domain,more » revealing how lentiviruses employ a diverse family of late-domain sequences to bind ALIX and promote virus budding.« less
Human adenovirus serotype 12 virion precursors pMu and pVI are cleaved at amino-terminal and carboxy-terminal sites that conform to the adenovirus 2 endoproteinase cleavage consensus sequence.

PubMed

Freimuth, P; Anderson, C W

1993-03-01

The sequence of a 1158-base pair fragment of the human adenovirus serotype 12 (Ad12) genome was determined. This segment encodes the precursors for virion components Mu and VI. Both Ad12 precursors contain two sequences that conform to a consensus sequence motif for cleavage by the endoproteinase of adenovirus 2 (Ad2). Analysis of the amino terminus of VI and of the peptide fragments found in Ad12 virions demonstrated that these sites are cleaved during Ad12 maturation. This observation suggests that the recognition motif for adenovirus endoproteinases is highly conserved among human serotypes. The adenovirus 2 endoproteinase polypeptide requires additional co-factors for activity (C. W. Anderson, Protein Expression Purif., 1993, 4, 8-15). Synthetic Ad12 or Ad2 pVI carboxy-terminal peptides each permitted efficient cleavage of an artificial endoproteinase substrate by recombinant Ad2 endoproteinase polypeptide.
High-Throughput SNP Discovery through Deep Resequencing of a Reduced Representation Library to Anchor and Orient Scaffolds in the Soybean Whole Genome Sequence

USDA-ARS?s Scientific Manuscript database

The soybean Consensus Map 4.0 facilitated the anchoring of 95.6% of the soybean whole genome sequence developed by the Joint Genome Institute, Department of Energy but only properly oriented 66% of the sequence scaffolds. To find additional single nucleotide polymorphism (SNP) markers for additiona...
Nanopore DNA Sequencing and Genome Assembly on the International Space Station.

PubMed

Castro-Wallace, Sarah L; Chiu, Charles Y; John, Kristen K; Stahl, Sarah E; Rubins, Kathleen H; McIntyre, Alexa B R; Dworkin, Jason P; Lupisella, Mark L; Smith, David J; Botkin, Douglas J; Stephenson, Timothy A; Juul, Sissel; Turner, Daniel J; Izquierdo, Fernando; Federman, Scot; Stryke, Doug; Somasekar, Sneha; Alexander, Noah; Yu, Guixia; Mason, Christopher E; Burton, Aaron S

2017-12-21

We evaluated the performance of the MinION DNA sequencer in-flight on the International Space Station (ISS), and benchmarked its performance off-Earth against the MinION, Illumina MiSeq, and PacBio RS II sequencing platforms in terrestrial laboratories. Samples contained equimolar mixtures of genomic DNA from lambda bacteriophage, Escherichia coli (strain K12, MG1655) and Mus musculus (female BALB/c mouse). Nine sequencing runs were performed aboard the ISS over a 6-month period, yielding a total of 276,882 reads with no apparent decrease in performance over time. From sequence data collected aboard the ISS, we constructed directed assemblies of the ~4.6 Mb E. coli genome, ~48.5 kb lambda genome, and a representative M. musculus sequence (the ~16.3 kb mitochondrial genome), at 100%, 100%, and 96.7% consensus pairwise identity, respectively; de novo assembly of the E. coli genome from raw reads yielded a single contig comprising 99.9% of the genome at 98.6% consensus pairwise identity. Simulated real-time analyses of in-flight sequence data using an automated bioinformatic pipeline and laptop-based genomic assembly demonstrated the feasibility of sequencing analysis and microbial identification aboard the ISS. These findings illustrate the potential for sequencing applications including disease diagnosis, environmental monitoring, and elucidating the molecular basis for how organisms respond to spaceflight.
Sequence of a second gene encoding bovine submaxillary mucin: implication for mucin heterogeneity and cloning.

PubMed

Jiang, W; Woitach, J T; Gupta, D; Bhavanandan, V P

1998-10-20

Secreted epithelial mucins are extremely large and heterogeneous glycoproteins. We report the 5 kilobase DNA sequence of a second gene, BSM2, which encodes bovine submaxillary mucin. The determined nucleotide and deduced amino acid sequences of BSM2 are 95.2% and 92. 2% identical, respectively, to those of the previously described BSM1 gene isolated from the same cow. Further, the five predicted protein domains of the two genes are 100%, 94%, 93%, 77%, and 88% identical. Based on the above results, we propose that expression of multiple homologous core proteins from a single animal is a factor in generating diversity of saccharides in mucins and in providing resistance of the molecules to proteolysis. In addition, this work raises several important issues in mucin cloning such as assembling sequences from seemingly overlapping clones and deducing consensus sequences for nearly identical tandem repeats. Copyright 1998 Academic Press.
The sampled-data consensus of multi-agent systems with probabilistic time-varying delays and packet losses

NASA Astrophysics Data System (ADS)

Sui, Xin; Yang, Yongqing; Xu, Xianyun; Zhang, Shuai; Zhang, Lingzhong

2018-02-01

This paper investigates the consensus of multi-agent systems with probabilistic time-varying delays and packet losses via sampled-data control. On the one hand, a Bernoulli-distributed white sequence is employed to model random packet losses among agents. On the other hand, a switched system is used to describe packet dropouts in a deterministic way. Based on the special property of the Laplacian matrix, the consensus problem can be converted into a stabilization problem of a switched system with lower dimensions. Some mean square consensus criteria are derived in terms of constructing an appropriate Lyapunov function and using linear matrix inequalities (LMIs). Finally, two numerical examples are given to show the effectiveness of the proposed method.
Guinea Pig ID-Like Families of SINEs

PubMed Central

Kass, David H.; Schaetz, Brian A.; Beitler, Lindsey; Bonney, Kevin M.; Jamison, Nicole; Wiesner, Cathy

2009-01-01

Previous studies have indicated a paucity of SINEs within the genomes of the guinea pig and nutria, representatives of the Hystricognathi suborder of rodents. More recent work has shown that the guinea pig genome contains a large number of B1 elements, expanding to various levels among different rodents. In this work we utilized A–B PCR and screened GenBank with sequences from isolated clones to identify potentially uncharacterized SINEs within the guinea pig genome, and identified numerous sequences with a high degree of similarity (>92%) specific to the guinea pig. The presence of A-tails and flanking direct repeats associated with these sequences supported the identification of a full-length SINE, with a consensus sequence notably distinct from other rodent SINEs. Although most similar to the ID SINE, it clearly was not derived from the known ID master gene (BC1), hence we refer to this element as guinea pig ID-like (GPIDL). Using the consensus to screen the guinea pig genomic database (Assembly CavPor2) with Ensembl BlastView, we estimated at least 100,000 copies, which contrasts markedly to just over 100 copies of ID elements. Additionally we provided evidence of recent integrations of GPIDL as two of seven analyzed conserved GPIDL-containing loci demonstrated presence/absence variants in Cavia porcellus and C. aperea. Using intra-IDL PCR and sequence analyses we also provide evidence that GPIDL is derived from a hystricognath-specific SINE family. These results demonstrate that this SINE family continues to contribute to the dynamics of genomes of hystricognath rodents. PMID:19232383
Guinea pig ID-like families of SINEs.

PubMed

Kass, David H; Schaetz, Brian A; Beitler, Lindsey; Bonney, Kevin M; Jamison, Nicole; Wiesner, Cathy

2009-05-01

Previous studies have indicated a paucity of SINEs within the genomes of the guinea pig and nutria, representatives of the Hystricognathi suborder of rodents. More recent work has shown that the guinea pig genome contains a large number of B1 elements, expanding to various levels among different rodents. In this work we utilized A-B PCR and screened GenBank with sequences from isolated clones to identify potentially uncharacterized SINEs within the guinea pig genome, and identified numerous sequences with a high degree of similarity (>92%) specific to the guinea pig. The presence of A-tails and flanking direct repeats associated with these sequences supported the identification of a full-length SINE, with a consensus sequence notably distinct from other rodent SINEs. Although most similar to the ID SINE, it clearly was not derived from the known ID master gene (BC1), hence we refer to this element as guinea pig ID-like (GPIDL). Using the consensus to screen the guinea pig genomic database (Assembly CavPor2) with Ensembl BlastView, we estimated at least 100,000 copies, which contrasts markedly to just over 100 copies of ID elements. Additionally we provided evidence of recent integrations of GPIDL as two of seven analyzed conserved GPIDL-containing loci demonstrated presence/absence variants in Cavia porcellus and C. aperea. Using intra-IDL PCR and sequence analyses we also provide evidence that GPIDL is derived from a hystricognath-specific SINE family. These results demonstrate that this SINE family continues to contribute to the dynamics of genomes of hystricognath rodents.

CoVaCS: a consensus variant calling system.

PubMed

Chiara, Matteo; Gioiosa, Silvia; Chillemi, Giovanni; D'Antonio, Mattia; Flati, Tiziano; Picardi, Ernesto; Zambelli, Federico; Horner, David Stephen; Pesole, Graziano; Castrignanò, Tiziana

2018-02-05

The advent and ongoing development of next generation sequencing technologies (NGS) has led to a rapid increase in the rate of human genome re-sequencing data, paving the way for personalized genomics and precision medicine. The body of genome resequencing data is progressively increasing underlining the need for accurate and time-effective bioinformatics systems for genotyping - a crucial prerequisite for identification of candidate causal mutations in diagnostic screens. Here we present CoVaCS, a fully automated, highly accurate system with a web based graphical interface for genotyping and variant annotation. Extensive tests on a gold standard benchmark data-set -the NA12878 Illumina platinum genome- confirm that call-sets based on our consensus strategy are completely in line with those attained by similar command line based approaches, and far more accurate than call-sets from any individual tool. Importantly our system exhibits better sensitivity and higher specificity than equivalent commercial software. CoVaCS offers optimized pipelines integrating state of the art tools for variant calling and annotation for whole genome sequencing (WGS), whole-exome sequencing (WES) and target-gene sequencing (TGS) data. The system is currently hosted at Cineca, and offers the speed of a HPC computing facility, a crucial consideration when large numbers of samples must be analysed. Importantly, all the analyses are performed automatically allowing high reproducibility of the results. As such, we believe that CoVaCS can be a valuable tool for the analysis of human genome resequencing studies. CoVaCS is available at: https://bioinformatics.cineca.it/covacs .
Revisiting and re-engineering the classical zinc finger peptide: consensus peptide-1 (CP-1).

PubMed

Besold, Angelique N; Widger, Leland R; Namuswe, Frances; Michalek, Jamie L; Michel, Sarah L J; Goldberg, David P

2016-04-01

Zinc plays key structural and catalytic roles in biology. Structural zinc sites are often referred to as zinc finger (ZF) sites, and the classical ZF contains a Cys2His2 motif that is involved in coordinating Zn(II). An optimized Cys2His2 ZF, named consensus peptide 1 (CP-1), was identified more than 20 years ago using a limited set of sequenced proteins. We have reexamined the CP-1 sequence, using our current, much larger database of sequenced proteins that have been identified from high-throughput sequencing methods, and found the sequence to be largely unchanged. The CCHH ligand set of CP-1 was then altered to a CAHH motif to impart hydrolytic activity. This ligand set mimics the His2Cys ligand set of peptide deformylase (PDF), a hydrolytically active M(II)-centered (M = Zn or Fe) protein. The resultant peptide [CP-1(CAHH)] was evaluated for its ability to coordinate Zn(II) and Co(II) ions, adopt secondary structure, and promote hydrolysis. CP-1(CAHH) was found to coordinate Co(II) and Zn(II) and a pentacoordinate geometry for Co(II)-CP-1(CAHH) was implicated from UV-vis data. This suggests a His2Cys(H2O)2 environment at the metal center. The Zn(II)-bound CP-1(CAHH) was shown to adopt partial secondary structure by 1-D (1)H NMR spectroscopy. Both Zn(II)-CP-1(CAHH) and Co(II)-CP-1(CAHH) show good hydrolytic activity toward the test substrate 4-nitrophenyl acetate, exhibiting faster rates than most active synthetic Zn(II) complexes.
Application of circular consensus sequencing and network analysis to characterize the bovine IgG repertoire

USDA-ARS?s Scientific Manuscript database

Background: Vertebrate immune systems generate diverse repertoires of antibodies capable of mediating response to a variety of antigens. Next generation sequencing methods provide unique approaches to a number of immuno-based research areas including antibody discovery and engineering, disease surve...
Evaluation of Bioinformatic Programmes for the Analysis of Variants within Splice Site Consensus Regions

PubMed Central

Tang, Rongying; Prosser, Debra O.; Love, Donald R.

2016-01-01

The increasing diagnostic use of gene sequencing has led to an expanding dataset of novel variants that lie within consensus splice junctions. The challenge for diagnostic laboratories is the evaluation of these variants in order to determine if they affect splicing or are merely benign. A common evaluation strategy is to use in silico analysis, and it is here that a number of programmes are available online; however, currently, there are no consensus guidelines on the selection of programmes or protocols to interpret the prediction results. Using a collection of 222 pathogenic mutations and 50 benign polymorphisms, we evaluated the sensitivity and specificity of four in silico programmes in predicting the effect of each variant on splicing. The programmes comprised Human Splice Finder (HSF), Max Entropy Scan (MES), NNSplice, and ASSP. The MES and ASSP programmes gave the highest performance based on Receiver Operator Curve analysis, with an optimal cut-off of score reduction of 10%. The study also showed that the sensitivity of prediction is affected by the level of conservation of individual positions, with in silico predictions for variants at positions −4 and +7 within consensus splice sites being largely uninformative. PMID:27313609
Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets.

PubMed

Deiana, Antonio; Giansanti, Andrea

2010-04-21

Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score SSU, that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. Our results show that proteins unclassified by SSU belong to a twilight zone. Proteins left unclassified by the consensus score SSU have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated.
Virus genome dynamics under different propagation pressures: reconstruction of whole genome haplotypes of West Nile viruses from NGS data.

PubMed

Kortenhoeven, Cornell; Joubert, Fourie; Bastos, Armanda D S; Abolnik, Celia

2015-02-22

Extensive focus is placed on the comparative analyses of consensus genotypes in the study of West Nile virus (WNV) emergence. Few studies account for genetic change in the underlying WNV quasispecies population variants. These variants are not discernable in the consensus genome at the time of emergence, and the maintenance of mutation-selection equilibria of population variants is greatly underestimated. The emergence of lineage 1 WNV strains has been studied extensively, but recent epidemics caused by lineage 2 WNV strains in Hungary, Austria, Greece and Italy emphasizes the increasing importance of this lineage to public health. In this study we explored the quasispecies dynamics of minority variants that contribute to cell-tropism and host determination, i.e. the ability to infect different cell types or cells from different species from Next Generation Sequencing (NGS) data of a historic lineage 2 WNV strain. Minority variants contributing to host cell membrane association persist in the viral population without contributing to the genetic change in the consensus genome. Minority variants are shown to maintain a stable mutation-selection equilibrium under positive selection, particularly in the capsid gene region. This study is the first to infer positive selection and the persistence of WNV haplotype variants that contribute to viral fitness without accompanying genetic change in the consensus genotype, documented solely from NGS sequence data. The approach used in this study streamlines the experimental design seeking viral minority variants accurately from NGS data whilst minimizing the influence of associated sequence error.
Peptide Array X-Linking (PAX): A New Peptide-Protein Identification Approach

PubMed Central

Okada, Hirokazu; Uezu, Akiyoshi; Soderblom, Erik J.; Moseley, M. Arthur; Gertler, Frank B.; Soderling, Scott H.

2012-01-01

Many protein interaction domains bind short peptides based on canonical sequence consensus motifs. Here we report the development of a peptide array-based proteomics tool to identify proteins directly interacting with ligand peptides from cell lysates. Array-formatted bait peptides containing an amino acid-derived cross-linker are photo-induced to crosslink with interacting proteins from lysates of interest. Indirect associations are removed by high stringency washes under denaturing conditions. Covalently trapped proteins are subsequently identified by LC-MS/MS and screened by cluster analysis and domain scanning. We apply this methodology to peptides with different proline-containing consensus sequences and show successful identifications from brain lysates of known and novel proteins containing polyproline motif-binding domains such as EH, EVH1, SH3, WW domains. These results suggest the capacity of arrayed peptide ligands to capture and subsequently identify proteins by mass spectrometry is relatively broad and robust. Additionally, the approach is rapid and applicable to cell or tissue fractions from any source, making the approach a flexible tool for initial protein-protein interaction discovery. PMID:22606326
Characterization of Hepatitis C Virus (HCV) Envelope Diversification from Acute to Chronic Infection within a Sexually Transmitted HCV Cluster by Using Single-Molecule, Real-Time Sequencing

PubMed Central

Ho, Cynthia K. Y.; Raghwani, Jayna; Koekkoek, Sylvie; Liang, Richard H.; Van der Meer, Jan T. M.; Van Der Valk, Marc; De Jong, Menno; Pybus, Oliver G.

2016-01-01

ABSTRACT In contrast to other available next-generation sequencing platforms, PacBio single-molecule, real-time (SMRT) sequencing has the advantage of generating long reads albeit with a relatively higher error rate in unprocessed data. Using this platform, we longitudinally sampled and sequenced the hepatitis C virus (HCV) envelope genome region (1,680 nucleotides [nt]) from individuals belonging to a cluster of sexually transmitted cases. All five subjects were coinfected with HIV-1 and a closely related strain of HCV genotype 4d. In total, 50 samples were analyzed by using SMRT sequencing. By using 7 passes of circular consensus sequencing, the error rate was reduced to 0.37%, and the median number of sequences was 612 per sample. A further reduction of insertions was achieved by alignment against a sample-specific reference sequence. However, in vitro recombination during PCR amplification could not be excluded. Phylogenetic analysis supported close relationships among HCV sequences from the four male subjects and subsequent transmission from one subject to his female partner. Transmission was characterized by a strong genetic bottleneck. Viral genetic diversity was low during acute infection and increased upon progression to chronicity but subsequently fluctuated during chronic infection, caused by the alternate detection of distinct coexisting lineages. SMRT sequencing combines long reads with sufficient depth for many phylogenetic analyses and can therefore provide insights into within-host HCV evolutionary dynamics without the need for haplotype reconstruction using statistical algorithms. IMPORTANCE Next-generation sequencing has revolutionized the study of genetically variable RNA virus populations, but for phylogenetic and evolutionary analyses, longer sequences than those generated by most available platforms, while minimizing the intrinsic error rate, are desired. Here, we demonstrate for the first time that PacBio SMRT sequencing technology can be used to generate full-length HCV envelope sequences at the single-molecule level, providing a data set with large sequencing depth for the characterization of intrahost viral dynamics. The selection of consensus reads derived from at least 7 full circular consensus sequencing rounds significantly reduced the intrinsic high error rate of this method. We used this method to genetically characterize a unique transmission cluster of sexually transmitted HCV infections, providing insight into the distinct evolutionary pathways in each patient over time and identifying the transmission-associated genetic bottleneck as well as fluctuations in viral genetic diversity over time, accompanied by dynamic shifts in viral subpopulations. PMID:28077634
ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

PubMed Central

2014-01-01

Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393
Cloning, sequencing and characterization of lipase from a polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans

USDA-ARS?s Scientific Manuscript database

Lipase gene (lip) of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing bacterium P. resinovorans NRRL B-2649 was cloned, sequenced and characterized by using consensus primers and PCR-based genome walking method. The ORF of the putative Lip (314 amino acids) and its active site (Ser111, Asp...
A FRET Biosensor for ROCK Based on a Consensus Substrate Sequence Identified by KISS Technology.

PubMed

Li, Chunjie; Imanishi, Ayako; Komatsu, Naoki; Terai, Kenta; Amano, Mutsuki; Kaibuchi, Kozo; Matsuda, Michiyuki

2017-01-11

Genetically-encoded biosensors based on Förster/fluorescence resonance energy transfer (FRET) are versatile tools for studying the spatio-temporal regulation of signaling molecules within not only the cells but also tissues. Perhaps the hardest task in the development of a FRET biosensor for protein kinases is to identify the kinase-specific substrate peptide to be used in the FRET biosensor. To solve this problem, we took advantage of kinase-interacting substrate screening (KISS) technology, which deduces a consensus substrate sequence for the protein kinase of interest. Here, we show that a consensus substrate sequence for ROCK identified by KISS yielded a FRET biosensor for ROCK, named Eevee-ROCK, with high sensitivity and specificity. By treating HeLa cells with inhibitors or siRNAs against ROCK, we show that a substantial part of the basal FRET signal of Eevee-ROCK was derived from the activities of ROCK1 and ROCK2. Eevee-ROCK readily detected ROCK activation by epidermal growth factor, lysophosphatidic acid, and serum. When cells stably-expressing Eevee-ROCK were time-lapse imaged for three days, ROCK activity was found to increase after the completion of cytokinesis, concomitant with the spreading of cells. Eevee-ROCK also revealed a gradual increase in ROCK activity during apoptosis. Thus, Eevee-ROCK, which was developed from a substrate sequence predicted by the KISS technology, will pave the way to a better understanding of the function of ROCK in a physiological context.
ChIP-seq analysis of the σ E regulon of Salmonella enterica serovar typhimurium reveals new genes implicated in heat shock and oxidative stress response

DOE PAGES

Li, Jie; Overall, Christopher C.; Johnson, Rudd C.; ...

2015-09-21

The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less
ChIP-seq analysis of the σ E regulon of Salmonella enterica serovar typhimurium reveals new genes implicated in heat shock and oxidative stress response

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Jie; Overall, Christopher C.; Johnson, Rudd C.

The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less
Development of the first consensus genetic map of intermediate wheatgrass (Thinopyrum intermedium) using genotyping-by-sequencing.

PubMed

Kantarski, Traci; Larson, Steve; Zhang, Xiaofei; DeHaan, Lee; Borevitz, Justin; Anderson, James; Poland, Jesse

2017-01-01

Development of the first consensus genetic map of intermediate wheatgrass gives insight into the genome and tools for molecular breeding. Intermediate wheatgrass (Thinopyrum intermedium) has been identified as a candidate for domestication and improvement as a perennial grain, forage, and biofuel crop and is actively being improved by several breeding programs. To accelerate this process using genomics-assisted breeding, efficient genotyping methods and genetic marker reference maps are needed. We present here the first consensus genetic map for intermediate wheatgrass (IWG), which confirms the species' allohexaploid nature (2n = 6x = 42) and homology to Triticeae genomes. Genotyping-by-sequencing was used to identify markers that fit expected segregation ratios and construct genetic maps for 13 heterogeneous parents of seven full-sib families. These maps were then integrated using a linear programming method to produce a consensus map with 21 linkage groups containing 10,029 markers, 3601 of which were present in at least two populations. Each of the 21 linkage groups contained between 237 and 683 markers, cumulatively covering 5061 cM (2891 cM--Kosambi) with an average distance of 0.5 cM between each pair of markers. Through mapping the sequence tags to the diploid (2n = 2x = 14) barley reference genome, we observed high colinearity and synteny between these genomes, with three homoeologous IWG chromosomes corresponding to each of the seven barley chromosomes, and mapped translocations that are known in the Triticeae. The consensus map is a valuable tool for wheat breeders to map important disease-resistance genes within intermediate wheatgrass. These genomic tools can help lead to rapid improvement of IWG and development of high-yielding cultivars of this perennial grain that would facilitate the sustainable intensification of agricultural systems.
Identification of the regulatory autophosphorylation site of autophosphorylation-dependent protein kinase (auto-kinase). Evidence that auto-kinase belongs to a member of the p21-activated kinase family.

PubMed

Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D

1998-08-15

Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2.
Identification of the regulatory autophosphorylation site of autophosphorylation-dependent protein kinase (auto-kinase). Evidence that auto-kinase belongs to a member of the p21-activated kinase family.

PubMed Central

Yu, J S; Chen, W J; Ni, M H; Chan, W H; Yang, S D

1998-01-01

Autophosphorylation-dependent protein kinase (auto-kinase) was identified from pig brain and liver on the basis of its unique autophosphorylation/activation property [Yang, Fong, Yu and Liu (1987) J. Biol. Chem. 262, 7034-7040; Yang, Chang and Soderling (1987) J. Biol. Chem. 262, 9421-9427]. Its substrate consensus sequence motif was determined as being -R-X-(X)-S*/T*-X3-S/T-. To characterize auto-kinase further, we partly sequenced the kinase purified from pig liver. The N-terminal sequence (VDGGAKTSDKQKKKAXMTDE) and two internal peptide sequences (EKLRTIV and LQNPEK/ILTP/FI) of auto-kinase were obtained. These sequences identify auto-kinase as a C-terminal catalytic fragment of p21-activated protein kinase 2 (PAK2 or gamma-PAK) lacking its N-terminal regulatory region. Auto-kinase can be recognized by an antibody raised against the C-terminal peptide of human PAK2 by immunoblotting. Furthermore the autophosphorylation site sequence of auto-kinase was successfully predicted on the basis of its substrate consensus sequence motif and the known PAK2 sequence, and was further demonstrated to be RST(P)MVGTPYWMAPEVVTR by phosphoamino acid analysis, manual Edman degradation and phosphopeptide mapping via the help of phosphorylation site analysis of a synthetic peptide corresponding to the sequence of PAK2 from residues 396 to 418. During the activation process, auto-kinase autophosphorylates mainly on a single threonine residue Thr402 (according to the sequence numbering of human PAK2). In addition, a phospho-specific antibody against a synthetic phosphopeptide containing this identified sequence was generated and shown to be able to differentially recognize the activated auto-kinase autophosphorylated at Thr402 but not the non-phosphorylated/inactive auto-kinase. Immunoblot analysis with this phospho-specific antibody further revealed that the change in phosphorylation level of Thr402 of auto-kinase was well correlated with the activity change of the kinase during both autophosphorylation/activation and protein phosphatase-mediated dephosphorylation/inactivation processes. Taken together, our results identify Thr402 as the regulatory autophosphorylation site of auto-kinase, which is a C-terminal catalytic fragment of PAK2. PMID:9693111
Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples.

PubMed

Quick, Joshua; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah C; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno R; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

2017-06-01

Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples (i.e., without isolation and culture) remains challenging for viruses such as Zika, for which metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence-complete genomes, comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimized library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an Internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved in 1-2 d by starting with clinical samples and following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. The protocol can be used to sequence other viral genomes using the online Primal Scheme primer designer software. It is suitable for sequencing either RNA or DNA viruses in the field during outbreaks or as an inexpensive, convenient method for use in the lab.
A conserved mechanism for replication origin recognition and binding in archaea.

PubMed

Majerník, Alan I; Chong, James P J

2008-01-15

To date, methanogens are the only group within the archaea where firing DNA replication origins have not been demonstrated in vivo. In the present study we show that a previously identified cluster of ORB (origin recognition box) sequences do indeed function as an origin of replication in vivo in the archaeon Methanothermobacter thermautotrophicus. Although the consensus sequence of ORBs in M. thermautotrophicus is somewhat conserved when compared with ORB sequences in other archaea, the Cdc6-1 protein from M. thermautotrophicus (termed MthCdc6-1) displays sequence-specific binding that is selective for the MthORB sequence and does not recognize ORBs from other archaeal species. Stabilization of in vitro MthORB DNA binding by MthCdc6-1 requires additional conserved sequences 3' to those originally described for M. thermautotrophicus. By testing synthetic sequences bearing mutations in the MthORB consensus sequence, we show that Cdc6/ORB binding is critically dependent on the presence of an invariant guanine found in all archaeal ORB sequences. Mutation of a universally conserved arginine residue in the recognition helix of the winged helix domain of archaeal Cdc6-1 shows that specific origin sequence recognition is dependent on the interaction of this arginine residue with the invariant guanine. Recognition of a mutated origin sequence can be achieved by mutation of the conserved arginine residue to a lysine or glutamine residue. Thus despite a number of differences in protein and DNA sequences between species, the mechanism of origin recognition and binding appears to be conserved throughout the archaea.
Three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions sequence for routine imaging of the spine: preliminary experience.

PubMed

Tins, B; Cassar-Pullicino, V; Haddaway, M; Nachtrab, U

2012-08-01

The bulk of spinal imaging is still performed with conventional two-dimensional sequences. This study assesses the suitability of three-dimensional sampling perfection with application-optimised contrasts using a different flip angle evolutions (SPACE) sequence for routine spinal imaging. 62 MRI examinations of the spine were evaluated by 2 examiners in consensus for the depiction of anatomy and presence of artefact. We noted pathologies that might be missed using the SPACE sequence only or the SPACE and a sagittal T(1) weighted sequence. The reference standards were sagittal and axial T(1) weighted and T(2) weighted sequences. At a later date the evaluation was repeated by one of the original examiners and an additional examiner. There was good agreement of the single evaluations and consensus evaluation for the conventional sequences: κ>0.8, confidence interval (CI)>0.6-1.0. For the SPACE sequence, depiction of anatomy was very good for 84% of cases, with high interobserver agreement, but there was poor interobserver agreement for other cases. For artefact assessment of SPACE, κ=0.92, CI=0.92-1.0. The SPACE sequence was superior to conventional sequences for depiction of anatomy and artefact resistance. The SPACE sequence occasionally missed bone marrow oedema. In conjunction with sagittal T(1) weighted sequences, no abnormality was missed. The isotropic SPACE sequence was superior to conventional sequences in imaging difficult anatomy such as in scoliosis and spondylolysis. The SPACE sequence allows excellent assessment of anatomy owing to high spatial resolution and resistance to artefact. The sensitivity for bone marrow abnormalities is limited.
Evaluation of a decision aid for incidental genomic results, the Genomics ADvISER: protocol for a mixed methods randomised controlled trial.

PubMed

Shickh, Salma; Clausen, Marc; Mighton, Chloe; Casalino, Selina; Joshi, Esha; Glogowski, Emily; Schrader, Kasmintan A; Scheer, Adena; Elser, Christine; Panchal, Seema; Eisen, Andrea; Graham, Tracy; Aronson, Melyssa; Semotiuk, Kara M; Winter-Paquette, Laura; Evans, Michael; Lerner-Ellis, Jordan; Carroll, June C; Hamilton, Jada G; Offit, Kenneth; Robson, Mark; Thorpe, Kevin E; Laupacis, Andreas; Bombard, Yvonne

2018-04-26

Genome sequencing, a novel genetic diagnostic technology that analyses the billions of base pairs of DNA, promises to optimise healthcare through personalised diagnosis and treatment. However, implementation of genome sequencing faces challenges including the lack of consensus on disclosure of incidental results, gene changes unrelated to the disease under investigation, but of potential clinical significance to the patient and their provider. Current recommendations encourage clinicians to return medically actionable incidental results and stress the importance of education and informed consent. Given the shortage of genetics professionals and genomics expertise among healthcare providers, decision aids (DAs) can help fill a critical gap in the clinical delivery of genome sequencing. We aim to assess the effectiveness of an interactive DA developed for selection of incidental results. We will compare the DA in combination with a brief Q&A session with a genetic counsellor to genetic counselling alone in a mixed-methods randomised controlled trial. Patients who received negative standard cancer genetic results for their personal and family history of cancer and are thus eligible for sequencing will be recruited from cancer genetics clinics in Toronto. Our primary outcome is decisional conflict. Secondary outcomes are knowledge, satisfaction, preparation for decision-making, anxiety and length of session with the genetic counsellor. A subset of participants will complete a qualitative interview about preferences for incidental results. This study has been approved by research ethics boards of St. Michael's Hospital, Mount Sinai Hospital and Sunnybrook Health Sciences Centre. This research poses no significant risk to participants. This study evaluates the effectiveness of a novel patient-centred tool to support clinical delivery of incidental results. Results will be shared through national and international conferences, and at a stakeholder workshop to develop a consensus statement to optimise implementation of the DA in practice. NCT03244202; Pre-results. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

Improved Thermostability of Clostridium thermocellum Endoglucanase Cel8A by Using Consensus-Guided Mutagenesis

PubMed Central

Anbar, Michael; Gul, Ozgur; Lamed, Raphael; Sezerman, Ugur O.

2012-01-01

The use of thermostable cellulases is advantageous for the breakdown of lignocellulosic biomass toward the commercial production of biofuels. Previously, we have demonstrated the engineering of an enhanced thermostable family 8 cellulosomal endoglucanase (EC 3.2.1.4), Cel8A, from Clostridium thermocellum, using random error-prone PCR and a combination of three beneficial mutations, dominated by an intriguing serine-to-glycine substitution (M. Anbar, R. Lamed, E. A. Bayer, ChemCatChem 2:997–1003, 2010). In the present study, we used a bioinformatics-based approach involving sequence alignment of homologous family 8 glycoside hydrolases to create a library of consensus mutations in which residues of the catalytic module are replaced at specific positions with the most prevalent amino acids in the family. One of the mutants (G283P) displayed a higher thermal stability than the wild-type enzyme. Introducing this mutation into the previously engineered Cel8A triple mutant resulted in an optimized enzyme, increasing the half-life of activity by 14-fold at 85°C. Remarkably, no loss of catalytic activity was observed compared to that of the wild-type endoglucanase. The structural changes were simulated by molecular dynamics analysis, and specific regions were identified that contributed to the observed thermostability. Intriguingly, most of the proteins used for sequence alignment in determining the consensus residues were derived from mesophilic bacteria, with optimal temperatures well below that of C. thermocellum Cel8A. PMID:22389377
Rapid Multi-Locus Sequence Typing Using Microfluidic Biochips

DTIC Science & Technology

2010-05-12

Sequence Types. The evolutionary history of all the B. cereus MLST concatenated Sequence Types (545 taxa, 2,394 nucleotide positions) was inferred using...the Neighbor-Joining method [28]. The bootstrap consensus tree inferred from 100 replicates was taken to represent the evolutionary history of the... Chlamydia (manuscript in preparation) and performed pilot studies on Staphylococcus aureus and Streptoccus pneumoniae (Data S4 and Text S2). Another potential
Restarting and recentering genetic algorithm variations for DNA fragment assembly: The necessity of a multi-strategy approach.

PubMed

Hughes, James Alexander; Houghten, Sheridan; Ashlock, Daniel

2016-12-01

DNA Fragment assembly - an NP-Hard problem - is one of the major steps in of DNA sequencing. Multiple strategies have been used for this problem, including greedy graph-based algorithms, deBruijn graphs, and the overlap-layout-consensus approach. This study focuses on the overlap-layout-consensus approach. Heuristics and computational intelligence methods are combined to exploit their respective benefits. These algorithm combinations were able to produce high quality results surpassing the best results obtained by a number of competitive algorithms specially designed and tuned for this problem on thirteen of sixteen popular benchmarks. This work also reinforces the necessity of using multiple search strategies as it is clearly observed that algorithm performance is dependent on problem instance; without a deeper look into many searches, top solutions could be missed entirely. Copyright Â© 2016. Published by Elsevier Ireland Ltd.
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

PubMed

Khoe, Clairine V; Chung, Long H; Murray, Vincent

2018-06-01

The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Event-triggered consensus tracking of multi-agent systems with Lur'e nonlinear dynamics

NASA Astrophysics Data System (ADS)

Huang, Na; Duan, Zhisheng; Wen, Guanghui; Zhao, Yu

2016-05-01

In this paper, distributed consensus tracking problem for networked Lur'e systems is investigated based on event-triggered information interactions. An event-triggered control algorithm is designed with the advantages of reducing controller update frequency and sensor energy consumption. By using tools of ?-procedure and Lyapunov functional method, some sufficient conditions are derived to guarantee that consensus tracking is achieved under a directed communication topology. Meanwhile, it is shown that Zeno behaviour of triggering time sequences is excluded for the proposed event-triggered rule. Finally, some numerical simulations on coupled Chua's circuits are performed to illustrate the effectiveness of the theoretical algorithms.
An Alternative Time for Telling: When Conceptual Instruction Prior to Problem Solving Improves Mathematical Knowledge

ERIC Educational Resources Information Center

Fyfe, Emily R.; DeCaro, Marci S.; Rittle-Johnson, Bethany

2014-01-01

Background: The sequencing of learning materials greatly influences the knowledge that learners construct. Recently, learning theorists have focused on the sequencing of instruction in relation to solving related problems. The general consensus suggests explicit instruction should be provided; however, when to provide instruction remains unclear.…
Examination of the catalytic fitness of the hammerhead ribozyme by in vitro selection.

PubMed Central

Tang, J; Breaker, R R

1997-01-01

We have designed a self-cleaving ribozyme construct that is rendered inactive during preparative in vitro transcription by allosteric interactions with ATP. This allosteric ribozyme was constructed by joining a hammerhead domain to an ATP-binding RNA aptamer, thereby creating a ribozyme whose catalytic rate can be controlled by ATP. Upon purification by PAGE, the engineered ribozyme undergoes rapid self-cleavage when incubated in the absence of ATP. This strategy of "allosteric delay" was used to prepare intact hammerhead ribozymes that would otherwise self-destruct during transcription. Using a similar strategy, we have prepared a combinatorial pool of RNA in order to assess the catalytic fitness of ribozymes that carry the natural consensus sequence for the hammerhead. Using in vitro selection, this comprehensive RNA pool was screened for sequence variants of the hammerhead ribozyme that also display catalytic activity. We find that sequences that comprise the core of naturally occurring hammerhead dominate the population of selected RNAs, indicating that the natural consensus sequence of this ribozyme is optimal for catalytic function. PMID:9257650
Twin anemia polycythemia sequence: a single center experience and literature review.

PubMed

Moaddab, Amirhossein; Nassr, Ahmed A; Espinoza, Jimmy; Ruano, Rodrigo; Bateni, Zhoobin H; Shamshirsaz, Amir A; Mandy, George T; Welty, Stephen E; Erfani, Hadi; Popek, Edwina J; Belfort, Michael A; Shamshirsaz, Alireza A

2016-10-01

Twin anemia polycythemia sequence (TAPS) is defined by significant intertwin hemoglobin discordance without the amniotic fluid discordance that characterizes twin-twin-transfusion syndrome (TTTS) in monochorionic twin pregnancies. TAPS is an uncommon condition which can either occur spontaneously, or following fetoscopic laser ablation for TTTS. This complication is thought to result from chronic transfusion through very small placental anastomoses; however, the pathogenesis of TAPS remains unknown. Consequently, there is no consensus in the management of TAPS. In this article, three cases of TAPS are described and we review the literature on this uncommon pregnancy complication. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A core phylogeny of Dictyostelia inferred from genomes representative of the eight major and minor taxonomic divisions of the group.

PubMed

Singh, Reema; Schilde, Christina; Schaap, Pauline

2016-11-17

Dictyostelia are a well-studied group of organisms with colonial multicellularity, which are members of the mostly unicellular Amoebozoa. A phylogeny based on SSU rDNA data subdivided all Dictyostelia into four major groups, but left the position of the root and of six group-intermediate taxa unresolved. Recent phylogenies inferred from 30 or 213 proteins from sequenced genomes, positioned the root between two branches, each containing two major groups, but lacked data to position the group-intermediate taxa. Since the positions of these early diverging taxa are crucial for understanding the evolution of phenotypic complexity in Dictyostelia, we sequenced six representative genomes of early diverging taxa. We retrieved orthologs of 47 housekeeping proteins with an average size of 890 amino acids from six newly sequenced and eight published genomes of Dictyostelia and unicellular Amoebozoa and inferred phylogenies from single and concatenated protein sequence alignments. Concatenated alignments of all 47 proteins, and four out of five subsets of nine concatenated proteins all produced the same consensus phylogeny with 100% statistical support. Trees inferred from just two out of the 47 proteins, individually reproduced the consensus phylogeny, highlighting that single gene phylogenies will rarely reflect correct species relationships. However, sets of two or three concatenated proteins again reproduced the consensus phylogeny, indicating that a small selection of genes suffices for low cost classification of as yet unincorporated or newly discovered dictyostelid and amoebozoan taxa by gene amplification. The multi-locus consensus phylogeny shows that groups 1 and 2 are sister clades in branch I, with the group-intermediate taxon D. polycarpum positioned as outgroup to group 2. Branch II consists of groups 3 and 4, with the group-intermediate taxon Polysphondylium violaceum positioned as sister to group 4, and the group-intermediate taxon Dictyostelium polycephalum branching at the base of that whole clade. Given the data, the approximately unbiased test rejects all alternative topologies favoured by SSU rDNA and individual proteins with high statistical support. The test also rejects monophyletic origins for the genera Acytostelium, Polysphondylium and Dictyostelium. The current position of Acytostelium ellipticum in the consensus phylogeny indicates that somatic cells were lost twice in Dictyostelia.
A Cluster of Legionella-Associated Pneumonia Cases in a Population of Military Recruits

DTIC Science & Technology

2007-06-01

this cluster may suggest a previously unrecognized suscep- FIG. 1. Phylogenic analysis of the training center strain (represented by the MCRD consensus...military recruits during population- based surveillance for pneumonia pathogens. Results were confirmed by sequence analysis . Cases cluster tightly...17 April 2007 A Legionella cluster was identified through retrospective PCR analysis of 240 throat swab samples from X-ray-confirmed pneumonia cases
Characterization of the Porphyromonas gingivalis conjugative transposon CTnPg1: determination of the integration site and the genes essential for conjugal transfer.

PubMed

Naito, Mariko; Sato, Keiko; Shoji, Mikio; Yukitake, Hideharu; Ogura, Yoshitoshi; Hayashi, Tetsuya; Nakayama, Koji

2011-07-01

In our previous study, extensive genomic rearrangements were found in two strains of the Gram-negative anaerobic bacterium Porphyromonas (Por.) gingivalis, and most of these rearrangements were associated with mobile genetic elements such as insertion sequences and conjugative transposons (CTns). CTnPg1, identified in Por. gingivalis strain ATCC 33277, was the first complete CTn reported for the genus Porphyromonas. In the present study, we found that CTnPg1 can be transferred from strain ATCC 33277 to another Por. gingivalis strain, W83, at a frequency of 10(-7) to 10(-6). The excision of CTnPg1 from the chromosome in a donor cell depends on an integrase (Int; PGN_0094) encoded in CTnPg1, whereas CTnPg1 excision is independent of PGN_0084 (a DNA topoisomerase I homologue; Exc) encoded within CTnPg1 and recA (PGN_1057) on the donor chromosome. Intriguingly, however, the transfer of CTnPg1 between Por. gingivalis strains requires RecA function in the recipient. Sequencing analysis of CTnPg1-integrated sites on the chromosomes of transconjugants revealed that the consensus attachment (att) sequence is a 13 bp sequence, TTTTCNNNNAAAA. We further report that CTnPg1 is able to transfer to two other bacterial species, Bacteroides thetaiotaomicron and Prevotella oralis. In addition, CTnPg1-like CTns are located in the genomes of other oral anaerobic bacteria, Porphyromonas endodontalis, Prevotella buccae and Prevotella intermedia, with the same consensus att sequence. These results suggest that CTns in the CTnPg1 family are widely distributed among oral anaerobic Gram-negative bacteria found in humans and play important roles in horizontal gene transfer among these bacteria.
Systematic sequencing of mRNA from the Antarctic krill (Euphausia superba) and first tissue specific transcriptional signature

PubMed Central

De Pittà, Cristiano; Bertolucci, Cristiano; Mazzotta, Gabriella M; Bernante, Filippo; Rizzo, Giorgia; De Nardi, Barbara; Pallavicini, Alberto; Lanfranchi, Gerolamo; Costa, Rodolfo

2008-01-01

Background Little is known about the genome sequences of Euphausiacea (krill) although these crustaceans are abundant components of the pelagic ecosystems in all oceans and used for aquaculture and pharmaceutical industry. This study reports the results of an expressed sequence tag (EST) sequencing project from different tissues of Euphausia superba (the Antarctic krill). Results We have constructed and sequenced five cDNA libraries from different Antarctic krill tissues: head, abdomen, thoracopods and photophores. We have identified 1.770 high-quality ESTs which were assembled into 216 overlapping clusters and 801 singletons resulting in a total of 1.017 non-redundant sequences. Quantitative RT-PCR analysis was performed to quantify and validate the expression levels of ten genes presenting different EST countings in krill tissues. In addition, bioinformatic screening of the non-redundant E. superba sequences identified 69 microsatellite containing ESTs. Clusters, consensuses and related similarity and gene ontology searches were organized in a dedicated E. superba database . Conclusion We defined the first tissue transcriptional signatures of E. superba based on functional categorization among the examined tissues. The analyses of annotated transcripts showed a higher similarity with genes from insects with respect to Malacostraca possibly as an effect of the limited number of Malacostraca sequences in the public databases. Our catalogue provides for the first time a genomic tool to investigate the biology of the Antarctic krill. PMID:18226200
Evaluation of Phage Display Discovered Peptides as Ligands for Prostate-Specific Membrane Antigen (PSMA)

PubMed Central

Edwards, W. Barry

2013-01-01

The aim of this study was to identify potential ligands of PSMA suitable for further development as novel PSMA-targeted peptides using phage display technology. The human PSMA protein was immobilized as a target followed by incubation with a 15-mer phage display random peptide library. After one round of prescreening and two rounds of screening, high-stringency screening at the third round of panning was performed to identify the highest affinity binders. Phages which had a specific binding activity to PSMA in human prostate cancer cells were isolated and the DNA corresponding to the 15-mers were sequenced to provide three consensus sequences: GDHSPFT, SHFSVGS and EVPRLSLLAVFL as well as other sequences that did not display consensus. Two of the peptide sequences deduced from DNA sequencing of binding phages, SHSFSVGSGDHSPFT and GRFLTGGTGRLLRIS were labeled with 5-carboxyfluorescein and shown to bind and co-internalize with PSMA on human prostate cancer cells by fluorescence microscopy. The high stringency requirements yielded peptides with affinities KD∼1 µM or greater which are suitable starting points for affinity maturation. While these values were less than anticipated, the high stringency did yield peptide sequences that apparently bound to different surfaces on PSMA. These peptide sequences could be the basis for further development of peptides for prostate cancer tumor imaging and therapy. PMID:23935860
Epstein-Barr virus latent gene sequences as geographical markers of viral origin: unique EBNA3 gene signatures identify Japanese viruses as distinct members of the Asian virus family.

PubMed

Sawada, Akihisa; Croom-Carter, Deborah; Kondo, Osamu; Yasui, Masahiro; Koyama-Sato, Maho; Inoue, Masami; Kawa, Keisei; Rickinson, Alan B; Tierney, Rosemary J

2011-05-01

Polymorphisms in Epstein-Barr virus (EBV) latent genes can identify virus strains from different human populations and individual strains within a population. An Asian EBV signature has been defined almost exclusively from Chinese viruses, with little information from other Asian countries. Here we sequenced polymorphic regions of the EBNA1, 2, 3A, 3B, 3C and LMP1 genes of 31 Japanese strains from control donors and EBV-associated T/NK-cell lymphoproliferative disease (T/NK-LPD) patients. Though identical to Chinese strains in their dominant EBNA1 and LMP1 alleles, Japanese viruses were subtly different at other loci. Thus, while Chinese viruses mainly fall into two families with strongly linked 'Wu' or 'Li' alleles at EBNA2 and EBNA3A/B/C, Japanese viruses all have the consensus Wu EBNA2 allele but fall into two families at EBNA3A/B/C. One family has variant Li-like sequences at EBNA3A and 3B and the consensus Li sequence at EBNA3C; the other family has variant Wu-like sequences at EBNA3A, variants of a low frequency Chinese allele 'Sp' at EBNA3B and a consensus Sp sequence at EBNA3C. Thus, EBNA3A/B/C allelotypes clearly distinguish Japanese from Chinese strains. Interestingly, most Japanese viruses also lack those immune-escape mutations in the HLA-A11 epitope-encoding region of EBNA3B that are so characteristic of viruses from the highly A11-positive Chinese population. Control donor-derived and T/NK-LPD-derived strains were similarly distributed across allelotypes and, by using allelic polymorphisms to track virus strains in patients pre- and post-haematopoietic stem-cell transplant, we show that a single strain can induce both T/NK-LPD and B-cell-lymphoproliferative disease in the same patient.
Importing statistical measures into Artemis enhances gene identification in the Leishmania genome project.

PubMed

Aggarwal, Gautam; Worthey, E A; McDonagh, Paul D; Myler, Peter J

2003-06-07

Seattle Biomedical Research Institute (SBRI) as part of the Leishmania Genome Network (LGN) is sequencing chromosomes of the trypanosomatid protozoan species Leishmania major. At SBRI, chromosomal sequence is annotated using a combination of trained and untrained non-consensus gene-prediction algorithms with ARTEMIS, an annotation platform with rich and user-friendly interfaces. Here we describe a methodology used to import results from three different protein-coding gene-prediction algorithms (GLIMMER, TESTCODE and GENESCAN) into the ARTEMIS sequence viewer and annotation tool. Comparison of these methods, along with the CODONUSAGE algorithm built into ARTEMIS, shows the importance of combining methods to more accurately annotate the L. major genomic sequence. An improvised and powerful tool for gene prediction has been developed by importing data from widely-used algorithms into an existing annotation platform. This approach is especially fruitful in the Leishmania genome project where there is large proportion of novel genes requiring manual annotation.
A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS

PubMed Central

Jiao, Xiaoli; Zheng, Xin; Ma, Liang; Kutty, Geetha; Gogineni, Emile; Sun, Qiang; Sherman, Brad T.; Hu, Xiaojun; Jones, Kristine; Raley, Castle; Tran, Bao; Munroe, David J.; Stephens, Robert; Liang, Dun; Imamichi, Tomozumi; Kovacs, Joseph A.; Lempicki, Richard A.; Huang, Da Wei

2013-01-01

PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results. PMID:24179701
A Benchmark Study on Error Assessment and Quality Control of CCS Reads Derived from the PacBio RS.

PubMed

Jiao, Xiaoli; Zheng, Xin; Ma, Liang; Kutty, Geetha; Gogineni, Emile; Sun, Qiang; Sherman, Brad T; Hu, Xiaojun; Jones, Kristine; Raley, Castle; Tran, Bao; Munroe, David J; Stephens, Robert; Liang, Dun; Imamichi, Tomozumi; Kovacs, Joseph A; Lempicki, Richard A; Huang, Da Wei

2013-07-31

PacBio RS, a newly emerging third-generation DNA sequencing platform, is based on a real-time, single-molecule, nano-nitch sequencing technology that can generate very long reads (up to 20-kb) in contrast to the shorter reads produced by the first and second generation sequencing technologies. As a new platform, it is important to assess the sequencing error rate, as well as the quality control (QC) parameters associated with the PacBio sequence data. In this study, a mixture of 10 prior known, closely related DNA amplicons were sequenced using the PacBio RS sequencing platform. After aligning Circular Consensus Sequence (CCS) reads derived from the above sequencing experiment to the known reference sequences, we found that the median error rate was 2.5% without read QC, and improved to 1.3% with an SVM based multi-parameter QC method. In addition, a De Novo assembly was used as a downstream application to evaluate the effects of different QC approaches. This benchmark study indicates that even though CCS reads are post error-corrected it is still necessary to perform appropriate QC on CCS reads in order to produce successful downstream bioinformatics analytical results.
A high-density consensus map of barley linking DArT markers to SSR, RFLP and STS loci and agricultural traits

PubMed Central

Wenzl, Peter; Li, Haobing; Carling, Jason; Zhou, Meixue; Raman, Harsh; Paul, Edie; Hearnden, Phillippa; Maier, Christina; Xia, Ling; Caig, Vanessa; Ovesná, Jaroslava; Cakir, Mehmet; Poulsen, David; Wang, Junping; Raman, Rosy; Smith, Kevin P; Muehlbauer, Gary J; Chalmers, Ken J; Kleinhofs, Andris; Huttner, Eric; Kilian, Andrzej

2006-01-01

Background Molecular marker technologies are undergoing a transition from largely serial assays measuring DNA fragment sizes to hybridization-based technologies with high multiplexing levels. Diversity Arrays Technology (DArT) is a hybridization-based technology that is increasingly being adopted by barley researchers. There is a need to integrate the information generated by DArT with previous data produced with gel-based marker technologies. The goal of this study was to build a high-density consensus linkage map from the combined datasets of ten populations, most of which were simultaneously typed with DArT and Simple Sequence Repeat (SSR), Restriction Enzyme Fragment Polymorphism (RFLP) and/or Sequence Tagged Site (STS) markers. Results The consensus map, built using a combination of JoinMap 3.0 software and several purpose-built perl scripts, comprised 2,935 loci (2,085 DArT, 850 other loci) and spanned 1,161 cM. It contained a total of 1,629 'bins' (unique loci), with an average inter-bin distance of 0.7 ± 1.0 cM (median = 0.3 cM). More than 98% of the map could be covered with a single DArT assay. The arrangement of loci was very similar to, and almost as optimal as, the arrangement of loci in component maps built for individual populations. The locus order of a synthetic map derived from merging the component maps without considering the segregation data was only slightly inferior. The distribution of loci along chromosomes indicated centromeric suppression of recombination in all chromosomes except 5H. DArT markers appeared to have a moderate tendency toward hypomethylated, gene-rich regions in distal chromosome areas. On the average, 14 ± 9 DArT loci were identified within 5 cM on either side of SSR, RFLP or STS loci previously identified as linked to agricultural traits. Conclusion Our barley consensus map provides a framework for transferring genetic information between different marker systems and for deploying DArT markers in molecular breeding schemes. The study also highlights the need for improved software for building consensus maps from high-density segregation data of multiple populations. PMID:16904008
Structure and Temporal Dynamics of Populations within Wheat Streak Mosaic Virus Isolates

PubMed Central

Hall, Jeffrey S.; French, Roy; Morris, T. Jack; Stenger, Drake C.

2001-01-01

Variation within the Type and Sidney 81 strains of wheat streak mosaic virus was assessed by single-strand conformation polymorphism (SSCP) analysis and confirmed by nucleotide sequencing. Limiting-dilution subisolates (LDSIs) of each strain were evaluated for polymorphism in the P1, P3, NIa, and CP cistrons. Different SSCP patterns among LDSIs of a strain were associated with single-nucleotide substitutions. Sidney 81 LDSI-S10 was used as founding inoculum to establish three lineages each in wheat, corn, and barley. The P1, HC-Pro, P3, CI, NIa, NIb, and CP cistrons of LDSI-S10 and each lineage at passages 1, 3, 6, and 9 were evaluated for polymorphism. By passage 9, each lineage differed in consensus sequence from LDSI-S10. The majority of substitutions occurred within NIa and CP, although at least one change occurred in each cistron except HC-Pro and P3. Most consensus sequence changes among lineages were independent, with substitutions accumulating over time. However, LDSI-S10 bore a variant nucleotide (G6016) in NIa that was restored to A6016 in eight of nine lineages by passage 6. This near-global reversion is most easily explained by selection. Examination of nonconsensus variation revealed a pool of unique substitutions (singletons) that remained constant in frequency during passage, regardless of the host species examined. These results suggest that mutations arising by viral polymerase error are generated at a constant rate but that most newly generated mutants are sequestered in virions and do not serve as replication templates. Thus, a substantial fraction of variation generated is static and has yet to be tested for relative fitness. In contrast, nonsingleton variation increased upon passage, suggesting that some mutants do serve as replication templates and may become established in a population. Replicated mutants may or may not rise to prominence to become the consensus sequence in a lineage, with the fate of any particular mutant subject to selection and stochastic processes such as genetic drift and population growth factors. PMID:11581391
A Consensus Genetic Map for Pinus taeda and Pinus elliottii and Extent of Linkage Disequilibrium in Two Genotype-Phenotype Discovery Populations of Pinus taeda

PubMed Central

Westbrook, Jared W.; Chhatre, Vikram E.; Wu, Le-Shin; Chamala, Srikar; Neves, Leandro Gomide; Muñoz, Patricio; Martínez-García, Pedro J.; Neale, David B.; Kirst, Matias; Mockaitis, Keithanne; Nelson, C. Dana; Peter, Gary F.; Echt, Craig S.

2015-01-01

A consensus genetic map for Pinus taeda (loblolly pine) and Pinus elliottii (slash pine) was constructed by merging three previously published P. taeda maps with a map from a pseudo-backcross between P. elliottii and P. taeda. The consensus map positioned 3856 markers via genotyping of 1251 individuals from four pedigrees. It is the densest linkage map for a conifer to date. Average marker spacing was 0.6 cM and total map length was 2305 cM. Functional predictions of mapped genes were improved by aligning expressed sequence tags used for marker discovery to full-length P. taeda transcripts. Alignments to the P. taeda genome mapped 3305 scaffold sequences onto 12 linkage groups. The consensus genetic map was used to compare the genome-wide linkage disequilibrium in a population of distantly related P. taeda individuals (ADEPT2) used for association genetic studies and a multiple-family pedigree used for genomic selection (CCLONES). The prevalence and extent of LD was greater in CCLONES as compared to ADEPT2; however, extended LD with LGs or between LGs was rare in both populations. The average squared correlations, r2, between SNP alleles less than 1 cM apart were less than 0.05 in both populations and r2 did not decay substantially with genetic distance. The consensus map and analysis of linkage disequilibrium establish a foundation for comparative association mapping and genomic selection in P. taeda and P. elliottii. PMID:26068575

Draft Genome Sequences of Two Novel Salmonella enterica subsp. enterica Strains Isolated from Low-Moisture Foods with Applications in Food Safety Research.

PubMed

Radford, Devon R; Leon-Velarde, Carlos G; Chen, Shu; Hamidi Oskouei, Amir M; Balamurugan, Sampathkumar

2018-03-29

The genomes of two strains of Salmonella enterica subsp. enterica serovar Cubana and serovar Muenchen, isolated from dry hazelnuts and chia seeds, respectively, were sequenced using the Illumina MiSeq platform, assembled de novo using the overlap-layout-consensus method, and aligned to their respective most identical sequence genome scaffolds using MUMMER and BLAST searches. Copyright © 2018 Radford et al.
Strong spurious transcription likely contributes to DNA insert bias in typical metagenomic clone libraries.

PubMed

Lam, Kathy N; Charles, Trevor C

2015-01-01

Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role. To explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ(70) promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ(70) consensus sequences in the genome than with simple GC content. The GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ(70) consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite widespread use of E. coli to propagate foreign DNA in metagenomic libraries, the effects of in vivo transcriptional activity on clone stability are not well understood. Further work is required to tease apart the effects of transcription from those of gene product toxicity.
Easy and accurate reconstruction of whole HIV genomes from short-read sequence data with shiver

PubMed Central

Blanquart, François; Golubchik, Tanya; Gall, Astrid; Bakker, Margreet; Bezemer, Daniela; Croucher, Nicholas J; Hall, Matthew; Hillebregt, Mariska; Ratmann, Oliver; Albert, Jan; Bannert, Norbert; Fellay, Jacques; Fransen, Katrien; Gourlay, Annabelle; Grabowski, M Kate; Gunsenheimer-Bartmeyer, Barbara; Günthard, Huldrych F; Kivelä, Pia; Kouyos, Roger; Laeyendecker, Oliver; Liitsola, Kirsi; Meyer, Laurence; Porter, Kholoud; Ristola, Matti; van Sighem, Ard; Cornelissen, Marion; Kellam, Paul; Reiss, Peter

2018-01-01

Abstract Studying the evolution of viruses and their molecular epidemiology relies on accurate viral sequence data, so that small differences between similar viruses can be meaningfully interpreted. Despite its higher throughput and more detailed minority variant data, next-generation sequencing has yet to be widely adopted for HIV. The difficulty of accurately reconstructing the consensus sequence of a quasispecies from reads (short fragments of DNA) in the presence of large between- and within-host diversity, including frequent indels, may have presented a barrier. In particular, mapping (aligning) reads to a reference sequence leads to biased loss of information; this bias can distort epidemiological and evolutionary conclusions. De novo assembly avoids this bias by aligning the reads to themselves, producing a set of sequences called contigs. However contigs provide only a partial summary of the reads, misassembly may result in their having an incorrect structure, and no information is available at parts of the genome where contigs could not be assembled. To address these problems we developed the tool shiver to pre-process reads for quality and contamination, then map them to a reference tailored to the sample using corrected contigs supplemented with the user’s choice of existing reference sequences. Run with two commands per sample, it can easily be used for large heterogeneous data sets. We used shiver to reconstruct the consensus sequence and minority variant information from paired-end short-read whole-genome data produced with the Illumina platform, for sixty-five existing publicly available samples and fifty new samples. We show the systematic superiority of mapping to shiver’s constructed reference compared with mapping the same reads to the closest of 3,249 real references: median values of 13 bases called differently and more accurately, 0 bases called differently and less accurately, and 205 bases of missing sequence recovered. We also successfully applied shiver to whole-genome samples of Hepatitis C Virus and Respiratory Syncytial Virus. shiver is publicly available from https://github.com/ChrisHIV/shiver. PMID:29876136
Comparison of MY09/11 consensus PCR and type-specific PCRs in the detection of oncogenic HPV types.

PubMed

Depuydt, C E; Boulet, G A V; Horvath, C A J; Benoy, I H; Vereecken, A J; Bogers, J J

2007-01-01

The causal relationship between persistent infection with high-risk HPV and cervical cancer has resulted in the development of HPV DNA detection systems. The widely used MY09/11 consensus PCR targets a 450bp conserved sequence in the HPV L1 gene, and can therefore amplify a broad spectrum of HPV types. However, limitations of these consensus primers are evident, particularly in regard to the variability in detection sensitivity among different HPV types. This study compared MY09/11 PCR with type-specific PCRs in the detection of oncogenic HPV types. The study population comprised 15, 774 patients. Consensus PCR failed to detect 522 (10.9%) HPV infections indicated by type-specific PCRs. A significant correlation between failure of consensus PCR and HPV type was found. HPV types 51, 68 and 45 were missed most frequently. The clinical relevance of the HPV infections missed by MY09/11 PCR was reflected in the fraction of cases with cytological abnormalities and in follow-up, showing 104 (25.4%) CIN2+ cases. The MY09/11 false negativity could be the result of poor sensitivity, mismatch of MY09/11 primers or disruption of L1 target by HPV integration or DNA degradation. Furthermore, MY09/11 PCR lacked specificity for oncogenic HPVs. Diagnostic accuracy of the PCR systems, in terms of sensitivity (MY09/11 PCR: 87.9%; type-specific PCRs: 98.3%) and specificity (MY09/11 PCR: 38.7%; type-specific PCRs: 76.14%), and predictive values for histologically confirmed CIN2+, suggest that type-specific PCRs could be used in a clinical setting as a reliable screening tool.
Investigating intra-host and intra-herd sequence diversity of foot-and-mouth disease virus.

PubMed

King, David J; Freimanis, Graham L; Orton, Richard J; Waters, Ryan A; Haydon, Daniel T; King, Donald P

2016-10-01

Due to the poor-fidelity of the enzymes involved in RNA genome replication, foot-and-mouth disease (FMD) virus samples comprise of unique polymorphic populations. In this study, deep sequencing was utilised to characterise the diversity of FMD virus (FMDV) populations in 6 infected cattle present on a single farm during the series of outbreaks in the UK in 2007. A novel RT-PCR method was developed to amplify a 7.6kb nucleotide fragment encompassing the polyprotein coding region of the FMDV genome. Illumina sequencing of each sample identified the fine polymorphic structures at each nucleotide position, from consensus level changes to variants present at a 0.24% frequency. These data were used to investigate population dynamics of FMDV at both herd and host levels, evaluate the impact of host on the viral swarm structure and to identify transmission links with viruses recovered from other farms in the same series of outbreaks. In 7 samples, from 6 different animals, a total of 5 consensus level variants were identified, in addition to 104 sub-consensus variants of which 22 were shared between 2 or more animals. Further analysis revealed differences in swarm structures from samples derived from the same animal suggesting the presence of distinct viral populations evolving independently at different lesion sites within the same infected animal. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes.

PubMed

Lomsadze, Alexandre; Gemayel, Karl; Tang, Shiyuyun; Borodovsky, Mark

2018-05-17

In a conventional view of the prokaryotic genome organization, promoters precede operons and ribosome binding sites (RBSs) with Shine-Dalgarno consensus precede genes. However, recent experimental research suggesting a more diverse view motivated us to develop an algorithm with improved gene-finding accuracy. We describe GeneMarkS-2, an ab initio algorithm that uses a model derived by self-training for finding species-specific (native) genes, along with an array of precomputed "heuristic" models designed to identify harder-to-detect genes (likely horizontally transferred). Importantly, we designed GeneMarkS-2 to identify several types of distinct sequence patterns (signals) involved in gene expression control, among them the patterns characteristic for leaderless transcription as well as noncanonical RBS patterns. To assess the accuracy of GeneMarkS-2, we used genes validated by COG (Clusters of Orthologous Groups) annotation, proteomics experiments, and N-terminal protein sequencing. We observed that GeneMarkS-2 performed better on average in all accuracy measures when compared with the current state-of-the-art gene prediction tools. Furthermore, the screening of ∼5000 representative prokaryotic genomes made by GeneMarkS-2 predicted frequent leaderless transcription in both archaea and bacteria. We also observed that the RBS sites in some species with leadered transcription did not necessarily exhibit the Shine-Dalgarno consensus. The modeling of different types of sequence motifs regulating gene expression prompted a division of prokaryotic genomes into five categories with distinct sequence patterns around the gene starts. © 2018 Lomsadze et al.; Published by Cold Spring Harbor Laboratory Press.
Testing Convergent Evolution in Auditory Processing Genes between Echolocating Mammals and the Aye-Aye, a Percussive-Foraging Primate

PubMed Central

Jerjos, Michael; Hohman, Baily; Lauterbur, M. Elise; Kistler, Logan

2017-01-01

Abstract Several taxonomically distinct mammalian groups—certain microbats and cetaceans (e.g., dolphins)—share both morphological adaptations related to echolocation behavior and strong signatures of convergent evolution at the amino acid level across seven genes related to auditory processing. Aye-ayes (Daubentonia madagascariensis) are nocturnal lemurs with a specialized auditory processing system. Aye-ayes tap rapidly along the surfaces of trees, listening to reverberations to identify the mines of wood-boring insect larvae; this behavior has been hypothesized to functionally mimic echolocation. Here we investigated whether there are signals of convergence in auditory processing genes between aye-ayes and known mammalian echolocators. We developed a computational pipeline (Basic Exon Assembly Tool) that produces consensus sequences for regions of interest from shotgun genomic sequencing data for nonmodel organisms without requiring de novo genome assembly. We reconstructed complete coding region sequences for the seven convergent echolocating bat–dolphin genes for aye-ayes and another lemur. We compared sequences from these two lemurs in a phylogenetic framework with those of bat and dolphin echolocators and appropriate nonecholocating outgroups. Our analysis reaffirms the existence of amino acid convergence at these loci among echolocating bats and dolphins; some methods also detected signals of convergence between echolocating bats and both mice and elephants. However, we observed no significant signal of amino acid convergence between aye-ayes and echolocating bats and dolphins, suggesting that aye-aye tap-foraging auditory adaptations represent distinct evolutionary innovations. These results are also consistent with a developing consensus that convergent behavioral ecology does not reliably predict convergent molecular evolution. PMID:28810710
Current whole-body MRI applications in the neurofibromatoses

PubMed Central

Fayad, Laura M.; Khan, Muhammad Shayan; Bredella, Miriam A.; Harris, Gordon J.; Evans, D. Gareth; Farschtschi, Said; Jacobs, Michael A.; Chhabra, Avneesh; Salamon, Johannes M.; Wenzel, Ralph; Mautner, Victor F.; Dombi, Eva; Cai, Wenli; Plotkin, Scott R.; Blakeley, Jaishri O.

2016-01-01

Objectives: The Response Evaluation in Neurofibromatosis and Schwannomatosis (REiNS) International Collaboration Whole-Body MRI (WB-MRI) Working Group reviewed the existing literature on WB-MRI, an emerging technology for assessing disease in patients with neurofibromatosis type 1 (NF1), neurofibromatosis type 2 (NF2), and schwannomatosis (SWN), to recommend optimal image acquisition and analysis methods to enable WB-MRI as an endpoint in NF clinical trials. Methods: A systematic process was used to review all published data about WB-MRI in NF syndromes to assess diagnostic accuracy, feasibility and reproducibility, and data about specific techniques for assessment of tumor burden, characterization of neoplasms, and response to therapy. Results: WB-MRI at 1.5T or 3.0T is feasible for image acquisition. Short tau inversion recovery (STIR) sequence is used in all investigations to date, suggesting consensus about the utility of this sequence for detection of WB tumor burden in people with NF. There are insufficient data to support a consensus statement about the optimal imaging planes (axial vs coronal) or 2D vs 3D approaches. Functional imaging, although used in some NF studies, has not been systematically applied or evaluated. There are no comparative studies between regional vs WB-MRI or evaluations of WB-MRI reproducibility. Conclusions: WB-MRI is feasible for identifying tumors using both 1.5T and 3.0T systems. The STIR sequence is a core sequence. Additional investigation is needed to define the optimal approach for volumetric analysis, the reproducibility of WB-MRI in NF, and the diagnostic performance of WB-MRI vs regional MRI. PMID:27527647
Full trans-activation mediated by the immediate-early protein of equine herpesvirus 1 requires a consensus TATA box, but not its cognate binding sequence.

PubMed

Kim, Seong K; Shakya, Akhalesh K; O'Callaghan, Dennis J

2016-01-04

The immediate-early protein (IEP) of equine herpesvirus 1 (EHV-1) has extensive homology to the IEP of alphaherpesviruses and possesses domains essential for trans-activation, including an acidic trans-activation domain (TAD) and binding domains for DNA, TFIIB, and TBP. Our data showed that the IEP directly interacted with transcription factor TFIIA, which is known to stabilize the binding of TBP and TFIID to the TATA box of core promoters. When the TATA box of the EICP0 promoter was mutated to a nonfunctional TATA box, IEP-mediated trans-activation was reduced from 22-fold to 7-fold. The IEP trans-activated the viral promoters in a TATA motif-dependent manner. Our previous data showed that the IEP is able to repress its own promoter when the IEP-binding sequence (IEBS) is located within 26-bp from the TATA box. When the IEBS was located at 100 bp upstream of the TATA box, IEP-mediated trans-activation was very similar to that of the minimal IE(nt -89 to +73) promoter lacking the IEBS. As the distance from the IEBS to the TATA box decreased, IEP-mediated trans-activation progressively decreased, indicating that the IEBS located within 100 bp from the TATA box sequence functions as a distance-dependent repressive element. These results indicated that IEP-mediated full trans-activation requires a consensus TATA box of core promoters, but not its binding to the cognate sequence (IEBS). Copyright © 2015 Elsevier B.V. All rights reserved.
Full trans–activation mediated by the immediate–early protein of equine herpesvirus 1 requires a consensus TATA box, but not its cognate binding sequence

PubMed Central

Kim, Seong K.; Shakya, Akhalesh K.; O'Callaghan, Dennis J.

2015-01-01

The immediate-early protein (IEP) of equine herpesvirus 1 (EHV-1) has extensive homology to the IEP of alphaherpesviruses and possesses domains essential for trans-activation, including an acidic trans-activation domain (TAD) and binding domains for DNA, TFIIB, and TBP. Our data showed that the IEP directly interacted with transcription factor TFIIA, which is known to stabilize the binding of TBP and TFIID to the TATA box of core promoters. When the TATA box of the EICP0 promoter was mutated to a nonfunctional TATA box, IEP-mediated trans-activation was reduced from 22-fold to 7-fold. The IEP trans-activated the viral promoters in a TATA motif-dependent manner. Our previous data showed that the IEP is able to repress its own promoter when the IEP-binding sequence (IEBS) is located within 26-bp from the TATA box. When the IEBS was located at 100 bp upstream of the TATA box, IEP-mediated trans-activation was very similar to that of the minimal IE(nt −89 to +73) promoter lacking the IEBS. As the distance from the IEBS to the TATA box decreased, IEP-mediated trans-activation progressively decreased, indicating that the IEBS located within 100 bp from the TATA box sequence functions as a distance-dependent repressive element. These results indicated that IEP-mediated full trans-activation requires a consensus TATA box of core promoters, but not its binding to the cognate sequence (IEBS). PMID:26541315
Detection of a novel herpesvirus from bats in the Philippines.

PubMed

Sano, Kaori; Okazaki, Sachiko; Taniguchi, Satoshi; Masangkay, Joseph S; Puentespina, Roberto; Eres, Eduardo; Cosico, Edison; Quibod, Niña; Kondo, Taisuke; Shimoda, Hiroshi; Hatta, Yuuki; Mitomo, Shumpei; Oba, Mami; Katayama, Yukie; Sassa, Yukiko; Furuya, Tetsuya; Nagai, Makoto; Une, Yumi; Maeda, Ken; Kyuwa, Shigeru; Yoshikawa, Yasuhiro; Akashi, Hiroomi; Omatsu, Tsutomu; Mizutani, Tetsuya

2015-08-01

Bats are natural hosts of many zoonotic viruses. Monitoring bat viruses is important to detect novel bat-borne infectious diseases. In this study, next generation sequencing techniques and conventional PCR were used to analyze intestine, lung, and blood clot samples collected from wild bats captured at three locations in Davao region, in the Philippines in 2012. Different viral genes belonging to the Retroviridae and Herpesviridae families were identified using next generation sequencing. The existence of herpesvirus in the samples was confirmed by PCR using herpesvirus consensus primers. The nucleotide sequences of the resulting PCR amplicons were 166-bp. Further phylogenetic analysis identified that the virus from which this nucleotide sequence was obtained belonged to the Gammaherpesvirinae subfamily. PCR using primers specific to the nucleotide sequence obtained revealed that the infection rate among the captured bats was 30 %. In this study, we present the partial genome of a novel gammaherpesvirus detected from wild bats. Our observations also indicate that this herpesvirus may be widely distributed in bat populations in Davao region.
Real-time UAV trajectory generation using feature points matching between video image sequences

NASA Astrophysics Data System (ADS)

Byun, Younggi; Song, Jeongheon; Han, Dongyeob

2017-09-01

Unmanned aerial vehicles (UAVs), equipped with navigation systems and video capability, are currently being deployed for intelligence, reconnaissance and surveillance mission. In this paper, we present a systematic approach for the generation of UAV trajectory using a video image matching system based on SURF (Speeded up Robust Feature) and Preemptive RANSAC (Random Sample Consensus). Video image matching to find matching points is one of the most important steps for the accurate generation of UAV trajectory (sequence of poses in 3D space). We used the SURF algorithm to find the matching points between video image sequences, and removed mismatching by using the Preemptive RANSAC which divides all matching points to outliers and inliers. The inliers are only used to determine the epipolar geometry for estimating the relative pose (rotation and translation) between image sequences. Experimental results from simulated video image sequences showed that our approach has a good potential to be applied to the automatic geo-localization of the UAVs system
PHASTpep: Analysis Software for Discovery of Cell-Selective Peptides via Phage Display and Next-Generation Sequencing

PubMed Central

Dasa, Siva Sai Krishna; Kelly, Kimberly A.

2016-01-01

Next-generation sequencing has enhanced the phage display process, allowing for the quantification of millions of sequences resulting from the biopanning process. In response, many valuable analysis programs focused on specificity and finding targeted motifs or consensus sequences were developed. For targeted drug delivery and molecular imaging, it is also necessary to find peptides that are selective—targeting only the cell type or tissue of interest. We present a new analysis strategy and accompanying software, PHage Analysis for Selective Targeted PEPtides (PHASTpep), which identifies highly specific and selective peptides. Using this process, we discovered and validated, both in vitro and in vivo in mice, two sequences (HTTIPKV and APPIMSV) targeted to pancreatic cancer-associated fibroblasts that escaped identification using previously existing software. Our selectivity analysis makes it possible to discover peptides that target a specific cell type and avoid other cell types, enhancing clinical translatability by circumventing complications with systemic use. PMID:27186887
Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software.

PubMed

Nakano, Shogo; Asano, Yasuhisa

2015-02-03

Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software

NASA Astrophysics Data System (ADS)

Nakano, Shogo; Asano, Yasuhisa

2015-02-01

Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
Spatial Sequences, but Not Verbal Sequences, Are Vulnerable to General Interference during Retention in Working Memory

ERIC Educational Resources Information Center

Morey, Candice C.; Miron, Monica D.

2016-01-01

Among models of working memory, there is not yet a consensus about how to describe functions specific to storing verbal or visual-spatial memories. We presented aural-verbal and visual-spatial lists simultaneously and sometimes cued one type of information after presentation, comparing accuracy in conditions with and without informative…
Homologues of insulinase, a new superfamily of metalloendopeptidases.

PubMed Central

Rawlings, N D; Barrett, A J

1991-01-01

On the basis of a statistical analysis of an alignment of the amino acid sequences, a new superfamily of metalloendopeptidases is proposed, consisting of human insulinase, Escherichia coli protease III and mitochondrial processing endopeptidases from Saccharomyces and Neurospora. These enzymes do not contain the 'HEXXH' consensus sequence found in all previously recognized zinc metalloendopeptidases. PMID:2025223
A Consensus Method for the Prediction of ‘Aggregation-Prone’ Peptides in Globular Proteins

PubMed Central

Tsolis, Antonios C.; Papandreou, Nikos C.; Iconomidou, Vassiliki A.; Hamodrakas, Stavros J.

2013-01-01

The purpose of this work was to construct a consensus prediction algorithm of ‘aggregation-prone’ peptides in globular proteins, combining existing tools. This allows comparison of the different algorithms and the production of more objective and accurate results. Eleven (11) individual methods are combined and produce AMYLPRED2, a publicly, freely available web tool to academic users (http://biophysics.biol.uoa.gr/AMYLPRED2), for the consensus prediction of amyloidogenic determinants/‘aggregation-prone’ peptides in proteins, from sequence alone. The performance of AMYLPRED2 indicates that it functions better than individual aggregation-prediction algorithms, as perhaps expected. AMYLPRED2 is a useful tool for identifying amyloid-forming regions in proteins that are associated with several conformational diseases, called amyloidoses, such as Altzheimer's, Parkinson's, prion diseases and type II diabetes. It may also be useful for understanding the properties of protein folding and misfolding and for helping to the control of protein aggregation/solubility in biotechnology (recombinant proteins forming bacterial inclusion bodies) and biotherapeutics (monoclonal antibodies and biopharmaceutical proteins). PMID:23326595
Common Viral Integration Sites Identified in Avian Leukosis Virus-Induced B-Cell Lymphomas

PubMed Central

Justice, James F.; Morgan, Robin W.

2015-01-01

ABSTRACT Avian leukosis virus (ALV) induces B-cell lymphoma and other neoplasms in chickens by integrating within or near cancer genes and perturbing their expression. Four genes—MYC, MYB, Mir-155, and TERT—have previously been identified as common integration sites in these virus-induced lymphomas and are thought to play a causal role in tumorigenesis. In this study, we employ high-throughput sequencing to identify additional genes driving tumorigenesis in ALV-induced B-cell lymphomas. In addition to the four genes implicated previously, we identify other genes as common integration sites, including TNFRSF1A, MEF2C, CTDSPL, TAB2, RUNX1, MLL5, CXorf57, and BACH2. We also analyze the genome-wide ALV integration landscape in vivo and find increased frequency of ALV integration near transcriptional start sites and within transcripts. Previous work has shown ALV prefers a weak consensus sequence for integration in cultured human cells. We confirm this consensus sequence for ALV integration in vivo in the chicken genome. PMID:26670384
Modeling repetitive, non‐globular proteins

PubMed Central

Basu, Koli; Campbell, Robert L.; Guo, Shuaiqi; Sun, Tianjun

2016-01-01

Abstract While ab initio modeling of protein structures is not routine, certain types of proteins are more straightforward to model than others. Proteins with short repetitive sequences typically exhibit repetitive structures. These repetitive sequences can be more amenable to modeling if some information is known about the predominant secondary structure or other key features of the protein sequence. We have successfully built models of a number of repetitive structures with novel folds using knowledge of the consensus sequence within the sequence repeat and an understanding of the likely secondary structures that these may adopt. Our methods for achieving this success are reviewed here. PMID:26914323

Histoimmunogenetics Markup Language 1.0: Reporting next generation sequencing-based HLA and KIR genotyping.

PubMed

Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin

2015-12-01

We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Towards the Rational Design of a Candidate Vaccine against Pregnancy Associated Malaria: Conserved Sequences of the DBL6ε Domain of VAR2CSA

PubMed Central

Badaut, Cyril; Bertin, Gwladys; Rustico, Tatiana; Fievet, Nadine; Massougbodji, Achille; Gaye, Alioune; Deloron, Philippe

2010-01-01

Background Placental malaria is a disease linked to the sequestration of Plasmodium falciparum infected red blood cells (IRBC) in the placenta, leading to reduced materno-fetal exchanges and to local inflammation. One of the virulence factors of P. falciparum involved in cytoadherence to chondroitin sulfate A, its placental receptor, is the adhesive protein VAR2CSA. Its localisation on the surface of IRBC makes it accessible to the immune system. VAR2CSA contains six DBL domains. The DBL6ε domain is the most variable. High variability constitutes a means for the parasite to evade the host immune response. The DBL6ε domain could constitute a very attractive basis for a vaccine candidate but its reported variability necessitates, for antigenic characterisations, identifying and classifying commonalities across isolates. Methodology/Principal Findings Local alignment analysis of the DBL6ε domain had revealed that it is not as variable as previously described. Variability is concentrated in seven regions present on the surface of the DBL6ε domain. The main goal of our work is to classify and group variable sequences that will simplify further research to determine dominant epitopes. Firstly, variable sequences were grouped following their average percent pairwise identity (APPI). Groups comprising many variable sequences sharing low variability were found. Secondly, ELISA experiments following the IgG recognition of a recombinant DBL6ε domain, and of peptides mimicking its seven variable blocks, allowed to determine an APPI cut-off and to isolate groups represented by a single consensus sequence. Conclusions/Significance A new sequence approach is used to compare variable regions in sequences that have extensive segmental gene relationship. Using this approach, the VAR2CSA DBL6 domain is composed of 7 variable blocks with limited polymorphism. Each variable block is composed of a limited number of consensus types. Based on peptide based ELISA, variable blocks with 85% or greater sequence identity are expected to be recognized equally well by antibody and can be considered the same consensus type. Therefore, the analysis of the antibody response against the classified small number of sequences should be helpful to determine epitopes. PMID:20585655
MytiBase: a knowledgebase of mussel (M. galloprovincialis) transcribed sequences

PubMed Central

Venier, Paola; De Pittà, Cristiano; Bernante, Filippo; Varotto, Laura; De Nardi, Barbara; Bovo, Giuseppe; Roch, Philippe; Novoa, Beatriz; Figueras, Antonio; Pallavicini, Alberto; Lanfranchi, Gerolamo

2009-01-01

Background Although Bivalves are among the most studied marine organisms due to their ecological role, economic importance and use in pollution biomonitoring, very little information is available on the genome sequences of mussels. This study reports the functional analysis of a large-scale Expressed Sequence Tag (EST) sequencing from different tissues of Mytilus galloprovincialis (the Mediterranean mussel) challenged with toxic pollutants, temperature and potentially pathogenic bacteria. Results We have constructed and sequenced seventeen cDNA libraries from different Mediterranean mussel tissues: gills, digestive gland, foot, anterior and posterior adductor muscle, mantle and haemocytes. A total of 24,939 clones were sequenced from these libraries generating 18,788 high-quality ESTs which were assembled into 2,446 overlapping clusters and 4,666 singletons resulting in a total of 7,112 non-redundant sequences. In particular, a high-quality normalized cDNA library (Nor01) was constructed as determined by the high rate of gene discovery (65.6%). Bioinformatic screening of the non-redundant M. galloprovincialis sequences identified 159 microsatellite-containing ESTs. Clusters, consensuses, related similarities and gene ontology searches have been organized in a dedicated, searchable database . Conclusion We defined the first species-specific catalogue of M. galloprovincialis ESTs including 7,112 unique transcribed sequences. Putative microsatellite markers were identified. This annotated catalogue represents a valuable platform for expression studies, marker validation and genetic linkage analysis for investigations in the biology of Mediterranean mussels. PMID:19203376
Convergence of DNA methylation and phosphorothioation epigenetics in bacterial genomes.

PubMed

Chen, Chao; Wang, Lianrong; Chen, Si; Wu, Xiaolin; Gu, Meijia; Chen, Xi; Jiang, Susu; Wang, Yunfu; Deng, Zixin; Dedon, Peter C; Chen, Shi

2017-04-25

Explosive growth in the study of microbial epigenetics has revealed a diversity of chemical structures and biological functions of DNA modifications in restriction-modification (R-M) and basic genetic processes. Here, we describe the discovery of shared consensus sequences for two seemingly unrelated DNA modification systems, 6m A methylation and phosphorothioation (PT), in which sulfur replaces a nonbridging oxygen in the DNA backbone. Mass spectrometric analysis of DNA from Escherichia coli B7A and Salmonella enterica serovar Cerro 87, strains possessing PT-based R-M genes, revealed d(G PS 6m A) dinucleotides in the G PS 6m AAC consensus representing ∼5% of the 1,100 to 1,300 PT-modified d(G PS A) motifs per genome, with 6m A arising from a yet-to-be-identified methyltransferase. To further explore PT and 6m A in another consensus sequence, G PS 6m ATC, we engineered a strain of E. coli HST04 to express Dnd genes from Hahella chejuensis KCTC2396 (PT in G PS ATC) and Dam methyltransferase from E. coli DH10B ( 6m A in G 6m ATC). Based on this model, in vitro studies revealed reduced Dam activity in G PS ATC-containing oligonucleotides whereas single-molecule real-time sequencing of HST04 DNA revealed 6m A in all 2,058 G PS ATC sites (5% of 37,698 total GATC sites). This model system also revealed temperature-sensitive restriction by DndFGH in KCTC2396 and B7A, which was exploited to discover that 6m A can substitute for PT to confer resistance to restriction by the DndFGH system. These results point to complex but unappreciated interactions between DNA modification systems and raise the possibility of coevolution of interacting systems to facilitate the function of each.
A High Density Consensus Genetic Map of Tetraploid Cotton That Integrates Multiple Component Maps through Molecular Marker Redundancy Check

PubMed Central

Blenda, Anna; Fang, David D.; Rami, Jean-François; Garsmeur, Olivier; Luo, Feng; Lacape, Jean-Marc

2012-01-01

A consensus genetic map of tetraploid cotton was constructed using six high-density maps and after the integration of a sequence-based marker redundancy check. Public cotton SSR libraries (17,343 markers) were curated for sequence redundancy using 90% as a similarity cutoff. As a result, 20% of the markers (3,410) could be considered as redundant with some other markers. The marker redundancy information had been a crucial part of the map integration process, in which the six most informative interspecific Gossypium hirsutum×G. barbadense genetic maps were used for assembling a high density consensus (HDC) map for tetraploid cotton. With redundant markers being removed, the HDC map could be constructed thanks to the sufficient number of collinear non-redundant markers in common between the component maps. The HDC map consists of 8,254 loci, originating from 6,669 markers, and spans 4,070 cM, with an average of 2 loci per cM. The HDC map presents a high rate of locus duplications, as 1,292 markers among the 6,669 were mapped in more than one locus. Two thirds of the duplications are bridging homoeologous AT and DT chromosomes constitutive of allopolyploid cotton genome, with an average of 64 duplications per AT/DT chromosome pair. Sequences of 4,744 mapped markers were used for a mutual blast alignment (BBMH) with the 13 major scaffolds of the recently released Gossypium raimondii genome indicating high level of homology between the diploid D genome and the tetraploid cotton genetic map, with only a few minor possible structural rearrangements. Overall, the HDC map will serve as a valuable resource for trait QTL comparative mapping, map-based cloning of important genes, and better understanding of the genome structure and evolution of tetraploid cotton. PMID:23029214
ESTree db: a Tool for Peach Functional Genomics

PubMed Central

Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo

2005-01-01

Background The ESTree db represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. Results The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. Conclusion The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig. PMID:16351742
Design and Evaluation of Illumina MiSeq-Compatible, 18S rRNA Gene-Specific Primers for Improved Characterization of Mixed Phototrophic Communities.

PubMed

Bradley, Ian M; Pinto, Ameet J; Guest, Jeremy S

2016-10-01

The use of high-throughput sequencing technologies with the 16S rRNA gene for characterization of bacterial and archaeal communities has become routine. However, the adoption of sequencing methods for eukaryotes has been slow, despite their significance to natural and engineered systems. There are large variations among the target genes used for amplicon sequencing, and for the 18S rRNA gene, there is no consensus on which hypervariable region provides the most suitable representation of diversity. Additionally, it is unclear how much PCR/sequencing bias affects the depiction of community structure using current primers. The present study amplified the V4 and V8-V9 regions from seven microalgal mock communities as well as eukaryotic communities from freshwater, coastal, and wastewater samples to examine the effect of PCR/sequencing bias on community structure and membership. We found that degeneracies on the 3' end of the current V4-specific primers impact read length and mean relative abundance. Furthermore, the PCR/sequencing error is markedly higher for GC-rich members than for communities with balanced GC content. Importantly, the V4 region failed to reliably capture 2 of the 12 mock community members, and the V8-V9 hypervariable region more accurately represents mean relative abundance and alpha and beta diversity. Overall, the V4 and V8-V9 regions show similar community representations over freshwater, coastal, and wastewater environments, but specific samples show markedly different communities. These results indicate that multiple primer sets may be advantageous for gaining a more complete understanding of community structure and highlight the importance of including mock communities composed of species of interest. The quantification of error associated with community representation by amplicon sequencing is a critical challenge that is often ignored. When target genes are amplified using currently available primers, differential amplification efficiencies result in inaccurate estimates of community structure. The extent to which amplification bias affects community representation and the accuracy with which different gene targets represent community structure are not known. As a result, there is no consensus on which region provides the most suitable representation of diversity for eukaryotes. This study determined the accuracy with which commonly used 18S rRNA gene primer sets represent community structure and identified particular biases related to PCR amplification and Illumina MiSeq sequencing in order to more accurately study eukaryotic microbial communities. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Cloning and Expression of the Erwinia carotovora subsp. carotovora Gene Encoding the Low-Molecular-Weight Bacteriocin Carocin S1▿

PubMed Central

Chuang, Duen-yau; Chien, Yung-chei; Wu, Huang-Pin

2007-01-01

The purpose of this study was to clone the carocin S1 gene and express it in a non-carocin-producing strain of Erwinia carotovora. A mutant, TH22-10, which produced a high-molecular-weight bacteriocin but not a low-molecular-weight bacteriocin, was obtained by Tn5 insertional mutagenesis using H-rif-8-2 (a spontaneous rifampin-resistant mutant of Erwinia carotovora subsp. carotovora 89-H-4). Using thermal asymmetric interlaced PCR, the DNA sequence from the Tn5 insertion site and the DNA sequence of the contiguous 2,280-bp region were determined. Two complete open reading frames (ORF), designated ORF2 and ORF3, were identified within the sequence fragment. ORF2 and ORF3 were identified with the carocin S1 genes, caroS1K (ORF2) and caroS1I (ORF3), which, respectively, encode a killing protein (CaroS1K) and an immunity protein (CaroS1I). These genes were homologous to the pyocin S3 gene and the pyocin AP41 gene. Carocin S1 was expressed in E. carotovora subsp. carotovora Ea1068 and replicated in TH22-10 but could not be expressed in Escherichia coli (JM101) because a consensus sequence resembling an SOS box was absent. A putative sequence similar to the consensus sequence for the E. coli cyclic AMP receptor protein binding site (−312 bp) was found upstream of the start codon. Production of this bacteriocin was also induced by glucose and lactose. The homology search results indicated that the carocin S1 gene (between bp 1078 and bp 1704) was homologous to the pyocin S3 and pyocin AP41 genes in Pseudomonas aeruginosa. These genes encode proteins with nuclease activity (domain 4). This study found that carocin S1 also has nuclease activity. PMID:17071754
p53 Specifically Binds Triplex DNA In Vitro and in Cells

PubMed Central

Brázdová, Marie; Tichý, Vlastimil; Helma, Robert; Bažantová, Pavla; Polášková, Alena; Krejčí, Aneta; Petr, Marek; Navrátilová, Lucie; Tichá, Olga; Nejedlý, Karel; Bennink, Martin L.; Subramaniam, Vinod; Bábková, Zuzana; Martínek, Tomáš; Lexa, Matej; Adámik, Matej

2016-01-01

Triplex DNA is implicated in a wide range of biological activities, including regulation of gene expression and genomic instability leading to cancer. The tumor suppressor p53 is a central regulator of cell fate in response to different type of insults. Sequence and structure specific modes of DNA recognition are core attributes of the p53 protein. The focus of this work is the structure-specific binding of p53 to DNA containing triplex-forming sequences in vitro and in cells and the effect on p53-driven transcription. This is the first DNA binding study of full-length p53 and its deletion variants to both intermolecular and intramolecular T.A.T triplexes. We demonstrate that the interaction of p53 with intermolecular T.A.T triplex is comparable to the recognition of CTG-hairpin non-B DNA structure. Using deletion mutants we determined the C-terminal DNA binding domain of p53 to be crucial for triplex recognition. Furthermore, strong p53 recognition of intramolecular T.A.T triplexes (H-DNA), stabilized by negative superhelicity in plasmid DNA, was detected by competition and immunoprecipitation experiments, and visualized by AFM. Moreover, chromatin immunoprecipitation revealed p53 binding T.A.T forming sequence in vivo. Enhanced reporter transactivation by p53 on insertion of triplex forming sequence into plasmid with p53 consensus sequence was observed by luciferase reporter assays. In-silico scan of human regulatory regions for the simultaneous presence of both consensus sequence and T.A.T motifs identified a set of candidate p53 target genes and p53-dependent activation of several of them (ABCG5, ENOX1, INSR, MCC, NFAT5) was confirmed by RT-qPCR. Our results show that T.A.T triplex comprises a new class of p53 binding sites targeted by p53 in a DNA structure-dependent mode in vitro and in cells. The contribution of p53 DNA structure-dependent binding to the regulation of transcription is discussed. PMID:27907175
Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

PubMed Central

2013-01-01

Background Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Results Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li’s D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li’s D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. Conclusions This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens. PMID:23497218
Characterization of NIST human mitochondrial DNA SRM-2392 and SRM-2392-I standard reference materials by next generation sequencing.

PubMed

Riman, Sarah; Kiesler, Kevin M; Borsuk, Lisa A; Vallone, Peter M

2017-07-01

Standard Reference Materials SRM 2392 and 2392-I are intended to provide quality control when amplifying and sequencing human mitochondrial genome sequences. The National Institute of Standards and Technology (NIST) offers these SRMs to laboratories performing DNA-based forensic human identification, molecular diagnosis of mitochondrial diseases, mutation detection, evolutionary anthropology, and genetic genealogy. The entire mtGenome (∼16569bp) of SRM 2392 and 2392-I have previously been characterized at NIST by Sanger sequencing. Herein, we used the sensitivity, specificity, and accuracy offered by next generation sequencing (NGS) to: (1) re-sequence the certified values of the SRM 2392 and 2392-I; (2) confirm Sanger data with a high coverage new sequencing technology; (3) detect lower level heteroplasmies (<20%); and thus (4) support mitochondrial sequencing communities in the adoption of NGS methods. To obtain a consensus sequence for the SRMs as well as identify and control any bias, sequencing was performed using two NGS platforms and data was analyzed using different bioinformatics pipelines. Our results confirm five low level heteroplasmy sites that were not previously observed with Sanger sequencing: three sites in the GM09947A template in SRM 2392 and two sites in the HL-60 template in SRM 2392-I. Copyright © 2017 Elsevier B.V. All rights reserved.
High resolution identity testing of inactivated poliovirus vaccines

PubMed Central

Mee, Edward T.; Minor, Philip D.; Martin, Javier

2015-01-01

Background Definitive identification of poliovirus strains in vaccines is essential for quality control, particularly where multiple wild-type and Sabin strains are produced in the same facility. Sequence-based identification provides the ultimate in identity testing and would offer several advantages over serological methods. Methods We employed random RT-PCR and high throughput sequencing to recover full-length genome sequences from monovalent and trivalent poliovirus vaccine products at various stages of the manufacturing process. Results All expected strains were detected in previously characterised products and the method permitted identification of strains comprising as little as 0.1% of sequence reads. Highly similar Mahoney and Sabin 1 strains were readily discriminated on the basis of specific variant positions. Analysis of a product known to contain incorrect strains demonstrated that the method correctly identified the contaminants. Conclusion Random RT-PCR and shotgun sequencing provided high resolution identification of vaccine components. In addition to the recovery of full-length genome sequences, the method could also be easily adapted to the characterisation of minor variant frequencies and distinction of closely related products on the basis of distinguishing consensus and low frequency polymorphisms. PMID:26049003
Accurate Typing of Human Leukocyte Antigen Class I Genes by Oxford Nanopore Sequencing.

PubMed

Liu, Chang; Xiao, Fangzhou; Hoisington-Lopez, Jessica; Lang, Kathrin; Quenzel, Philipp; Duffy, Brian; Mitra, Robi David

2018-04-03

Oxford Nanopore Technologies' MinION has expanded the current DNA sequencing toolkit by delivering long read lengths and extreme portability. The MinION has the potential to enable expedited point-of-care human leukocyte antigen (HLA) typing, an assay routinely used to assess the immunologic compatibility between organ donors and recipients, but the platform's high error rate makes it challenging to type alleles with accuracy. We developed and validated accurate typing of HLA by Oxford nanopore (Athlon), a bioinformatic pipeline that i) maps nanopore reads to a database of known HLA alleles, ii) identifies candidate alleles with the highest read coverage at different resolution levels that are represented as branching nodes and leaves of a tree structure, iii) generates consensus sequences by remapping the reads to the candidate alleles, and iv) calls the final diploid genotype by blasting consensus sequences against the reference database. Using two independent data sets generated on the R9.4 flow cell chemistry, Athlon achieved a 100% accuracy in class I HLA typing at the two-field resolution. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Armored RNA Technology for Production of Ribonuclease-Resistant Viral RNA Controls and Standards

PubMed Central

Pasloske, Brittan L.; Walkerpeach, Cindy R.; Obermoeller, R. Dawn; Winkler, Matthew; DuBois, Dwight B.

1998-01-01

The widespread use of sensitive assays for the detection of viral and cellular RNA sequences has created a need for stable, well-characterized controls and standards. We describe the development of a versatile, novel system for creating RNase-resistant RNA. “Armored RNA” is a complex of MS2 bacteriophage coat protein and RNA produced in Escherichia coli by the induction of an expression plasmid that encodes the coat protein and an RNA standard sequence. The RNA sequences are completely protected from RNase digestion within the bacteriophage-like complexes. As a prototype, a 172-base consensus sequence from a portion of the human immunodeficiency virus type 1 (HIV-1) gag gene was synthesized and cloned into the packaging vector used to produce the bacteriophage-like particles. After production and purification, the resulting HIV-1 Armored RNA particles were shown to be resistant to degradation in human plasma and produced reproducible results in the Amplicor HIV-1 Monitor assay for 180 days when stored at −20°C or for 60 days at 4°C. Additionally, Armored RNA preparations are homogeneous and noninfectious. PMID:9817878
CHROMA: consensus-based colouring of multiple alignments for publication.

PubMed

Goodstadt, L; Ponting, C P

2001-09-01

CHROMA annotates multiple protein sequence alignments by consensus to produce formatted and coloured text suitable for incorporation into other documents for publication. The package is designed to be flexible and reliable, and has a simple-to-use graphical user interface running under Microsoft Windows. Both the executables and source code for CHROMA running under Windows and Linux (portable command-line only) are freely available at http://www.lg.ndirect.co.uk/chroma. Software enquiries should be directed to CHROMA@lg.ndirect.co.uk.
MEGANTE: A Web-Based System for Integrated Plant Genome Annotation

PubMed Central

Numa, Hisataka; Itoh, Takeshi

2014-01-01

The recent advancement of high-throughput genome sequencing technologies has resulted in a considerable increase in demands for large-scale genome annotation. While annotation is a crucial step for downstream data analyses and experimental studies, this process requires substantial expertise and knowledge of bioinformatics. Here we present MEGANTE, a web-based annotation system that makes plant genome annotation easy for researchers unfamiliar with bioinformatics. Without any complicated configuration, users can perform genomic sequence annotations simply by uploading a sequence and selecting the species to query. MEGANTE automatically runs several analysis programs and integrates the results to select the appropriate consensus exon–intron structures and to predict open reading frames (ORFs) at each locus. Functional annotation, including a similarity search against known proteins and a functional domain search, are also performed for the predicted ORFs. The resultant annotation information is visualized with a widely used genome browser, GBrowse. For ease of analysis, the results can be downloaded in Microsoft Excel format. All of the query sequences and annotation results are stored on the server side so that users can access their own data from virtually anywhere on the web. The current release of MEGANTE targets 24 plant species from the Brassicaceae, Fabaceae, Musaceae, Poaceae, Salicaceae, Solanaceae, Rosaceae and Vitaceae families, and it allows users to submit a sequence up to 10 Mb in length and to save up to 100 sequences with the annotation information on the server. The MEGANTE web service is available at https://megante.dna.affrc.go.jp/. PMID:24253915
Mining a database of single amplified genomes from Red Sea brine pool extremophiles—improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA)

PubMed Central

Grötzinger, Stefan W.; Alam, Intikhab; Ba Alawi, Wail; Bajic, Vladimir B.; Stingl, Ulrich; Eppinger, Jörg

2014-01-01

Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile's genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website. PMID:24778629
Quasispecies Analyses of the HIV-1 Near-full-length Genome With Illumina MiSeq

PubMed Central

Ode, Hirotaka; Matsuda, Masakazu; Matsuoka, Kazuhiro; Hachiya, Atsuko; Hattori, Junko; Kito, Yumiko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

2015-01-01

Human immunodeficiency virus type-1 (HIV-1) exhibits high between-host genetic diversity and within-host heterogeneity, recognized as quasispecies. Because HIV-1 quasispecies fluctuate in terms of multiple factors, such as antiretroviral exposure and host immunity, analyzing the HIV-1 genome is critical for selecting effective antiretroviral therapy and understanding within-host viral coevolution mechanisms. Here, to obtain HIV-1 genome sequence information that includes minority variants, we sought to develop a method for evaluating quasispecies throughout the HIV-1 near-full-length genome using the Illumina MiSeq benchtop deep sequencer. To ensure the reliability of minority mutation detection, we applied an analysis method of sequence read mapping onto a consensus sequence derived from de novo assembly followed by iterative mapping and subsequent unique error correction. Deep sequencing analyses of aHIV-1 clone showed that the analysis method reduced erroneous base prevalence below 1% in each sequence position and discarded only < 1% of all collected nucleotides, maximizing the usage of the collected genome sequences. Further, we designed primer sets to amplify the HIV-1 near-full-length genome from clinical plasma samples. Deep sequencing of 92 samples in combination with the primer sets and our analysis method provided sufficient coverage to identify >1%-frequency sequences throughout the genome. When we evaluated sequences of pol genes from 18 treatment-naïve patients' samples, the deep sequencing results were in agreement with Sanger sequencing and identified numerous additional minority mutations. The results suggest that our deep sequencing method would be suitable for identifying within-host viral population dynamics throughout the genome. PMID:26617593
Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets.

PubMed

Nelson, Christopher S; Fuller, Chris K; Fordyce, Polly M; Greninger, Alexander L; Li, Hao; DeRisi, Joseph L

2013-07-01

The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein's DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2's-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved.
Microfluidic affinity and ChIP-seq analyses converge on a conserved FOXP2-binding motif in chimp and human, which enables the detection of evolutionarily novel targets

PubMed Central

Nelson, Christopher S.; Fuller, Chris K.; Fordyce, Polly M.; Greninger, Alexander L.; Li, Hao; DeRisi, Joseph L.

2013-01-01

The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein’s DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2’s-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved. PMID:23625967

Optimal treatment sequence in COPD: Can a consensus be found?

PubMed

Ferreira, J; Drummond, M; Pires, N; Reis, G; Alves, C; Robalo-Cordeiro, C

2016-01-01

There is currently no consensus on the treatment sequence in chronic obstructive pulmonary disease (COPD), although it is recognized that early diagnosis is of paramount importance to start treatment in the early stages of the disease. Although it is fairly consensual that initial treatment should be with an inhaled short-acting beta agonist, a short-acting muscarinic antagonist, a long-acting beta-agonist or a long-acting muscarinic antagonist. As the disease progresses, several therapeutic options are available, and which to choose at each disease stage remains controversial. When and in which patients to use dual bronchodilation? When to use inhaled corticosteroids? And triple therapy? Are the existing non-inhaled therapies, such as mucolytic agents, antibiotics, phosphodiesterase-4 inhibitors, methylxanthines and immunostimulating agents, useful? If so, which patients would benefit? Should co-morbidities be taken into account when choosing COPD therapy for a patient? This paper reviews current guidelines and available evidence and proposes a therapeutic scheme for COPD patients. We also propose a treatment algorithm in the hope that it will help physicians to decide the best approach for their patients. The authors conclude that, at present, a full consensus on optimal treatment sequence in COPD cannot be found, mainly due to disease heterogeneity and lack of biomarkers to guide treatment. For the time being, and although some therapeutic approaches are consensual, treatment of COPD should be patient-oriented. Copyright © 2015 Sociedade Portuguesa de Pneumologia. Published by Elsevier España, S.L.U. All rights reserved.
Sampled-data consensus in switching networks of integrators based on edge events

NASA Astrophysics Data System (ADS)

Xiao, Feng; Meng, Xiangyu; Chen, Tongwen

2015-02-01

This paper investigates the event-driven sampled-data consensus in switching networks of multiple integrators and studies both the bidirectional interaction and leader-following passive reaction topologies in a unified framework. In these topologies, each information link is modelled by an edge of the information graph and assigned a sequence of edge events, which activate the mutual data sampling and controller updates of the two linked agents. Two kinds of edge-event-detecting rules are proposed for the general asynchronous data-sampling case and the synchronous periodic event-detecting case. They are implemented in a distributed fashion, and their effectiveness in reducing communication costs and solving consensus problems under a jointly connected topology condition is shown by both theoretical analysis and simulation examples.
WebLogo

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crooks, Gavin E.

WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible. Sequesnce logos are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richermore » and more precise description of, for example, a binding site, than would a consensus sequence.« less
Spatial analysis of extension fracture systems: A process modeling approach

USGS Publications Warehouse

Ferguson, C.C.

1985-01-01

Little consensus exists on how best to analyze natural fracture spacings and their sequences. Field measurements and analyses published in geotechnical literature imply fracture processes radically different from those assumed by theoretical structural geologists. The approach adopted in this paper recognizes that disruption of rock layers by layer-parallel extension results in two spacing distributions, one representing layer-fragment lengths and another separation distances between fragments. These two distributions and their sequences reflect mechanics and history of fracture and separation. Such distributions and sequences, represented by a 2 ?? n matrix of lengthsL, can be analyzed using a method that is history sensitive and which yields also a scalar estimate of bulk extension, e (L). The method is illustrated by a series of Monte Carlo experiments representing a variety of fracture-and-separation processes, each with distinct implications for extension history. Resulting distributions of e (L)are process-specific, suggesting that the inverse problem of deducing fracture-and-separation history from final structure may be tractable. ?? 1985 Plenum Publishing Corporation.
Testing Convergent Evolution in Auditory Processing Genes between Echolocating Mammals and the Aye-Aye, a Percussive-Foraging Primate.

PubMed

Bankoff, Richard J; Jerjos, Michael; Hohman, Baily; Lauterbur, M Elise; Kistler, Logan; Perry, George H

2017-07-01

Several taxonomically distinct mammalian groups-certain microbats and cetaceans (e.g., dolphins)-share both morphological adaptations related to echolocation behavior and strong signatures of convergent evolution at the amino acid level across seven genes related to auditory processing. Aye-ayes (Daubentonia madagascariensis) are nocturnal lemurs with a specialized auditory processing system. Aye-ayes tap rapidly along the surfaces of trees, listening to reverberations to identify the mines of wood-boring insect larvae; this behavior has been hypothesized to functionally mimic echolocation. Here we investigated whether there are signals of convergence in auditory processing genes between aye-ayes and known mammalian echolocators. We developed a computational pipeline (Basic Exon Assembly Tool) that produces consensus sequences for regions of interest from shotgun genomic sequencing data for nonmodel organisms without requiring de novo genome assembly. We reconstructed complete coding region sequences for the seven convergent echolocating bat-dolphin genes for aye-ayes and another lemur. We compared sequences from these two lemurs in a phylogenetic framework with those of bat and dolphin echolocators and appropriate nonecholocating outgroups. Our analysis reaffirms the existence of amino acid convergence at these loci among echolocating bats and dolphins; some methods also detected signals of convergence between echolocating bats and both mice and elephants. However, we observed no significant signal of amino acid convergence between aye-ayes and echolocating bats and dolphins, suggesting that aye-aye tap-foraging auditory adaptations represent distinct evolutionary innovations. These results are also consistent with a developing consensus that convergent behavioral ecology does not reliably predict convergent molecular evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Transcription activation mediated by a cyclic AMP receptor protein from Thermus thermophilus HB8.

PubMed

Shinkai, Akeo; Kira, Satoshi; Nakagawa, Noriko; Kashihara, Aiko; Kuramitsu, Seiki; Yokoyama, Shigeyuki

2007-05-01

The extremely thermophilic bacterium Thermus thermophilus HB8, which belongs to the phylum Deinococcus-Thermus, has an open reading frame encoding a protein belonging to the cyclic AMP (cAMP) receptor protein (CRP) family present in many bacteria. The protein named T. thermophilus CRP is highly homologous to the CRP family proteins from the phyla Firmicutes, Actinobacteria, and Cyanobacteria, and it forms a homodimer and interacts with cAMP. CRP mRNA and intracellular cAMP were detected in this strain, which did not drastically fluctuate during cultivation in a rich medium. The expression of several genes was altered upon disruption of the T. thermophilus CRP gene. We found six CRP-cAMP-dependent promoters in in vitro transcription assays involving DNA fragments containing the upstream regions of the genes exhibiting decreased expression in the CRP disruptant, indicating that the CRP is a transcriptional activator. The consensus T. thermophilus CRP-binding site predicted upon nucleotide sequence alignment is 5'-(C/T)NNG(G/T)(G/T)C(A/C)N(A/T)NNTCACAN(G/C)(G/C)-3'. This sequence is unique compared with the known consensus binding sequences of CRP family proteins. A putative -10 hexamer sequence resides at 18 to 19 bp downstream of the predicted T. thermophilus CRP-binding site. The CRP-regulated genes found in this study comprise clustered regularly interspaced short palindromic repeat (CRISPR)-associated (cas) ones, and the genes of a putative transcriptional regulator, a protein containing the exonuclease III-like domain of DNA polymerase, a GCN5-related acetyltransferase homolog, and T. thermophilus-specific proteins of unknown function. These results suggest a role for cAMP signal transduction in T. thermophilus and imply the T. thermophilus CRP is a cAMP-responsive regulator.
Intramolecular control of transcriptional activity by the NK2-specific domain in NK-2 homeodomain proteins

PubMed Central

Watada, Hirotaka; Mirmira, Raghavendra G.; Kalamaras, Julie; German, Michael S.

2000-01-01

The developmentally important homeodomain transcription factors of the NK-2 class contain a highly conserved region, the NK2-specific domain (NK2-SD). The function of this domain, however, remains unknown. The primary structure of the NK2-SD suggests that it might function as an accessory DNA-binding domain or as a protein–protein interaction interface. To assess the possibility that the NK2-SD may contribute to DNA-binding specificity, we used a PCR-based approach to identify a consensus DNA-binding sequences for Nkx2.2, an NK-2 family member involved in pancreas and central nervous system development. The consensus sequence (TCTAAGTGAGCTT) is similar to the known binding sequences for other NK-2 homeodomain proteins, but we show that the NK2-SD does not contribute significantly to specific DNA binding to this sequence. To determine whether the NK2-SD contributes to transactivation, we used GAL4-Nkx2.2 fusion constructs to map a powerful transcriptional activation domain in the C-terminal region beyond the conserved NK2-SD. Interestingly, this C-terminal region functions as a transcriptional activator only in the absence of an intact NK2-SD. The NK2-SD also can mask transactivation from the paired homeodomain transcription factor Pax6, but it has no effect on transcription by itself. These results demonstrate that the NK2-SD functions as an intramolecular regulator of the C-terminal activation domain in Nkx2.2 and support a model in which interactions through the NK2-SD regulate the ability of NK-2-class proteins to activate specific genes during development. PMID:10944215
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gacias, Mar; Perez-Marti, Albert; Pujol-Vidal, Magdalena

Highlights: Black-Right-Pointing-Pointer The Cact gene is induced in mouse skeletal muscle after 24 h of fasting. Black-Right-Pointing-Pointer The Cact gene contains a functional consensus sequence for ERR. Black-Right-Pointing-Pointer This sequence binds ERR{alpha} both in vivo and in vitro. Black-Right-Pointing-Pointer This ERRE is required for the activation of Cact expression by the PGC-1/ERR axis. Black-Right-Pointing-Pointer Our results add Cact as a genuine gene target of these transcriptional regulators. -- Abstract: Carnitine/acylcarnitine translocase (CACT) is a mitochondrial-membrane carrier proteins that mediates the transport of acylcarnitines into the mitochondrial matrix for their oxidation by the mitochondrial fatty acid-oxidation pathway. CACT deficiency causes amore » variety of pathological conditions, such as hypoketotic hypoglycemia, cardiac arrest, hepatomegaly, hepatic dysfunction and muscle weakness, and it can be fatal in newborns and infants. Here we report that expression of the Cact gene is induced in mouse skeletal muscle after 24 h of fasting. To gain insight into the control of Cact gene expression, we examine the transcriptional regulation of the mouse Cact gene. We show that the 5 Prime -flanking region of this gene is transcriptionally active and contains a consensus sequence for the estrogen-related receptor (ERR), a member of the nuclear receptor family of transcription factors. This sequence binds ERR{alpha}in vivo and in vitro and is required for the activation of Cact expression by the peroxisome proliferator-activated receptor gamma coactivator (PGC)-1/ERR axis. We also demonstrate that XTC790, the inverse agonist of ERR{alpha}, specifically blocks Cact activation by PGC-1{beta} in C2C12 cells.« less
Sequence diversity of wheat mosaic virus isolates.

PubMed

Stewart, Lucy R

2016-02-02

Wheat mosaic virus (WMoV), transmitted by eriophyid wheat curl mites (Aceria tosichella) is the causal agent of High Plains disease in wheat and maize. WMoV and other members of the genus Emaravirus evaded thorough molecular characterization for many years due to the experimental challenges of mite transmission and manipulating multisegmented negative sense RNA genomes. Recently, the complete genome sequence of a Nebraska isolate of WMoV revealed eight segments, plus a variant sequence of the nucleocapsid protein-encoding segment. Here, near-complete and partial consensus sequences of five more WMoV isolates are reported and compared to the Nebraska isolate: an Ohio maize isolate (GG1), a Kansas barley isolate (KS7), and three Ohio wheat isolates (H1, K1, W1). Results show two distinct groups of WMoV isolates: Ohio wheat isolate RNA segments had 84% or lower nucleotide sequence identity to the NE isolate, whereas GG1 and KS7 had 98% or higher nucleotide sequence identity to the NE isolate. Knowledge of the sequence variability of WMoV isolates is a step toward understanding virus biology, and potentially explaining observed biological variation. Published by Elsevier B.V.
A possible structural model of members of the CPF family of cuticular proteins implicating binding to components other than chitin

PubMed Central

Papandreou, Nikos C.; Iconomidou, Vassiliki A.; Willis, Judith H.; Hamodrakas, Stavros J.

2010-01-01

The physical properties of cuticle are determined by the structure of its two major components, cuticular proteins (CPs) and chitin, and, also, by their interactions. A common consensus region (extended R&R Consensus) found in the majority of cuticular proteins, the CPRs, binds to chitin. Previous work established that β-pleated sheet predominates in the Consensus region and we proposed that it is responsible for the formation of helicoidal cuticle. Remote sequence similarity between CPRs and a lipocalin, bovine plasma retinol binding protein (RBP), led us to suggest an antiparallel β-sheet half-barrel structure as the basic folding motif of the R&R Consensus. There are several other families of cuticular proteins. One of the best defined is CPF. Its four members in Anopheles gambiae are expressed during the early stages of either pharate pupal or pharate adult development, suggesting that the proteins contribute to the outer regions of the cuticle, the epi- and/or exocuticle. These proteins did not bind to chitin in the same assay used successfully for CPRs. Although CPFs are distinct in sequence from CPRs, the same lipocalin could also be used to derive homology models for one Anopheles gambiae and one Drosophila melanogaster CPF. For the CPFs, the basic folding motif predicted is an eight-stranded, antiparallel β-sheet, full-barrel structure. Possible implications of this structure are discussed and docking experiments were carried out with one possible Drosophila ligand, 7(Z), 11(Z)-heptacosadiene. PMID:20417215
Final Report for LDRD Project 02-ERD-069: Discovering the Unknown Mechanism(s) of Virulence in a BW, Class A Select Agent

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chain, P; Garcia, E

2003-02-06

The goal of this proposed effort was to assess the difficulty in identifying and characterizing virulence candidate genes in an organism for which very limited data exists. This was accomplished by first addressing the finishing phase of draft-sequenced F. tularensis genomes and conducting comparative analyses to determine the coding potential of each genome; to discover the differences in genome structure and content, and to identify potential genes whose products may be involved in the F. tularensis virulence process. The project was divided into three parts: (1) Genome finishing: This part involves determining the order and orientation of the consensus sequencesmore » of contigs obtained from Phrap assemblies of random draft genomic sequences. This tedious process consists of linking contig ends using information embedded in each sequence file that relates the sequence to the original cloned insert. Since inserts are sequenced from both ends, we can establish a link between these paired-ends in different contigs and thus order and orient contigs. Since these genomes carry numerous copies of insertion sequences, these repeated elements ''confuse'' the Phrap assembly program. It is thus necessary to break these contigs apart at the repeated sequences and individually join the proper flanking regions using paired-end information, or using results of comparisons against a similar genome. Larger repeated elements such as the small subunit ribosomal RNA operon require verification with PCR. Tandem repeats require manual intervention and typically rely on single nucleotide polymorphisms to be resolved. Remaining gaps require PCR reactions and sequencing. Once the genomes have been ''closed'', low quality regions are addressed by resequencing reactions. (2) Genome analysis: The final consensus sequences are processed by combining the results of three gene modelers: Glimmer, Critica and Generation. The final gene models are submitted to a battery of homology searches and domain prediction programs in order to annotate them (e.g. BLAST, Pfam, TIGRfam, COG, KEGG, InterPro, TMhmm, SignalP). The genome structure is also assessed in terms of G+C content, GC bias (GC skew), and locations of repeated regions (e.g. IS elements) and phage-like genes. (3) Comparative genomics: The results of the various genome analyses are compared between the finished (or almost finished) genomes. Here, we have compared the F. tularensis genomes from the extremely lethal strain Schu4 (subsp. tularensis), the vaccine strain LVS (subsp. holartica), and strain UT01-4992 of the less virulent, opportunistic subsp. novicida. Regions present in the highly virulent strain that are absent from the other less virulent strains may provide insight into what factors are required for the high level of virulence.« less
UK quantitative WB-DWI technical workgroup: consensus meeting recommendations on optimisation, quality control, processing and analysis of quantitative whole-body diffusion-weighted imaging for cancer.

PubMed

Barnes, Anna; Alonzi, Roberto; Blackledge, Matthew; Charles-Edwards, Geoff; Collins, David J; Cook, Gary; Coutts, Glynn; Goh, Vicky; Graves, Martin; Kelly, Charles; Koh, Dow-Mu; McCallum, Hazel; Miquel, Marc E; O'Connor, James; Padhani, Anwar; Pearson, Rachel; Priest, Andrew; Rockall, Andrea; Stirling, James; Taylor, Stuart; Tunariu, Nina; van der Meulen, Jan; Walls, Darren; Winfield, Jessica; Punwani, Shonit

2018-01-01

Application of whole body diffusion-weighted MRI (WB-DWI) for oncology are rapidly increasing within both research and routine clinical domains. However, WB-DWI as a quantitative imaging biomarker (QIB) has significantly slower adoption. To date, challenges relating to accuracy and reproducibility, essential criteria for a good QIB, have limited widespread clinical translation. In recognition, a UK workgroup was established in 2016 to provide technical consensus guidelines (to maximise accuracy and reproducibility of WB-MRI QIBs) and accelerate the clinical translation of quantitative WB-DWI applications for oncology. A panel of experts convened from cancer centres around the UK with subspecialty expertise in quantitative imaging and/or the use of WB-MRI with DWI. A formal consensus method was used to obtain consensus agreement regarding best practice. Questions were asked about the appropriateness or otherwise on scanner hardware and software, sequence optimisation, acquisition protocols, reporting, and ongoing quality control programs to monitor precision and accuracy and agreement on quality control. The consensus panel was able to reach consensus on 73% (255/351) items and based on consensus areas made recommendations to maximise accuracy and reproducibly of quantitative WB-DWI studies performed at 1.5T. The panel were unable to reach consensus on the majority of items related to quantitative WB-DWI performed at 3T. This UK Quantitative WB-DWI Technical Workgroup consensus provides guidance on maximising accuracy and reproducibly of quantitative WB-DWI for oncology. The consensus guidance can be used by researchers and clinicians to harmonise WB-DWI protocols which will accelerate clinical translation of WB-DWI-derived QIBs.
Integration of transcriptomic and proteomic data from a single wheat cultivar provides new tools for understanding the roles of individual alpha gliadin proteins in flour quality and celiac disease

USDA-ARS?s Scientific Manuscript database

One-hundred-thirty-six expressed sequence tags (ESTs) encoding alpha gliadins from Triticum aestivum cv Butte 86 were identified in public databases and assembled into 19 contigs. Consensus sequences for 12 of the contigs encoded complete alpha gliadin proteins, but only two were identical to protei...
A new fast method for inferring multiple consensus trees using k-medoids.

PubMed

Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

2018-04-05

Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while providing similar or better clustering results in most cases. This makes it particularly well suited for the analysis of large genomic and phylogenetic datasets.
Diversity and Evolution of Bacterial Twin Arginine Translocase Protein, TatC, Reveals a Protein Secretion System That Is Evolving to Fit Its Environmental Niche

PubMed Central

Simone, Domenico; Bay, Denice C.; Leach, Thorin; Turner, Raymond J.

2013-01-01

Background The twin-arginine translocation (Tat) protein export system enables the transport of fully folded proteins across a membrane. This system is composed of two integral membrane proteins belonging to TatA and TatC protein families and in some systems a third component, TatB, a homolog of TatA. TatC participates in substrate protein recognition through its interaction with a twin arginine leader peptide sequence. Methodology/Principal Findings The aim of this study was to explore TatC diversity, evolution and sequence conservation in bacteria to identify how TatC is evolving and diversifying in various bacterial phyla. Surveying bacterial genomes revealed that 77% of all species possess one or more tatC loci and half of these classes possessed only tatC and tatA genes. Phylogenetic analysis of diverse TatC homologues showed that they were primarily inherited but identified a small subset of taxonomically unrelated bacteria that exhibited evidence supporting lateral gene transfer within an ecological niche. Examination of bacilli tatCd/tatCy isoform operons identified a number of known and potentially new Tat substrate genes based on their frequent association to tatC loci. Evolutionary analysis of these Bacilli isoforms determined that TatCy was the progenitor of TatCd. A bacterial TatC consensus sequence was determined and highlighted conserved and variable regions within a three dimensional model of the Escherichia coli TatC protein. Comparative analysis between the TatC consensus sequence and Bacilli TatCd/y isoform consensus sequences revealed unique sites that may contribute to isoform substrate specificity or make TatA specific contacts. Synonymous to non-synonymous nucleotide substitution analyses of bacterial tatC homologues determined that tatC sequence variation differs dramatically between various classes and suggests TatC specialization in these species. Conclusions/Significance TatC proteins appear to be diversifying within particular bacterial classes and its specialization may be driven by the substrates it transports and the environment of its host. PMID:24236045
Sequencing ebola and marburg viruses genomes using microarrays.

PubMed

Hardick, Justin; Woelfel, Roman; Gardner, Warren; Ibrahim, Sofi

2016-08-01

Periodic outbreaks of Ebola and Marburg hemorrhagic fevers have occurred in Africa over the past four decades with case fatality rates reaching as high as 90%. The latest Ebola outbreak in West Africa in 2014 raised concerns that these infections can spread across continents and pose serious health risks. Early and accurate identification of the causative agents is necessary to contain outbreaks. In this report, we describe sequencing-by-hybridization (SBH) technique using high density microarrays to identify Ebola and Marburg viruses. The microarrays were designed to interrogate the sequences of entire viral genomes, and were evaluated with three species of Ebolavirus (Reston, Sudan, and Zaire), and three strains of Marburgvirus (Angola, Musoke, and Ravn). The results showed that the consensus sequences generated with four or more hybridizations had 92.1-98.9% accuracy over 95-99% of the genomes. Additionally, with SBH microarrays it was possible to distinguish between different strains of the Lake Victoria Marburgvirus. J. Med. Virol. 88:1303-1308, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Nuclear Mitochondrial DNA Activates Replication in Saccharomyces cerevisiae

PubMed Central

Chatre, Laurent; Ricchetti, Miria

2011-01-01

The nuclear genome of eukaryotes is colonized by DNA fragments of mitochondrial origin, called NUMTs. These insertions have been associated with a variety of germ-line diseases in humans. The significance of this uptake of potentially dangerous sequences into the nuclear genome is unclear. Here we provide functional evidence that sequences of mitochondrial origin promote nuclear DNA replication in Saccharomyces cerevisiae. We show that NUMTs are rich in key autonomously replicating sequence (ARS) consensus motifs, whose mutation results in the reduction or loss of DNA replication activity. Furthermore, 2D-gel analysis of the mrc1 mutant exposed to hydroxyurea shows that several NUMTs function as late chromosomal origins. We also show that NUMTs located close to or within ARS provide key sequence elements for replication. Thus NUMTs can act as independent origins, when inserted in an appropriate genomic context or affect the efficiency of pre-existing origins. These findings show that migratory mitochondrial DNAs can impact on the replication of the nuclear region they are inserted in. PMID:21408151
Nuclear mitochondrial DNA activates replication in Saccharomyces cerevisiae.

PubMed

Chatre, Laurent; Ricchetti, Miria

2011-03-08

The nuclear genome of eukaryotes is colonized by DNA fragments of mitochondrial origin, called NUMTs. These insertions have been associated with a variety of germ-line diseases in humans. The significance of this uptake of potentially dangerous sequences into the nuclear genome is unclear. Here we provide functional evidence that sequences of mitochondrial origin promote nuclear DNA replication in Saccharomyces cerevisiae. We show that NUMTs are rich in key autonomously replicating sequence (ARS) consensus motifs, whose mutation results in the reduction or loss of DNA replication activity. Furthermore, 2D-gel analysis of the mrc1 mutant exposed to hydroxyurea shows that several NUMTs function as late chromosomal origins. We also show that NUMTs located close to or within ARS provide key sequence elements for replication. Thus NUMTs can act as independent origins, when inserted in an appropriate genomic context or affect the efficiency of pre-existing origins. These findings show that migratory mitochondrial DNAs can impact on the replication of the nuclear region they are inserted in.
International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

PubMed Central

Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

2015-01-01

This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
Prevention of influenza virus shedding and protection from lethal H1N1 challenge using a consensus 2009 H1N1 HA and NA adenovirus vector vaccine

PubMed Central

Jones, Frank R.; Gabitzsch, Elizabeth S.; Xu, Younong; Balint, Joseph P.; Borisevich, Viktoriya; Smith, Jennifer; Smith, Jeanon; Peng, Bi-Hung; Walker, Aida; Salazar, Magda; Paessler, Slobodan

2013-01-01

Vaccines against emerging pathogens such as the 2009 H1N1 pandemic virus can benefit from current technologies such as rapid genomic sequencing to construct the most biologically relevant vaccine. A novel platform (Ad5 [E1-, E2b-]) has been utilized to induce immune responses to various antigenic targets. We employed this vector platform to express hemagglutinin (HA) and neuraminidase (NA) genes from 2009 H1N1 pandemic viruses. Inserts were consensuses sequences designed from viral isolate sequences and the vaccine was rapidly constructed and produced. Vaccination induced H1N1 immune responses in mice, which afforded protection from lethal virus challenge. In ferrets, vaccination protected from disease development and significantly reduced viral titers in nasal washes. H1N1 cell mediated immunity as well as antibody induction correlated with the prevention of disease symptoms and reduction of virus replication. The Ad5 [E1-, E2b-] should be evaluated for the rapid development of effective vaccines against infectious diseases. PMID:21821082

Molecular identification and characterization of clustered regularly interspaced short palindromic repeats (CRISPRs) in a urease-positive thermophilic Campylobacter sp. (UPTC).

PubMed

Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M

2012-02-01

Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.
Structural determinants of nuclear export signal orientation in binding to exportin CRM1

DOE PAGES

Fung, Ho Yee Joyce; Fu, Szu -Chin; Brautigam, Chad A.; ...

2015-09-08

The Chromosome Region of Maintenance 1 (CRM1) protein mediates nuclear export of hundreds of proteins through recognition of their nuclear export signals (NESs), which are highly variable in sequence and structure. The plasticity of the CRM1-NES interaction is not well understood, as there are many NES sequences that seem incompatible with structures of the NES-bound CRM1 groove. Crystal structures of CRM1 bound to two different NESs with unusual sequences showed the NES peptides binding the CRM1 groove in the opposite orientation (minus) to that of previously studied NESs (plus). A comparison of minus and plus NESs identified structural and sequencemore » determinants for NES orientation. The binding of NESs to CRM1 in both orientations results in a large expansion in NES consensus patterns and therefore a corresponding expansion of potential NESs in the proteome.« less
Fabrication of a New Lineage of Artificial Luciferases from Natural Luciferase Pools.

PubMed

Kim, Sung Bae; Nishihara, Ryo; Citterio, Daniel; Suzuki, Koji

2017-09-11

The fabrication of artificial luciferases (ALucs) with unique optical properties has a fundamental impact on bioassays and molecular imaging. In this study, we developed a new lineage of ALucs with unique substrate preferences by extracting consensus amino acids from the alignment of 25 copepod luciferase sequences available in natural luciferase pools. The primary sequence was first created with a sequence logo generator resulting in a total of 11 sibling sequences. Phylogenetic analysis shows that the newly fabricated ALucs form an independent branch, genetically isolated from the natural luciferases, and from a prior series of ALucs produced by our laboratory using a smaller basis set. The new lineage of ALucs were strongly luminescent in living mammalian cells with specific substrate selectivity to native coelenterazine. A single-residue-level comparison of the C-terminal sequences of new ALucs reveals that some amino acids in the C-terminal ends are greatly influential on the optical intensities but limited in the color variance. The success of this approach guides on how to engineer and functionalize marine luciferases for bioluminescence imaging and assays.
Describing sequencing results of structural chromosome rearrangements with a suggested next-generation cytogenetic nomenclature.

PubMed

Ordulu, Zehra; Wong, Kristen E; Currall, Benjamin B; Ivanov, Andrew R; Pereira, Shahrin; Althari, Sara; Gusella, James F; Talkowski, Michael E; Morton, Cynthia C

2014-05-01

With recent rapid advances in genomic technologies, precise delineation of structural chromosome rearrangements at the nucleotide level is becoming increasingly feasible. In this era of "next-generation cytogenetics" (i.e., an integration of traditional cytogenetic techniques and next-generation sequencing), a consensus nomenclature is essential for accurate communication and data sharing. Currently, nomenclature for describing the sequencing data of these aberrations is lacking. Herein, we present a system called Next-Gen Cytogenetic Nomenclature, which is concordant with the International System for Human Cytogenetic Nomenclature (2013). This system starts with the alignment of rearrangement sequences by BLAT or BLAST (alignment tools) and arrives at a concise and detailed description of chromosomal changes. To facilitate usage and implementation of this nomenclature, we are developing a program designated BLA(S)T Output Sequence Tool of Nomenclature (BOSToN), a demonstrative version of which is accessible online. A standardized characterization of structural chromosomal rearrangements is essential both for research analyses and for application in the clinical setting. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
ESTree db: a tool for peach functional genomics.

PubMed

Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo

2005-12-01

The ESTree db http://www.itb.cnr.it/estree/ represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig.
A consensus-hemagglutinin-based vaccine delivered by an attenuated Salmonella mutant protects chickens against heterologous H7N1 influenza virus.

PubMed

Hyoung, Kim Je; Hajam, Irshad Ahmed; Lee, John Hwa

2017-06-13

H7N3 and H7N7 are highly pathogenic avian influenza (HPAI) viruses and have posed a great threat not only for the poultry industry but for the human health as well. H7N9, a low pathogenic avian influenza (LPAI) virus, is also highly pathogenic to humans, and there is a great concern that these H7 subtypes would acquire the ability to spread efficiently between humans, thereby becoming a pandemic threat. A vaccine candidate covering all the three subtypes must, therefore, be an integral part of any pandemic preparedness plan. To address this need, we constructed a consensus hemagglutinin (HA) sequence of H7N3, H7N7, and H7N9 based on the data available in the NCBI in early 2012-2015. This artificial sequence was then optimized for protein expression before being transformed into an attenuated auxotrophic mutant of Salmonella Typhimurium, JOL1863 strain. Immunizing chickens with JOL1863, delivered intramuscularly, nasally or orally, elicited efficient humoral and cell mediated immune responses, independently of the route of vaccination. Our results also showed that JOL1863 deliver efficient maturation signals to chicken monocyte derived dendritic cells (MoDCs) which were characterized by upregulation of costimulatory molecules and higher cytokine induction. Moreover, immunization with JOL1863 in chickens conferred a significant protection against the heterologous LPAI H7N1 virus challenge as indicated by reduced viral sheddings in the cloacal swabs. We conclude that this vaccine, based on a consensus HA, could induce broader spectrum of protection against divergent H7 influenza viruses and thus warrants further study.
Shotgun Protein Sequencing with Meta-contig Assembly*

PubMed Central

Guthals, Adrian; Clauser, Karl R.; Bandeira, Nuno

2012-01-01

Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings. PMID:22798278
Shotgun protein sequencing with meta-contig assembly.

PubMed

Guthals, Adrian; Clauser, Karl R; Bandeira, Nuno

2012-10-01

Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings.
Context based computational analysis and characterization of ARS consensus sequences (ACS) of Saccharomyces cerevisiae genome.

PubMed

Singh, Vinod Kumar; Krishnamachari, Annangarachari

2016-09-01

Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS) requires an essential consensus sequence (ACS) for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC) denoted as ORC-ACS and non-replicating ACS sequences (nrACS), that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.
WebLogo: A Sequence Logo Generator

PubMed Central

Crooks, Gavin E.; Hon, Gary; Chandonia, John-Marc; Brenner, Steven E.

2004-01-01

WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Each logo consists of stacks of letters, one stack for each position in the sequence. The overall height of each stack indicates the sequence conservation at that position (measured in bits), whereas the height of symbols within the stack reflects the relative frequency of the corresponding amino or nucleic acid at that position. WebLogo has been enhanced recently with additional features and options, to provide a convenient and highly configurable sequence logo generator. A command line interface and the complete, open WebLogo source code are available for local installation and customization. PMID:15173120
Genomic Variability of Haemophilus influenzae Isolated from Mexican Children Determined by Using Enterobacterial Repetitive Intergenic Consensus Sequences and PCR

PubMed Central

Gomez-De-Leon, Patricia; Santos, Jose I.; Caballero, Javier; Gomez, Demostenes; Espinosa, Luz E.; Moreno, Isabel; Piñero, Daniel; Cravioto, Alejandro

2000-01-01

Genomic fingerprints from 92 capsulated and noncapsulated strains of Haemophilus influenzae from Mexican children with different diseases and healthy carriers were generated by PCR using the enterobacterial repetitive intergenic consensus (ERIC) sequences. A cluster analysis by the unweighted pair-group method with arithmetic averages based on the overall similarity as estimated from the characteristics of the genomic fingerprints, was conducted to group the strains. A total of 69 fingerprint patterns were detected in the H. influenzae strains. Isolates from patients with different diseases were represented by a variety of patterns, which clustered into two major groups. Of the 37 strains isolated from cases of meningitis, 24 shared patterns and were clustered into five groups within a similarity level of 1.0. One fragment of 1.25 kb was common to all meningitis strains. H. influenzae strains from healthy carriers presented fingerprint patterns different from those found in strains from sick children. Isolates from healthy individuals were more variable and were distributed differently from those from patients. The results show that ERIC-PCR provides a powerful tool for the determination of the distinctive pathogenicity potentials of H. influenzae strains and encourage its use for molecular epidemiology investigations. PMID:10878033
σ54-Dependent Response to Nitrogen Limitation and Virulence in Burkholderia cenocepacia Strain H111

PubMed Central

Lardi, Martina; Aguilar, Claudio; Pedrioli, Alessandro; Omasits, Ulrich; Suppiger, Angela; Cárcamo-Oyarce, Gerardo; Schmid, Nadine; Ahrens, Christian H.

2015-01-01

Members of the genus Burkholderia are versatile bacteria capable of colonizing highly diverse environmental niches. In this study, we investigated the global response of the opportunistic pathogen Burkholderia cenocepacia H111 to nitrogen limitation at the transcript and protein expression levels. In addition to a classical response to nitrogen starvation, including the activation of glutamine synthetase, PII proteins, and the two-component regulatory system NtrBC, B. cenocepacia H111 also upregulated polyhydroxybutyrate (PHB) accumulation and exopolysaccharide (EPS) production in response to nitrogen shortage. A search for consensus sequences in promoter regions of nitrogen-responsive genes identified a σ54 consensus sequence. The mapping of the σ54 regulon as well as the characterization of a σ54 mutant suggests an important role of σ54 not only in control of nitrogen metabolism but also in the virulence of this organism. PMID:25841012
Characterization of an Avipoxvirus From a Bald Eagle ( Haliaeetus leucocephalus ) Using Novel Consensus PCR Protocols for the rpo147 and DNA-Dependent DNA Polymerase Genes.

PubMed

Stephen, Alexa A; Leone, Angelique M; Toplon, David E; Archer, Linda L; Wellehan, James F X

2016-12-01

A juvenile female bald eagle ( Haliaeetus leucocephalus ) was presented with emaciation and proliferative periocular lesions. The eagle did not respond to supportive therapy and was euthanatized. Histopathologic examination of the skin lesions revealed plaques of marked epidermal hyperplasia parakeratosis, marked acanthosis and spongiosis, and eosinophilic intracytoplasmic inclusion bodies. Novel polymerase chain reaction (PCR) assays were done to amplify and sequence DNA polymerase and rpo147 genes. The 4b gene was also analyzed by a previously developed assay. Bayesian and maximum likelihood phylogenetic analyses of the obtained sequences found it to be poxvirus of the genus Avipoxvirus and clustered with other raptor isolates. Better phylogenetic resolution was found in rpo147 rather than the commonly used DNA polymerase. The novel consensus rpo147 PCR assay will create more accurate phylogenic trees and allow better insight into poxvirus history.
Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics

PubMed Central

Ardui, Simon; Ameur, Adam; Vermeesch, Joris R; Hestand, Matthew S

2018-01-01

Abstract Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio's single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing. PMID:29401301
Sequence analysis of the internal transcribed spacer (ITS) region reveals a novel clade of Ichthyophonus sp. from rainbow trout

USGS Publications Warehouse

Rasmussen, C.; Purcell, M.K.; Gregg, J.L.; LaPatra, S.E.; Winton, J.R.; Hershberger, P.K.

2010-01-01

The mesomycetozoean parasite Ichthyophonus hoferi is most commonly associated with marine fish hosts but also occurs in some components of the freshwater rainbow trout Oncorhynchus mykiss aquaculture industry in Idaho, USA. It is not certain how the parasite was introduced into rainbow trout culture, but it might have been associated with the historical practice of feeding raw, ground common carp Cyprinus carpio that were caught by commercial fisherman. Here, we report a major genetic division between west coast freshwater and marine isolates of Ichthyophonus hoferi. Sequence differences were not detected in 2 regions of the highly conserved small subunit (18S) rDNA gene; however, nucleotide variation was seen in internal transcribed spacer loci (ITS1 and ITS2), both within and among the isolates. Intra-isolate variation ranged from 2.4 to 7.6 nucleotides over a region consisting of ~740 bp. Majority consensus sequences from marine/anadromous hosts differed in only 0 to 3 nucleotides (99.6 to 100% nucleotide identity), while those derived from freshwater rainbow trout had no nucleotide substitutions relative to each other. However, the consensus sequences between isolates from freshwater rainbow trout and those from marine/anadromous hosts differed in 13 to 16 nucleotides (97.8 to 98.2% nucleotide identity).
Accurate multiplex polony sequencing of an evolved bacterial genome.

PubMed

Shendure, Jay; Porreca, Gregory J; Reppas, Nikos B; Lin, Xiaoxia; McCutcheon, John P; Rosenbaum, Abraham M; Wang, Michael D; Zhang, Kun; Mitra, Robi D; Church, George M

2005-09-09

We describe a DNA sequencing technology in which a commonly available, inexpensive epifluorescence microscope is converted to rapid nonelectrophoretic DNA sequencing automation. We apply this technology to resequence an evolved strain of Escherichia coli at less than one error per million consensus bases. A cell-free, mate-paired library provided single DNA molecules that were amplified in parallel to 1-micrometer beads by emulsion polymerase chain reaction. Millions of beads were immobilized in a polyacrylamide gel and subjected to automated cycles of sequencing by ligation and four-color imaging. Cost per base was roughly one-ninth as much as that of conventional sequencing. Our protocols were implemented with off-the-shelf instrumentation and reagents.
Nucleotide Sequence of the blaRTG-2 (CARB-5) Gene and Phylogeny of a New Group of Carbenicillinases

PubMed Central

Choury, Daniele; Szajnert, Marie-France; Joly-Guillou, Marie-Laure; Azibi, Kemal; Delpech, Marc; Paul, Gérard

2000-01-01

We determined the nucleotide sequence of the bla gene for the Acinetobacter calcoaceticus β-lactamase previously described as CARB-5. Alignment of the deduced amino acid sequence with those of known β-lactamases revealed that CARB-5 possesses an RTG triad in box VII, as described for the Proteus mirabilis GN79 enzyme, instead of the RSG consensus characteristic of the other carbenicillinases. Phylogenetic studies showed that these RTG enzymes constitute a new, separate group, possibly ancestors of the carbenicillinase family. PMID:10722515
A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer.

PubMed

Quick, Joshua; Quinlan, Aaron R; Loman, Nicholas J

2014-01-01

The MinION™ is a new, portable single-molecule sequencer developed by Oxford Nanopore Technologies. It measures four inches in length and is powered from the USB 3.0 port of a laptop computer. The MinION™ measures the change in current resulting from DNA strands interacting with a charged protein nanopore. These measurements can then be used to deduce the underlying nucleotide sequence. We present a read dataset from whole-genome shotgun sequencing of the model organism Escherichia coli K-12 substr. MG1655 generated on a MinION™ device during the early-access MinION™ Access Program (MAP). Sequencing runs of the MinION™ are presented, one generated using R7 chemistry (released in July 2014) and one using R7.3 (released in September 2014). Base-called sequence data are provided to demonstrate the nature of data produced by the MinION™ platform and to encourage the development of customised methods for alignment, consensus and variant calling, de novo assembly and scaffolding. FAST5 files containing event data within the HDF5 container format are provided to assist with the development of improved base-calling methods.
Ability of HIV-1 Nef to downregulate CD4 and HLA class I differs among viral subtypes

PubMed Central

2013-01-01

Background The highly genetically diverse HIV-1 group M subtypes may differ in their biological properties. Nef is an important mediator of viral pathogenicity; however, to date, a comprehensive inter-subtype comparison of Nef in vitro function has not been undertaken. Here, we investigate two of Nef’s most well-characterized activities, CD4 and HLA class I downregulation, for clones obtained from 360 chronic patients infected with HIV-1 subtypes A, B, C or D. Results Single HIV-1 plasma RNA Nef clones were obtained from N=360 antiretroviral-naïve, chronically infected patients from Africa and North America: 96 (subtype A), 93 (B), 85 (C), and 86 (D). Nef clones were expressed by transfection in an immortalized CD4+ T-cell line. CD4 and HLA class I surface levels were assessed by flow cytometry. Nef expression was verified by Western blot. Subset analyses and multivariable linear regression were used to adjust for differences in age, sex and clinical parameters between cohorts. Consensus HIV-1 subtype B and C Nef sequences were synthesized and functionally assessed. Exploratory sequence analyses were performed to identify potential genotypic correlates of Nef function. Subtype B Nef clones displayed marginally greater CD4 downregulation activity (p = 0.03) and markedly greater HLA class I downregulation activity (p < 0.0001) than clones from other subtypes. Subtype C Nefs displayed the lowest in vitro functionality. Inter-subtype differences in HLA class I downregulation remained statistically significant after controlling for differences in age, sex, and clinical parameters (p < 0.0001). The synthesized consensus subtype B Nef showed higher activities compared to consensus C Nef, which was most pronounced in cells expressing lower protein levels. Nef clones exhibited substantial inter-subtype diversity: cohort consensus residues differed at 25% of codons, while a similar proportion of codons exhibited substantial inter-subtype differences in major variant frequency. These amino acids, along with others identified in intra-subtype analyses, represent candidates for mediating inter-subtype differences in Nef function. Conclusions Results support a functional hierarchy of subtype B > A/D > C for Nef-mediated CD4 and HLA class I downregulation. The mechanisms underlying these differences and their relevance to HIV-1 pathogenicity merit further investigation. PMID:24041011
Genome characterization of Sugarcane Yellow Leaf Virus with special reference to RNAi based molecular breeding.

PubMed

Khalil, Farghama; Yueyu, Xu; Naiyan, Xiao; Di, Liu; Tayyab, Muhammad; Hengbo, Wang; Islam, Waqar; Rauf, Saeed; Pinghua, Chen

2018-05-04

Sugarcane is an essential crop for sugar and biofuel. Globally, its production is severely affected by sugarcane yellow leaf disease (SCYLD) caused by Sugarcane Yellow Leaf Virus (SCYLV). Many aphid vectors are involved in the spread of the disease which reduced the effectiveness of cultural and chemical management. Empirical methods of plant breeding such as introgression from wild and cultivated germplasm were not possible or at least challenging due to the absence of resistance in cultivated and wild germplasm of sugarcane. RNA interference (RNAi) transformation is an effective method to create virus-resistant varieties. Nevertheless, limited progress has been made due to lack of comprehensive research program on SCYLV based on RNAi technique. In order to show improvement and to propose future strategies for the feasibility of the RNAi technique to cope SCYLV, genome-wide consensus sequences of SCYLV were analyzed through GenBank. The coverage rates of every consensus sequence in SCYLV isolates were calculated to evaluate their practicability. Our analysis showed that single consensus sequence from SCYLV could not work well for RNAi based sugarcane breeding programs. This may be due to high mutation rate and continuous recombination within and between various viral strains. Alternative multi-target RNAi strategy is suggested to combat several strains of the viruses and to reduce the silencing escape. The multi-target small interfering RNA (siRNA) can be used together to construct RNAi plant expression plasmid, and to transform sugarcane tissues to develop new sugarcane varieties resistant to SCYLV. Copyright © 2018 Elsevier Ltd. All rights reserved.

Chromosome-Encoded Broad-Spectrum Ambler Class A β-Lactamase RUB-1 from Serratia rubidaea

PubMed Central

Didi, Jennifer; Ergani, Ayla; Lima, Sandra

2016-01-01

ABSTRACT Whole-genome sequencing of Serratia rubidaea CIP 103234T revealed a chromosomally located Ambler class A β-lactamase gene. The gene was cloned, and the β-lactamase, RUB-1, was characterized. RUB-1 displayed 74% and 73% amino acid sequence identity with the GIL-1 and TEM-1 penicillinases, respectively, and its substrate profile was similar to that of the latter β-lactamases. Analysis by 5′ rapid amplification of cDNA ends revealed promoter sequences highly divergent from the Escherichia coli σ70 consensus sequence. This work further illustrates the heterogeneity of β-lactamases among Serratia spp. PMID:27956418
Chromosome-Encoded Broad-Spectrum Ambler Class A β-Lactamase RUB-1 from Serratia rubidaea.

PubMed

Bonnin, Rémy A; Didi, Jennifer; Ergani, Ayla; Lima, Sandra; Naas, Thierry

2017-02-01

Whole-genome sequencing of Serratia rubidaea CIP 103234 T revealed a chromosomally located Ambler class A β-lactamase gene. The gene was cloned, and the β-lactamase, RUB-1, was characterized. RUB-1 displayed 74% and 73% amino acid sequence identity with the GIL-1 and TEM-1 penicillinases, respectively, and its substrate profile was similar to that of the latter β-lactamases. Analysis by 5' rapid amplification of cDNA ends revealed promoter sequences highly divergent from the Escherichia coli σ 70 consensus sequence. This work further illustrates the heterogeneity of β-lactamases among Serratia spp. Copyright © 2017 American Society for Microbiology.
LongISLND: in silico sequencing of lengthy and noisy datatypes

PubMed Central

Lau, Bayo; Mohiyuddin, Marghoob; Mu, John C.; Fang, Li Tai; Bani Asadi, Narges; Dallett, Carolina; Lam, Hugo Y. K.

2016-01-01

Summary: LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multi-pass sequencing with P5 and P6 chemistries, producing data in FASTQ, H5, and the latest PacBio BAM format. We demonstrate its utility by downstream processing with consensus building and variant calling. Availability and Implementation: LongISLND is implemented in Java and available at http://bioinform.github.io/longislnd Contact: hugo.lam@roche.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27667791
Integrated consensus genetic and physical maps of flax (Linum usitatissimum L.).

PubMed

Cloutier, Sylvie; Ragupathy, Raja; Miranda, Evelyn; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Ward, Kerry; Rowland, Gordon; Duguid, Scott; Banik, Mitali

2012-12-01

Three linkage maps of flax (Linum usitatissimum L.) were constructed from populations CDC Bethune/Macbeth, E1747/Viking and SP2047/UGG5-5 containing between 385 and 469 mapped markers each. The first consensus map of flax was constructed incorporating 770 markers based on 371 shared markers including 114 that were shared by all three populations and 257 shared between any two populations. The 15 linkage group map corresponds to the haploid number of chromosomes of this species. The marker order of the consensus map was largely collinear in all three individual maps but a few local inversions and marker rearrangements spanning short intervals were observed. Segregation distortion was present in all linkage groups which contained 1-52 markers displaying non-Mendelian segregation. The total length of the consensus genetic map is 1,551 cM with a mean marker density of 2.0 cM. A total of 670 markers were anchored to 204 of the 416 fingerprinted contigs of the physical map corresponding to ~274 Mb or 74 % of the estimated flax genome size of 370 Mb. This high resolution consensus map will be a resource for comparative genomics, genome organization, evolution studies and anchoring of the whole genome shotgun sequence.
Viral Linkage in HIV-1 Seroconverters and Their Partners in an HIV-1 Prevention Clinical Trial

PubMed Central

Campbell, Mary S.; Mullins, James I.; Hughes, James P.; Celum, Connie; Wong, Kim G.; Raugi, Dana N.; Sorensen, Stefanie; Stoddard, Julia N.; Zhao, Hong; Deng, Wenjie; Kahle, Erin; Panteleeff, Dana; Baeten, Jared M.; McCutchan, Francine E.; Albert, Jan; Leitner, Thomas; Wald, Anna; Corey, Lawrence; Lingappa, Jairam R.

2011-01-01

Background Characterization of viruses in HIV-1 transmission pairs will help identify biological determinants of infectiousness and evaluate candidate interventions to reduce transmission. Although HIV-1 sequencing is frequently used to substantiate linkage between newly HIV-1 infected individuals and their sexual partners in epidemiologic and forensic studies, viral sequencing is seldom applied in HIV-1 prevention trials. The Partners in Prevention HSV/HIV Transmission Study (ClinicalTrials.gov #NCT00194519) was a prospective randomized placebo-controlled trial that enrolled serodiscordant heterosexual couples to determine the efficacy of genital herpes suppression in reducing HIV-1 transmission; as part of the study analysis, HIV-1 sequences were examined for genetic linkage between seroconverters and their enrolled partners. Methodology/Principal Findings We obtained partial consensus HIV-1 env and gag sequences from blood plasma for 151 transmission pairs and performed deep sequencing of env in some cases. We analyzed sequences with phylogenetic techniques and developed a Bayesian algorithm to evaluate the probability of linkage. For linkage, we required monophyletic clustering between enrolled partners' sequences and a Bayesian posterior probability of ≥50%. Adjudicators classified each seroconversion, finding 108 (71.5%) linked, 40 (26.5%) unlinked, and 3 (2.0%) indeterminate transmissions, with linkage determined by consensus env sequencing in 91 (84%). Male seroconverters had a higher frequency of unlinked transmissions than female seroconverters. The likelihood of transmission from the enrolled partner was related to time on study, with increasing numbers of unlinked transmissions occurring after longer observation periods. Finally, baseline viral load was found to be significantly higher among linked transmitters. Conclusions/Significance In this first use of HIV-1 sequencing to establish endpoints in a large clinical trial, more than one-fourth of transmissions were unlinked to the enrolled partner, illustrating the relevance of these methods in the design of future HIV-1 prevention trials in serodiscordant couples. A hierarchy of sequencing techniques, analysis methods, and expert adjudication contributed to the linkage determination process. PMID:21399681
[Discussion on the botanical origin of Isatidis radix and Isatidis folium based on DNA barcoding].

PubMed

Sun, Zhi-Ying; Pang, Xiao-Hui

2013-12-01

This paper aimed to investigate the botanical origins of Isatidis Radix and Isatidis Folium, and clarify the confusion of its classification. The second internal transcribed spacer (ITS2) of ribosomal DNA, the chloroplast matK gene of 22 samples from some major production areas were amplified and sequenced. Sequence assembly and consensus sequence generation were performed using the CodonCode Aligner. Phylogenetic study was performed using MEGA 4.0 software in accordance with the Kimura 2-Parameter (K2P) model, and the phylogenetic tree was constructed using the neighbor-joining methods. The results showed that the length of ITS2 sequence of the botanical origins of Isatidis Radix and Isatidis Folium was 191 bp. The sequence showed that some samples had several SNP sites, and some samples had heterozygosis sites. In the NJ tree, based on ITS2 sequence, the studied samples were separated into two groups, and one of them was gathered with Isatis tinctoria L. The studied samples also were divided into two groups obviously based on the chloroplast matK gene. In conclusion, our results support that the botanical origins of Isatidis Radix and Isatidis Folium are Isatis indigotica Fortune, and Isatis indigotica and Isatis tinctoria are two distinct species. This study doesn't support the opinion about the combination of these two species in Flora of China.
A genosensor for detection of consensus DNA sequence of Dengue virus using ZnO/Pt-Pd nanocomposites.

PubMed

Singhal, Chaitali; Pundir, C S; Narang, Jagriti

2017-11-15

An electrochemical genosensor based on Zinc oxide/platinum-palladium (ZnO/Pt-Pd) modified fluorine doped tin oxide (FTO) glass plate was fabricated for detection of consensus DNA sequence of Dengue virus (DENV) using methylene blue (MB) as an intercalating agent. To achieve it, probe DNA (PDNA) was immobilized on the surface of ZnO/Pt-Pd nanocomposites modified FTO electrode. The synthesized nano-composites were characterized by high resolution transmission electron microscopy (HRTEM), energy dispersive X-ray analysis (EDX), atomic force microscopy (AFM), scanning electron microscopy (SEM), UV-Vis spectroscopy, X-ray diffraction (XRD) analysis and Fourier transform infra-red (FTIR) spectroscopy. This PDNA modified electrode (PDNA/ZnO/Pt-Pd/FTO) served as a signal amplification platform for the detection of the target hybridized DNA (TDNA). The hybridization between PDNA and TDNA was detected by reduction in current, generated by interaction of anionic mediator, i.e., methylene blue (MB) with free guanine (3'G) of ssDNA. The sensor showed a dynamic linear range of 1 × 10 -6 M to 100 × 10 -6 M with LOD as 4.3 × 10 -5 M and LOQ as 9.5 × 10 -5 M. Till date, majorly serotype specific biosensors for dengue detection have been developed. The genosensor reported here eliminates the possibility of false result as in case of serotype specific DNA sensor. This is the report where conserved sequences present in all the serotypes of Dengue virus has been employed for fabrication of a genosensor. Copyright © 2017 Elsevier B.V. All rights reserved.
The Society for Immunotherapy of Cancer consensus statement on immunotherapy for the treatment of prostate carcinoma.

PubMed

McNeel, Douglas G; Bander, Neil H; Beer, Tomasz M; Drake, Charles G; Fong, Lawrence; Harrelson, Stacey; Kantoff, Philip W; Madan, Ravi A; Oh, William K; Peace, David J; Petrylak, Daniel P; Porterfield, Hank; Sartor, Oliver; Shore, Neal D; Slovin, Susan F; Stein, Mark N; Vieweg, Johannes; Gulley, James L

2016-01-01

Prostate cancer is the most commonly diagnosed malignancy and second leading cause of cancer death among men in the United States. In recent years, several new agents, including cancer immunotherapies, have been approved or are currently being investigated in late-stage clinical trials for the management of advanced prostate cancer. Therefore, the Society for Immunotherapy of Cancer (SITC) convened a multidisciplinary panel, including physicians, nurses, and patient advocates, to develop consensus recommendations for the clinical application of immunotherapy for prostate cancer patients. To do so, a systematic literature search was performed to identify high-impact papers from 2006 until 2014 and was further supplemented with literature provided by the panel. Results from the consensus panel voting and discussion as well as the literature review were used to rate supporting evidence and generate recommendations for the use of immunotherapy in prostate cancer patients. Sipuleucel-T, an autologous dendritic cell vaccine, is the first and currently only immunotherapeutic agent approved for the clinical management of metastatic castrate resistant prostate cancer (mCRPC). The consensus panel utilized this model to discuss immunotherapy in the treatment of prostate cancer, issues related to patient selection, monitoring of patients during and post treatment, and sequence/combination with other anti-cancer treatments. Potential immunotherapies emerging from late-stage clinical trials are also discussed. As immunotherapy evolves as a therapeutic option for the treatment of prostate cancer, these recommendations will be updated accordingly.
Event-Triggered Distributed Average Consensus Over Directed Digital Networks With Limited Communication Bandwidth.

PubMed

Li, Huaqing; Chen, Guo; Huang, Tingwen; Dong, Zhaoyang; Zhu, Wei; Gao, Lan

2016-12-01

In this paper, we consider the event-triggered distributed average-consensus of discrete-time first-order multiagent systems with limited communication data rate and general directed network topology. In the framework of digital communication network, each agent has a real-valued state but can only exchange finite-bit binary symbolic data sequence with its neighborhood agents at each time step due to the digital communication channels with energy constraints. Novel event-triggered dynamic encoder and decoder for each agent are designed, based on which a distributed control algorithm is proposed. A scheme that selects the number of channel quantization level (number of bits) at each time step is developed, under which all the quantizers in the network are never saturated. The convergence rate of consensus is explicitly characterized, which is related to the scale of network, the maximum degree of nodes, the network structure, the scaling function, the quantization interval, the initial states of agents, the control gain and the event gain. It is also found that under the designed event-triggered protocol, by selecting suitable parameters, for any directed digital network containing a spanning tree, the distributed average consensus can be always achieved with an exponential convergence rate based on merely one bit information exchange between each pair of adjacent agents at each time step. Two simulation examples are provided to illustrate the feasibility of presented protocol and the correctness of the theoretical results.
Sequence Bundles: a novel method for visualising, discovering and exploring sequence motifs

PubMed Central

2014-01-01

Background We introduce Sequence Bundles--a novel data visualisation method for representing multiple sequence alignments (MSAs). We identify and address key limitations of the existing bioinformatics data visualisation methods (i.e. the Sequence Logo) by enabling Sequence Bundles to give salient visual expression to sequence motifs and other data features, which would otherwise remain hidden. Methods For the development of Sequence Bundles we employed research-led information design methodologies. Sequences are encoded as uninterrupted, semi-opaque lines plotted on a 2-dimensional reconfigurable grid. Each line represents a single sequence. The thickness and opacity of the stack at each residue in each position indicates the level of conservation and the lines' curved paths expose patterns in correlation and functionality. Several MSAs can be visualised in a composite image. The Sequence Bundles method is designed to favour a tangible, continuous and intuitive display of information. Results We have developed a software demonstration application for generating a Sequence Bundles visualisation of MSAs provided for the BioVis 2013 redesign contest. A subsequent exploration of the visualised line patterns allowed for the discovery of a number of interesting features in the dataset. Reported features include the extreme conservation of sequences displaying a specific residue and bifurcations of the consensus sequence. Conclusions Sequence Bundles is a novel method for visualisation of MSAs and the discovery of sequence motifs. It can aid in generating new insight and hypothesis making. Sequence Bundles is well disposed for future implementation as an interactive visual analytics software, which can complement existing visualisation tools. PMID:25237395
Population-genetic analysis of HvABCG31 promoter sequence in wild barley (Hordeum vulgare ssp. spontaneum)

PubMed Central

2012-01-01

Background The cuticle is an important adaptive structure whose origin played a crucial role in the transition of plants from aqueous to terrestrial conditions. HvABCG31/Eibi1 is an ABCG transporter gene, involved in cuticle formation that was recently identified in wild barley (Hordeum vulgare ssp. spontaneum). To study the genetic variation of HvABCG31 in different habitats, its 2 kb promoter region was sequenced from 112 wild barley accessions collected from five natural populations from southern and northern Israel. The sites included three mesic and two xeric habitats, and differed in annual rainfall, soil type, and soil water capacity. Results Phylogenetic analysis of the aligned HvABCG31 promoter sequences clustered the majority of accessions (69 out of 71) from the three northern mesic populations into one cluster, while all 21 accessions from the Dead Sea area, a xeric southern population, and two isolated accessions (one from a xeric population at Mitzpe Ramon and one from the xeric ‘African Slope’ of “Evolution Canyon”) formed the second cluster. The southern arid populations included six haplotypes, but they differed from the consensus sequence at a large number of positions, while the northern mesic populations included 15 haplotypes that were, on average, more similar to the consensus sequence. Most of the haplotypes (20 of 22) were unique to a population. Interestingly, higher genetic variation occurred within populations (54.2%) than among populations (45.8%). Analysis of the promoter region detected a large number of transcription factor binding sites: 121–128 and 121–134 sites in the two southern arid populations, and 123–128,125–128, and 123–125 sites in the three northern mesic populations. Three types of TFBSs were significantly enriched: those related to GA (gibberellin), Dof (DNA binding with one finger), and light. Conclusions Drought stress and adaptive natural selection may have been important determinants in the observed sequence variation of HvABCG31 promoter. Abiotic stresses may be involved in the HvABCG31 gene transcription regulations, generating more protective cuticles in plants under stresses. PMID:23006777
Structure and stability of the ankyrin domain of the Drosophila Notch receptor.

PubMed

Zweifel, Mark E; Leahy, Daniel J; Hughson, Frederick M; Barrick, Doug

2003-11-01

The Notch receptor contains a conserved ankyrin repeat domain that is required for Notch-mediated signal transduction. The ankyrin domain of Drosophila Notch contains six ankyrin sequence repeats previously identified as closely matching the ankyrin repeat consensus sequence, and a putative seventh C-terminal sequence repeat that exhibits lower similarity to the consensus sequence. To better understand the role of the Notch ankyrin domain in Notch-mediated signaling and to examine how structure is distributed among the seven ankyrin sequence repeats, we have determined the crystal structure of this domain to 2.0 angstroms resolution. The seventh, C-terminal, ankyrin sequence repeat adopts a regular ankyrin fold, but the first, N-terminal ankyrin repeat, which contains a 15-residue insertion, appears to be largely disordered. The structure reveals a substantial interface between ankyrin polypeptides, showing a high degree of shape and charge complementarity, which may be related to homotypic interactions suggested from indirect studies. However, the Notch ankyrin domain remains largely monomeric in solution, demonstrating that this interface alone is not sufficient to promote tight association. Using the structure, we have classified reported mutations within the Notch ankyrin domain that are known to disrupt signaling into those that affect buried residues and those restricted to surface residues. We show that the buried substitutions greatly decrease protein stability, whereas the surface substitutions have only a marginal affect on stability. The surface substitutions are thus likely to interfere with Notch signaling by disrupting specific Notch-effector interactions and map the sites of these interactions.
DNA sequence requirements for the accurate transcription of a protein-coding plastid gene in a plastid in vitro system from mustard (Sinapis alba L.)

PubMed Central

Link, Gerhard

1984-01-01

A nuclease-treated plastid extract from mustard (Sinapis alba L.) allows efficient transcription of cloned plastid DNA templates. In this in vitro system, the major runoff transcript of the truncated gene for the 32 000 mol. wt. photosystem II protein was accurately initiated from a site close to or identical with the in vivo start site. By using plasmids with deletions in the 5'-flanking region of this gene as templates, a DNA region required for efficient and selective initiation was detected ˜28-35 nucleotides upstream of the transcription start site. This region contains the sequence element TTGACA, which matches the consensus sequence for prokaryotic `−35' promoter elements. In the absence of this region, a region ˜13-27 nucleotides upstream of the start site still enables a basic level of specific transcription. This second region contains the sequence element TATATAA, which matches the consensus sequence for the `TATA' box of genes transcribed by RNA polymerase II (or B). The region between the `TATA'-like element and the transcription start site is not sufficient but may be required for specific transcription of the plastid gene. This latter region contains the sequence element TATACT, which resembles the prokaryotic `−10' (Pribnow) box. Based on the structural and transcriptional features of the 5' upstream region, a `promoter switch' mechanism is proposed, which may account for the developmentally regulated expression of this plastid gene. ImagesFig. 1.Fig. 2.Fig. 3.Fig. 4.Figure 5. PMID:16453540
Architecture of a Fur Binding Site: a Comparative Analysis

PubMed Central

Lavrrar, Jennifer L.; McIntosh, Mark A.

2003-01-01

Fur is an iron-binding transcriptional repressor that recognizes a 19-bp consensus site of the sequence 5′-GATAATGATAATCATTATC-3′. This site can be defined as three adjacent hexamers of the sequence 5′-GATAAT-3′, with the third being slightly imperfect (an F-F-F configuration), or as two hexamers in the forward orientation separated by one base pair from a third hexamer in the reverse orientation (an F-F-x-R configuration). Although Fur can bind synthetic DNA sequences containing the F-F-F arrangement, most natural binding sites are variations of the F-F-x-R arrangement. The studies presented here compared the ability of Fur to recognize synthetic DNA sequences containing two to four adjacent hexamers with binding to sequences containing variations of the F-F-x-R arrangement (including natural operator sequences from the entS and fepB promoter regions of Escherichia coli). Gel retardation assays showed that the F-F-x-R architecture was necessary for high-affinity Fur-DNA interactions and that contiguous hexamers were not recognized as effectively. In addition, the stoichiometry of Fur at each binding site was determined, showing that Fur interacted with its minimal 19-bp binding site as two overlapping dimers. These data confirm the proposed overlapping-dimer binding model, where the unit of interaction with a single Fur dimer is two inverted hexamers separated by a C:G base pair, with two overlapping units comprising the 19-bp consensus binding site required for the high-affinity interaction with two Fur dimers. PMID:12644489
Identification of a DNA sequence motif required for expression of iron-regulated genes in pseudomonads.

PubMed

Rombel, I T; McMorran, B J; Lamont, I L

1995-02-20

Many bacteria respond to a lack of iron in the environment by synthesizing siderophores, which act as iron-scavenging compounds. Fluorescent pseudomonads synthesize strain-specific but chemically related siderophores called pyoverdines or pseudobactins. We have investigated the mechanisms by which iron controls expression of genes involved in pyoverdine metabolism in Pseudomonas aeruginosa. Transcription of these genes is repressed by the presence of iron in the growth medium. Three promoters from these genes were cloned and the activities of the promoters were dependent on the amounts of iron in the growth media. Two of the promoters were sequenced and the transcriptional start site were identified by S1 nuclease analysis. Sequences similar to the consensus binding site for the Fur repressor protein, which controls expression of iron-repressible genes in several gram-negative species, were not present in the promoters, suggesting that they are unlikely to have a high affinity for Fur. However, comparison of the promoter sequences with those of iron-regulated genes from other Pseudomonas species and also the iron-regulated exotoxin gene of P. aeruginosa allowed identification of a shared sequence element, with the consensus sequence (G/C)CTAAAT-CCC, which is likely to act as a binding site for a transcriptional activator protein. Mutations in this sequence greatly reduced the activities of the promoters characterized here as well as those of other iron-regulated promoters. The requirement for this motif in the promoters of iron-regulated genes of different Pseudomonas species indicates that similar mechanisms are likely to be involved in controlling expression of a range of iron-regulated genes in pseudomonads.
Identification of Dendrobium species by a candidate DNA barcode sequence: the chloroplast psbA-trnH intergenic region.

PubMed

Yao, Hui; Song, Jing-Yuan; Ma, Xin-Ye; Liu, Chang; Li, Ying; Xu, Hong-Xi; Han, Jian-Ping; Duan, Li-Sheng; Chen, Shi-Lin

2009-05-01

DNA barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Although a consensus has not been reached regarding which DNA sequences can be used as the best plant barcodes, the psbA-trnH spacer region has been tested extensively in recent years. In this study, we hypothesize that the psbA-trnH spacer regions are also effective barcodes for Dendrobium species. We have sequenced the chloroplast psbA-trnH intergenic spacers of 17 Dendrobium species to test this hypothesis. The sequences were found to be significantly different from those of other species, with percentages of variation ranging from 0.3 % to 2.3 % and an average of 1.2 %. In contrast, the intraspecific variation among the Dendrobium species studied ranged from 0 % to 0.1 %. The sequence difference between the psbA-trnH sequences of 17 Dendrobium species and one Bulbophyllum odoratissimum ranged from 2.0 % to 3.1 %, with an average of 2.5 %. Our results support the notion that the psbA-trnH intergenic spacer region could be used as a barcode to distinguish various Dendrobium species and to differentiate Dendrobium species from other adulterating species. Copyright Georg Thieme Verlag KG Stuttgart. New York.
Echinococcus granulosus Sensu Stricto in Dogs and Jackals from Caspian Sea Region, Northern Iran

PubMed Central

GHOLAMI, Shirzad; JAHANDAR, Hefzallah; ABASTABAR, Mahdi; PAGHEH, Abdolsatar; MOBEDI, Iraj; SHARBATKHORI, Mitra

2016-01-01

Background: The aim of the present study was genotyping of Echinococcus granulosus isolates from dogs and jackals in Mazandaran Province, northern Iran, and using partial sequence of the mitochondrial cytochrome c oxidase subunit 1 gene (cox1). Methods: E. granulosus isolates (n = 15) were collected from 42 stray dogs and 16 jackals found in south of the Caspian Sea in northern Iran. After morphological study, the isolates were genetically characterized using consensus sequences (366bp) of the cox1 gene. Phylogenetic analysis of cox1 nucleotide sequence data was performed using a Bayesian Inference approach. Results: Four different sequences were observed among the isolates. Two genotypes [G1 (66.7%) and G3 (33.3%)] were identified among the isolates. The G1 sequences indicated three sequence profiles. One profile (Maz1) had 100% homology with reference sequence (AN: KP339045). Two other profiles, designated Maz2 and Maz3, had 99% homology with the G1 genotype (ANs: KP339046 and KP339047). A G3 sequence designated Maz4 showed 100% homology with a G3 reference sequence (AN: KP339048). Conclusion: The occurrence of the G1 genotype of E. granulosus sensu stricto as a frequent genotype in dogs is emphasized. This study established the first molecular characterization of E. granulosus in the province. PMID:28096852
Process of labeling specific chromosomes using recombinant repetitive DNA

DOEpatents

Moyzis, R.K.; Meyne, J.

1988-02-12

Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
C-Mannosylation of thrombopoietin receptor (c-Mpl) regulates thrombopoietin-dependent JAK-STAT signaling.

PubMed

Sasazawa, Yukiko; Sato, Natsumi; Suzuki, Takehiro; Dohmae, Naoshi; Simizu, Siro

The thrombopoietin receptor, also known as c-Mpl, is a member of the cytokine superfamily, which regulates the differentiation of megakaryocytes and formation of platelets by binding to its ligand, thrombopoietin (TPO), through Janus kinase (JAK)-signal transducer and activator of transcription (STAT) signaling. The loss-of-function mutations of c-Mpl cause severe thrombocytopenia due to impaired megakaryocytopoiesis, and gain-of-function mutations cause thrombocythemia. c-Mpl contains two Trp-Ser-Xaa-Trp-Ser (Xaa represents any amino acids) sequences, which are characteristic sequences of type I cytokine receptors, corresponding to C-mannosylation consensus sequences: Trp-Xaa-Xaa-Trp/Cys. C-mannosylation is a post-translational modification of tryptophan residue in which one mannose is attached to the first tryptophan residue in the consensus sequence via C-C linkage. Although c-Mpl contains some C-mannosylation sequences, whether c-Mpl is C-mannosylated or not has been uninvestigated. We identified that c-Mpl is C-mannosylated not only at Trp(269) and Trp(474), which are putative C-mannosylation site, but also at Trp(272), Trp(416), and Trp(477). Using C-mannosylation defective mutant of c-Mpl, the C-mannosylated tryptophan residues at four sites (Trp(269), Trp(272), Trp(474), and Trp(477)) are essential for c-Mpl-mediated JAK-STAT signaling. Our findings suggested that C-mannosylation of c-Mpl is a possible therapeutic target for platelet disorders. Copyright © 2015 Elsevier Inc. All rights reserved.
Sequence and structural analyses of nuclear export signals in the NESdb database

PubMed Central

Xu, Darui; Farmer, Alicia; Collett, Garen; Grishin, Nick V.; Chook, Yuh Min

2012-01-01

We compiled >200 nuclear export signal (NES)–containing CRM1 cargoes in a database named NESdb. We analyzed the sequences and three-dimensional structures of natural, experimentally identified NESs and of false-positive NESs that were generated from the database in order to identify properties that might distinguish the two groups of sequences. Analyses of amino acid frequencies, sequence logos, and agreement with existing NES consensus sequences revealed strong preferences for the Φ1-X3-Φ2-X2-Φ3-X-Φ4 pattern and for negatively charged amino acids in the nonhydrophobic positions of experimentally identified NESs but not of false positives. Strong preferences against certain hydrophobic amino acids in the hydrophobic positions were also revealed. These findings led to a new and more precise NES consensus. More important, three-dimensional structures are now available for 68 NESs within 56 different cargo proteins. Analyses of these structures showed that experimentally identified NESs are more likely than the false positives to adopt α-helical conformations that transition to loops at their C-termini and more likely to be surface accessible within their protein domains or be present in disordered or unobserved parts of the structures. Such distinguishing features for real NESs might be useful in future NES prediction efforts. Finally, we also tested CRM1-binding of 40 NESs that were found in the 56 structures. We found that 16 of the NES peptides did not bind CRM1, hence illustrating how NESs are easily misidentified. PMID:22833565

Cloning and sequence analysis of the invertase gene INV 1 from the yeast Pichia anomala.

PubMed

Pérez, J A; Rodríguez, J; Rodríguez, L; Ruiz, T

1996-02-01

A genomic library from the yeast Pichia anomala has been constructed and employed to clone the gene encoding the sucrose-hydrolysing enzyme invertase by complementation of a sucrose non-fermenting mutant of Saccharomyces cerevisiae. The cloned gene, INV1, was sequenced and found to encode a polypeptide of 550 amino acids which contained a 22 amino-acid signal sequence and ten potential glycosylation sites. The amino-acid sequence shows significant identity with other yeast invertases and also with Kluyveromyces marxianus inulinase, a yeast beta-fructofuranosidase which has a different substrate specificity. The nucleotide sequences of the 5' and 3' non-coding regions were found to contain several consensus motifs probably involved in the initiation and termination of gene transcription.
Biclustering as a method for RNA local multiple sequence alignment.

PubMed

Wang, Shu; Gutell, Robin R; Miranker, Daniel P

2007-12-15

Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in multiple sequence alignment (MSA) is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering is intended to address. We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was tested on the subsets of the BRAliBase 2.1 benchmark suite that display high variability and on an extension to that suite to larger problem sizes. Also, alignments were evaluated of two large datasets of current biological interest, T box sequences and Group IC1 Introns. The results were compared with alignments computed by ClustalW, MAFFT, MUCLE and PROBCONS alignment programs using Sum of Pairs (SPS) and Consensus Count. Results for the benchmark suite are sensitive to problem size. On problems of 15 or greater sequences, BlockMSA is consistently the best. On none of the problems in the test suite are there appreciable differences in scores among BlockMSA, MAFFT and PROBCONS. On the T box sequences, BlockMSA does the most faithful job of reproducing known annotations. MAFFT and PROBCONS do not. On the Intron sequences, BlockMSA, MAFFT and MUSCLE are comparable at identifying conserved regions. BlockMSA is implemented in Java. Source code and supplementary datasets are available at http://aug.csres.utexas.edu/msa/
Detection of the CLOCK/BMAL1 heterodimer using a nucleic acid probe with cycling probe technology.

PubMed

Nakagawa, Kazuhiro; Yamamoto, Takuro; Yasuda, Akio

2010-09-15

An isothermal signal amplification technique for specific DNA sequences, known as cycling probe technology (CPT), has enabled rapid acquisition of genomic information. Here we report an analogous technique for the detection of an activated transcription factor, a transcription element-binding assay with fluorescent amplification by apurinic/apyrimidinic (AP) site lysis cycle (TEFAL). This simple amplification assay can detect activated transcription factors by using a unique nucleic acid probe containing a consensus binding sequence and an AP site, which enables the CPT reaction with AP endonuclease. In this article, we demonstrate that this method detects the functional CLOCK/BMAL1 heterodimer via the TEFAL probe containing the E-box consensus sequence to which the CLOCK/BMAL1 heterodimer binds. Using TEFAL combined with immunoassays, we measured oscillations in the amount of CLOCK/BMAL1 heterodimer in serum-stimulated HeLa cells. Furthermore, we succeeded in measuring the circadian accumulation of the functional CLOCK/BMAL1 heterodimer in human buccal mucosa cells. TEFAL contributes greatly to the study of transcription factor activation in mammalian tissues and cell extracts and is a powerful tool for less invasive investigation of human circadian rhythms. 2010 Elsevier Inc. All rights reserved.
Comparison of the AdvanSure human papillomavirus screening real-time PCR, the Abbott RealTime High Risk human papillomavirus test, and the Hybrid Capture human papillomavirus DNA test for the detection of human papillomavirus.

PubMed

Hwang, Yusun; Lee, Miae

2012-05-01

We evaluated the performance of various commercial assays for the molecular detection of human papillomavirus (HPV); the recently developed AdvanSure HPV Screening real-time PCR assay (AdvanSure PCR) and the Abbott RealTime High Risk HPV PCR assay (Abbott PCR) were compared with the Hybrid Capture 2 HPV DNA Test (HC2). All 3 tests were performed on 177 samples, and any sample that showed a discrepancy in any of the 3 tests was genotyped using INNO-LiPA HPV genotyping and/or sequencing. On the basis of these results, we obtained a consensus HPV result, and the performance of each test was evaluated. We also evaluated high-risk HPV 16/18 detection by using the 2 real-time PCR assays. Among the 177 samples, 65 were negative and 75 were positive in all 3 assays; however, the results of the 3 assays with 37 samples were discrepant. Compared with the consensus HPV result, the sensitivities and specificities of HC2, AdvanSure PCR, and Abbott PCR were 97.6%, 91.7%, and 86.9% and 83.9%, 98.8%, and 100.0%, respectively. For HPV type 16/18 detection, the concordance rate between the AdvanSure PCR and Abbott PCR assays was 98.3%; however, 3 samples were discrepant (positive in AdvanSure PCR and negative in Abbott PCR) and were confirmed as HPV type 16 by INNO-LiPA genotyping and/or sequencing. For HPV detection, the AdvanSure HPV Screening real-time PCR assay and the Abbott PCR assay are less sensitive but more specific than the HC2 assay, but can simultaneously differentiate type 16/18 HPV from other types.
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana.

PubMed

Liu, Yanan; Wang, Baoju; Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia

2016-01-01

The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model.
Transcriptome Analysis and Comparison of Marmota monax and Marmota himalayana

PubMed Central

Wang, Lu; Vikash, Vikash; Wang, Qin; Roggendorf, Michael; Lu, Mengji; Yang, Dongliang; Liu, Jia

2016-01-01

The Eastern woodchuck (Marmota monax) is a classical animal model for studying hepatitis B virus (HBV) infection and hepatocellular carcinoma (HCC) in humans. Recently, we found that Marmota himalayana, an Asian animal species closely related to Marmota monax, is susceptible to woodchuck hepatitis virus (WHV) infection and can be used as a new mammalian model for HBV infection. However, the lack of genomic sequence information of both Marmota models strongly limited their application breadth and depth. To address this major obstacle of the Marmota models, we utilized Illumina RNA-Seq technology to sequence the cDNA libraries of liver and spleen samples of two Marmota monax and four Marmota himalayana. In total, over 13 billion nucleotide bases were sequenced and approximately 1.5 billion clean reads were obtained. Following assembly, 106,496 consensus sequences of Marmota monax and 78,483 consensus sequences of Marmota himalayana were detected. For functional annotation, in total 73,603 Unigenes of Marmota monax and 78,483 Unigenes of Marmota himalayana were identified using different databases (NR, NT, Swiss-Prot, KEGG, COG, GO). The Unigenes were aligned by blastx to protein databases to decide the coding DNA sequences (CDS) and in total 41,247 CDS of Marmota monax and 34,033 CDS of Marmota himalayana were predicted. The single nucleotide polymorphisms (SNPs) and the simple sequence repeats (SSRs) were also analyzed for all Unigenes obtained. Moreover, a large-scale transcriptome comparison was performed and revealed a high similarity in transcriptome sequences between the two marmota species. Our study provides an extensive amount of novel sequence information for Marmota monax and Marmota himalayana. This information may serve as a valuable genomics resource for further molecular, developmental and comparative evolutionary studies, as well as for the identification and characterization of functional genes that are involved in WHV infection and HCC development in the woodchuck model. PMID:27806133
Forensic Loci Allele Database (FLAD): Automatically generated, permanent identifiers for sequenced forensic alleles.

PubMed

Van Neste, Christophe; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

2016-01-01

It is difficult to predict if and when massively parallel sequencing of forensic STR loci will replace capillary electrophoresis as the new standard technology in forensic genetics. The main benefits of sequencing are increased multiplexing scales and SNP detection. There is not yet a consensus on how sequenced profiles should be reported. We present the Forensic Loci Allele Database (FLAD) service, made freely available on http://forensic.ugent.be/FLAD/. It offers permanent identifiers for sequenced forensic alleles (STR or SNP) and their microvariants for use in forensic allele nomenclature. Analogous to Genbank, its aim is to provide permanent identifiers for forensically relevant allele sequences. Researchers that are developing forensic sequencing kits or are performing population studies, can register on http://forensic.ugent.be/FLAD/ and add loci and allele sequences with a short and simple application interface (API). Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
The Malarial Host-Targeting Signal Is Conserved in the Irish Potato Famine Pathogen

PubMed Central

Liolios, Konstantinos; Win, Joe; Kanneganti, Thirumala-Devi; Young, Carolyn; Kamoun, Sophien; Haldar, Kasturi

2006-01-01

Animal and plant eukaryotic pathogens, such as the human malaria parasite Plasmodium falciparum and the potato late blight agent Phytophthora infestans, are widely divergent eukaryotic microbes. Yet they both produce secretory virulence and pathogenic proteins that alter host cell functions. In P. falciparum, export of parasite proteins to the host erythrocyte is mediated by leader sequences shown to contain a host-targeting (HT) motif centered on an RxLx (E, D, or Q) core: this motif appears to signify a major pathogenic export pathway with hundreds of putative effectors. Here we show that a secretory protein of P. infestans, which is perceived by plant disease resistance proteins and induces hypersensitive plant cell death, contains a leader sequence that is equivalent to the Plasmodium HT-leader in its ability to export fusion of green fluorescent protein (GFP) from the P. falciparum parasite to the host erythrocyte. This export is dependent on an RxLR sequence conserved in P. infestans leaders, as well as in leaders of all ten secretory oomycete proteins shown to function inside plant cells. The RxLR motif is also detected in hundreds of secretory proteins of P. infestans, Phytophthora sojae, and Phytophthora ramorum and has high value in predicting host-targeted leaders. A consensus motif further reveals E/D residues enriched within ~25 amino acids downstream of the RxLR, which are also needed for export. Together the data suggest that in these plant pathogenic oomycetes, a consensus HT motif may reside in an extended sequence of ~25–30 amino acids, rather than in a short linear sequence. Evidence is presented that although the consensus is much shorter in P. falciparum, information sufficient for vacuolar export is contained in a region of ~30 amino acids, which includes sequences flanking the HT core. Finally, positional conservation between Phytophthora RxLR and P. falciparum RxLx (E, D, Q) is consistent with the idea that the context of their presentation is constrained. These studies provide the first evidence to our knowledge that eukaryotic microbes share equivalent pathogenic HT signals and thus conserved mechanisms to access host cells across plant and animal kingdoms that may present unique targets for prophylaxis across divergent pathogens. PMID:16733545
Hydroxyapatite-binding peptides for bone growth and inhibition

DOEpatents

Bertozzi, Carolyn R [Berkeley, CA; Song, Jie [Shrewsbury, MA; Lee, Seung-Wuk [Walnut Creek, CA

2011-09-20

Hydroxyapatite (HA)-binding peptides are selected using combinatorial phage library display. Pseudo-repetitive consensus amino acid sequences possessing periodic hydroxyl side chains in every two or three amino acid sequences are obtained. These sequences resemble the (Gly-Pro-Hyp).sub.x repeat of human type I collagen, a major component of extracellular matrices of natural bone. A consistent presence of basic amino acid residues is also observed. The peptides are synthesized by the solid-phase synthetic method and then used for template-driven HA-mineralization. Microscopy reveal that the peptides template the growth of polycrystalline HA crystals .about.40 nm in size.
The autism sequencing consortium: large-scale, high-throughput sequencing in autism spectrum disorders.

PubMed

Buxbaum, Joseph D; Daly, Mark J; Devlin, Bernie; Lehner, Thomas; Roeder, Kathryn; State, Matthew W

2012-12-20

Research during the past decade has seen significant progress in the understanding of the genetic architecture of autism spectrum disorders (ASDs), with gene discovery accelerating as the characterization of genomic variation has become increasingly comprehensive. At the same time, this research has highlighted ongoing challenges. Here we address the enormous impact of high-throughput sequencing (HTS) on ASD gene discovery, outline a consensus view for leveraging this technology, and describe a large multisite collaboration developed to accomplish these goals. Similar approaches could prove effective for severe neurodevelopmental disorders more broadly. Copyright © 2012 Elsevier Inc. All rights reserved.
Unenhanced breast MRI (STIR, T2-weighted TSE, DWIBS): An accurate and alternative strategy for detecting and differentiating breast lesions.

PubMed

Telegrafo, Michele; Rella, Leonarda; Stabile Ianora, Amato Antonio; Angelelli, Giuseppe; Moschetta, Marco

2015-10-01

To assess the role of STIR, T2-weighted TSE and DWIBS sequences for detecting and characterizing breast lesions and to compare unenhanced (UE)-MRI results with contrast-enhanced (CE)-MRI and histological findings, having the latter as the reference standard. Two hundred eighty consecutive patients (age range, 27-73 years; mean age±standard deviation (SD), 48.8±9.8years) underwent MR examination with a diagnostic protocol including STIR, T2-weighted TSE, THRIVE and DWIBS sequences. Two radiologists blinded to both dynamic sequences and histological findings evaluated in consensus STIR, T2-weighted TSE and DWIBS sequences and after two weeks CE-MRI images searching for breast lesions. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic accuracy for UE-MRI and CE-MRI were calculated. UE-MRI results were also compared with CE- MRI. UE-MRI sequences obtained sensitivity, specificity, diagnostic accuracy, PPV and NPV values of 94%, 79%, 86%, 79% and 94%, respectively. CE-MRI sequences obtained sensitivity, specificity, diagnostic accuracy, PPV and NPV values of 98%, 83%, 90%, 84% and 98%, respectively. No statistically significant difference between UE-MRI and CE-MRI was found. Breast UE-MRI could represent an accurate diagnostic tool and a valid alternative to CE-MRI for evaluating breast lesions. STIR and DWIBS sequences allow to detect breast lesions while T2-weighted TSE sequences and ADC values could be useful for lesion characterization. Copyright © 2015 Elsevier Inc. All rights reserved.
Primary and secondary structural analyses of glutathione S-transferase pi from human placenta.

PubMed

Ahmad, H; Wilson, D E; Fritz, R R; Singh, S V; Medh, R D; Nagle, G T; Awasthi, Y C; Kurosky, A

1990-05-01

The primary structure of glutathione S-transferase (GST) pi from a single human placenta was determined. The structure was established by chemical characterization of tryptic and cyanogen bromide peptides as well as automated sequence analysis of the intact enzyme. The structural analysis indicated that the protein is comprised of 209 amino acid residues and gave no evidence of post-translational modifications. The amino acid sequence differed from that of the deduced amino acid sequence determined by nucleotide sequence analysis of a cDNA clone (Kano, T., Sakai, M., and Muramatsu, M., 1987, Cancer Res. 47, 5626-5630) at position 104 which contained both valine and isoleucine whereas the deduced sequence from nucleotide sequence analysis identified only isoleucine at this position. These results demonstrated that in the one individual placenta studied at least two GST pi genes are coexpressed, probably as a result of allelomorphism. Computer assisted consensus sequence evaluation identified a hydrophobic region in GST pi (residues 155-181) that was predicted to be either a buried transmembrane helical region or a signal sequence region. The significance of this hydrophobic region was interpreted in relation to the mode of action of the enzyme especially in regard to the potential involvement of a histidine in the active site mechanism. A comparison of the chemical similarity of five known human GST complete enzyme structures, one of pi, one of mu, two of alpha, and one microsomal, gave evidence that all five enzymes have evolved by a divergent evolutionary process after gene duplication, with the microsomal enzyme representing the most divergent form.
LongISLND: in silico sequencing of lengthy and noisy datatypes.

PubMed

Lau, Bayo; Mohiyuddin, Marghoob; Mu, John C; Fang, Li Tai; Bani Asadi, Narges; Dallett, Carolina; Lam, Hugo Y K

2016-12-15

LongISLND is a software package designed to simulate sequencing data according to the characteristics of third generation, single-molecule sequencing technologies. The general software architecture is easily extendable, as demonstrated by the emulation of Pacific Biosciences (PacBio) multi-pass sequencing with P5 and P6 chemistries, producing data in FASTQ, H5, and the latest PacBio BAM format. We demonstrate its utility by downstream processing with consensus building and variant calling. LongISLND is implemented in Java and available at http://bioinform.github.io/longislnd CONTACT: hugo.lam@roche.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Age-related regulation of genes: slow homeostatic changes and age-dimension technology

NASA Astrophysics Data System (ADS)

Kurachi, Kotoku; Zhang, Kezhong; Huo, Jeffrey; Ameri, Afshin; Kuwahara, Mitsuhiro; Fontaine, Jean-Marc; Yamamoto, Kei; Kurachi, Sumiko

2002-11-01

Through systematic studies of pro- and anti-blood coagulation factors, we have determined molecular mechanisms involving two genetic elements, age-related stability element (ASE), GAGGAAG and age-related increase element (AIE), a unique stretch of dinucleotide repeats (AIE). ASE and AIE are essential for age-related patterns of stable and increased gene expression patterns, respectively. Such age-related gene regulatory mechanisms are also critical for explaining homeostasis in various physiological reactions as well as slow homeostatic changes in them. The age-related increase expression of the human factor IX (hFIX) gene requires the presence of both ASE and AIE, which apparently function additively. The anti-coagulant factor protein C (hPC) gene uses an ASE (CAGGAG) to produce age-related stable expression. Both ASE sequences (G/CAGAAG) share consensus sequence of the transcriptional factor PEA-3 element. No other similar sequences, including another PEA-3 consensus sequence, GAGGATG, function in conferring age-related gene regulation. The age-regulatory mechanisms involving ASE and AIE apparently function universally with different genes and across different animal species. These findings have led us to develop a new field of research and applications, which we named “age-dimension technology (ADT)”. ADT has exciting potential for modifying age-related expression of genes as well as associated physiological processes, and developing novel, more effective prophylaxis or treatments for age-related diseases.
Dfam: a database of repetitive DNA based on profile hidden Markov models.

PubMed

Wheeler, Travis J; Clements, Jody; Eddy, Sean R; Hubley, Robert; Jones, Thomas A; Jurka, Jerzy; Smit, Arian F A; Finn, Robert D

2013-01-01

We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.
New clades of euthyneuran gastropods (Mollusca) from 28S rRNA sequences.

PubMed

Dayrat, B; Tillier, A; Lecointre, G; Tillier, S

2001-05-01

Recent morphological and molecular results on phylogeny of euthyneuran gastropods, which include opisthobranchs and pulmonates, have greatly diminished previous supposed resolution of their phylogenetic relationships. In addition to recent morphological results, sequences of the D1 and D2 domains of the 28S rRNA are here analyzed by parsimony for 31 euthyneuran species. The molecular and previous morphological data sets were not congruent according to an ILD test, and morphological and molecular data could not be analyzed simultaneously. Consequently Bremer's Combinable Component Consensus was used to obtain a new tree, with the following supported molecular results: monophyly of a new clade of opisthobranchs including actively swimming Euthyneura, i.e., pelagic Gymnosomata and Thecosomata plus benthic Anaspidea; first molecular confirmation of monophylies of Hygrophila, including Chilina, Acteonoidea, and Sacoglossa, which include both shell-bearing species and slugs; and new confirmation of the monophyly of Stylommatophora. Morphological characters which support the new clades obtained here are discussed. Copyright 2001 Academic Press.
Expression of ADP-ribosylation factor (ARF)-like protein 6 during mouse embryonic development.

PubMed

Takada, Tatsuyuki; Iida, Keiko; Sasaki, Hiroshi; Taira, Masanori; Kimura, Hiroshi

2005-01-01

ADP-ribosylation factor (ARF)-like protein 6 (ARL6) is a member of the ARF-like protein (ARL) subfamily of small GTPases (Moss, 1995; Chavrier, 1999). ARLs are highly conserved through evolution and most of them possess the consensus sequence required for GTP binding and hydrolysis (Pasquallato, 2002). Among ARLs, ARL6 which was initially isolated from a J2E erythroleukemic cell line is divergent in its consensus sequences and its expression has been shown to be limited to the brain and kidney in adult mouse (Ingley, 1999). Recently, it was reported that mutations of the ARL6 gene cause type 3 Bardet-Biedl syndrome in humans and that ARL6 is involved in ciliary transport in C. elegans (Chiang, 2004; Fan, 2004). Here, we investigated the expression pattern of ARL6 during early mouse development by whole-mount in situ hybridization and found that interestingly, ARL6 mRNA was localized around the node at 7.0-7.5 days post coitum (dpc) embryos, while weak expression was also found in the ectoderm. At the later stage (8.5 dpc) ARL6 was expressed in the neural plate and probably in the somites. Based on these results, a possible role of ARL6 in early development is discussed in relation to the findings in human and C. elegans (Chiang, 2004; Fan, 2004).
Overproduction, purification, and ATPase activity of the Escherichia coli RuvB protein involved in DNA repair.

PubMed Central

Iwasaki, H; Shiba, T; Makino, K; Nakata, A; Shinagawa, H

1989-01-01

The ruvA and ruvB genes of Escherichia coli constitute an operon which belongs to the SOS regulon. Genetic evidence suggests that the products of the ruv operon are involved in DNA repair and recombination. To begin biochemical characterization of these proteins, we developed a plasmid system that overproduced RuvB protein to 20% of total cell protein. Starting from the overproducing system, we purified RuvB protein. The purified RuvB protein behaved like a monomer in gel filtration chromatography and had an apparent relative molecular mass of 38 kilodaltons in sodium dodecyl sulfate-polyacrylamide gel electrophoresis, which agrees with the value predicted from the DNA sequence. The amino acid sequence of the amino-terminal region of the purified protein was analyzed, and the sequence agreed with the one deduced from the DNA sequence. Since the deduced sequence of RuvB protein contained the consensus sequence for ATP-binding proteins, we examined the ATP-binding and ATPase activities of the purified RuvB protein. RuvB protein had a stronger affinity to ADP than to ATP and weak ATPase activity. The results suggest that the weak ATPase activity of RuvB protein is at least partly due to end product inhibition by ADP. Images PMID:2529252
Experimental Identification of Actinobacillus pleuropneumoniae Strains L20 and JL03 Heptosyltransferases, Evidence for a New Heptosyltransferase Signature Sequence

PubMed Central

Merino, Susana; Knirel, Yuriy A.; Regué, Miguel; Tomás, Juan M.

2013-01-01

We experimentally identified the activities of six predicted heptosyltransferases in Actinobacillus pleuropneumoniae genome serotype 5b strain L20 and serotype 3 strain JL03. The initial identification was based on a bioinformatic analysis of the amino acid similarity between these putative heptosyltrasferases with others of known function from enteric bacteria and Aeromonas. The putative functions of all the Actinobacillus pleuropneumoniae heptosyltrasferases were determined by using surrogate LPS acceptor molecules from well-defined A. hydrophyla AH-3 and A. salmonicida A450 mutants. Our results show that heptosyltransferases APL_0981 and APJL_1001 are responsible for the transfer of the terminal outer core D-glycero-D-manno-heptose (D,D-Hep) residue although they are not currently included in the CAZY glycosyltransferase 9 family. The WahF heptosyltransferase group signature sequence [S(T/S)(GA)XXH] differs from the heptosyltransferases consensus signature sequence [D(TS)(GA)XXH], because of the substitution of D261 for S261, being unique. PMID:23383222
Identification of IBV QX vaccine markers : Should vaccine acceptance by authorities require similar identifications for all live IBV vaccines?

PubMed

Listorti, Valeria; Laconi, Andrea; Catelli, Elena; Cecchinato, Mattia; Lupini, Caterina; Naylor, Clive J

2017-10-09

IBV genotype QX causes sufficient disease in Europe for several commercial companies to have started developing live attenuated vaccines. Here, one of those vaccines (L1148) was fully consensus sequenced alongside its progenitor field strain (1148-A) to determine vaccine markers, thereby enabling detection on farms. Twenty-eight single nucleotide substitutions were associated with the 1148-A attenuation, of which any combination can identify vaccine L1148 in the field. Sixteen substitutions resulted in amino acid coding changes of which half were in spike. One change in the 1b gene altered the normally highly conserved final 5 nucleotides of the transcription regulatory sequence of the S gene, common to all IBV QX genes. No mutations can currently be associated with the attenuation process. Field vaccination strategies would greatly benefit by such comparative sequence data being mandatorily submitted to regulators prior to vaccine release following a successful registration process. Copyright © 2017. Published by Elsevier Ltd.

Direct repeat sequences are essential for function of the cis-acting locus of transfer (clt) of Streptomyces phaeochromogenes plasmid pJV1.

PubMed

Franco, Bernardo; González-Cerón, Gabriela; Servín-González, Luis

2003-11-01

The functionality of direct and inverted repeat sequences inside the cis acting locus of transfer (clt) of the Streptomyces plasmid pJV1 was determined by testing the effect of different deletions on plasmid transfer. The results show that the single most important element for pJV1 clt function is a series of evenly spaced 9 bp long direct repeats which match the consensus CCGCACA(C/G)(C/G), since their deletion caused a dramatic reduction in plasmid transfer. The presence of these repeats in the absence of any other clt sequences allowed plasmid transfer to occur at a frequency that was at least two orders of magnitude higher than that obtained in the complete absence of clt. A database search revealed regions with a similar organization, and in the same position, in Streptomyces plasmids pSN22 and pSLS, which have transfer proteins homologous to those of pJV1.
Molecular Diagnostic Analysis of Outbreak Scenarios

ERIC Educational Resources Information Center

Morsink, M. C.; Dekter, H. E.; Dirks-Mulder, A.; van Leeuwen, W. B.

2012-01-01

In the current laboratory assignment, technical aspects of the polymerase chain reaction (PCR) are integrated in the context of six different bacterial outbreak scenarios. The "Enterobacterial Repetitive Intergenic Consensus Sequence" (ERIC) PCR was used to analyze different outbreak scenarios. First, groups of 2-4 students determined optimal…
Characterization of the molecular chaperone calnexin in the channel catfish, Ictalurus punctatus, and its association with MHC class II molecules.

PubMed

Fuller, James R; Pitzer, Joshua E; Godwin, Ulla; Albertino, Mark; Machon, Benjamin D; Kearse, Kelly P; McConnell, Thomas J

2004-05-17

Folding and assembly of MHC molecules in mammals occurs in the endoplasmic reticulum (ER), but has not been studied in teleosts. Calnexin (CNX) is an ER chaperone that associates with glycoproteins bearing a monoglucosylated N-linked oligosaccharide side chain. Here we report the first identification and characterization of a full-length CNX cDNA clone in a teleost, and the association of the CNX chaperone with MHC class II in a channel catfish T cell line. The 1.8 kb CNX clone encodes a protein of 607 amino acids that is 72% identical to the consensus sequence of mammalian CNXs. The association of CNX with class II is of particular interest because the native MHC class II alpha chain of Ictalurus punctatus does not bear any N-linked oligosaccharide consensus glycosylation sequences. Thus the assembly of class II molecules in the catfish probably proceeds via different steps than occurs in mammals. Copyright 2003 Elsevier Ltd.
Screening of matrix metalloproteinases available from the protein data bank: insights into biological functions, domain organization, and zinc binding groups.

PubMed

Nicolotti, Orazio; Miscioscia, Teresa Fabiola; Leonetti, Francesco; Muncipinto, Giovanni; Carotti, Angelo

2007-01-01

A total of 142 matrix metalloproteinase (MMP) X-ray crystallographic structures were retrieved from the Protein Data Bank (PDB) and analyzed by an automated and efficient routine, developed in-house, with a series of bioinformatic tools. Highly informative heat maps and hierarchical clusterograms provided a reliable and comprehensive representation of the relationships existing among MMPs, enlarging and complementing the current knowledge in the field. Multiple sequence and structural alignments permitted better location and display of key MMP motifs and quantification of the residue consensus at each amino acid position in the most critical binding subsites of MMPs. The MMP active site consensus sequences, the C-alpha root-mean-square deviation (RMSd) analysis of diverse enzymatic subsites, and the examination of the chemical nature, binding topologies, and zinc binding groups (ZBGs) of ligands extracted from crystallographic complexes provided useful insights on the structural arrangements of the most potent MMP inhibitors.
Regulated expression of the Ras effector Rin1 in forebrain neurons

PubMed Central

Dzudzor, Bartholomew; Huynh, Lucia; Thai, Minh; Bliss, Joanne M.; Nagaoka, Yoshiko; Wang, Ying; Ch'ng, Toh Hean; Jiang, Meisheng; Martin, Kelsey C.; Colicelli, John

2009-01-01

The Ras effector Rin1 is induced concomitant with synaptogenesis in forebrain neurons, where it inhibits fear conditioning and amygdala LTP. In epithelial cells, lower levels of Rin1 orchestrate receptor endocytosis. A 945bp Rin1 promoter fragment was active in hippocampal neurons and directed accurate tissue-specific and temporal expression in transgenic mice. Regulated expression in neurons and epithelial cells was mediated in part by Snail transcriptional repressors: mutation of a conserved Snail site increased expression and endogenous Snai1 was detected at the Rin1 promoter. We also describe an element closely related to, but distinct from, the consensus site for REST, a master repressor of neuronal genes. Conversion to a consensus REST sequence reduced expression in both cell types. These results provide insight into regulated expression of a neuronal Ras effector, define a promoter useful in telencephalic neuron studies, and describe a novel REST site variant directing expression to mature neurons. PMID:19837165
Core-SINE blocks comprise a large fraction of monotreme genomes; implications for vertebrate chromosome evolution.

PubMed

Kirby, Patrick J; Greaves, Ian K; Koina, Edda; Waters, Paul D; Marshall Graves, Jennifer A

2007-01-01

The genomes of the egg-laying platypus and echidna are of particular interest because monotremes are the most basal mammal group. The chromosomal distribution of an ancient family of short interspersed repeats (SINEs), the core-SINEs, was investigated to better understand monotreme genome organization and evolution. Previous studies have identified the core-SINE as the predominant SINE in the platypus genome, and in this study we quantified, characterized and localized subfamilies. Dot blot analysis suggested that a very large fraction (32% of the platypus and 16% of the echidna genome) is composed of Mon core-SINEs. Core-SINE-specific primers were used to amplify PCR products from platypus and echidna genomic DNA. Sequence analysis suggests a common consensus sequence Mon 1-B, shared by platypus and echidna, as well as platypus-specific Mon 1-C and echidna specific Mon 1-D consensus sequences. FISH mapping of the Mon core-SINE products to platypus metaphase spreads demonstrates that the Mon-1C subfamily is responsible for the striking Mon core-SINE accumulation in the distal regions of the six large autosomal pairs and the largest X chromosome. This unusual distribution highlights the dichotomy between the seven large chromosome pairs and the 19 smaller pairs in the monotreme karyotype, which has some similarity to the macro- and micro-chromosomes of birds and reptiles, and suggests that accumulation of repetitive sequences may have enlarged small chromosomes in an ancestral vertebrate. In the forthcoming sequence of the platypus genome there are still large gaps, and the extensive Mon core-SINE accumulation on the distal regions of the six large autosomal pairs may provide one explanation for this missing sequence.
Hepatitis delta genotypes in chronic delta infection in the northeast of Spain (Catalonia).

PubMed

Cotrina, M; Buti, M; Jardi, R; Quer, J; Rodriguez, F; Pascual, C; Esteban, R; Guardia, J

1998-06-01

Based on genetic analysis of variants obtained around the world, three genotypes of the hepatitis delta virus have been defined. Hepatitis delta virus variants have been associated with different disease patterns and geographic distributions. To determine the prevalence of hepatitis delta virus genotypes in the northeast of Spain (Catalonia) and the correlation with transmission routes and clinical disease, we studied the nucleotide divergence of the consensus sequence of HDV RNA obtained from 33 patients with chronic delta hepatitis (24 were intravenous drug users and nine had no risk factors), and four patients with acute self-limited delta infection. Serum HDV RNA was amplified by the polymerase chain reaction technique and a fragment of 350 nucleotides (nt 910 to 1259) was directly sequenced. Genetic analysis of the nucleotide consensus sequence obtained showed a high degree of conservation among sequences (93% of mean). Comparison of these sequences with those derived from different geographic areas and pertaining to genotypes I, II and III, showed a mean sequence identity of 92% with genotype I, 73% with genotype II and 61% with genotype III. At the amino acid level (aa 115 to 214), the mean identity was 87% with genotype I, 63% with genotype II and 56% with genotype III. Conserved regions included the RNA editing domain, the carboxyl terminal 19 amino acids of the hepatitis delta antigen and the polyadenylation signal of the viral mRNA. Hepatitis delta virus isolates in the northeast of Spain are exclusively genotype I, independently of the transmission route and the type of infection. No hepatitis delta virus subgenotypes were found, suggesting that the origin of hepatitis delta virus infection in our geographical area is homogeneous.
Structure, replication efficiency and fragility of yeast ARS elements.

PubMed

Dhar, Manoj K; Sehgal, Shelly; Kaul, Sanjana

2012-05-01

DNA replication in eukaryotes initiates at specific sites known as origins of replication, or replicators. These replication origins occur throughout the genome, though the propensity of their occurrence depends on the type of organism. In eukaryotes, zones of initiation of replication spanning from about 100 to 50,000 base pairs have been reported. The characteristics of eukaryotic replication origins are best understood in the budding yeast Saccharomyces cerevisiae, where some autonomously replicating sequences, or ARS elements, confer origin activity. ARS elements are short DNA sequences of a few hundred base pairs, identified by their efficiency at initiating a replication event when cloned in a plasmid. ARS elements, although structurally diverse, maintain a basic structure composed of three domains, A, B and C. Domain A is comprised of a consensus sequence designated ACS (ARS consensus sequence), while the B domain has the DNA unwinding element and the C domain is important for DNA-protein interactions. Although there are ∼400 ARS elements in the yeast genome, not all of them are active origins of replication. Different groups within the genus Saccharomyces have ARS elements as components of replication origin. The present paper provides a comprehensive review of various aspects of ARSs, starting from their structural conservation to sequence thermodynamics. All significant and conserved functional sequence motifs within different types of ARS elements have been extensively described. Issues like silencing at ARSs, their inherent fragility and factors governing their replication efficiency have also been addressed. Progress in understanding crucial components associated with the replication machinery and timing at these ARS elements is discussed in the section entitled "The replicon revisited". Copyright © 2012 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
The short interspersed repetitive element of Trypanosoma cruzi, SIRE, is part of VIPER, an unusual retroelement related to long terminal repeat retrotransposons

PubMed Central

Vázquez, Martín; Ben-Dov, Claudia; Lorenzi, Hernan; Moore, Troy; Schijman, Alejandro; Levin, Mariano J.

2000-01-01

The short interspersed repetitive element (SIRE) of Trypanosoma cruzi was first detected when comparing the sequences of loci that encode the TcP2β genes. It is present in about 1,500–3,000 copies per genome, depending on the strain, and it is distributed in all chromosomes. An initial analysis of SIRE sequences from 21 genomic fragments allowed us to derive a consensus nucleotide sequence and structure for the element, consisting of three regions (I, II, and III) each harboring distinctive features. Analysis of 158 transcribed SIREs demonstrates that the consensus is highly conserved. The sequences of 51 cDNAs show that SIRE is included in the 3′ end of several mRNAs, always transcribed from the sense strand, contributing the polyadenylation site in 63% of the cases. This study led to the characterization of VIPER (vestigial interposed retroelement), a 2,326-bp-long unusual retroelement. VIPER's 5′ end is formed by the first 182 bp of SIRE, whereas its 3′ end is formed by the last 220 bp of the element. Both SIRE moieties are connected by a 1,924-bp-long fragment that carries a unique ORF encoding a complete reverse transcriptase-RNase H gene whose 15 C-terminal amino acids derive from codons specified by SIRE's region II. The amino acid sequence of VIPER's reverse transcriptase-RNase H shares significant homology to that of long terminal repeat retrotransposons. The fact that SIRE and VIPER sequences are found only in the T. cruzi genome may be of relevance for studies concerning the evolution and the genome flexibility of this protozoan parasite. PMID:10688909
srRNA evolution and phylogenetic relationships of the genus Naegleria (Protista: Rhizopoda).

PubMed

Baverstock, P R; Illana, S; Christy, P E; Robinson, B S; Johnson, A M

1989-05-01

A rapid RNA sequencing technique was used to partially sequence the small-subunit ribosomal RNA (srRNA) of four species of the amoeboid genus Naegleria. The extent of nucleotide sequence divergence between the two most divergent species was roughly similar to that found between mammals and frogs. However, the pattern of variation among the Naegleria species was quite different from that found for those species of tetrapods characterized to date. A phylogenetic analysis of the consensus Naegleria sequence showed that Naegleria was not monophyletic with either Acanthamoeba castellanii or Dictyostelium discoideum, two other amoebas for which sequences were available. It was shown that the semiconserved regions of the srRNA molecule evolve in a clocklike fashion and that the clock is time dependent rather than generation dependent.
Multiple mobile promoter regions for the rare carbapenem resistance gene of Bacteroides fragilis.

PubMed

Podglajen, I; Breuil, J; Rohaut, A; Monsempes, C; Collatz, E

2001-06-01

Two novel insertion sequences (IS), IS1187 and IS1188, are described upstream from the carbapenem resistance gene cfiA in strains of Bacteroides fragilis. Mapping, with the RACE procedure, of transcription start sites of cfiA in these and two other previously reported IS showed that transcription of this rarely encountered gene is initiated close to a variety of B. fragilis consensus promoter sequences, as recently defined (D. P. Bayley, E. R. Rocha, and C. J. Smith, FEMS Microbiol. Lett. 193:149-154, 2000). In the cases of IS1186 and IS1188, these sequences overlap with putative Esigma(70) promoter sequences, while in IS942 and IS1187 such sequences can be observed either upstream or downstream of the B. fragilis promoters.
Advanced evolutionary molecular engineering to produce thermostable cellulase by using a small but efficient library.

PubMed

Ito, Y; Ikeuchi, A; Imamura, C

2013-01-01

We aimed at constructing thermostable cellulase variants of cellobiohydrolase II, derived from the mesophilic fungus Phanerochaete chrysosporium, by using an advanced evolutionary molecular engineering method. By aligning the amino acid sequences of the catalytic domains of five thermophilic fungal CBH2 and PcCBH2 proteins, we identified 45 positions where the PcCBH2 genes differ from the consensus sequence of two to five thermophilic fungal CBH2s. PcCBH2 variants with the consensus mutations were obtained by a cell-free translation system that was chosen for easy evaluation of thermostability. From the small library of consensus mutations, advantageous mutations for improving thermostability were found to occur with much higher frequency relative to a random library. To further improve thermostability, advantageous mutations were accumulated within the wild-type gene. Finally, we obtained the most thermostable variant Mall4, which contained all 15 advantageous mutations found in this study. This variant had the same specific cellulase activity as the wild type and retained sufficient activity at 50°C for >72 h, whereas wild-type PcCBH2 retained much less activity under the same conditions. The history of the accumulation process indicated that evolution of PcCBH2 toward improved thermostability was ideally and rapidly accomplished through the evolutionary process employed in this study.
A consensus genetic map of cowpea [Vigna unguiculata (L) Walp.] and synteny based on EST-derived SNPs.

PubMed

Muchero, Wellington; Diop, Ndeye N; Bhat, Prasanna R; Fenton, Raymond D; Wanamaker, Steve; Pottorff, Marti; Hearne, Sarah; Cisse, Ndiaga; Fatokun, Christian; Ehlers, Jeffrey D; Roberts, Philip A; Close, Timothy J

2009-10-27

Consensus genetic linkage maps provide a genomic framework for quantitative trait loci identification, map-based cloning, assessment of genetic diversity, association mapping, and applied breeding in marker-assisted selection schemes. Among "orphan crops" with limited genomic resources such as cowpea [Vigna unguiculata (L.) Walp.] (2n = 2x = 22), the use of transcript-derived SNPs in genetic maps provides opportunities for automated genotyping and estimation of genome structure based on synteny analysis. Here, we report the development and validation of a high-throughput EST-derived SNP assay for cowpea, its application in consensus map building, and determination of synteny to reference genomes. SNP mining from 183,118 ESTs sequenced from 17 cDNA libraries yielded approximately 10,000 high-confidence SNPs from which an Illumina 1,536-SNP GoldenGate genotyping array was developed and applied to 741 recombinant inbred lines from six mapping populations. Approximately 90% of the SNPs were technically successful, providing 1,375 dependable markers. Of these, 928 were incorporated into a consensus genetic map spanning 680 cM with 11 linkage groups and an average marker distance of 0.73 cM. Comparison of this cowpea genetic map to reference legumes, soybean (Glycine max) and Medicago truncatula, revealed extensive macrosynteny encompassing 85 and 82%, respectively, of the cowpea map. Regions of soybean genome duplication were evident relative to the simpler diploid cowpea. Comparison with Arabidopsis revealed extensive genomic rearrangement with some conserved microsynteny. These results support evolutionary closeness between cowpea and soybean and identify regions for synteny-based functional genomics studies in legumes.
Transcriptome sequencing for high throughput SNP development and genetic mapping in Pea

PubMed Central

2014-01-01

Background Pea has a complex genome of 4.3 Gb for which only limited genomic resources are available to date. Although SNP markers are now highly valuable for research and modern breeding, only a few are described and used in pea for genetic diversity and linkage analysis. Results We developed a large resource by cDNA sequencing of 8 genotypes representative of modern breeding material using the Roche 454 technology, combining both long reads (400 bp) and high coverage (3.8 million reads, reaching a total of 1,369 megabases). Sequencing data were assembled and generated a 68 K unigene set, from which 41 K were annotated from their best blast hit against the model species Medicago truncatula. Annotated contigs showed an even distribution along M. truncatula pseudochromosomes, suggesting a good representation of the pea genome. 10 K pea contigs were found to be polymorphic among the genetic material surveyed, corresponding to 35 K SNPs. We validated a subset of 1538 SNPs through the GoldenGate assay, proving their ability to structure a diversity panel of breeding germplasm. Among them, 1340 were genetically mapped and used to build a new consensus map comprising a total of 2070 markers. Based on blast analysis, we could establish 1252 bridges between our pea consensus map and the pseudochromosomes of M. truncatula, which provides new insight on synteny between the two species. Conclusions Our approach created significant new resources in pea, i.e. the most comprehensive genetic map to date tightly linked to the model species M. truncatula and a large SNP resource for both academic research and breeding. PMID:24521263
A novel thermophilic and halophilic esterase from Janibacter sp. R02, the first member of a new lipase family (Family XVII).

PubMed

Castilla, Agustín; Panizza, Paola; Rodríguez, Diego; Bonino, Luis; Díaz, Pilar; Irazoqui, Gabriela; Rodríguez Giordano, Sonia

2017-03-01

Janibacter sp. strain R02 (BNM 560) was isolated in our laboratory from an Antarctic soil sample. A remarkable trait of the strain was its high lipolytic activity, detected in Rhodamine-olive oil supplemented plates. Supernatants of Janibacter sp. R02 displayed superb activity on transesterification of acyl glycerols, thus being a good candidate for lipase prospection. Considering the lack of information concerning lipases of the genus Janibacter, we focused on the identification, cloning, expression and characterization of the extracellular lipases of this strain. By means of sequence alignment and clustering of consensus nucleotide sequences, a DNA fragment of 1272bp was amplified, cloned and expressed in E. coli. The resulting recombinant enzyme, named LipJ2, showed preference for short to medium chain-length substrates, and displayed maximum activity at 80°C and pH 8-9, being strongly activated by a mixture of Na + and K + . The enzyme presented an outstanding stability regarding both pH and temperature. Bioinformatics analysis of the amino acid sequence of LipJ2 revealed the presence of a consensus catalytic triad and a canonical pentapeptide. However, two additional rare motifs were found in LipJ2: an SXXL β-lactamase motif and two putative Y-type oxyanion holes (YAP). Although some of the previous features could allow assigning LipJ2 to the bacterial lipase families VIII or X, the phylogenetic analysis showed that LipJ2 clusters apart from other members of known lipase families, indicating that the newly isolated Janibacter esterase LipJ2 would be the first characterized member of a new family of bacterial lipases. Published by Elsevier Inc.
Specific Inhibition of the transcription factor Ci by a Cobalt(III)-Schiff base-DNA conjugate

PubMed Central

Hurtado, Ryan R.; Harney, Allison S.; Heffern, Marie C.; Holbrook, Robert J.; Holmgren, Robert A.; Meade, Thomas J.

2012-01-01

We describe the use of Co(III) Schiff base-DNA conjugates, a versatile class of research tools that target C2H2 transcription factors, to inhibit the Hedgehog (Hh) pathway. In developing mammalian embryos, Hh signaling is critical for the formation and development of many tissues and organs. Inappropriate activation of the Hedgehog (Hh) pathway has been implicated in a variety of cancers including medulloblastomas and basal cell carcinomas. It is well known that Hh regulates the activity of the Gli family of C2H2 zinc finger transcription factors in mammals. In Drosophila the function of the Gli proteins is performed by a single transcription factor with an identical DNA binding consensus sequence, Cubitus Interruptus (Ci). We have demonstrated previously that conjugation of a specific 17 base-pair oligonucleotide to a Co(III) Schiff base complex results in a targeted inhibitor of the Snail family C2H2 zinc finger transcription factors. Modification of the oligonucleotide sequence in the Co(III) Schiff base-DNA conjugate to that of Ci’s consensus sequence (Co(III)-Ci) generates an equally selective inhibitor of Ci. Co(III)-Ci irreversibly binds the Ci zinc finger domain and prevents it from binding DNA in vitro. In a Ci responsive tissue culture reporter gene assay, Co(III)-Ci reduces the transcriptional activity of Ci in a concentration dependent manner. In addition, injection of wild-type Drosophila embryos with Co(III)-Ci phenocopies a Ci loss of function phenotype, demonstrating effectiveness in vivo. This study provides evidence that Co(III) Schiff base-DNA conjugates are a versatile class of specific and potent tools for studying zinc finger domain proteins and have potential applications as customizable anti-cancer therapeutics. PMID:22214326
Characterization of clade 2.3.4.4 H5N8 highly pathogenic avian influenza viruses from wild birds possessing atypical hemagglutinin polybasic cleavage sites.

PubMed

Usui, Tatsufumi; Soda, Kosuke; Tomioka, Yukiko; Ito, Hiroshi; Yabuta, Toshiyo; Takakuwa, Hiroki; Otsuki, Koichi; Ito, Toshihiro; Yamaguchi, Tsuyoshi

2017-02-01

Since 2014, clade 2.3.4.4 H5 subtype highly pathogenic avian influenza viruses (HPAIVs) have been distributed worldwide. These viruses, which were reported to be highly virulent in chickens by intravenous inoculation, have a consensus HPAI motif PLRERRRKR at the HA cleavage site. However, two-clade 2.3.4.4 H5N8 viruses which we isolated from wild migratory birds in late 2014 in Japan possessed atypical HA cleavage sequences. A swan isolate, Tottori/C6, had a novel polybasic cleavage sequence, PLGERRRKR, and another isolate from a dead mandarin duck, Gifu/01, had a heterogeneous mixture of consensus PLRERRRKR and variant PLRERRRRKR sequences. The polybasic HA cleavage site is the prime virulence determinant of AIVs. Therefore, in the present study, we examined the pathogenicity of these H5N8 isolates in chickens by intravenous inoculation. When 10 6 EID 50 of these viruses were intravenously inoculated into chickens, the mean death time associated with Tottori/C6 was substantially longer (>6.1 days) than that associated with Gifu/01 (2.5 days). These viruses had comparable abilities to replicate in tissue culture cells in the presence and absence of exogenous trypsin, but the growth of Tottori/C6 was hampered. These results indicate that the novel cleavage motif of Tottori/C6 did not directly affect the infectivity of the virus, but Tottori/C6 caused attenuated pathogenicity in chickens because of hampered replication efficiency. It is important to test for the emergence of diversified HPAIVs, because introduction of HPAIVs with a lower virulence like Tottori/C6 might hinder early detection of affected birds in poultry farms.
Species classifier choice is a key consideration when analysing low-complexity food microbiome data.

PubMed

Walsh, Aaron M; Crispie, Fiona; O'Sullivan, Orla; Finnegan, Laura; Claesson, Marcus J; Cotter, Paul D

2018-03-20

The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R 2 = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R 2 = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using SUPER-FOCUS were generally accordant between the platforms at different sequencing depths. Finally, and expectedly, metagenome assembly completeness was significantly lower on the MiSeq than either on the NextSeq (p = 0.03) or the Proton (p = 0.011), and it improved with increased sequencing depth. Our results demonstrate a remarkable similarity in the results generated by the three sequencing platforms at different sequencing depths, and, in fact, the choice of bioinformatics methodology had a more evident impact on results than the choice of sequencer did.
Survey of gene splicing algorithms based on reads.

PubMed

Si, Xiuhua; Wang, Qian; Zhang, Lei; Wu, Ruo; Ma, Jiquan

2017-11-02

Gene splicing is the process of assembling a large number of unordered short sequence fragments to the original genome sequence as accurately as possible. Several popular splicing algorithms based on reads are reviewed in this article, including reference genome algorithms and de novo splicing algorithms (Greedy-extension, Overlap-Layout-Consensus graph, De Bruijn graph). We also discuss a new splicing method based on the MapReduce strategy and Hadoop. By comparing these algorithms, some conclusions are drawn and some suggestions on gene splicing research are made.
Glutamate Receptor Aptamers and ALS

DTIC Science & Technology

2008-01-01

XXGATC ACC Consensus sequence RNA library Cloning and sequencing SELEX DIAGRAM Fig. 1. (a) Flow chart of SELEX. The library we used for SELEX...recording electrode and placed ~100 µm away from the hole. The linear flow rate of the solution is 1-4 cm/s. An optical fiber through which laser light for...channel recording (26) GluR6Q 1.1 × 104 4.2 × 102 Laser-pulse photolysis (67) 1.0 × 104 4.4 × 102 Fitting (52) 1.0 × 104 Flow measurement (46

Evidence of Divergent Amino Acid Usage in Comparative Analyses of R5- and X4-Associated HIV-1 Vpr Sequences

PubMed Central

Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia

2017-01-01

Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613
Genetic mapping and identification of QTL for earliness in the globe artichoke/cultivated cardoon complex

PubMed Central

2012-01-01

Background The Asteraceae species Cynara cardunculus (2n = 2x = 34) includes the two fully cross-compatible domesticated taxa globe artichoke (var. scolymus L.) and cultivated cardoon (var. altilis DC). As both are out-pollinators and suffer from marked inbreeding depression, linkage analysis has focussed on the use of a two way pseudo-test cross approach. Results A set of 172 microsatellite (SSR) loci derived from expressed sequence tag DNA sequence were integrated into the reference C. cardunculus genetic maps, based on segregation among the F1 progeny of a cross between a globe artichoke and a cultivated cardoon. The resulting maps each detected 17 major linkage groups, corresponding to the species’ haploid chromosome number. A consensus map based on 66 co-dominant shared loci (64 SSRs and two SNPs) assembled 694 loci, with a mean inter-marker spacing of 2.5 cM. When the maps were used to elucidate the pattern of inheritance of head production earliness, a key commercial trait, seven regions were shown to harbour relevant quantitative trait loci (QTL). Together, these QTL accounted for up to 74% of the overall phenotypic variance. Conclusion The newly developed consensus as well as the parental genetic maps can accelerate the process of tagging and eventually isolating the genes underlying earliness in both the domesticated C. cardunculus forms. The largest single effect mapped to the same linkage group in each parental maps, and explained about one half of the phenotypic variance, thus representing a good candidate for marker assisted selection. PMID:22621324
The diploid genome sequence of an Asian individual

PubMed Central

Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian

2009-01-01

Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735
ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos.

PubMed

Roca, Alberto I

2014-01-01

The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org.
Bioinformatic flowchart and database to investigate the origins and diversity of Clan AA peptidases

PubMed Central

Llorens, Carlos; Futami, Ricardo; Renaud, Gabriel; Moya, Andrés

2009-01-01

Background Clan AA of aspartic peptidases relates the family of pepsin monomers evolutionarily with all dimeric peptidases encoded by eukaryotic LTR retroelements. Recent findings describing various pools of single-domain nonviral host peptidases, in prokaryotes and eukaryotes, indicate that the diversity of clan AA is larger than previously thought. The ensuing approach to investigate this enzyme group is by studying its phylogeny. However, clan AA is a difficult case to study due to the low similarity and different rates of evolution. This work is an ongoing attempt to investigate the different clan AA families to understand the cause of their diversity. Results In this paper, we describe in-progress database and bioinformatic flowchart designed to characterize the clan AA protein domain based on all possible protein families through ancestral reconstructions, sequence logos, and hidden markov models (HMMs). The flowchart includes the characterization of a major consensus sequence based on 6 amino acid patterns with correspondence with Andreeva's model, the structural template describing the clan AA peptidase fold. The set of tools is work in progress we have organized in a database within the GyDB project, referred to as Clan AA Reference Database . Conclusion The pre-existing classification combined with the evolutionary history of LTR retroelements permits a consistent taxonomical collection of sequence logos and HMMs. This set is useful for gene annotation but also a reference to evaluate the diversity of, and the relationships among, the different families. Comparisons among HMMs suggest a common ancestor for all dimeric clan AA peptidases that is halfway between single-domain nonviral peptidases and those coded by Ty3/Gypsy LTR retroelements. Sequence logos reveal how all clan AA families follow similar protein domain architecture related to the peptidase fold. In particular, each family nucleates a particular consensus motif in the sequence position related to the flap. The different motifs constitute a network where an alanine-asparagine-like variable motif predominates, instead of the canonical flap of the HIV-1 peptidase and closer relatives. Reviewers This article was reviewed by Daniel H. Haft, Vladimir Kapitonov (nominated by Jerry Jurka), and Ben M. Dunn (nominated by Claus Wilke). PMID:19173708
Combination of the immunization with the sequence close to the consensus sequence and two DNA prime plus one VLP boost generate H5 hemagglutinin specific broad neutralizing antibodies

PubMed Central

Wang, Guiqin; Yin, Renfu; Zhou, Paul; Ding, Zhuang

2017-01-01

Hemagglutinin (HA) head has long been considered to be able to elicit only a narrow, strain-specific antibody response as it undergoes rapid antigenic drift. However, we previously showed that a heterologous prime-boost strategy, in which mice were primed twice with DNA encoding HA and boosted once with virus-like particles (VLP) from an H5N1 strain A/Thailand/1(KAN)-1/2004 (noted as TH DDV), induced anti-head broad cross-H5 neutralizing antibody response. To explain why TH DDV immunization could generate such breadth, we systemically compared the neutralization breadth and potency between TH DDV sera and immune sera elicited by TH DDD (three times of DNA immunizations), TH VVV (three times of VLP immunizations), TH DV (one DNA prime plus one VLP boost) and TK DDV (plasmid DNA and VLP derived from another H5N1 strain, A/Turkey/65596/2006). Then we determined the antigenic sites (AS) on TH HA head and the key residues of the main antigenic site. Through the comparison of different regiments, we found that the combination of the immunization with the sequence close to the consensus sequence and two DNA prime plus one VLP boost caused that TH DDV immunization generate broad neutralizing antibodies. Antigenic analysis showed that TH DDV, TH DV, TH DDD and TH VVV sera recognize the common antigenic site AS1. Antibodies directed to AS1 contribute to the largest proportion of the neutralizing activity of these immune sera. Residues 188 and 193 in AS1 are the key residues which are responsible for neutralization breadth of the immune sera. Interestingly, residues 188 and 193 locate in classical antigen sites but are relatively conserved among the 16 tested strains and 1,663 HA sequences from NCBI database. Thus, our results strongly indicate that it is feasible to develop broad cross-H5 influenza vaccines against HA head. PMID:28542275
In vivo binding of PRDM9 reveals interactions with noncanonical genomic sites

PubMed Central

Grey, Corinne; Clément, Julie A.J.; Buard, Jérôme; Leblanc, Benjamin; Gut, Ivo; Gut, Marta; Duret, Laurent

2017-01-01

In mouse and human meiosis, DNA double-strand breaks (DSBs) initiate homologous recombination and occur at specific sites called hotspots. The localization of these sites is determined by the sequence-specific DNA binding domain of the PRDM9 histone methyl transferase. Here, we performed an extensive analysis of PRDM9 binding in mouse spermatocytes. Unexpectedly, we identified a noncanonical recruitment of PRDM9 to sites that lack recombination activity and the PRDM9 binding consensus motif. These sites include gene promoters, where PRDM9 is recruited in a DSB-dependent manner. Another subset reveals DSB-independent interactions between PRDM9 and genomic sites, such as the binding sites for the insulator protein CTCF. We propose that these DSB-independent sites result from interactions between hotspot-bound PRDM9 and genomic sequences located on the chromosome axis. PMID:28336543
An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge

PubMed Central

2014-01-01

Background There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. Results A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. Conclusions The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups. PMID:24667040
A microsatellite-based consensus linkage map for species of Eucalyptus and a novel set of 230 microsatellite markers for the genus

PubMed Central

Brondani, Rosana PV; Williams, Emlyn R; Brondani, Claudio; Grattapaglia, Dario

2006-01-01

Background Eucalypts are the most widely planted hardwood trees in the world occupying globally more than 18 million hectares as an important source of carbon neutral renewable energy and raw material for pulp, paper and solid wood. Quantitative Trait Loci (QTLs) in Eucalyptus have been localized on pedigree-specific RAPD or AFLP maps seriously limiting the value of such QTL mapping efforts for molecular breeding. The availability of a genus-wide genetic map with transferable microsatellite markers has become a must for the effective advancement of genomic undertakings. This report describes the development of a novel set of 230 EMBRA microsatellites, the construction of the first comprehensive microsatellite-based consensus linkage map for Eucalyptus and the consolidation of existing linkage information for other microsatellites and candidate genes mapped in other species of the genus. Results The consensus map covers ~90% of the recombining genome of Eucalyptus, involves 234 mapped EMBRA loci on 11 linkage groups, an observed length of 1,568 cM and a mean distance between markers of 8.4 cM. A compilation of all microsatellite linkage information published in Eucalyptus allowed us to establish the homology among linkage groups between this consensus map and other maps published for E. globulus. Comparative mapping analyses also resulted in the linkage group assignment of other 41 microsatellites derived from other Eucalyptus species as well as candidate genes and QTLs for wood and flowering traits published in the literature. This report significantly increases the availability of microsatellite markers and mapping information for species of Eucalyptus and corroborates the high conservation of microsatellite flanking sequences and locus ordering between species of the genus. Conclusion This work represents an important step forward for Eucalyptus comparative genomics, opening stimulating perspectives for evolutionary studies and molecular breeding applications. The generalized use of an increasingly larger set of interspecific transferable markers and consensus mapping information, will allow faster and more detailed investigations of QTL synteny among species, validation of expression-QTL across variable genetic backgrounds and positioning of a growing number of candidate genes co-localized with QTLs, to be tested in association mapping experiments. PMID:16995939
Characterization of rabies virus from a human case in Nepal.

PubMed

Pant, G R; Horton, D L; Dahal, M; Rai, J N; Ide, S; Leech, S; Marston, D A; McElhinney, L M; Fooks, A R

2011-04-01

Rabies is endemic throughout most of Asia, with the majority of human cases transmitted by domestic dogs (Canis familiaris). Here, we report a case of rabies in a 12-year-old girl in the Lalitpur district of Nepal that might have been prevented by better public awareness and timely post-exposure prophylaxis. Molecular characterization of the virus showed 100% identity over a partial nucleoprotein gene sequence to previous isolates from Nepal belonging to the 'arctic-like' lineage of rabies virus. Sequence analysis of both partial nucleoprotein and glycoprotein genes showed differences in consensus sequence after passage in vitro but not after passage in vivo.
EST-derived SSR markers used as anchor loci for the construction of a consensus linkage map in ryegrass (Lolium spp.)

PubMed Central

2010-01-01

Background Genetic markers and linkage mapping are basic prerequisites for marker-assisted selection and map-based cloning. In the case of the key grassland species Lolium spp., numerous mapping populations have been developed and characterised for various traits. Although some genetic linkage maps of these populations have been aligned with each other using publicly available DNA markers, the number of common markers among genetic maps is still low, limiting the ability to compare candidate gene and QTL locations across germplasm. Results A set of 204 expressed sequence tag (EST)-derived simple sequence repeat (SSR) markers has been assigned to map positions using eight different ryegrass mapping populations. Marker properties of a subset of 64 EST-SSRs were assessed in six to eight individuals of each mapping population and revealed 83% of the markers to be polymorphic in at least one population and an average number of alleles of 4.88. EST-SSR markers polymorphic in multiple populations served as anchor markers and allowed the construction of the first comprehensive consensus map for ryegrass. The integrated map was complemented with 97 SSRs from previously published linkage maps and finally contained 284 EST-derived and genomic SSR markers. The total map length was 742 centiMorgan (cM), ranging for individual chromosomes from 70 cM of linkage group (LG) 6 to 171 cM of LG 2. Conclusions The consensus linkage map for ryegrass based on eight mapping populations and constructed using a large set of publicly available Lolium EST-SSRs mapped for the first time together with previously mapped SSR markers will allow for consolidating existing mapping and QTL information in ryegrass. Map and markers presented here will prove to be an asset in the development for both molecular breeding of ryegrass as well as comparative genetics and genomics within grass species. PMID:20712870
DNA breathing dynamics distinguish binding from nonbinding consensus sites for transcription factor YY1 in cells.

PubMed

Alexandrov, Boian S; Fukuyo, Yayoi; Lange, Martin; Horikoshi, Nobuo; Gelev, Vladimir; Rasmussen, Kim Ø; Bishop, Alan R; Usheva, Anny

2012-11-01

The genome-wide mapping of the major gene expression regulators, the transcription factors (TFs) and their DNA binding sites, is of great importance for describing cellular behavior and phenotypic diversity. Presently, the methods for prediction of genomic TF binding produce a large number of false positives, most likely due to insufficient description of the physiochemical mechanisms of protein-DNA binding. Growing evidence suggests that, in the cell, the double-stranded DNA (dsDNA) is subject to local transient strands separations (breathing) that contribute to genomic functions. By using site-specific chromatin immunopecipitations, gel shifts, BIOBASE data, and our model that accurately describes the melting behavior and breathing dynamics of dsDNA we report a specific DNA breathing profile found at YY1 binding sites in cells. We find that the genomic flanking sequence variations and SNPs, may exert long-range effects on DNA dynamics and predetermine YY1 binding. The ubiquitous TF YY1 has a fundamental role in essential biological processes by activating, initiating or repressing transcription depending upon the sequence context it binds. We anticipate that consensus binding sequences together with the related DNA dynamics profile may significantly improve the accuracy of genomic TF binding sites and TF binding-related functional SNPs.
Application of phage display for the development of a novel inhibitor of PLA2 activity in Western cottonmouth venom

PubMed Central

Titus, James K; Kay, Matthew K; Glaser, CDR Jacob J

2017-01-01

Snakebite envenomation is an important global health concern. The current standard treatment approach for snakebite envenomation relies on antibody-based antisera, which are expensive, not universally available, and can lead to adverse physiological effects. Phage display techniques offer a powerful tool for the selection of phage-expressed peptides, which can bind with high specificity and affinity towards venom components. In this research, the amino acid sequences of Phospholipase A2 (PLA2) from multiple cottonmouth species were analyzed, and a consensus peptide synthesized. Three phage display libraries were panned against this consensus peptide, crosslinked to capillary tubes, followed by a modified surface panning procedure. This high throughput selection method identified four phage clones with anti-PLA2 activity against Western cottonmouth venom, and the amino acid sequences of the displayed peptides were identified. This is the first report identifying short peptide sequences capable of inhibiting PLA2 activity of Western cottonmouth venom in vitro, using a phage display technique. Additionally, this report utilizes synthetic panning targets, designed using venom proteomic data, to mimic epitope regions. M13 phages displaying circular 7-mer or linear 12-mer peptides with antivenom activity may offer a novel alternative to traditional antibody-based therapy. PMID:29285351
Application of phage display for the development of a novel inhibitor of PLA2 activity in Western cottonmouth venom.

PubMed

Titus, James K; Kay, Matthew K; Glaser, Cdr Jacob J

2017-01-01

Snakebite envenomation is an important global health concern. The current standard treatment approach for snakebite envenomation relies on antibody-based antisera, which are expensive, not universally available, and can lead to adverse physiological effects. Phage display techniques offer a powerful tool for the selection of phage-expressed peptides, which can bind with high specificity and affinity towards venom components. In this research, the amino acid sequences of Phospholipase A 2 (PLA 2 ) from multiple cottonmouth species were analyzed, and a consensus peptide synthesized. Three phage display libraries were panned against this consensus peptide, crosslinked to capillary tubes, followed by a modified surface panning procedure. This high throughput selection method identified four phage clones with anti-PLA 2 activity against Western cottonmouth venom, and the amino acid sequences of the displayed peptides were identified. This is the first report identifying short peptide sequences capable of inhibiting PLA 2 activity of Western cottonmouth venom in vitro , using a phage display technique. Additionally, this report utilizes synthetic panning targets, designed using venom proteomic data, to mimic epitope regions. M13 phages displaying circular 7-mer or linear 12-mer peptides with antivenom activity may offer a novel alternative to traditional antibody-based therapy.
The General Definition of the p97/Valosin-containing Protein (VCP)-interacting Motif (VIM) Delineates a New Family of p97 Cofactors*

PubMed Central

Stapf, Christopher; Cartwright, Edward; Bycroft, Mark; Hofmann, Kay; Buchberger, Alexander

2011-01-01

Cellular functions of the essential, ubiquitin-selective AAA ATPase p97/valosin-containing protein (VCP) are controlled by regulatory cofactors determining substrate specificity and fate. Most cofactors bind p97 through a ubiquitin regulatory X (UBX) or UBX-like domain or linear sequence motifs, including the hitherto ill defined p97/VCP-interacting motif (VIM). Here, we present the new, minimal consensus sequence RX5AAX2R as a general definition of the VIM that unites a novel family of known and putative p97 cofactors, among them UBXD1 and ZNF744/ANKZF1. We demonstrate that this minimal VIM consensus sequence is necessary and sufficient for p97 binding. Using NMR chemical shift mapping, we identified several residues of the p97 N-terminal domain (N domain) that are critical for VIM binding. Importantly, we show that cellular stress resistance conferred by the yeast VIM-containing cofactor Vms1 depends on the physical interaction between its VIM and the critical N domain residues of the yeast p97 homolog, Cdc48. Thus, the VIM-N domain interaction characterized in this study is required for the physiological function of Vms1 and most likely other members of the newly defined VIM family of cofactors. PMID:21896481
Molecular cloning of actin genes in Trichomonas vaginalis and phylogeny inferred from actin sequences.

PubMed

Bricheux, G; Brugerolle, G

1997-08-01

The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
Cloning and sequencing of a laccase gene from the lignin-degrading basidiomycete Pleurotus ostreatus.

PubMed Central

Giardina, P; Cannio, R; Martirani, L; Marzullo, L; Palmieri, G; Sannia, G

1995-01-01

The gene (pox1) encoding a phenol oxidase from Pleurotus ostreatus, a lignin-degrading basidiomycete, was cloned and sequenced, and the corresponding pox1 cDNA was also synthesized and sequenced. The isolated gene consists of 2,592 bp, with the coding sequence being interrupted by 19 introns and flanked by an upstream region in which putative CAAT and TATA consensus sequences could be identified at positions -174 and -84, respectively. The isolation of a second cDNA (pox2 cDNA), showing 84% similarity, and of the corresponding truncated genomic clones demonstrated the existence of a multigene family coding for isoforms of laccase in P. ostreatus. PCR amplifications of specific regions on the DNA of isolated monokaryons proved that the two genes are not allelic forms. The POX1 amino acid sequence deduced was compared with those of other known laccases from different fungi. PMID:7793961
Development of the first consensus genetic map of intermediate wheatgrass (Thinopyrum intermedium) using genotyping-by-sequencing

USDA-ARS?s Scientific Manuscript database

Intermediate wheatgrass (Thinopyrum intermedium) has been identified as a candidate for domestication and improvement as a perennial grain, forage, and biofuel crop by several active breeding programs. To accelerate this process using genomics-assisted breeding, efficient genotyping methods and gen...
Development and Implementation of High-Throughput SNP Genotyping in Barley

USDA-ARS?s Scientific Manuscript database

Approximately 22,000 SNPs were identified from barley ESTs and sequenced amplicons; 4,596 of them were tested for performance in three pilot phase Illumina GoldenGate assays. Pilot phase data from three barley doubled haploid mapping populations supported the production of an initial consensus map, ...
Mosaic protein and nucleic acid vaccines against hepatitis C virus

DOEpatents

Yusim, Karina; Korber, Bette T. M.; Kuiken, Carla L.; Fischer, William M.

2013-06-11

The invention relates to immunogenic compositions useful as HCV vaccines. Provided are HCV mosaic polypeptide and nucleic acid compositions which provide higher levels of T-cell epitope coverage while minimizing the occurrence of unnatural and rare epitopes compared to natural HCV polypeptides and consensus HCV sequences.

Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats☆

PubMed Central

Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

2013-01-01

Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. PMID:23648487
Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats.

PubMed

Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

2013-08-01

Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Metagenome assembly through clustering of next-generation sequencing data using protein sequences.

PubMed

Sim, Mikang; Kim, Jaebum

2015-02-01

The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging. In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising. Copyright © 2015 Elsevier B.V. All rights reserved.
Minimum Information for Reporting Next Generation Sequence Genotyping (MIRING): Guidelines for Reporting HLA and KIR Genotyping via Next Generation Sequencing

PubMed Central

Mack, Steven J.; Milius, Robert P.; Gifford, Benjamin D.; Sauter, Jürgen; Hofmann, Jan; Osoegawa, Kazutoyo; Robinson, James; Groeneweg, Mathijs; Turenchalk, Gregory S.; Adai, Alex; Holcomb, Cherie; Rozemuller, Erik H.; Penning, Maarten T.; Heuer, Michael L.; Wang, Chunlin; Salit, Marc L.; Schmidt, Alexander H.; Parham, Peter R.; Müller, Carlheinz; Hague, Tim; Fischer, Gottfried; Fernandez-Viňa, Marcelo; Hollenbach, Jill A; Norman, Paul J.; Maiers, Martin

2015-01-01

The development of next-generation sequencing (NGS) technologies for HLA and KIR genotyping is rapidly advancing knowledge of genetic variation of these highly polymorphic loci. NGS genotyping is poised to replace older methods for clinical use, but standard methods for reporting and exchanging these new, high quality genotype data are needed. The Immunogenomic NGS Consortium, a broad collaboration of histocompatibility and immunogenetics clinicians, researchers, instrument manufacturers and software developers, has developed the Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines. MIRING is a checklist that specifies the content of NGS genotyping results as well as a set of messaging guidelines for reporting the results. A MIRING message includes five categories of structured information – message annotation, reference context, full genotype, consensus sequence and novel polymorphism – and references to three categories of accessory information – NGS platform documentation, read processing documentation and primary data. These eight categories of information ensure the long-term portability and broad application of this NGS data for all current histocompatibility and immunogenetics use cases. In addition, MIRING can be extended to allow the reporting of genotype data generated using pre-NGS technologies. Because genotyping results reported using MIRING are easily updated in accordance with reference and nomenclature databases, MIRING represents a bold departure from previous methods of reporting HLA and KIR genotyping results, which have provided static and less-portable data. More information about MIRING can be found online at miring.immunogenomics.org. PMID:26407912
Molecular characterization of Hepatozoon sp. from Brazilian dogs and its phylogenetic relationship with other Hepatozoon spp.

PubMed

Forlano, M D; Teixeira, K R S; Scofield, A; Elisei, C; Yotoko, K S C; Fernandes, K R; Linhares, G F C; Ewing, S A; Massard, C L

2007-04-10

To characterize phylogenetically the species which causes canine hepatozoonosis at two rural areas of Rio de Janeiro State, Brazil, we used universal or Hepatozoon spp. primer sets for the 18S SSU rRNA coding region. DNA extracts were obtained from blood samples of thirteen dogs naturally infected, from four experimentally infected, and from five puppies infected by vertical transmission from a dam, that was experimentally infected. DNA of sporozoites of Hepatozoon americanum was used as positive control. The amplification of DNA extracts from blood of dogs infected with sporozoites of Hepatozoon spp. was observed in the presence of primers to 18S SSU rRNA gene of Hepatozoon spp., whereas DNA of H. americanum sporozoites was amplified in the presence of either universal or Hepatozoon spp.-specific primer sets; the amplified products were approximately 600bp in size. Cloned PCR products obtained from DNA extracts of blood from two dogs experimentally infected with Hepatozoon sp. were sequenced. The consensus sequence, derived from six sequence data sets, were blasted against sequences of 18S SSU rRNA of Hepatozoon spp. available at GenBank and aligned to homologous sequences to perform the phylogenetic analysis. This analysis clearly showed that our sequence clustered, independently of H. americanum sequences, within a group comprising other Hepatozoon canis sequences. Our results confirmed the hypothesis that the agent causing hepatozoonosis in the areas studied in Brazil is H. canis, supporting previous reports that were based on morphological and morphometric analyses.
Cloning and characterization of the gene encoding IMP dehydrogenase from Arabidopsis thaliana.

PubMed

Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E

1996-10-03

We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Arabidopsis thaliana (At). The transcription unit of the At gene spans approximately 1900 bp and specifies a protein of 503 amino acids with a calculated relative molecular mass (M(r)) of 54,190. The gene is comprised of a minimum of four introns and five exons with all donor and acceptor splice sequences conforming to previously proposed consensus sequences. The deduced IMPDH amino-acid sequence from At shows a remarkable similarity to other eukaryotic IMPDH sequences, with a 48% identity to human Type II enzyme. Allowing for conservative substitutions, the enzyme is 69% similar to human Type II IMPDH. The putative active-site sequence of At IMPDH conforms to the IMP dehydrogenase/guanosine monophosphate reductase motif and contains an essential active-site cysteine residue.
Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach

PubMed Central

Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.

2007-01-01

We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853
Human Ro60 (SSA2) genomic organization and sequence alterations, examined in cutaneous lupus erythematosus.

PubMed

Millard, T P; Ashton, G H S; Kondeatis, E; Vaughan, R W; Hughes, G R V; Khamashta, M A; Hawk, J L M; McGregor, J M; McGrath, J A

2002-02-01

The Ro 60 kDa protein (Ro60 or SSA2) is the major component of the Ro ribonucleoprotein (Ro RNP) complex, to which an immune response is a specific feature of several autoimmune diseases. The genomic organization and any sequence variation within the DNA encoding Ro60 are unknown. To characterize the Ro60 gene structure and to assess whether any sequence alterations might be associated with serum anti-Ro antibody in subacute cutaneous lupus erythematosus (SCLE), thus potentially providing new insight into disease pathogenesis. The cDNA sequence for Ro60 was obtained from the NCBI database and used for a BLAST search for a clone containing the entire genomic sequence. The intron-exon borders were confirmed by designing intronic primer pairs to flank each exon, which were then used to amplify genomic DNA for automated sequencing from 36 caucasian patients with SCLE (anti-Ro positive) and 49 with discoid LE (DLE, anti-Ro negative), in addition to 36 healthy caucasian controls. Heteroduplex analysis of polymerase chain reaction (PCR) products from patients and controls spanning all Ro60 exons (1-8) revealed a common bandshift in the PCR products spanning exon 7. Sequencing of the corresponding PCR products demonstrated an A > G substitution at nucleotide position 1318-7, within the consensus acceptor splice site of exon 7 (GenBank XM001901). The allele frequencies were major allele A (0.71) and minor allele G (0.29) in 72 control chromosomes, with no significant differences found between SCLE patients, DLE patients and controls. The genomic organization of the DNA encoding the Ro60 protein is described, including a common polymorphism within the consensus acceptor splice site of exon 7. Our delineation of a strategy for the genomic amplification of Ro60 forms a basis for further examination of the pathological functions of the Ro RNP in autoimmune disease.
A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety

PubMed Central

Cartwright, Dustin A.; Cestaro, Alessandro; Pruss, Dmitry; Pindo, Massimo; FitzGerald, Lisa M.; Vezzulli, Silvia; Reid, Julia; Malacarne, Giulia; Iliev, Diana; Coppola, Giuseppina; Wardell, Bryan; Micheletti, Diego; Macalma, Teresita; Facci, Marco; Mitchell, Jeff T.; Perazzolli, Michele; Eldredge, Glenn; Gatto, Pamela; Oyzerski, Rozan; Moretto, Marco; Gutin, Natalia; Stefanini, Marco; Chen, Yang; Segala, Cinzia; Davenport, Christine; Demattè, Lorenzo; Mraz, Amy; Battilana, Juri; Stormo, Keith; Costa, Fabrizio; Tao, Quanzhou; Si-Ammour, Azeddine; Harkins, Tim; Lackey, Angie; Perbost, Clotilde; Taillon, Bruce; Stella, Alessandra; Solovyev, Victor; Fawcett, Jeffrey A.; Sterck, Lieven; Vandepoele, Klaas; Grando, Stella M.; Toppo, Stefano; Moser, Claudio; Lanchbury, Jerry; Bogden, Robert; Skolnick, Mark; Sgaramella, Vittorio; Bhatnagar, Satish K.; Fontana, Paolo; Gutin, Alexander; Van de Peer, Yves; Salamini, Francesco; Viola, Roberto

2007-01-01

Background Worldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented. Principal Findings We estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism (SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitis-specific large scale duplication event concerning at least 10 chromosomes (duplication not reported before). Conclusions Sanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape. PMID:18094749
The evolution and phylogeography of the African elephant inferred from mitochondrial DNA sequence and nuclear microsatellite markers.

PubMed

Eggert, Lori S; Rasner, Caylor A; Woodruff, David S

2002-10-07

Recent genetic results support the recognition of two African elephant species: Loxodonta africana, the savannah elephant, and Loxodonta cyclotis, the forest elephant. The study, however, did not include the populations of West Africa, where the taxonomic affinities of elephants have been much debated. We examined mitochondrial cytochrome b control region sequences and four microsatellite loci to investigate the genetic differences between the forest and savannah elephants of West and Central Africa. We then combined our data with published control region sequences from across Africa to examine patterns at the continental level. Our analysis reveals several deeply divergent lineages that do not correspond with the currently recognized taxonomy: (i) the forest elephants of Central Africa; the forest and savannah elephants of West Africa; and (iii) the savannah elephants of eastern, southern and Central Africa. We propose that the complex phylogeographic patterns we detect in African elephants result from repeated continental-scale climatic changes over their five-to-six million year evolutionary history. Until there is consensus on the taxonomy, we suggest that the genetic and ecological distinctness of these lineages should be an important factor in conservation management planning.
MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins.

PubMed

Necci, Marco; Piovesan, Damiano; Dosztányi, Zsuzsanna; Tosatto, Silvio C E

2017-05-01

Intrinsic disorder (ID) is established as an important feature of protein sequences. Its use in proteome annotation is however hampered by the availability of many methods with similar performance at the single residue level, which have mostly not been optimized to predict long ID regions of size comparable to domains. Here, we have focused on providing a single consensus-based prediction, MobiDB-lite, optimized for highly specific (i.e. few false positive) predictions of long disorder. The method uses eight different predictors to derive a consensus which is then filtered for spurious short predictions. Consensus prediction is shown to outperform the single methods when annotating long ID regions. MobiDB-lite can be useful in large-scale annotation scenarios and has indeed already been integrated in the MobiDB, DisProt and InterPro databases. MobiDB-lite is available as part of the MobiDB database from URL: http://mobidb.bio.unipd.it/. An executable can be downloaded from URL: http://protein.bio.unipd.it/mobidblite/. silvio.tosatto@unipd.it. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Chimaeric Virus-Like Particles Derived from Consensus Genome Sequences of Human Rotavirus Strains Co-Circulating in Africa

PubMed Central

Jere, Khuzwayo C.; O'Neill, Hester G.; Potgieter, A. Christiaan; van Dijk, Alberdina A.

2014-01-01

Rotavirus virus-like particles (RV-VLPs) are potential alternative non-live vaccine candidates due to their high immunogenicity. They mimic the natural conformation of native viral proteins but cannot replicate because they do not contain genomic material which makes them safe. To date, most RV-VLPs have been derived from cell culture adapted strains or common G1 and G3 rotaviruses that have been circulating in communities for some time. In this study, chimaeric RV-VLPs were generated from the consensus sequences of African rotaviruses (G2, G8, G9 or G12 strains associated with either P[4], P[6] or P[8] genotypes) characterised directly from human stool samples without prior adaptation of the wild type strains to cell culture. Codon-optimised sequences for insect cell expression of genome segments 2 (VP2), 4 (VP4), 6 (VP6) and 9 (VP7) were cloned into a modified pFASTBAC vector, which allowed simultaneous expression of up to four genes using the Bac-to-Bac Baculovirus Expression System (BEVS; Invitrogen). Several combinations of the genome segments originating from different field strains were cloned to produce double-layered RV-VLPs (dRV-VLP; VP2/6), triple-layered RV-VLPs (tRV-VLP; VP2/6/7 or VP2/6/7/4) and chimaeric tRV-VLPs. The RV-VLPs were produced by infecting Spodoptera frugiperda 9 and Trichoplusia ni cells with recombinant baculoviruses using multi-cistronic, dual co-infection and stepwise-infection expression strategies. The size and morphology of the RV-VLPs, as determined by transmission electron microscopy, revealed successful production of RV-VLPs. The novel approach of producing tRV-VLPs, by using the consensus insect cell codon-optimised nucleotide sequence derived from dsRNA extracted directly from clinical specimens, should speed-up vaccine research and development by by-passing the need to adapt rotaviruses to cell culture. Other problems associated with cell culture adaptation, such as possible changes in epitopes, can also be circumvented. Thus, it is now possible to generate tRV-VLPs for evaluation as non-live vaccine candidates for any human or animal field rotavirus strain. PMID:25268783
Dual Regulation of Bacillus subtilis kinB Gene Encoding a Sporulation Trigger by SinR through Transcription Repression and Positive Stringent Transcription Control.

PubMed

Fujita, Yasutaro; Ogura, Mitsuo; Nii, Satomi; Hirooka, Kazutake

2017-01-01

It is known that transcription of kinB encoding a trigger for Bacillus subtilis sporulation is under repression by SinR, a master repressor of biofilm formation, and under positive stringent transcription control depending on the adenine species at the transcription initiation nucleotide (nt). Deletion and base substitution analyses of the kinB promoter (P kinB ) region using lacZ fusions indicated that either a 5-nt deletion (Δ5, nt -61/-57, +1 is the transcription initiation nt) or the substitution of G at nt -45 with A (G-45A) relieved kinB repression. Thus, we found a pair of SinR-binding consensus sequences (GTTCTYT; Y is T or C) in an inverted orientation (SinR-1) between nt -57/-42, which is most likely a SinR-binding site for kinB repression. This relief from SinR repression likely requires SinI, an antagonist of SinR. Surprisingly, we found that SinR is essential for positive stringent transcription control of P kinB . Electrophoretic mobility shift assay (EMSA) analysis indicated that SinR bound not only to SinR-1 but also to SinR-2 (nt -29/-8) consisting of another pair of SinR consensus sequences in a tandem repeat arrangement; the two sequences partially overlap the '-35' and '-10' regions of P kinB . Introduction of base substitutions (T-27C C-26T) in the upstream consensus sequence of SinR-2 affected positive stringent transcription control of P kinB , suggesting that SinR binding to SinR-2 likely causes this positive control. EMSA also implied that RNA polymerase and SinR are possibly bound together to SinR-2 to form a transcription initiation complex for kinB transcription. Thus, it was suggested in this work that derepression of kinB from SinR repression by SinI induced by Spo0A∼P and occurrence of SinR-dependent positive stringent transcription control of kinB might induce effective sporulation cooperatively, implying an intimate interplay by stringent response, sporulation, and biofilm formation.
Chloroplast Phylogenomics Indicates that Ginkgo biloba Is Sister to Cycads

PubMed Central

Wu, Chung-Shien; Chaw, Shu-Miaw; Huang, Ya-Yi

2013-01-01

Molecular phylogenetic studies have not yet reached a consensus on the placement of Ginkgoales, which is represented by the only living species, Ginkgo biloba (common name: ginkgo). At least six discrepant placements of ginkgo have been proposed. This study aimed to use the chloroplast phylogenomic approach to examine possible factors that lead to such disagreeing placements. We found the sequence types used in the analyses as the most critical factor in the conflicting placements of ginkgo. In addition, the placement of ginkgo varied in the trees inferred from nucleotide (NU) sequences, which notably depended on breadth of taxon sampling, tree-building methods, codon positions, positions of Gnetopsida (common name: gnetophytes), and including or excluding gnetophytes in data sets. In contrast, the trees inferred from amino acid (AA) sequences congruently supported the monophyly of a ginkgo and Cycadales (common name: cycads) clade, regardless of which factors were examined. Our site-stripping analysis further revealed that the high substitution saturation of NU sequences mainly derived from the third codon positions and contributed to the variable placements of ginkgo. In summary, the factors we surveyed did not affect results inferred from analyses of AA sequences. Congruent topologies in our AA trees give more confidence in supporting the ginkgo–cycad sister-group hypothesis. PMID:23315384
Molecular phylogeny and SNP variation of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) derived from genome sequences.

PubMed

Cronin, Matthew A; Rincon, Gonzalo; Meredith, Robert W; MacNeil, Michael D; Islas-Trejo, Alma; Cánovas, Angela; Medrano, Juan F

2014-01-01

We assessed the relationships of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) with high throughput genomic sequencing data with an average coverage of 25× for each species. A total of 1.4 billion 100-bp paired-end reads were assembled using the polar bear and annotated giant panda (Ailuropoda melanoleuca) genome sequences as references. We identified 13.8 million single nucleotide polymorphisms (SNP) in the 3 species aligned to the polar bear genome. These data indicate that polar bears and brown bears share more SNP with each other than either does with black bears. Concatenation and coalescence-based analysis of consensus sequences of approximately 1 million base pairs of ultraconserved elements in the nuclear genome resulted in a phylogeny with black bears as the sister group to brown and polar bears, and all brown bears are in a separate clade from polar bears. Genotypes for 162 SNP loci of 336 bears from Alaska and Montana showed that the species are genetically differentiated and there is geographic population structure of brown and black bears but not polar bears.
ESTuber db: an online database for Tuber borchii EST sequences.

PubMed

Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo

2007-03-08

The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PubMed

Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

2017-07-01

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
SubCellProt: predicting protein subcellular localization using machine learning approaches.

PubMed

Garg, Prabha; Sharma, Virag; Chaudhari, Pradeep; Roy, Nilanjan

2009-01-01

High-throughput genome sequencing projects continue to churn out enormous amounts of raw sequence data. However, most of this raw sequence data is unannotated and, hence, not very useful. Among the various approaches to decipher the function of a protein, one is to determine its localization. Experimental approaches for proteome annotation including determination of a protein's subcellular localizations are very costly and labor intensive. Besides the available experimental methods, in silico methods present alternative approaches to accomplish this task. Here, we present two machine learning approaches for prediction of the subcellular localization of a protein from the primary sequence information. Two machine learning algorithms, k Nearest Neighbor (k-NN) and Probabilistic Neural Network (PNN) were used to classify an unknown protein into one of the 11 subcellular localizations. The final prediction is made on the basis of a consensus of the predictions made by two algorithms and a probability is assigned to it. The results indicate that the primary sequence derived features like amino acid composition, sequence order and physicochemical properties can be used to assign subcellular localization with a fair degree of accuracy. Moreover, with the enhanced accuracy of our approach and the definition of a prediction domain, this method can be used for proteome annotation in a high throughput manner. SubCellProt is available at www.databases.niper.ac.in/SubCellProt.
Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia

PubMed Central

Maezato, Yukari; Wu, Yu-Wei; Romine, Margaret F.; Lindemann, Stephen R.

2015-01-01

To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled the de novo reconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 of the 20 detected member species. Two Halomonas spp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of the Halomonas populations, one of the Rhodobacteraceae populations, and the Rhizobiales population. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set. PMID:26497460
Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nelson, William C.; Maezato, Yukari; Wu, Yu-Wei

2015-10-23

To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled thede novoreconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 ofmore » the 20 detected member species. TwoHalomonasspp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of theHalomonaspopulations, one of theRhodobacteraceaepopulations, and theRhizobialespopulation. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set.« less

Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2's q2-feature-classifier plugin.

PubMed

Bokulich, Nicholas A; Kaehler, Benjamin D; Rideout, Jai Ram; Dillon, Matthew; Bolyen, Evan; Knight, Rob; Huttley, Gavin A; Gregory Caporaso, J

2018-05-17

Taxonomic classification of marker-gene sequences is an important step in microbiome analysis. We present q2-feature-classifier ( https://github.com/qiime2/q2-feature-classifier ), a QIIME 2 plugin containing several novel machine-learning and alignment-based methods for taxonomy classification. We evaluated and optimized several commonly used classification methods implemented in QIIME 1 (RDP, BLAST, UCLUST, and SortMeRNA) and several new methods implemented in QIIME 2 (a scikit-learn naive Bayes machine-learning classifier, and alignment-based taxonomy consensus methods based on VSEARCH, and BLAST+) for classification of bacterial 16S rRNA and fungal ITS marker-gene amplicon sequence data. The naive-Bayes, BLAST+-based, and VSEARCH-based classifiers implemented in QIIME 2 meet or exceed the species-level accuracy of other commonly used methods designed for classification of marker gene sequences that were evaluated in this work. These evaluations, based on 19 mock communities and error-free sequence simulations, including classification of simulated "novel" marker-gene sequences, are available in our extensible benchmarking framework, tax-credit ( https://github.com/caporaso-lab/tax-credit-data ). Our results illustrate the importance of parameter tuning for optimizing classifier performance, and we make recommendations regarding parameter choices for these classifiers under a range of standard operating conditions. q2-feature-classifier and tax-credit are both free, open-source, BSD-licensed packages available on GitHub.
Control of artefactual variation in reported inter-sample relatedness during clinical use of a Mycobacterium tuberculosis sequencing pipeline.

PubMed

Wyllie, David H; Sanderson, Nicholas; Myers, Richard; Peto, Tim; Robinson, Esther; Crook, Derrick W; Smith, E Grace; Walker, A Sarah

2018-06-06

Contact tracing requires reliable identification of closely related bacterial isolates. When we noticed the reporting of artefactual variation between M. tuberculosis isolates during routine next generation sequencing of Mycobacterium spp, we investigated its basis in 2,018 consecutive M. tuberculosis isolates. In the routine process used, clinical samples were decontaminated and inoculated into broth cultures; from positive broth cultures DNA was extracted, sequenced, reads mapped, and consensus sequences determined. We investigated the process of consensus sequence determination, which selects the most common nucleotide at each position. Having determined the high-quality read depth and depth of minor variants across 8,006 M. tuberculosis genomic regions, we quantified the relationship between the minor variant depth and the amount of non-Mycobacterial bacterial DNA, which originates from commensal microbes killed during sample decontamination. In the presence of non-Mycobacterial bacterial DNA, we found significant increases in minor variant frequencies of more than 1.5 fold in 242 regions covering 5.1% of the M. tuberculosis genome. Included within these were four high variation regions strongly influenced by the amount of non-Mycobacterial bacterial DNA. Excluding these four regions from pairwise distance comparisons reduced biologically implausible variation from 5.2% to 0% in an independent validation set derived from 226 individuals. Thus, we have demonstrated an approach identifying critical genomic regions contributing to clinically relevant artefactual variation in bacterial similarity searches. The approach described monitors the outputs of the complex multi-step laboratory and bioinformatics process, allows periodic process adjustments, and will have application to quality control of routine bacterial genomics. Copyright © 2018 Wyllie et al.
Characterization of protein--DNA interactions using surface plasmon resonance spectroscopy with various assay schemes.

PubMed

Teh, Huey Fang; Peh, Wendy Y X; Su, Xiaodi; Thomsen, Jane S

2007-02-27

Specific protein-DNA interactions play a central role in transcription and other biological processes. A comprehensive characterization of protein-DNA interactions should include information about binding affinity, kinetics, sequence specificity, and binding stoichiometry. In this study, we have used surface plasmon resonance spectroscopy (SPR) to study the interactions between human estrogen receptors (ER, alpha and beta subtypes) and estrogen response elements (ERE), with four assay schemes. First, we determined the sequence-dependent receptors' binding capacity by monitoring the binding of ER to various ERE sequences immobilized on a sensor surface (assay format denoted as the direct assay). Second, we screened the relative affinity of ER for various ERE sequences using a competition assay, in which the receptors bind to an ERE-immobilized surface in the presence of competitor ERE sequences. Third, we monitored the assembly of ER-ERE complexes on a SPR surface and thereafter the removal and/or dissociation of the ER (assay scheme denoted as the dissociation assay) to determine the binding stoichiometry. Last, a sandwich assay (ER binding to ERE followed by anti-ER recognition of a specific ER subtype) was performed in an effort to understand how ERalpha and ERbeta may associate and compete when binding to the DNA. With these assay schemes, we reaffirmed that (1) ERalpha is more sensitive than ERbeta to base pair change(s) in the consensus ERE, (2) ERalpha and ERbeta form a heterodimer when they bind to the consensus ERE, and (3) the binding stoichiometry of both ERalpha- and ERbeta-ERE complexes is dependent on salt concentration. With this study, we demonstrate the versatility of the SPR analysis. With the involvement of various assay arrangements, the SPR analysis can be further extended to more than kinetics and affinity study.
Genetic mapping and identification of QTL for earliness in the globe artichoke/cultivated cardoon complex.

PubMed

Portis, Ezio; Scaglione, Davide; Acquadro, Alberto; Mauromicale, Giovanni; Mauro, Rosario; Knapp, Steven J; Lanteri, Sergio

2012-05-23

The Asteraceae species Cynara cardunculus (2n = 2x = 34) includes the two fully cross-compatible domesticated taxa globe artichoke (var. scolymus L.) and cultivated cardoon (var. altilis DC). As both are out-pollinators and suffer from marked inbreeding depression, linkage analysis has focussed on the use of a two way pseudo-test cross approach. A set of 172 microsatellite (SSR) loci derived from expressed sequence tag DNA sequence were integrated into the reference C. cardunculus genetic maps, based on segregation among the F1 progeny of a cross between a globe artichoke and a cultivated cardoon. The resulting maps each detected 17 major linkage groups, corresponding to the species' haploid chromosome number. A consensus map based on 66 co-dominant shared loci (64 SSRs and two SNPs) assembled 694 loci, with a mean inter-marker spacing of 2.5 cM. When the maps were used to elucidate the pattern of inheritance of head production earliness, a key commercial trait, seven regions were shown to harbour relevant quantitative trait loci (QTL). Together, these QTL accounted for up to 74% of the overall phenotypic variance. The newly developed consensus as well as the parental genetic maps can accelerate the process of tagging and eventually isolating the genes underlying earliness in both the domesticated C. cardunculus forms. The largest single effect mapped to the same linkage group in each parental maps, and explained about one half of the phenotypic variance, thus representing a good candidate for marker assisted selection.
Cell membrane-bound CD200 signals both via an extracellular domain and following nuclear translocation of a cytoplasmic fragment.

PubMed

Chen, Zhiqi; Kapus, Andras; Khatri, Ismat; Kos, Olha; Zhu, Fang; Gorczynski, Reginald M

2018-06-01

In previous studies we had reported that the immunosuppressive cell membrane bound molecule CD200 is released from the cell following cleavage by matrix metalloproteases, with the released soluble CD200 acting as an immunosuppressant following binding to, and signaling through, its cognate receptor CD200R expressed on target cells. We now show that although the intracellular cytoplasmic tail (CD200 C-tail ) of CD200 has no consensus sites for adapter molecules which might signal the CD200 + cell directly, cleavage of the CD200 C-tail from the membrane region of CD200 by a consensus γ-secretase, leads to nuclear translocation and DNA binding (identified by chromatin immunoprecipitation followed by sequencing, Chip-sequencing) of the CD200 C-tail . Subsequently there occurs an altered expression of a limited number of genes, many of which are transcription factors (TFs) known to be associated with regulation of cell proliferation. Altered expression of these TFs was also prominent following transfection of CD200 + B cell lines and fresh patient CLL cells with a vector construct containing the CD200 C-tail . Artificial transfection of non CD200 + Hek293 cells with this CD200 C-tail construct resulted in altered expression of most of these same genes. Introduction of a siRNA for one of these TFs, POTEA, reversed CD200 C-tail regulation of altered cell proliferation. Copyright © 2018 Elsevier Ltd. All rights reserved.
Cloning the uteroglobin gene promoter from the relic volcano rabbit (Romerolagus diazi) reveals an ancient estrogen-response element.

PubMed

Acosta-MontesdeOca, Adriana; Zariñán, Teresa; Macías, Héctor; Pérez-Solís, Marco A; Ulloa-Aguirre, Alfredo; Gutiérrez-Sagal, Rubén

2012-05-01

To gain further insight on the estrogen-dependent transcriptional regulation of the uteroglobin (UG) gene, we cloned the 5'-flanking region of the UG gene from the phylogenetically ancient volcano rabbit (Romerolagus diazi; Rd). The cloned region spans 812 base pairs (bp; -812/-1) and contains a noncanonical TATA box (TACA). The translation start site is 48 bp downstream from the putative transcription initiation site (AGA), and is preceded by a consensus Kozak box. Comparison of the Rd-UG gene with that previously isolated from rabbits (Oryctolagus cuniculus) showed 93% in sequence identity as well as a number of conserved cis-acting elements, including the estrogen-response element (ERE; -265/-251), which differs from the consensus by two nucleotides. In MCF-7 cells, 17β-estradiol (E(2)) induced transcription of a luciferase reporter driven by the Rd-UG promoter in a similar manner as in an equivalent rabbit UG reporter; the Rd-UG promoter was 30% more responsive to E(2) than the rabbit promoter. Mutagenesis studies on the Rd-ERE confirmed this cis-element as a target of E(2) as two luciferase mutant reporters of the Rd-promoter, one with the rabbit and the other with the consensus ERE, were more responsive to the hormone than the wild-type reporter. Gel shift and super-shift assays showed that estrogen receptor-α indeed binds to the imperfect palindromic sequence of the Rd-ERE. Copyright © 2012 Wiley Periodicals, Inc.
Towards a consensus Y-chromosomal phylogeny and Y-SNP set in forensics in the next-generation sequencing era.

PubMed

Larmuseau, Maarten H D; Van Geystelen, Anneleen; Kayser, Manfred; van Oven, Mannis; Decorte, Ronny

2015-03-01

Currently, several different Y-chromosomal phylogenies and haplogroup nomenclatures are presented in scientific literature and at conferences demonstrating the present diversity in Y-chromosomal phylogenetic trees and Y-SNP sets used within forensic and anthropological research. This situation can be ascribed to the exponential growth of the number of Y-SNPs discovered due to mostly next-generation sequencing (NGS) studies. As Y-SNPs and their respective phylogenetic positions are important in forensics, such as for male lineage characterization and paternal bio-geographic ancestry inference, there is a need for forensic geneticists to know how to deal with these newly identified Y-SNPs and phylogenies, especially since these phylogenies are often created with other aims than to carry out forensic genetic research. Therefore, we give here an overview of four categories of currently used Y-chromosomal phylogenies and the associated Y-SNP sets in scientific research in the current NGS era. We compare these categories based on the construction method, their advantages and disadvantages, the disciplines wherein the phylogenetic tree can be used, and their specific relevance for forensic geneticists. Based on this overview, it is clear that an up-to-date reduced tree with a consensus Y-SNP set and a stable nomenclature will be the most appropriate reference resource for forensic research. Initiatives to reach such an international consensus are therefore highly recommended. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Human renin 5'-flanking DNA to nucleotide-2750.

PubMed

Smith, D L; Jeyapalan, S; Lang, J A; Guo, X H; Sigmund, C D; Morris, B J

1995-01-01

Renin is one of the most important factors in blood pressure and electrolyte regulation in mammals and the renin locus has been implicated in hypertension. To assist studies of promoter control we therefore determined the 5'-flanking sequence of the human gene (REN) to residue -2750 relative to the transcription start site (+1). Sites of homology to consensus sequences for binding of trans-acting factors involved in transcriptional control of other genes were identified, and functionality for two of these (a CRE and Pit-1 site) have so far been demonstrated.
Predicting the reactivity of proteins from their sequence alone: Kazal family of protein inhibitors of serine proteinases

PubMed Central

Lu, Stephen M.; Lu, Wuyuan; Qasim, M. A.; Anderson, Stephen; Apostol, Izydor; Ardelt, Wojciech; Bigler, Theresa; Chiang, Yi Wen; Cook, James; James, Michael N. G.; Kato, Ikunoshin; Kelly, Clyde; Kohr, William; Komiyama, Tomoko; Lin, Tiao-Yin; Ogawa, Michio; Otlewski, Jacek; Park, Soon-Jae; Qasim, Sabiha; Ranjbar, Michael; Tashiro, Misao; Warne, Nicholas; Whatley, Harry; Wieczorek, Anna; Wieczorek, Maciej; Wilusz, Tadeusz; Wynn, Richard; Zhang, Wenlei; Laskowski, Michael

2001-01-01

An additivity-based sequence to reactivity algorithm for the interaction of members of the Kazal family of protein inhibitors with six selected serine proteinases is described. Ten consensus variable contact positions in the inhibitor were identified, and the 19 possible variants at each of these positions were expressed. The free energies of interaction of these variants and the wild type were measured. For an additive system, this data set allows for the calculation of all possible sequences, subject to some restrictions. The algorithm was extensively tested. It is exceptionally fast so that all possible sequences can be predicted. The strongest, the most specific possible, and the least specific inhibitors were designed, and an evolutionary problem was solved. PMID:11171964
Detection of a new bat gammaherpesvirus in the Philippines.

PubMed

Watanabe, Shumpei; Ueda, Naoya; Iha, Koichiro; Masangkay, Joseph S; Fujii, Hikaru; Alviola, Phillip; Mizutani, Tetsuya; Maeda, Ken; Yamane, Daisuke; Walid, Azab; Kato, Kentaro; Kyuwa, Shigeru; Tohya, Yukinobu; Yoshikawa, Yasuhiro; Akashi, Hiroomi

2009-08-01

A new bat herpesvirus was detected in the spleen of an insectivorous bat (Hipposideros diadema, family Hipposideridae) collected on Panay Island, the Philippines. PCR analyses were performed using COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOPs) targeting the herpesvirus DNA polymerase (DPOL) gene. Although we obtained PCR products with CODEHOPs, direct sequencing using the primers was not possible because of high degree of degeneracy. Direct sequencing technology developed in our rapid determination system of viral RNA sequences (RDV) was applied in this study, and a partial DPOL nucleotide sequence was determined. In addition, a partial gB gene nucleotide sequence was also determined using the same strategy. We connected the partial gB and DPOL sequences with long-distance PCR, and a 3741-bp nucleotide fragment, including the 3' part of the gB gene and the 5' part of the DPOL gene, was finally determined. Phylogenetic analysis showed that the sequence was novel and most similar to those of the subfamily Gammaherpesvirinae.
Structural basis for lack of ADP-ribosyltransferase activity in poly(ADP-ribose) polymerase-13/zinc finger antiviral protein.

PubMed

Karlberg, Tobias; Klepsch, Mirjam; Thorsell, Ann-Gerd; Andersson, C David; Linusson, Anna; Schüler, Herwig

2015-03-20

The mammalian poly(ADP-ribose) polymerase (PARP) family includes ADP-ribosyltransferases with diphtheria toxin homology (ARTD). Most members have mono-ADP-ribosyltransferase activity. PARP13/ARTD13, also called zinc finger antiviral protein, has roles in viral immunity and microRNA-mediated stress responses. PARP13 features a divergent PARP homology domain missing a PARP consensus sequence motif; the domain has enigmatic functions and apparently lacks catalytic activity. We used x-ray crystallography, molecular dynamics simulations, and biochemical analyses to investigate the structural requirements for ADP-ribosyltransferase activity in human PARP13 and two of its functional partners in stress granules: PARP12/ARTD12, and PARP15/BAL3/ARTD7. The crystal structure of the PARP homology domain of PARP13 shows obstruction of the canonical active site, precluding NAD(+) binding. Molecular dynamics simulations indicate that this closed cleft conformation is maintained in solution. Introducing consensus side chains in PARP13 did not result in 3-aminobenzamide binding, but in further closure of the site. Three-dimensional alignment of the PARP homology domains of PARP13, PARP12, and PARP15 illustrates placement of PARP13 residues that deviate from the PARP family consensus. Introducing either one of two of these side chains into the corresponding positions in PARP15 abolished PARP15 ADP-ribosyltransferase activity. Taken together, our results show that PARP13 lacks the structural requirements for ADP-ribosyltransferase activity. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Analysis of a cis-Acting Element Involved in Regulation by Estrogen of Human Angiotensinogen Gene Expression.

PubMed

Zhao, Yan-Yan; Sun, Kai-Lai; Ashok, Kumar

1998-01-01

The work was aimed to identify the estrogen responsive element in the human angiotensinogen gene. The nucleotide sequence between the transcription initiation site and TATA box in angiotensinogen gene promoter was found to be strongly homologous with the consensus estrogen responsive element. This sequence was confirmed as the estrogen responsive element (HAG ERE) by electrophoretic mobility shift assay. The recombinant expression vectors were constructed in which chloramphenicol acetyltransferase (CAT) reporter gene was driven by angiotensinogen core promoter with HAG ERE of by TK core promoter with multiplied HAG ERE, and were used in cotransfection with the human estrogen receptor expression vector into HepG(2) cells; CAT assays showed an increase of the CAT activity on 17beta-estradiol treatment in those transfectants. These results suggest that the human angiotensinogen gene is transcriptionally up-regulated by estrogen through the estrogen responsive element near TATA box of the promoter.
Synthetic signal sequences that enable efficient secretory protein production in the yeast Kluyveromyces marxianus.

PubMed

Yarimizu, Tohru; Nakamura, Mikiko; Hoshida, Hisashi; Akada, Rinji

2015-02-14

Targeting of cellular proteins to the extracellular environment is directed by a secretory signal sequence located at the N-terminus of a secretory protein. These signal sequences usually contain an N-terminal basic amino acid followed by a stretch containing hydrophobic residues, although no consensus signal sequence has been identified. In this study, simple modeling of signal sequences was attempted using Gaussia princeps secretory luciferase (GLuc) in the yeast Kluyveromyces marxianus, which allowed comprehensive recombinant gene construction to substitute synthetic signal sequences. Mutational analysis of the GLuc signal sequence revealed that the GLuc hydrophobic peptide length was lower limit for effective secretion and that the N-terminal basic residue was indispensable. Deletion of the 16th Glu caused enhanced levels of secreted protein, suggesting that this hydrophilic residue defined the boundary of a hydrophobic peptide stretch. Consequently, we redesigned this domain as a repeat of a single hydrophobic amino acid between the N-terminal Lys and C-terminal Glu. Stretches consisting of Phe, Leu, Ile, or Met were effective for secretion but the number of residues affected secretory activity. A stretch containing sixteen consecutive methionine residues (M16) showed the highest activity; the M16 sequence was therefore utilized for the secretory production of human leukemia inhibitory factor protein in yeast, resulting in enhanced secreted protein yield. We present a new concept for the provision of secretory signal sequence ability in the yeast K. marxianus, determined by the number of residues of a single hydrophobic residue located between N-terminal basic and C-terminal acidic amino acid boundaries.
Unveiling the Hybrid Genome Structure of Escherichia coli RR1 (HB101 RecA+)

PubMed Central

Jeong, Haeyoung; Sim, Young Mi; Kim, Hyun Ju; Lee, Sang Jun

2017-01-01

There have been extensive genome sequencing studies for Escherichia coli strains, particularly for pathogenic isolates, because fast determination of pathogenic potential and/or drug resistance and their propagation routes is crucial. For laboratory E. coli strains, however, genome sequence information is limited except for several well-known strains. We determined the complete genome sequence of laboratory E. coli strain RR1 (HB101 RecA+), which has long been used as a general cloning host. A hybrid genome sequence of K-12 MG1655 and B BL21(DE3) was constructed based on the initial mapping of Illumina HiSeq reads to each reference, and iterative rounds of read mapping, variant detection, and consensus extraction were carried out. Finally, PCR and Sanger sequencing-based finishing were applied to resolve non-single nucleotide variant regions with aberrant read depths and breakpoints, most of them resulting from prophages and insertion sequence transpositions that are not present in the reference genome sequence. We found that 96.9% of the RR1 genome is derived from K-12, and identified exact crossover junctions between K-12 and B genomic fragments. However, because RR1 has experienced a series of genetic manipulations since branching from the common ancestor, it has a set of mutations different from those found in K-12 MG1655. As well as identifying all known genotypes of RR1 on the basis of genomic context, we found novel mutations. Our results extend current knowledge of the genotype of RR1 and its relatives, and provide insights into the pedigree, genomic background, and physiology of common laboratory strains. PMID:28421066
An ethnically relevant consensus Korean reference genome is a step towards personal reference genomes

PubMed Central

Cho, Yun Sung; Kim, Hyunho; Kim, Hak-Min; Jho, Sungwoong; Jun, JeHoon; Lee, Yong Joo; Chae, Kyun Shik; Kim, Chang Geun; Kim, Sangsoo; Eriksson, Anders; Edwards, Jeremy S.; Lee, Semin; Kim, Byung Chul; Manica, Andrea; Oh, Tae-Kwang; Church, George M.; Bhak, Jong

2016-01-01

Human genomes are routinely compared against a universal reference. However, this strategy could miss population-specific and personal genomic variations, which may be detected more efficiently using an ethnically relevant or personal reference. Here we report a hybrid assembly of a Korean reference genome (KOREF) for constructing personal and ethnic references by combining sequencing and mapping methods. We also build its consensus variome reference, providing information on millions of variants from 40 additional ethnically homogeneous genomes from the Korean Personal Genome Project. We find that the ethnically relevant consensus reference can be beneficial for efficient variant detection. Systematic comparison of human assemblies shows the importance of assembly quality, suggesting the necessity of new technologies to comprehensively map ethnic and personal genomic structure variations. In the era of large-scale population genome projects, the leveraging of ethnicity-specific genome assemblies as well as the human reference genome will accelerate mapping all human genome diversity. PMID:27882922
Archaebacterial rhodopsin sequences: Implications for evolution

NASA Technical Reports Server (NTRS)

Lanyi, J. K.

1991-01-01

It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
BioWord: A sequence manipulation suite for Microsoft Word

PubMed Central

2012-01-01

Background The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. Results BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. Conclusions BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms. PMID:22676326
Identification and characterization of ARS-like sequences as putative origin(s) of replication in human malaria parasite Plasmodium falciparum.

PubMed

Agarwal, Meetu; Bhowmick, Krishanu; Shah, Kushal; Krishnamachari, Annangarachari; Dhar, Suman Kumar

2017-08-01

DNA replication is a fundamental process in genome maintenance, and initiates from several genomic sites (origins) in eukaryotes. In Saccharomyces cerevisiae, conserved sequences known as autonomously replicating sequences (ARSs) provide a landing pad for the origin recognition complex (ORC), leading to replication initiation. Although origins from higher eukaryotes share some common sequence features, the definitive genomic organization of these sites remains elusive. The human malaria parasite Plasmodium falciparum undergoes multiple rounds of DNA replication; therefore, control of initiation events is crucial to ensure proper replication. However, the sites of DNA replication initiation and the mechanism by which replication is initiated are poorly understood. Here, we have identified and characterized putative origins in P. falciparum by bioinformatics analyses and experimental approaches. An autocorrelation measure method was initially used to search for regions with marked fluctuation (dips) in the chromosome, which we hypothesized might contain potential origins. Indeed, S. cerevisiae ARS consensus sequences were found in dip regions. Several of these P. falciparum sequences were validated with chromatin immunoprecipitation-quantitative PCR, nascent strand abundance and a plasmid stability assay. Subsequently, the same sequences were used in yeast to confirm their potential as origins in vivo. Our results identify the presence of functional ARSs in P. falciparum and provide meaningful insights into replication origins in these deadly parasites. These data could be useful in designing transgenic vectors with improved stability for transfection in P. falciparum. © 2017 Federation of European Biochemical Societies.
Initial sequence characterization of the rhabdoviruses of squamate reptiles, including a novel rhabdovirus from a caiman lizard (Dracaena guianensis)

PubMed Central

Wellehan, James F.X.; Pessier, Allan P.; Archer, Linda L.; Childress, April L.; Jacobson, Elliott R.; Tesh, Robert B.

2012-01-01

Rhabdoviruses infect a variety of hosts, including non-avian reptiles. Consensus PCR techniques were used to obtain partial RNA-dependent RNA polymerase gene sequence from five rhabdoviruses of South American lizards; Marco, Chaco, Timbo, Sena Madureira, and a rhabdovirus from a caiman lizard (Dracaena guianensis). The caiman lizard rhabdovirus formed inclusions in erythrocytes, which may be a route for infecting hematophagous insects. This is the first information on behavior of a rhabdovirus in squamates. We also obtained sequence from two rhabdoviruses of Australian lizards, confirming previous Charleville virus sequence and finding that, unlike a previous sequence report but in agreement with serologic reports, Almpiwar virus is clearly distinct from Charleville virus. Bayesian and maximum likelihood phylogenetic analysis revealed that most known rhabdoviruses of squamates cluster in the Almpiwar subgroup. The exception is Marco virus, which is found in the Hart Park group. PMID:22397930
Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

PubMed

Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

2014-01-01

Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

Stability switches of arbitrary high-order consensus in multiagent networks with time delays.

PubMed

Yang, Bo

2013-01-01

High-order consensus seeking, in which individual high-order dynamic agents share a consistent view of the objectives and the world in a distributed manner, finds its potential broad applications in the field of cooperative control. This paper presents stability switches analysis of arbitrary high-order consensus in multiagent networks with time delays. By employing a frequency domain method, we explicitly derive analytical equations that clarify a rigorous connection between the stability of general high-order consensus and the system parameters such as the network topology, communication time-delays, and feedback gains. Particularly, our results provide a general and a fairly precise notion of how increasing communication time-delay causes the stability switches of consensus. Furthermore, under communication constraints, the stability and robustness problems of consensus algorithms up to third order are discussed in details to illustrate our central results. Numerical examples and simulation results for fourth-order consensus are provided to demonstrate the effectiveness of our theoretical results.
Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing.

PubMed

Cornman, Robert Scott; Boncristiani, Humberto; Dainat, Benjamin; Chen, Yanping; vanEngelsdorp, Dennis; Weaver, Daniel; Evans, Jay D

2013-03-07

Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li's D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li's D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low FST. This initial survey of genetic variation within honey bee RNA viruses suggests future directions for studies examining the underlying causes of population-genetic structure in these economically important pathogens.
Management of patients with advanced prostate cancer: recommendations of the St Gallen Advanced Prostate Cancer Consensus Conference (APCCC) 2015

PubMed Central

Gillessen, S.; Omlin, A.; Attard, G.; de Bono, J. S.; Efstathiou, E.; Fizazi, K.; Halabi, S.; Nelson, P. S.; Sartor, O.; Smith, M. R.; Soule, H. R.; Akaza, H.; Beer, T. M.; Beltran, H.; Chinnaiyan, A. M.; Daugaard, G.; Davis, I. D.; De Santis, M.; Drake, C. G.; Eeles, R. A.; Fanti, S.; Gleave, M. E.; Heidenreich, A.; Hussain, M.; James, N. D.; Lecouvet, F. E.; Logothetis, C. J.; Mastris, K.; Nilsson, S.; Oh, W. K.; Olmos, D.; Padhani, A. R.; Parker, C.; Rubin, M. A.; Schalken, J. A.; Scher, H. I.; Sella, A.; Shore, N. D.; Small, E. J.; Sternberg, C. N.; Suzuki, H.; Sweeney, C. J.; Tannock, I. F.; Tombal, B.

2015-01-01

The first St Gallen Advanced Prostate Cancer Consensus Conference (APCCC) Expert Panel identified and reviewed the available evidence for the ten most important areas of controversy in advanced prostate cancer (APC) management. The successful registration of several drugs for castration-resistant prostate cancer and the recent studies of chemo-hormonal therapy in men with castration-naïve prostate cancer have led to considerable uncertainty as to the best treatment choices, sequence of treatment options and appropriate patient selection. Management recommendations based on expert opinion, and not based on a critical review of the available evidence, are presented. The various recommendations carried differing degrees of support, as reflected in the wording of the article text and in the detailed voting results recorded in supplementary Material, available at Annals of Oncology online. Detailed decisions on treatment as always will involve consideration of disease extent and location, prior treatments, host factors, patient preferences as well as logistical and economic constraints. Inclusion of men with APC in clinical trials should be encouraged. PMID:26041764
Molecular dynamics study of the phosphorylation effect on the conformational states of the C-terminal domain of RNA polymerase II.

PubMed

Yonezawa, Yasushige

2014-05-01

The carboxyl-terminal domain (CTD) of RNA polymerase II in eukaryotes regulates mRNA processing processes by recruiting various regulation factors. A main function of the CTD relies on the heptad consensus sequence (YSPTSPS). The CTD dynamically changes its conformational state to recognize and bind different regulation factors. The dynamical conformation changes are caused by modifications, mainly phosphorylation and dephosphorylation, to the serine residues. In this study, we investigate the conformational states of the unit consensus CTD peptide with various phosphorylation patterns of the serine residues by extended ensemble simulations. The results show that the CTD without phosphorylation has a flexible disordered structure distributed between twisted and extended states, but phosphorylation tends to reduce the conformational space. It was found that phosphorylation induces a β-turn around the phosphorylated serine residue and the cis conformation of the proline residue significantly inhibits the β-turn formation. The β-turn should contribute to specific CTD binding of the different regulation factors by changing the conformation propensity combined with induced fit.
Comparison of sequencing the D2 region of the large subunit ribosomal RNA gene (MicroSEQ®) versus the internal transcribed spacer (ITS) regions using two public databases for identification of common and uncommon clinically relevant fungal species.

PubMed

Arbefeville, S; Harris, A; Ferrieri, P

2017-09-01

Fungal infections cause considerable morbidity and mortality in immunocompromised patients. Rapid and accurate identification of fungi is essential to guide accurately targeted antifungal therapy. With the advent of molecular methods, clinical laboratories can use new technologies to supplement traditional phenotypic identification of fungi. The aims of the study were to evaluate the sole commercially available MicroSEQ® D2 LSU rDNA Fungal Identification Kit compared to the in-house developed internal transcribed spacer (ITS) regions assay in identifying moulds, using two well-known online public databases to analyze sequenced data. 85 common and uncommon clinically relevant fungi isolated from clinical specimens were sequenced for the D2 region of the large subunit (LSU) of ribosomal RNA (rRNA) gene with the MicroSEQ® Kit and the ITS regions with the in house developed assay. The generated sequenced data were analyzed with the online GenBank and MycoBank public databases. The D2 region of the LSU rRNA gene identified 89.4% or 92.9% of the 85 isolates to the genus level and the full ITS region (f-ITS) 96.5% or 100%, using GenBank or MycoBank, respectively, when compared to the consensus ID. When comparing species-level designations to the consensus ID, D2 region of the LSU rRNA gene aligned with 44.7% (38/85) or 52.9% (45/85) of these isolates in GenBank or MycoBank, respectively. By comparison, f-ITS possessed greater specificity, followed by ITS1, then ITS2 regions using GenBank or MycoBank. Using GenBank or MycoBank, D2 region of the LSU rRNA gene outperformed phenotypic based ID at the genus level. Comparing rates of ID between D2 region of the LSU rRNA gene and the ITS regions in GenBank or MycoBank at the species level against the consensus ID, f-ITS and ITS2 exceeded performance of the D2 region of the LSU rRNA gene, but ITS1 had similar performance to the D2 region of the LSU rRNA gene using MycoBank. Our results indicated that the MicroSEQ® D2 LSU rDNA Fungal Identification Kit was equivalent to the in-house developed ITS regions assay to identify fungi at the genus level. The MycoBank database gave a better curated database and thus allowed a better genus and species identification for both D2 region of the LSU rRNA gene and ITS regions. Copyright © 2017 Elsevier B.V. All rights reserved.
The neurotoxicant PCB-95 by increasing the neuronal transcriptional repressor REST down-regulates caspase-8 and increases Ripk1, Ripk3 and MLKL expression determining necroptotic neuronal death.

PubMed

Guida, Natascia; Laudati, Giusy; Serani, Angelo; Mascolo, Luigi; Molinaro, Pasquale; Montuori, Paolo; Di Renzo, Gianfranco; Canzoniero, Lorella M T; Formisano, Luigi

2017-10-15

Our previous study showed that the environmental neurotoxicant non-dioxin-like polychlorinated biphenyl (PCB)-95 increases RE1-silencing transcription factor (REST) expression, which is related to necrosis, but not apoptosis, of neurons. Meanwhile, necroptosis is a type of a programmed necrosis that is positively regulated by receptor interacting protein kinase 1 (RIPK1), RIPK3 and mixed lineage kinase domain-like (MLKL) and negatively regulated by caspase-8. Here we evaluated whether necroptosis contributes to PCB-95-induced neuronal death through REST up-regulation. Our results demonstrated that in cortical neurons PCB-95 increased RIPK1, RIPK3, and MLKL expression and decreased caspase-8 at the gene and protein level. Furthermore, the RIPK1 inhibitor necrostatin-1 or siRNA-mediated RIPK1, RIPK3 and MLKL expression knockdown significantly reduced PCB-95-induced neuronal death. Intriguingly, PCB-95-induced increases in RIPK1, RIPK3, MLKL expression and decreases in caspase-8 expression were reversed by knockdown of REST expression with a REST-specific siRNA (siREST). Notably, in silico analysis of the rat genome identified a REST consensus sequence in the caspase-8 gene promoter (Casp8-RE1), but not the RIPK1, RIPK3 and MLKL promoters. Interestingly, in PCB-95-treated neurons, REST binding to the Casp8-RE1 sequence increased in parallel with a reduction in its promoter activity, whereas under the same experimental conditions, transfection of siREST or mutation of the Casp8-RE1 sequence blocked PCB-95-induced caspase-8 reduction. Since RIPK1, RIPK3 and MLKL rat genes showed no putative REST binding site, we assessed whether the transcription factor cAMP Responsive Element Binding Protein (CREB), which has a consensus sequence in all three genes, affected neuronal death. In neurons treated with PCB-95, CREB protein expression decreased in parallel with a reduction in binding to the RIPK1, RIPK3 and MLKL gene promoter sequence. Furthermore, CREB overexpression was associated with reduced promoter activity of the RIPK1, RIPK3 and MLKL genes. Collectively, these results indicate that PCB-95 was associated with REST-induced necroptotic cell death by increasing RIPK1, RIPK3 and MLKL expression and reducing caspase-8 levels. In addition, since REST is involved in several neurological disorders, therapies that block REST-induced necroptosis could be a new strategy to revert the neurodetrimental effects associated to its overexpression. Copyright © 2017 Elsevier Inc. All rights reserved.
Direct inhibition of the DNA-binding activity of POU transcription factors Pit-1 and Brn-3 by selective binding of a phenyl-furan-benzimidazole dication.

PubMed

Peixoto, Paul; Liu, Yang; Depauw, Sabine; Hildebrand, Marie-Paule; Boykin, David W; Bailly, Christian; Wilson, W David; David-Cordonnier, Marie-Hélène

2008-06-01

The development of small molecules to control gene expression could be the spearhead of future-targeted therapeutic approaches in multiple pathologies. Among heterocyclic dications developed with this aim, a phenyl-furan-benzimidazole dication DB293 binds AT-rich sites as a monomer and 5'-ATGA sequence as a stacked dimer, both in the minor groove. Here, we used a protein/DNA array approach to evaluate the ability of DB293 to specifically inhibit transcription factors DNA-binding in a single-step, competitive mode. DB293 inhibits two POU-domain transcription factors Pit-1 and Brn-3 but not IRF-1, despite the presence of an ATGA and AT-rich sites within all three consensus sequences. EMSA, DNase I footprinting and surface-plasmon-resonance experiments determined the precise binding site, affinity and stoichiometry of DB293 interaction to the consensus targets. Binding of DB293 occurred as a cooperative dimer on the ATGA part of Brn-3 site but as two monomers on AT-rich sites of IRF-1 sequence. For Pit-1 site, ATGA or AT-rich mutated sequences identified the contribution of both sites for DB293 recognition. In conclusion, DB293 is a strong inhibitor of two POU-domain transcription factors through a cooperative binding to ATGA. These findings are the first to show that heterocyclic dications can inhibit major groove transcription factors and they open the door to the control of transcription factors activity by those compounds.
Trans splicing in Leishmania enriettii and identification of ribonucleoprotein complexes containing the spliced leader and U2 equivalent RNAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, S.I.; Wirth, D.F.

1988-06-01

The 5' ends of Leishmania mRNAs contain an identical 35-nucleotide sequence termed the spliced leader (SL) or 5' mini-exon. The SL sequence is at the 5' end of an 85-nucleotide primary transcript that contains a consensus eucaryotic 5' intron-exon splice junction immediately 3' to the SL. The SL is added to protein-coding genes immediately 3' to a consensus eucaryotic 3' intron-exon splice junction. The authors' previous work demonstrated possible intermediates in discontinuous mRNA processing that contain the 50 nucleotides of the SL primary transcript 3' to the SL, the SL intron sequence (SLIS). These RNAs have a 5' terminus atmore » the splice junction of the SL and the SLIS. The authors examined a Leishmania nuclear extract for these RNAs in ribonucleoprotein (RNP) particles. Density centrifugation analysis showed that the SL RNA is predominately in RNP complexes at 60S, while the SLIS-containing RNAs are in complexes at 40S. They also demonstrated that the SLIS can be released from polyadenylated RNA by incubation with a HeLa cell extract containing debranching enzymatic activity. These data suggested that Leishmania enriettii mRNAs are assembled by bimolecular or trans splicing as has been recently demonstrated for Trypanosoma brucei. Furthermore, they determined the partial sequence of the Leishmania U2 equivalent RNA and demonstrated that it cosediments with the SL RNA at 60S in a nuclear extract. These RNP particles may be analogous to so-called spliceosomes that have been demonstrated in other systems.« less
A High-Density Consensus Map of Common Wheat Integrating Four Mapping Populations Scanned by the 90K SNP Array

PubMed Central

Wen, Weie; He, Zhonghu; Gao, Fengmei; Liu, Jindong; Jin, Hui; Zhai, Shengnan; Qu, Yanying; Xia, Xianchun

2017-01-01

A high-density consensus map is a powerful tool for gene mapping, cloning and molecular marker-assisted selection in wheat breeding. The objective of this study was to construct a high-density, single nucleotide polymorphism (SNP)-based consensus map of common wheat (Triticum aestivum L.) by integrating genetic maps from four recombinant inbred line populations. The populations were each genotyped using the wheat 90K Infinium iSelect SNP assay. A total of 29,692 SNP markers were mapped on 21 linkage groups corresponding to 21 hexaploid wheat chromosomes, covering 2,906.86 cM, with an overall marker density of 10.21 markers/cM. Compared with the previous maps based on the wheat 90K SNP chip detected 22,736 (76.6%) of the SNPs with consistent chromosomal locations, whereas 1,974 (6.7%) showed different chromosomal locations, and 4,982 (16.8%) were newly mapped. Alignment of the present consensus map and the wheat expressed sequence tags (ESTs) Chromosome Bin Map enabled assignment of 1,221 SNP markers to specific chromosome bins and 819 ESTs were integrated into the consensus map. The marker orders of the consensus map were validated based on physical positions on the wheat genome with Spearman rank correlation coefficients ranging from 0.69 (4D) to 0.97 (1A, 4B, 5B, and 6A), and were also confirmed by comparison with genetic position on the previously 40K SNP consensus map with Spearman rank correlation coefficients ranging from 0.84 (6D) to 0.99 (6A). Chromosomal rearrangements reported previously were confirmed in the present consensus map and new putative rearrangements were identified. In addition, an integrated consensus map was developed through the combination of five published maps with ours, containing 52,607 molecular markers. The consensus map described here provided a high-density SNP marker map and a reliable order of SNPs, representing a step forward in mapping and validation of chromosomal locations of SNPs on the wheat 90K array. Moreover, it can be used as a reference for quantitative trait loci (QTL) mapping to facilitate exploitation of genes and QTL in wheat breeding. PMID:28848588
Complete genome sequences of two strains of Treponema pallidum subsp. pertenue from Ghana, Africa: Identical genome sequences in samples isolated more than 7 years apart.

PubMed

Strouhal, Michal; Mikalová, Lenka; Havlíčková, Pavla; Tenti, Paolo; Čejková, Darina; Rychlík, Ivan; Bruisten, Sylvia; Šmajs, David

2017-09-01

Treponema pallidum subsp. pertenue (TPE) is the causative agent of yaws, a multi-stage disease, endemic in tropical regions of Africa, Asia, Oceania, and South America. To date, four TPE strains have been completely sequenced including three TPE strains of human origin (Samoa D, CDC-2, and Gauthier) and one TPE strain (Fribourg-Blanc) isolated from a baboon. All TPE strains are highly similar to T. pallidum subsp. pallidum (TPA) strains. The mutation rate in syphilis and related treponemes has not been experimentally determined yet. Complete genomes of two TPE strains, CDC 2575 and Ghana-051, that infected patients in Ghana and were isolated in 1980 and 1988, respectively, were sequenced and analyzed. Both strains had identical consensus genome nucleotide sequences raising the question whether TPE CDC 2575 and Ghana-051 represent two different strains. Several lines of evidence support the fact that both strains represent independent samples including regions showing intrastrain heterogeneity (13 and 5 intrastrain heterogeneous sites in TPE Ghana-051 and TPE CDC 2575, respectively). Four of these heterogeneous sites were found in both genomes but the frequency of alternative alleles differed. The identical consensus genome sequences were used to estimate the upper limit of the yaws treponeme evolution rate, which was 4.1 x 10-10 nucleotide changes per site per generation. The estimated upper limit for the mutation rate of TPE was slightly lower than the mutation rate of E. coli, which was determined during a long-term experiment. Given the known diversity between TPA and TPE genomes and the assumption that both TPA and TPE have a similar mutation rate, the most recent common ancestor of syphilis and yaws treponemes appears to be more than ten thousand years old and likely even older.
Isolation of nucleotide binding site-leucine rich repeat and kinase resistance gene analogues from sugarcane (Saccharum spp.).

PubMed

Glynn, Neil C; Comstock, Jack C; Sood, Sushma G; Dang, Phat M; Chaparro, Jose X

2008-01-01

Resistance gene analogues (RGAs) have been isolated from many crops and offer potential in breeding for disease resistance through marker-assisted selection, either as closely linked or as perfect markers. Many R-gene sequences contain kinase domains, and indeed kinase genes have been reported as being proximal to R-genes, making kinase analogues an additionally promising target. The first step towards utilizing RGAs as markers for disease resistance is isolation and characterization of the sequences. Sugarcane clone US01-1158 was identified as resistant to yellow leaf caused by the sugarcane yellow leaf virus (SCYLV) and moderately resistant to rust caused by Puccinia melanocephala Sydow & Sydow. Degenerate primers that had previously proved useful for isolating RGAs and kinase analogues in wheat and soybean were used to amplify DNA from sugarcane (Saccharum spp.) clone US-01-1158. Sequences generated from 1512 positive clones were assembled into 134 contigs of between two and 105 sequences. Comparison of the contig consensuses with the NCBI sequence database using BLASTx showed that 20 had sequence homology to nuclear binding site and leucine rich repeat (NBS-LRR) RGAs, and eight to kinase genes. Alignment of the deduced amino acid sequences with similar sequences from the NCBI database allowed the identification of several conserved domains. The alignment and resulting phenetic tree showed that many of the sequences had greater similarity to sequences from other species than to one another. The use of degenerate primers is a useful method for isolating novel sugarcane RGA and kinase gene analogues. Further studies are needed to evaluate the role of these genes in disease resistance.
DArT Markers Effectively Target Gene Space in the Rye Genome

PubMed Central

Gawroński, Piotr; Pawełkowicz, Magdalena; Tofil, Katarzyna; Uszyński, Grzegorz; Sharifova, Saida; Ahluwalia, Shivaksh; Tyrka, Mirosław; Wędzony, Maria; Kilian, Andrzej; Bolibok-Brągoszewska, Hanna

2016-01-01

Large genome size and complexity hamper considerably the genomics research in relevant species. Rye (Secale cereale L.) has one of the largest genomes among cereal crops and repetitive sequences account for over 90% of its length. Diversity Arrays Technology is a high-throughput genotyping method, in which a preferential sampling of gene-rich regions is achieved through the use of methylation sensitive restriction enzymes. We obtained sequences of 6,177 rye DArT markers and following a redundancy analysis assembled them into 3,737 non-redundant sequences, which were then used in homology searches against five Pooideae sequence sets. In total 515 DArT sequences could be incorporated into publicly available rye genome zippers providing a starting point for the integration of DArT- and transcript-based genomics resources in rye. Using Blast2Go pipeline we attributed putative gene functions to 1101 (29.4%) of the non-redundant DArT marker sequences, including 132 sequences with putative disease resistance-related functions, which were found to be preferentially located in the 4RL and 6RL chromosomes. Comparative analysis based on the DArT sequences revealed obvious inconsistencies between two recently published high density consensus maps of rye. Furthermore we demonstrated that DArT marker sequences can be a source of SSR polymorphisms. Obtained data demonstrate that DArT markers effectively target gene space in the large, complex, and repetitive rye genome. Through the annotation of putative gene functions and the alignment of DArT sequences relative to reference genomes we obtained information, that will complement the results of the studies, where DArT genotyping was deployed, by simplifying the gene ontology and microcolinearity based identification of candidate genes. PMID:27833625
DArT Markers Effectively Target Gene Space in the Rye Genome.

PubMed

Gawroński, Piotr; Pawełkowicz, Magdalena; Tofil, Katarzyna; Uszyński, Grzegorz; Sharifova, Saida; Ahluwalia, Shivaksh; Tyrka, Mirosław; Wędzony, Maria; Kilian, Andrzej; Bolibok-Brągoszewska, Hanna

2016-01-01

Large genome size and complexity hamper considerably the genomics research in relevant species. Rye ( Secale cereale L.) has one of the largest genomes among cereal crops and repetitive sequences account for over 90% of its length. Diversity Arrays Technology is a high-throughput genotyping method, in which a preferential sampling of gene-rich regions is achieved through the use of methylation sensitive restriction enzymes. We obtained sequences of 6,177 rye DArT markers and following a redundancy analysis assembled them into 3,737 non-redundant sequences, which were then used in homology searches against five Pooideae sequence sets. In total 515 DArT sequences could be incorporated into publicly available rye genome zippers providing a starting point for the integration of DArT- and transcript-based genomics resources in rye. Using Blast2Go pipeline we attributed putative gene functions to 1101 (29.4%) of the non-redundant DArT marker sequences, including 132 sequences with putative disease resistance-related functions, which were found to be preferentially located in the 4RL and 6RL chromosomes. Comparative analysis based on the DArT sequences revealed obvious inconsistencies between two recently published high density consensus maps of rye. Furthermore we demonstrated that DArT marker sequences can be a source of SSR polymorphisms. Obtained data demonstrate that DArT markers effectively target gene space in the large, complex, and repetitive rye genome. Through the annotation of putative gene functions and the alignment of DArT sequences relative to reference genomes we obtained information, that will complement the results of the studies, where DArT genotyping was deployed, by simplifying the gene ontology and microcolinearity based identification of candidate genes.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer

PubMed Central

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph

2017-01-01

Abstract Background: Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Results: Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Conclusion: Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. PMID:28369459
Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

PubMed Central

Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

2003-01-01

To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
SeqMule: automated pipeline for analysis of human exome/genome sequencing data.

PubMed

Guo, Yunfei; Ding, Xiaolei; Shen, Yufeng; Lyon, Gholson J; Wang, Kai

2015-09-18

Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org.
A variable DNA recognition site organization establishes the LiaR-mediated cell envelope stress response of enterococci to daptomycin

DOE PAGES

Davlieva, Milya; Shi, Yiwen; Leonard, Paul G.; ...

2015-04-19

LiaR is a ‘master regulator’ of the cell envelope stress response in enterococci and many other Gram-positive organisms. Mutations to liaR can lead to antibiotic resistance to a variety of antibiotics including the cyclic lipopeptide daptomycin. LiaR is phosphorylated in response to membrane stress to regulate downstream target operons. Using DNA footprinting of the regions upstream of the liaXYZ and liaFSR operons we show that LiaR binds an extended stretch of DNA that extends beyond the proposed canonical consensus sequence suggesting a more complex level of regulatory control of target operons. We go on to determine the biochemical and structuralmore » basis for increased resistance to daptomycin by the adaptive mutation to LiaR (D191N) first identified from the pathogen Enterococcus faecalis S613. LiaR D191N increases oligomerization of LiaR to form a constitutively activated tetramer that has high affinity for DNA even in the absence of phosphorylation leading to increased resistance. The crystal structures of the LiaR DNA binding domain complexed to the putative consensus sequence as well as an adjoining secondary sequence show that upon binding, LiaR induces DNA bending that is consistent with increased recruitment of RNA polymerase to the transcription start site and upregulation of target operons.« less
Quick, sensitive and specific detection and evaluation of quantification of minor variants by high-throughput sequencing.

PubMed

Leung, Ross Ka-Kit; Dong, Zhi Qiang; Sa, Fei; Chong, Cheong Meng; Lei, Si Wan; Tsui, Stephen Kwok-Wing; Lee, Simon Ming-Yuen

2014-02-01

Minor variants have significant implications in quasispecies evolution, early cancer detection and non-invasive fetal genotyping but their accurate detection by next-generation sequencing (NGS) is hampered by sequencing errors. We generated sequencing data from mixtures at predetermined ratios in order to provide insight into sequencing errors and variations that can arise for which simulation cannot be performed. The information also enables better parameterization in depth of coverage, read quality and heterogeneity, library preparation techniques, technical repeatability for mathematical modeling, theory development and simulation experimental design. We devised minor variant authentication rules that achieved 100% accuracy in both testing and validation experiments. The rules are free from tedious inspection of alignment accuracy, sequencing read quality or errors introduced by homopolymers. The authentication processes only require minor variants to: (1) have minimum depth of coverage larger than 30; (2) be reported by (a) four or more variant callers, or (b) DiBayes or LoFreq, plus SNVer (or BWA when no results are returned by SNVer), and with the interassay coefficient of variation (CV) no larger than 0.1. Quantification accuracy undermined by sequencing errors could neither be overcome by ultra-deep sequencing, nor recruiting more variant callers to reach a consensus, such that consistent underestimation and overestimation (i.e. low CV) were observed. To accommodate stochastic error and adjust the observed ratio within a specified accuracy, we presented a proof of concept for the use of a double calibration curve for quantification, which provides an important reference towards potential industrial-scale fabrication of calibrants for NGS.
Comparison of Silent and Conventional MR Imaging for the Evaluation of Myelination in Children

PubMed Central

Matsuo-Hagiyama, Chisato; Watanabe, Yoshiyuki; Tanaka, Hisashi; Takahashi, Hiroto; Arisawa, Atsuko; Yoshioka, Eri; Nabatame, Shin; Nakano, Sayaka; Tomiyama, Noriyuki

2017-01-01

Purpose: Silent magnetic resonance imaging (MRI) scans produce reduced acoustic noise and are considered more gentle for sedated children. The aim of this study was to compare the validity of T1- (T1W) and T2-weighted (T2W) silent sequences for myelination assessment in children with conventional spin-echo sequences. Materials and Methods: A total of 30 children (21 boys, 9 girls; age range: 1–83 months, mean age: 35.5 months, median age: 28.5 months) were examined using both silent and spin-echo sequences. Acoustic noise levels were analyzed and compared. The degree of myelination was qualitatively assessed via consensus, and T1W and T2W signal intensities were quantitatively measured by percent contrast. Results: Acoustic noise levels were significantly lower during silent sequences than during conventional sequences (P < 0.0001 for both T1W and T2W). Inter-method comparison indicated overall good to excellent agreement (T1W and T2W images, κ = 0.76 and 0.80, respectively); however, agreement was poor for cerebellar myelination on T1W images (κ = 0.14). The percent contrast of silent and conventional MRI sequences had a strong correlation (T1W, correlation coefficient [CC] = 0.76; T1W excluding the middle cerebellar peduncle, CC = 0.82; T2W, CC = 0.91). Conclusions: For brain MRI, silent sequences significantly reduced acoustic noise and provided diagnostic image quality for myelination evaluations; however, the two methods differed with respect to cerebellar delineation on T1W sequences. PMID:27795484
Analysis of the regulatory region of the protease III (ptr) gene of Escherichia coli K-12.

PubMed

Claverie-Martin, F; Diaz-Torres, M R; Kushner, S R

1987-01-01

The ptr gene of Escherichia coli encodes protease III (Mr 110,000) and a 50-kDa polypeptide, both of which are found in the periplasmic space. The gene is physically located between the recC and recB loci on the E. coli chromosome. The nucleotide sequence of a 1167-bp EcoRV-ClaI fragment of chromosomal DNA containing the promoter region and 885 bp of the ptr coding sequence has been determined. S1 nuclease mapping analysis showed that the major 5' end of the ptr mRNA was localized 127 bp upstream from the ATG start codon. The open reading frame (ORF), preceded by a Shine-Dalgarno sequence, extends to the end of the sequenced DNA. Downstream from the -35 and -10 regions is a sequence that strongly fits the consensus sequence of known nitrogen-regulated promoters. A signal peptide of 23 amino acids residues is present at the N terminus of the derived amino acid sequence. The cleavage site as well as the ORF were confirmed by sequencing the N terminus of mature protease III.

Brain cDNA clone for human cholinesterase

DOE Office of Scientific and Technical Information (OSTI.GOV)

McTiernan, C.; Adkins, S.; Chatonnet, A.

1987-10-01

A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less
Exit, cohesion, and consensus: social psychological moderators of consensus among adolescent peer groups

PubMed Central

Fisher, Jacob C.

2017-01-01

Virtually all social diffusion work relies on a common formal basis, which predicts that consensus will develop among a connected population as the result of diffusion. In spite of the popularity of social diffusion models that predict consensus, few empirical studies examine consensus, or a clustering of attitudes, directly. Those that do either focus on the coordinating role of strict hierarchies, or on the results of online experiments, and do not consider how consensus occurs among groups in situ. This study uses longitudinal data on adolescent social networks to show how meso-level social structures, such as informal peer groups, moderate the process of consensus formation. Using a novel method for controlling for selection into a group, I find that centralized peer groups, meaning groups with clear leaders, have very low levels of consensus, while cohesive peer groups, meaning groups where more ties hold the members of the group together, have very high levels of consensus. This finding is robust to two different measures of cohesion and consensus. This suggests that consensus occurs either through central leaders’ enforcement or through diffusion of attitudes, but that central leaders have limited ability to enforce when people can leave the group easily. PMID:29335675
Reaching consensus on communication of critical laboratory results using a collective intelligence method.

PubMed

Llovet, Maria Isabel; Biosca, Carmen; Martínez-Iribarren, Alicia; Blanco, Aurora; Busquets, Glòria; Castro, María José; Llopis, Maria Antonia; Montesinos, Mercè; Minchinela, Joana; Perich, Carme; Prieto, Judith; Ruiz, Rosa; Serrat, Núria; Simón, Margarita; Trejo, Alex; Monguet, Josep Maria; López-Pablo, Carlos; Ibarz, Mercè

2018-02-23

There is no consensus in the literature about what analytes or values should be informed as critical results and how they should be communicated. The main aim of this project is to establish consensual standards of critical results for the laboratories participating in the study. Among the project's secondary objectives, establishing consensual procedures for communication can be highlighted. Consensus was reached among all participating laboratories establishing the basis for the construction of the initial model put forward for consensus in conjunction with the clinicians. A real-time Delphi, methodology "health consensus" (HC), with motivating and participative questions was applied. The physician was expected to choose a numeric value within a scale designed for each analyte. The medians of critical results obtained represent the consensus on critical results for outpatient and inpatient care. Both in primary care and in hospital care a high degree of consensus was observed for critical values proposed in the analysis of creatinine, digoxin, phosphorus, glucose, international normalized ratio (INR), leukocytes, magnesium, neutrophils, chloride, sodium, calcium and lithium. For the rest of critical results the degree of consensus obtained was "medium high". The results obtained showed that in 72% of cases the consensual critical value coincided with the medians initially proposed by the laboratories. The real-time Delphi has allowed obtaining consensual standards for communication of critical results among the laboratories participating in the study, which can serve as a basis for other organizations.
Characterization of kinetoplast DNA from Phytomonas serpens.

PubMed

Sá-Carvalho, D; Perez-Morga, D; Traub-Cseko, Y M

1993-01-01

The restriction enzyme digestion of kinetoplast DNA from four Phytomonas serpens isolates shows an overall similar band pattern. One minicircle from isolate 30T was cloned and sequenced, showing low levels of homology but the same general features and organization as described for minicircles of other trypanosomatids. Extensive regions of the minicircle are composed by G and T on the H strand. These regions are very repetitive and similar to regions in a minicircle of Crithidia oncopelti and to telomeric sequences of Saccharomyces cerevisiae. Conserved Sequence Block 3, present in all trypanosomatids, is one nucleotide different from the consensus in P. serpens and provides a basis to differentiate P. serpens from other trypanosomatids. Electron microscopy of kinetoplast DNA evidenced a network with organization similar to other trypanosomatids and the measurement of minicircles confirmed the size of about 1.45 kb of the sequenced minicircle.
Development of an oligonucleotide probe for Aureobasidium pullulans based on the small-subunit rRNA gene.

PubMed Central

Li, S; Cullen, D; Hjort, M; Spear, R; Andrews, J H

1996-01-01

Aureobasidium pullulans, a cosmopolitan yeast-like fungus, colonizes leaf surfaces and has potential as a biocontrol agent of pathogens. To assess the feasibility of rRNA as a target for A. pullulans-specific oligonucleotide probes, we compared the nucleotide sequences of the small-subunit rRNA (18S) genes of 12 geographically diverse A. pullulans strains. Extreme sequence conservation was observed. The consensus A. pullulans sequence was compared with other fungal sequences to identify potential probes. A 21-mer probe which hybridized to the 12 A. pullulans strains but not to 98 other fungi, including 82 isolates from the phylloplane, was identified. A 17-mer highly specific for Cladosporium herbarum was also identified. These probes have potential in monitoring and quantifying fungi in leaf surface and other microbial communities. PMID:8633850
Phylogenetic Analysis of Prevalent Tuberculosis and Non-Tuberculosis Mycobacteria in Isfahan, Iran, Based on a 360 bp Sequence of the rpoB Gene

PubMed Central

Nasr Esfahani, Bahram; Moghim, Sharareh; Ghasemian Safaei, Hajieh; Moghoofei, Mohsen; Sedighi, Mansour; Hadifar, Shima

2016-01-01

Background Taxonomic and phylogenetic studies of Mycobacterium species have been based around the 16sRNA gene for many years. However, due to the high strain similarity between species in the Mycobacterium genus (94.3% - 100%), defining a valid phylogenetic tree is difficult; consequently, its use in estimating the boundaries between species is limited. The sequence of the rpoB gene makes it an appropriate gene for phylogenetic analysis, especially in bacteria with limited variation. Objectives In the present study, a 360bp sequence of rpoB was used for precise classification of Mycobacterium strains isolated in Isfahan, Iran. Materials and Methods From February to October 2013, 57 clinical and environmental isolates were collected, subcultured, and identified by phenotypic methods. After DNA extraction, a 360bp fragment was PCR-amplified and sequenced. The phylogenetic tree was constructed based on consensus sequence data, using MEGA5 software. Results Slow and fast-growing groups of the Mycobacterium strains were clearly differentiated based on the constructed tree of 56 common Mycobacterium isolates. Each species with a unique title in the tree was identified; in total, 13 nods with a bootstrap value of over 50% were supported. Among the slow-growing group was Mycobacterium kansasii, with M. tuberculosis in a cluster with a bootstrap value of 98% and M. gordonae in another cluster with a bootstrap value of 90%. In the fast-growing group, one cluster with a bootstrap value of 89% was defined, including all fast-growing members present in this study. Conclusions The results suggest that only the application of the rpoB gene sequence is sufficient for taxonomic categorization and definition of a new Mycobacterium species, due to its high resolution power and proper variation in its sequence (85% - 100%); the resulting tree has high validity. PMID:27284397
Identification of a penicillin-sensitive carboxypeptidase in the cellular slime mold Dictyostelium discoideum.

PubMed

Yasukawa, Hiro; Kuroita, Toshihiro; Tamura, Kentaro; Yamaguchi, Kazuo

2003-07-01

Penicillin binding proteins (PBPs) are penicillin-sensitive DD-peptidases catalyzing the terminal stages of bacterial cell wall assembly. We identified a Dictyostelium discoideum gene that encodes a protein of 522 amino acids showing similarity to Escherichia coli PBP4. The D. discoideum protein conserves three consensus sequences (SXXK, SXN and KTG) that are responsible for the catalytic activities of PBPs. The gene product prepared in the cell-free translation system showed carboxypeptidase activity but the activity was not detected in the presence of penicillin G. These results demonstrate that the D. discoideum gene encodes a eukaryotic form of penicillin-sensitive carboxypeptidase.
First Brazilian Consensus of Advanced Prostate Cancer: Recommendations for Clinical Practice.

PubMed

Sasse, Andre Deeke; Wiermann, Evanius Garcia; Herchenhorn, Daniel; Bastos, Diogo Assed; Schutz, Fabio A; Maluf, Fernando Cotait; Coura, George; Morbeck, Igor Alexandre Protzner; Cerci, Juliano J; Smaletz, Oren; Lima, Volney Soares; Adamy, Ari; Campos, Franz Santos de; Carvalhal, Gustavo Franco; Cezar, Leandro Casemiro; Dall'Oglio, Marcos Francisco; Sadi, Marcus Vinicius; Reis, Rodolfo Borges Dos; Nogueira, Lucas

2017-01-01

Prostate cancer still represents a major cause of morbidity, and still about 20% of men with the disease are diagnosed or will progress to the advanced stage without the possibility of curative treatment. Despite the recent advances in scientific and technological knowledge and the availability of new therapies, there is still considerable heterogeneity in the therapeutic approaches for metastatic prostate cancer. This article presents a summary of the I Brazilian Consensus on Advanced Prostate Cancer, conducted by the Brazilian Society of Urology and Brazilian Society of Clinical Oncology. Experts were selected by the medical societies involved. Forty issues regarding controversial issues in advanced disease were previously elaborated. The panel met for consensus, with a threshold established for 2/3 of the participants. The treatment of advanced prostate cancer is complex, due to the existence of a large number of therapies, with different response profiles and toxicities. The panel addressed recommendations on preferred choice of therapies, indicators that would justify their change, and indicated some strategies for better sequencing of treatment in order to maximize the potential for disease control with the available therapeutic arsenal. The lack of consensus on some topics clearly indicates the absence of strong evidence for some decisions. Copyright® by the International Brazilian Journal of Urology.
CABS-fold: Server for the de novo and consensus-based prediction of protein structure.

PubMed

Blaszczyk, Maciej; Jamroz, Michal; Kmiecik, Sebastian; Kolinski, Andrzej

2013-07-01

The CABS-fold web server provides tools for protein structure prediction from sequence only (de novo modeling) and also using alternative templates (consensus modeling). The web server is based on the CABS modeling procedures ranked in previous Critical Assessment of techniques for protein Structure Prediction competitions as one of the leading approaches for de novo and template-based modeling. Except for template data, fragmentary distance restraints can also be incorporated into the modeling process. The web server output is a coarse-grained trajectory of generated conformations, its Jmol representation and predicted models in all-atom resolution (together with accompanying analysis). CABS-fold can be freely accessed at http://biocomp.chem.uw.edu.pl/CABSfold.
CABS-fold: server for the de novo and consensus-based prediction of protein structure

PubMed Central

Blaszczyk, Maciej; Jamroz, Michal; Kmiecik, Sebastian; Kolinski, Andrzej

2013-01-01

The CABS-fold web server provides tools for protein structure prediction from sequence only (de novo modeling) and also using alternative templates (consensus modeling). The web server is based on the CABS modeling procedures ranked in previous Critical Assessment of techniques for protein Structure Prediction competitions as one of the leading approaches for de novo and template-based modeling. Except for template data, fragmentary distance restraints can also be incorporated into the modeling process. The web server output is a coarse-grained trajectory of generated conformations, its Jmol representation and predicted models in all-atom resolution (together with accompanying analysis). CABS-fold can be freely accessed at http://biocomp.chem.uw.edu.pl/CABSfold. PMID:23748950
Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fields, C.A.

1996-06-01

The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progressmore » report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.« less
Reducing false-positive incidental findings with ensemble genotyping and logistic regression based variant filtering methods.

PubMed

Hwang, Kyu-Baek; Lee, In-Hee; Park, Jin-Ho; Hambuch, Tina; Choe, Yongjoon; Kim, MinHyeok; Lee, Kyungjoon; Song, Taemin; Neu, Matthew B; Gupta, Neha; Kohane, Isaac S; Green, Robert C; Kong, Sek Won

2014-08-01

As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false-positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here, we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false-negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous single nucleotide variants (SNVs); 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery in NA12878, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and an ensemble genotyping would be essential to minimize false-positive DNM candidates. © 2014 WILEY PERIODICALS, INC.
Reducing false positive incidental findings with ensemble genotyping and logistic regression-based variant filtering methods

PubMed Central

Hwang, Kyu-Baek; Lee, In-Hee; Park, Jin-Ho; Hambuch, Tina; Choi, Yongjoon; Kim, MinHyeok; Lee, Kyungjoon; Song, Taemin; Neu, Matthew B.; Gupta, Neha; Kohane, Isaac S.; Green, Robert C.; Kong, Sek Won

2014-01-01

As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous SNVs; 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and ensemble genotyping would be essential to minimize false positive DNM candidates. PMID:24829188
A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.

PubMed

Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A

1999-12-20

The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.
A draft physical map of a D-genome cotton species (Gossypium raimondii)

PubMed Central

2010-01-01

Background Genetically anchored physical maps of large eukaryotic genomes have proven useful both for their intrinsic merit and as an adjunct to genome sequencing. Cultivated tetraploid cottons, Gossypium hirsutum and G. barbadense, share a common ancestor formed by a merger of the A and D genomes about 1-2 million years ago. Toward the long-term goal of characterizing the spectrum of diversity among cotton genomes, the worldwide cotton community has prioritized the D genome progenitor Gossypium raimondii for complete sequencing. Results A whole genome physical map of G. raimondii, the putative D genome ancestral species of tetraploid cottons was assembled, integrating genetically-anchored overgo hybridization probes, agarose based fingerprints and 'high information content fingerprinting' (HICF). A total of 13,662 BAC-end sequences and 2,828 DNA probes were used in genetically anchoring 1585 contigs to a cotton consensus genetic map, and 370 and 438 contigs, respectively to Arabidopsis thaliana (AT) and Vitis vinifera (VV) whole genome sequences. Conclusion Several lines of evidence suggest that the G. raimondii genome is comprised of two qualitatively different components. Much of the gene rich component is aligned to the Arabidopsis and Vitis vinifera genomes and shows promise for utilizing translational genomic approaches in understanding this important genome and its resident genes. The integrated genetic-physical map is of value both in assembling and validating a planned reference sequence. PMID:20569427
Novel features of ARS selection in budding yeast Lachancea kluyveri

PubMed Central

2011-01-01

Background The characterization of DNA replication origins in yeast has shed much light on the mechanisms of initiation of DNA replication. However, very little is known about the evolution of origins or the evolution of mechanisms through which origins are recognized by the initiation machinery. This lack of understanding is largely due to the vast evolutionary distances between model organisms in which origins have been examined. Results In this study we have isolated and characterized autonomously replicating sequences (ARSs) in Lachancea kluyveri - a pre-whole genome duplication (WGD) budding yeast. Through a combination of experimental work and rigorous computational analysis, we show that L. kluyveri ARSs require a sequence that is similar but much longer than the ARS Consensus Sequence well defined in Saccharomyces cerevisiae. Moreover, compared with S. cerevisiae and K. lactis, the replication licensing machinery in L. kluyveri seems more tolerant to variations in the ARS sequence composition. It is able to initiate replication from almost all S. cerevisiae ARSs tested and most Kluyveromyces lactis ARSs. In contrast, only about half of the L. kluyveri ARSs function in S. cerevisiae and less than 10% function in K. lactis. Conclusions Our findings demonstrate a replication initiation system with novel features and underscore the functional diversity within the budding yeasts. Furthermore, we have developed new approaches for analyzing biologically functional DNA sequences with ill-defined motifs. PMID:22204614
Understanding the mechanisms of protein-DNA interactions

NASA Astrophysics Data System (ADS)

Lavery, Richard

2004-03-01

Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.
Mexican consensus on lysosomal acid lipase deficiency diagnosis.

PubMed

Vázquez-Frias, R; García-Ortiz, J E; Valencia-Mayoral, P F; Castro-Narro, G E; Medina-Bravo, P G; Santillán-Hernández, Y; Flores-Calderón, J; Mehta, R; Arellano-Valdés, C A; Carbajal-Rodríguez, L; Navarrete-Martínez, J I; Urbán-Reyes, M L; Valadez-Reyes, M T; Zárate-Mondragón, F; Consuelo-Sánchez, A

Lysosomal acid lipase deficiency (LAL-D) causes progressive cholesteryl ester and triglyceride accumulation in the lysosomes of hepatocytes and monocyte-macrophage system cells, resulting in a systemic disease with various manifestations that may go unnoticed. It is indispensable to recognize the deficiency, which can present in patients at any age, so that specific treatment can be given. The aim of the present review was to offer a guide for physicians in understanding the fundamental diagnostic aspects of LAL-D, to successfully aid in its identification. The review was designed by a group of Mexican experts and is presented as an orienting algorithm for the pediatrician, internist, gastroenterologist, endocrinologist, geneticist, pathologist, radiologist, and other specialists that could come across this disease in their patients. An up-to-date review of the literature in relation to the clinical manifestations of LAL-D and its diagnosis was performed. The statements were formulated based on said review and were then voted upon. The structured quantitative method employed for reaching consensus was the nominal group technique. A practical algorithm of the diagnostic process in LAL-D patients was proposed, based on clinical and laboratory data indicative of the disease and in accordance with the consensus established for each recommendation. The algorithm provides a sequence of clinical actions from different studies for optimizing the diagnostic process of patients suspected of having LAL-D. Copyright © 2017 Asociación Mexicana de Gastroenterología. Publicado por Masson Doyma México S.A. All rights reserved.
Identification and cloning of a gamma 3 subunit splice variant of the human GABA(A) receptor.

PubMed

Poulsen, C F; Christjansen, K N; Hastrup, S; Hartvig, L

2000-05-31

cDNA sequences encoding two forms of the GABA(A) gamma 3 receptor subunit were cloned from human hippocampus. The nucleotide sequences differ by the absence (gamma 3S) or presence (gamma 3L) of 18 bp located in the presumed intracellular loop between transmembrane region (TM) III and IV. The extra 18 bp in the gamma 3L subunit generates a consensus site for phosphorylation by protein kinase C (PKC). Analysis of human genomic DNA encoding the gamma 3 subunit reveals that the 18 bp insert is contiguous with the upstream proximal exon.
Automated sequence analysis and editing software for HIV drug resistance testing.

PubMed

Struck, Daniel; Wallis, Carole L; Denisov, Gennady; Lambert, Christine; Servais, Jean-Yves; Viana, Raquel V; Letsoalo, Esrom; Bronze, Michelle; Aitken, Sue C; Schuurman, Rob; Stevens, Wendy; Schmit, Jean Claude; Rinke de Wit, Tobias; Perez Bercoff, Danielle

2012-05-01

Access to antiretroviral treatment in resource-limited-settings is inevitably paralleled by the emergence of HIV drug resistance. Monitoring treatment efficacy and HIV drugs resistance testing are therefore of increasing importance in resource-limited settings. Yet low-cost technologies and procedures suited to the particular context and constraints of such settings are still lacking. The ART-A (Affordable Resistance Testing for Africa) consortium brought together public and private partners to address this issue. To develop an automated sequence analysis and editing software to support high throughput automated sequencing. The ART-A Software was designed to automatically process and edit ABI chromatograms or FASTA files from HIV-1 isolates. The ART-A Software performs the basecalling, assigns quality values, aligns query sequences against a set reference, infers a consensus sequence, identifies the HIV type and subtype, translates the nucleotide sequence to amino acids and reports insertions/deletions, premature stop codons, ambiguities and mixed calls. The results can be automatically exported to Excel to identify mutations. Automated analysis was compared to manual analysis using a panel of 1624 PR-RT sequences generated in 3 different laboratories. Discrepancies between manual and automated sequence analysis were 0.69% at the nucleotide level and 0.57% at the amino acid level (668,047 AA analyzed), and discordances at major resistance mutations were recorded in 62 cases (4.83% of differences, 0.04% of all AA) for PR and 171 (6.18% of differences, 0.03% of all AA) cases for RT. The ART-A Software is a time-sparing tool for pre-analyzing HIV and viral quasispecies sequences in high throughput laboratories and highlighting positions requiring attention. Copyright © 2012 Elsevier B.V. All rights reserved.

Two DNA-binding factors recognize specific sequences at silencers, upstream activating sequences, autonomously replicating sequences, and telomeres in Saccharomyces cerevisiae

DOE Office of Scientific and Technical Information (OSTI.GOV)

Buchman, A.R.; Kimmerly, W.J.; Rine, J.

1988-01-01

Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less
SCOPE: a web server for practical de novo motif discovery.

PubMed

Carlson, Jonathan M; Chakravarty, Arijit; DeZiel, Charles E; Gross, Robert H

2007-07-01

SCOPE is a novel parameter-free method for the de novo identification of potential regulatory motifs in sets of coordinately regulated genes. The SCOPE algorithm combines the output of three component algorithms, each designed to identify a particular class of motifs. Using an ensemble learning approach, SCOPE identifies the best candidate motifs from its component algorithms. In tests on experimentally determined datasets, SCOPE identified motifs with a significantly higher level of accuracy than a number of other web-based motif finders run with their default parameters. Because SCOPE has no adjustable parameters, the web server has an intuitive interface, requiring only a set of gene names or FASTA sequences and a choice of species. The most significant motifs found by SCOPE are displayed graphically on the main results page with a table containing summary statistics for each motif. Detailed motif information, including the sequence logo, PWM, consensus sequence and specific matching sites can be viewed through a single click on a motif. SCOPE's efficient, parameter-free search strategy has enabled the development of a web server that is readily accessible to the practising biologist while providing results that compare favorably with those of other motif finders. The SCOPE web server is at .
A Novel WRKY transcription factor is required for induction of PR-1a gene expression by salicylic acid and bacterial elicitors.

PubMed

van Verk, Marcel C; Pappaioannou, Dimitri; Neeleman, Lyda; Bol, John F; Linthorst, Huub J M

2008-04-01

PR-1a is a salicylic acid-inducible defense gene of tobacco (Nicotiana tabacum). One-hybrid screens identified a novel tobacco WRKY transcription factor (NtWRKY12) with specific binding sites in the PR-1a promoter at positions -564 (box WK(1)) and -859 (box WK(2)). NtWRKY12 belongs to the class of transcription factors in which the WRKY sequence is followed by a GKK rather than a GQK sequence. The binding sequence of NtWRKY12 (WK box TTTTCCAC) deviated significantly from the consensus sequence (W box TTGAC[C/T]) shown to be recognized by WRKY factors with the GQK sequence. Mutation of the GKK sequence in NtWRKY12 into GQK or GEK abolished binding to the WK box. The WK(1) box is in close proximity to binding sites in the PR-1a promoter for transcription factors TGA1a (as-1 box) and Myb1 (MBSII box). Expression studies with PR-1a promoterbeta-glucuronidase (GUS) genes in stably and transiently transformed tobacco indicated that NtWRKY12 and TGA1a act synergistically in PR-1a expression induced by salicylic acid and bacterial elicitors. Cotransfection of Arabidopsis thaliana protoplasts with 35SNtWRKY12 and PR-1aGUS promoter fusions showed that overexpression of NtWRKY12 resulted in a strong increase in GUS expression, which required functional WK boxes in the PR-1a promoter.
Mother-to-Child HIV Transmission Bottleneck Selects for Consensus Virus with Lower Gag-Protease-Driven Replication Capacity

PubMed Central

Naidoo, Vanessa L.; Mann, Jaclyn K.; Noble, Christie; Adland, Emily; Carlson, Jonathan M.; Thomas, Jake; Brumme, Chanson J.; Thobakgale-Tshabalala, Christina F.; Brumme, Zabrina L.; Goulder, Philip J. R.

2017-01-01

ABSTRACT In the large majority of cases, HIV infection is established by a single variant, and understanding the characteristics of successfully transmitted variants is relevant to prevention strategies. Few studies have investigated the viral determinants of mother-to-child transmission. To determine the impact of Gag-protease-driven viral replication capacity on mother-to-child transmission, the replication capacities of 148 recombinant viruses encoding plasma-derived Gag-protease from 53 nontransmitter mothers, 48 transmitter mothers, and 47 infected infants were assayed in an HIV-1-inducible green fluorescent protein reporter cell line. All study participants were infected with HIV-1 subtype C. There was no significant difference in replication capacities between the nontransmitter (n = 53) and transmitter (n = 44) mothers (P = 0.48). Infant-derived Gag-protease NL4-3 recombinant viruses (n = 41) were found to have a significantly lower Gag-protease-driven replication capacity than that of viruses derived from the mothers (P < 0.0001 by a paired t test). High percent similarities to consensus subtype C Gag, p17, p24, and protease sequences were also found in the infants (n = 28) in comparison to their mothers (P = 0.07, P = 0.002, P = 0.03, and P = 0.02, respectively, as determined by a paired t test). These data suggest that of the viral quasispecies found in mothers, the HIV mother-to-child transmission bottleneck favors the transmission of consensus-like viruses with lower viral replication capacities. IMPORTANCE Understanding the characteristics of successfully transmitted HIV variants has important implications for preventative interventions. Little is known about the viral determinants of HIV mother-to-child transmission (MTCT). We addressed the role of viral replication capacity driven by Gag, a major structural protein that is a significant determinant of overall viral replicative ability and an important target of the host immune response, in the MTCT bottleneck. This study advances our understanding of the genetic bottleneck in MTCT by revealing that viruses transmitted to infants have a lower replicative ability as well as a higher similarity to the population consensus (in this case HIV subtype C) than those of their mothers. Furthermore, the observation that “consensus-like” virus sequences correspond to lower in vitro replication abilities yet appear to be preferentially transmitted suggests that viral characteristics favoring transmission are decoupled from those that enhance replicative capacity. PMID:28637761
Stock culture heterogeneity rather than new mutational variation complicates short-term cell physiology studies of Escherichia coli K-12 MG1655 in continuous culture.

PubMed

Nahku, Ranno; Peebo, Karl; Valgepea, Kaspar; Barrick, Jeffrey E; Adamberg, Kaarel; Vilu, Raivo

2011-09-01

Nutrient-limited continuous cultures in chemostats have been used to study microbial cell physiology for over 60 years. Genome instability and genetic heterogeneity are possible uncontrolled factors in continuous cultivation experiments. We investigated these issues by using high-throughput (HT) DNA sequencing to characterize samples from different phases of a glucose-limited accelerostat (A-stat) experiment with Escherichia coli K-12 MG1655 and a duration regularly used in cell physiology studies (20 generations of continuous cultivation). Seven consensus mutations from the reference sequence and five subpopulations characterized by different mutations were detected in the HT-sequenced samples. This genetic heterogeneity was confirmed to result from the stock culture by Sanger sequencing. All the subpopulations in which allele frequencies increased (betA, cspG/cspH, glyA) during the experiment were also present at the end of replicate A-stats, indicating that no new subpopulations emerged during our experiments. The fact that ~31 % of the cells in our initial cultures obtained directly from a culture stock centre were mutants raises concerns that even if cultivations are started from single colonies, there is a significant chance of picking a mutant clone with an altered phenotype. Our results show that current HT DNA sequencing technology allows accurate subpopulation analysis and demonstrates that a glucose-limited E. coli K-12 MG1655 A-stat experiment with a duration of tens of generations is suitable for studying cell physiology and collecting quantitative data for metabolic modelling without interference from new mutations.
Stock culture heterogeneity rather than new mutational variation complicates short-term cell physiology studies of Escherichia coli K-12 MG1655 in continuous culture

PubMed Central

Nahku, Ranno; Peebo, Karl; Valgepea, Kaspar; Barrick, Jeffrey E.; Adamberg, Kaarel

2011-01-01

Nutrient-limited continuous cultures in chemostats have been used to study microbial cell physiology for over 60 years. Genome instability and genetic heterogeneity are possible uncontrolled factors in continuous cultivation experiments. We investigated these issues by using high-throughput (HT) DNA sequencing to characterize samples from different phases of a glucose-limited accelerostat (A-stat) experiment with Escherichia coli K-12 MG1655 and a duration regularly used in cell physiology studies (20 generations of continuous cultivation). Seven consensus mutations from the reference sequence and five subpopulations characterized by different mutations were detected in the HT-sequenced samples. This genetic heterogeneity was confirmed to result from the stock culture by Sanger sequencing. All the subpopulations in which allele frequencies increased (betA, cspG/cspH, glyA) during the experiment were also present at the end of replicate A-stats, indicating that no new subpopulations emerged during our experiments. The fact that ~31 % of the cells in our initial cultures obtained directly from a culture stock centre were mutants raises concerns that even if cultivations are started from single colonies, there is a significant chance of picking a mutant clone with an altered phenotype. Our results show that current HT DNA sequencing technology allows accurate subpopulation analysis and demonstrates that a glucose-limited E. coli K-12 MG1655 A-stat experiment with a duration of tens of generations is suitable for studying cell physiology and collecting quantitative data for metabolic modelling without interference from new mutations. PMID:21700661
The yeast genome may harbor hypoxia response elements (HRE).

PubMed

Ferreira, Túlio César; Hertzberg, Libi; Gassmann, Max; Campos, Elida Geralda

2007-01-01

The hypoxia-inducible factor-1 (HIF-1) is a heterodimeric transcription factor activated when cells are submitted to hypoxia. The heterodimer is composed of two subunits, HIF-1alpha and the constitutively expressed HIF-1beta. During normoxia, HIF-1alpha is degraded by the 26S proteasome, but hypoxia causes HIF-1alpha to be stabilized, enter the nucleus and bind to HIF-1beta, thus forming the active complex. The complex then binds to the regulatory sequences of various genes involved in physiological and pathological processes. The specific regulatory sequence recognized by HIF-1 is the hypoxia response element (HRE) that has the consensus sequence 5'BRCGTGVBBB3'. Although the basic transcriptional regulation machinery is conserved between yeast and mammals, Saccharomyces cerevisiae does not express HIF-1 subunits. However, we hypothesized that baker's yeast has a protein analogous to HIF-1 which participates in the response to changes in oxygen levels by binding to HRE sequences. In this study we screened the yeast genome for HREs using probabilistic motif search tools. We described 24 yeast genes containing motifs with high probability of being HREs (p-value<0.1) and classified them according to biological function. Our results show that S. cerevisiae may harbor HREs and indicate that a transcription factor analogous to HIF-1 may exist in this organism.
A guide to enterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets.

PubMed

Koren, Omry; Knights, Dan; Gonzalez, Antonio; Waldron, Levi; Segata, Nicola; Knight, Rob; Huttenhower, Curtis; Ley, Ruth E

2013-01-01

Recent analyses of human-associated bacterial diversity have categorized individuals into 'enterotypes' or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes.
A Guide to Enterotypes across the Human Body: Meta-Analysis of Microbial Community Structures in Human Microbiome Datasets

PubMed Central

Waldron, Levi; Segata, Nicola; Knight, Rob; Huttenhower, Curtis; Ley, Ruth E.

2013-01-01

Recent analyses of human-associated bacterial diversity have categorized individuals into ‘enterotypes’ or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes. PMID:23326225
Characterization of Cer-1 cis-regulatory region during early Xenopus development.

PubMed

Silva, Ana Cristina; Filipe, Mário; Steinbeisser, Herbert; Belo, José António

2011-05-01

Cerberus-related molecules are well-known Wnt, Nodal, and BMP inhibitors that have been implicated in different processes including anterior–posterior patterning and left–right asymmetry. In both mouse and frog, two Cerberus-related genes have been isolated, mCer-1 and mCer-2, and Xcer and Xcoco, respectively. Until now, little is known about the mechanisms involved in their transcriptional regulation. Here, we report a heterologous analysis of the mouse Cerberus-1 gene upstream regulatory regions, responsible for its expression in the visceral endodermal cells. Our analysis showed that the consensus sequences for a TATA, CAAT, or GC boxes were absent but a TGTGG sequence was present at position -172 to -168 bp, relative to the ATG. Using a series of deletion constructs and transient expression in Xenopus embryos, we found that a fragment of 1.4 kb of Cer-1 promoter sequence could reproduce the endogenous expression pattern of Xenopus cerberus. A 0.7-kb mcer-1 upstream region was able to drive reporter expression to the involuting mesendodermal cells, while further deletions abolished reporter gene expression. Our results suggest that although no sequence similarity was found between mouse and Xenopus cerberus cis-regulatory regions, the signaling cascades regulating cerberus expression, during gastrulation, is conserved.
Applying the Concept of Peptide Uniqueness to Anti-Polio Vaccination

PubMed Central

Kanduc, Darja; Fasano, Candida; Capone, Giovanni; Pesce Delfino, Antonella; Calabrò, Michele; Polimeno, Lorenzo

2015-01-01

Background. Although rare, adverse events may associate with anti-poliovirus vaccination thus possibly hampering global polio eradication worldwide. Objective. To design peptide-based anti-polio vaccines exempt from potential cross-reactivity risks and possibly able to reduce rare potential adverse events such as the postvaccine paralytic poliomyelitis due to the tendency of the poliovirus genome to mutate. Methods. Proteins from poliovirus type 1, strain Mahoney, were analyzed for amino acid sequence identity to the human proteome at the pentapeptide level, searching for sequences that (1) have zero percent of identity to human proteins, (2) are potentially endowed with an immunologic potential, and (3) are highly conserved among poliovirus strains. Results. Sequence analyses produced a set of consensus epitopic peptides potentially able to generate specific anti-polio immune responses exempt from cross-reactivity with the human host. Conclusion. Peptide sequences unique to poliovirus proteins and conserved among polio strains might help formulate a specific and universal anti-polio vaccine able to react with multiple viral strains and exempt from the burden of possible cross-reactions with human proteins. As an additional advantage, using a peptide-based vaccine instead of current anti-polio DNA vaccines would eliminate the rare post-polio poliomyelitis cases and other disabling symptoms that may appear following vaccination. PMID:26568962
Consensus on consensus: a synthesis of consensus estimates on human-caused global warming

NASA Astrophysics Data System (ADS)

Cook, John; Oreskes, Naomi; Doran, Peter T.; Anderegg, William R. L.; Verheggen, Bart; Maibach, Ed W.; Carlton, J. Stuart; Lewandowsky, Stephan; Skuce, Andrew G.; Green, Sarah A.; Nuccitelli, Dana; Jacobs, Peter; Richardson, Mark; Winkler, Bärbel; Painting, Rob; Rice, Ken

2016-04-01

The consensus that humans are causing recent global warming is shared by 90%-100% of publishing climate scientists according to six independent studies by co-authors of this paper. Those results are consistent with the 97% consensus reported by Cook et al (Environ. Res. Lett. 8 024024) based on 11 944 abstracts of research papers, of which 4014 took a position on the cause of recent global warming. A survey of authors of those papers (N = 2412 papers) also supported a 97% consensus. Tol (2016 Environ. Res. Lett. 11 048001) comes to a different conclusion using results from surveys of non-experts such as economic geologists and a self-selected group of those who reject the consensus. We demonstrate that this outcome is not unexpected because the level of consensus correlates with expertise in climate science. At one point, Tol also reduces the apparent consensus by assuming that abstracts that do not explicitly state the cause of global warming (‘no position’) represent non-endorsement, an approach that if applied elsewhere would reject consensus on well-established theories such as plate tectonics. We examine the available studies and conclude that the finding of 97% consensus in published climate research is robust and consistent with other surveys of climate scientists and peer-reviewed studies.
Detection of enteroviruses and hepatitis a virus in water by consensus primer multiplex RT-PCR

PubMed Central

Li, Jun-Wen; Wang, Xin-Wei; Yuan, Chang-Qing; Zheng, Jin-Lai; Jin, Min; Song, Nong; Shi, Xiu-Quan; Chao, Fu-Huan

2002-01-01

AIM: To develop a rapid detection method of enteroviruses and Hepatitis A virus (HAV). METHODS: A one-step, single-tube consensus primers multiplex RT-PCR was developed to simultaneously detect Poliovirus, Coxsackie virus, Echovirus and HAV. A general upstream primer and a HAV primer and four different sets of primers (5 primers) specific for Poliovirus, Coxsacki evirus, Echovirus and HAV cDNA were mixed in the PCR mixture to reverse transcript and amplify the target DNA. Four distinct amplified DNA segments representing Poliovirus, Coxsackie virus, Echovirus and HAV were identified by gel electrophoresis as 589-, 671-, 1084-, and 1128 bp sequences, respectively. Semi-nested PCR was used to confirm the amplified products for each enterovirus and HAV. RESULTS: All four kinds of viral genome RNA were detected, and producing four bands which could be differentiated by the band size on the gel. To confirm the specificity of the multiplex PCR products, semi-nested PCR was performed. For all the four strains tested gave positive results. The detection sensitivity of multiplex PCR was similar to that of monoplex RT-PCR which was 24 PFU for Poliovrus, 21 PFU for Coxsackie virus, 60 PFU for Echovirus and 105 TCID50 for HAV. The minimum amount of enteric viral RNA detected by semi-nested PCR was equivalent to 2.4 PFU for Poliovrus, 2.1 PFU for Coxsackie virus, 6.0 PFU for Echovirus and 10.5 TCID50 for HAV. CONCLUSION: The consensus primers multiplex RT-PCR has more advantages over monoplex RT-PCR for enteric viruses detection, namely, the rapid turnaround time and cost effectiveness. PMID:12174381
The influence of ignoring secondary structure on divergence time estimates from ribosomal RNA genes.

PubMed

Dohrmann, Martin

2014-02-01

Genes coding for ribosomal RNA molecules (rDNA) are among the most popular markers in molecular phylogenetics and evolution. However, coevolution of sites that code for pairing regions (stems) in the RNA secondary structure can make it challenging to obtain accurate results from such loci. While the influence of ignoring secondary structure on multiple sequence alignment and tree topology has been investigated in numerous studies, its effect on molecular divergence time estimates is still poorly known. Here, I investigate this issue in Bayesian Markov Chain Monte Carlo (BMCMC) and penalized likelihood (PL) frameworks, using empirical datasets from dragonflies (Odonata: Anisoptera) and glass sponges (Porifera: Hexactinellida). My results indicate that highly biased inferences under substitution models that ignore secondary structure only occur if maximum-likelihood estimates of branch lengths are used as input to PL dating, whereas in a BMCMC framework and in PL dating based on Bayesian consensus branch lengths, the effect is far less severe. I conclude that accounting for coevolution of paired sites in molecular dating studies is not as important as previously suggested, as long as the estimates are based on Bayesian consensus branch lengths instead of ML point estimates. This finding is especially relevant for studies where computational limitations do not allow the use of secondary-structure specific substitution models, or where accurate consensus structures cannot be predicted. I also found that the magnitude and direction (over- vs. underestimating node ages) of bias in age estimates when secondary structure is ignored was not distributed randomly across the nodes of the phylogenies, a phenomenon that requires further investigation. Copyright © 2013 Elsevier Inc. All rights reserved.
Complete genome sequence of the phenanthrene-degrading soil bacterium Delftia acidovorans Cs1-4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shetty, Ameesha R.; de Gannes, Vidya; Obi, Chioma C.

Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs inmore » two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded into 1.5 kb overlapping shreds. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 182 additional reactions were needed to close gaps and to raise the quality of the finished sequence. The final assembly is based on 253.3 Mb of 454 draft data (averaging 38.4 X coverage) and 590.2 Mb of Illumina draft data (averaging 89.4 X coverage). The genome of strain Cs1-4 consists of a single circular chromosome of 6,685,842 bp (66.7 %G+C) containing 6,028 predicted genes; 5,931 of these genes were protein-encoding and 4,425 gene products were assigned to a putative function. Genes encoding phenanthrene degradation were localized to a 232 kb genomic island (termed the phn island), which contained near its 3’ end a bacteriophage P4-like integrase, an enzyme often associated with chromosomal integration of mobile genetic elements. Other biodegradation pathways reconstructed from the genome sequence included: benzoate (by the acetyl-CoA pathway), styrene, nicotinic acid (by the maleamate pathway) and the pesticides Dicamba and Fenitrothion. Lastly, determination of the complete genome sequence of D. acidovorans Cs1-4 has provided new insights the microbial mechanisms of PAH biodegradation that may shape the process in the environment.« less
Complete genome sequence of the phenanthrene-degrading soil bacterium Delftia acidovorans Cs1-4

DOE PAGES

Shetty, Ameesha R.; de Gannes, Vidya; Obi, Chioma C.; ...

2015-08-15

Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs inmore » two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded into 1.5 kb overlapping shreds. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 182 additional reactions were needed to close gaps and to raise the quality of the finished sequence. The final assembly is based on 253.3 Mb of 454 draft data (averaging 38.4 X coverage) and 590.2 Mb of Illumina draft data (averaging 89.4 X coverage). The genome of strain Cs1-4 consists of a single circular chromosome of 6,685,842 bp (66.7 %G+C) containing 6,028 predicted genes; 5,931 of these genes were protein-encoding and 4,425 gene products were assigned to a putative function. Genes encoding phenanthrene degradation were localized to a 232 kb genomic island (termed the phn island), which contained near its 3’ end a bacteriophage P4-like integrase, an enzyme often associated with chromosomal integration of mobile genetic elements. Other biodegradation pathways reconstructed from the genome sequence included: benzoate (by the acetyl-CoA pathway), styrene, nicotinic acid (by the maleamate pathway) and the pesticides Dicamba and Fenitrothion. Lastly, determination of the complete genome sequence of D. acidovorans Cs1-4 has provided new insights the microbial mechanisms of PAH biodegradation that may shape the process in the environment.« less
Return of Genomic Results to Research Participants: The Floor, the Ceiling, and the Choices In Between

PubMed Central

Jarvik, Gail P.; Amendola, Laura M.; Berg, Jonathan S.; Brothers, Kyle; Clayton, Ellen W.; Chung, Wendy; Evans, Barbara J.; Evans, James P.; Fullerton, Stephanie M.; Gallego, Carlos J.; Garrison, Nanibaa’ A.; Gray, Stacy W.; Holm, Ingrid A.; Kullo, Iftikhar J.; Lehmann, Lisa Soleymani; McCarty, Cathy; Prows, Cynthia A.; Rehm, Heidi L.; Sharp, Richard R.; Salama, Joseph; Sanderson, Saskia; Van Driest, Sara L.; Williams, Marc S.; Wolf, Susan M.; Wolf, Wendy A.; Harley, John; Myers, Melanie; Namjou, Bahram; Vinks, Sander; Connolly, John; Keating, Brendan; Gerhard, Glenn; Sundaresan, Agnes; Tromp, Gerard; Crosslin, David; Leppig, Kathy; Wicklund, Cathy; Chute, Christopher; Lynch, John; De Andrade, Mariza; Heit, John; McCormick, Jen; Brilliant, Murray; Kitchner, Terrie; Ritchie, Marylyn; Böttinger, Erwin; Peter, Inga; Persell, Stephen; Rasmussen-Torvik, Laura; McGregor, Tracy; Roden, Dan; Antommaria, Armand; Chiavacci, Rosetta; Faucett, Andy; Ledbetter, David; Williams, Janet; Hartzler, Andrea; Vitek, Carolyn R. Rohrer; Frost, Norm; Ferryman, Kadija; Horowitz, Carol; Rhodes, Rosamond; Zinberg, Randi; Aufox, Sharon; Pan, Vivian; Long, Rochelle; Ramos, Erin; Odgis, Jackie; Wise, Anastasia; Hull, Sara; Gitlin, Jonathan; Green, Robert; Metterville, Danielle; McGuire, Amy; Kong, Sek Won; Trinidad, Sue; Veenstra, David; Roche, Myra; Skinner, Debra; Raspberry, Kelly; O’Daniel, Julianne; Parsons, Will; Eng, Christine; Hilsenbeck, Susan; Karavite, Dean; Conlin, Laura; Spinner, Nancy; Krantz, Ian; Falk, Marni; Santani, Avni; Dechene, Elizabeth; Dulik, Matthew; Bernhardt, Barbara; Schuetze, Scott; Everett, Jessica; Gornick, Michele Caroline; Wilfond, Ben; Tabor, Holly; Lemke, Amy A.; Richards, Sue; Goddard, Katrina; Cooper, Greg; East, Kelly; Barsh, Greg; Koenig, Barbara; Van Allen, Eliezer; Garber, Judy; Garrett, Jeremy; Zawati, Ma’n; Lewis, Michelle; Savage, Sarah; Smith, Maureen; Roychowdhury, Sameek; Bailey, Alice; Berkman, Benjamin; Anan, Charlisse Caga; Hindorff, Lucia; Hutter, Carolyn; King, Rosalind; Li, Rongling; Lockhart, Nicole; McEwen, Jean; Scholes, Derek; Schully, Sheri; Sun, Kathie; Burke, Wylie

2014-01-01

As more research studies incorporate next-generation sequencing (including whole-genome or whole-exome sequencing), investigators and institutional review boards face difficult questions regarding which genomic results to return to research participants and how. An American College of Medical Genetics and Genomics 2013 policy paper suggesting that pathogenic mutations in 56 specified genes should be returned in the clinical setting has raised the question of whether comparable recommendations should be considered in research settings. The Clinical Sequencing Exploratory Research (CSER) Consortium and the Electronic Medical Records and Genomics (eMERGE) Network are multisite research programs that aim to develop practical strategies for addressing questions concerning the return of results in genomic research. CSER and eMERGE committees have identified areas of consensus regarding the return of genomic results to research participants. In most circumstances, if results meet an actionability threshold for return and the research participant has consented to return, genomic results, along with referral for appropriate clinical follow-up, should be offered to participants. However, participants have a right to decline the receipt of genomic results, even when doing so might be viewed as a threat to the participants’ health. Research investigators should be prepared to return research results and incidental findings discovered in the course of their research and meeting an actionability threshold, but they have no ethical obligation to actively search for such results. These positions are consistent with the recognition that clinical research is distinct from medical care in both its aims and its guiding moral principles. PMID:24814192
Cloning of a CACTA transposon-like insertion in intron I of tomato invertase Lin5 gene and identification of transposase-like sequences of Solanaceae species.

PubMed

Proels, Reinhard K; Roitsch, Thomas

2006-03-01

Very few CACTA transposon-like sequences have been described in Solanaceae species. Sequence information has been restricted to partial transposase (TPase)-like fragments, and no target gene of CACTA-like transposon insertion has been described in tomato to date. In this manuscript, we report on a CACTA transposon-like insertion in intron I of tomato (Lycopersicon esculentum) invertase gene Lin5 and TPase-like sequences of several Solanaceae species. Consensus primers deduced from the TPase region of the tomato CACTA transposon-like element allowed the amplification of similar sequences from various Solanaceae species of different subfamilies including Solaneae (Solanum tuberosum), Cestreae (Nicotiana tabacum) and Datureae (Datura stramonium). This demonstrates the ubiquitous presence of CACTA-like elements in Solanaceae genomes. The obtained partial sequences are highly conserved, and allow further detection and detailed analysis of CACTA-like transposons throughout Solanaceae species. CACTA-like transposon sequences make possible the evaluation of their use for genome analysis, functional studies of genes and the evolutionary relationships between plant species.
Delphi based consensus study into planning for chemical incidents

PubMed Central

Crawford, I; Mackway-Jones, K; Russell, D; Carley, S

2004-01-01

Objective: To achieve consensus in all phases of chemical incident planning and response. Design: A three round Delphi study was conducted using a panel of 39 experts from specialties involved in the management of chemical incidents. Areas that did not reach consensus in the Delphi study were presented as synopsis statements for discussion in four syndicate groups at a conference hosted by the Department of Health Emergency Planning Co-ordination Unit. Results: A total of 183 of 322 statements had reached consensus upon completion of the Delphi study. This represented 56.8% of the total number of statements. Of these, 148 reached consensus at >94% and 35 reached consensus at >89%. The results of the process are presented as a series of synopsis consensus statements that cover all phases of chemical incident planning and response. Conclusions: The use of a Delphi study and subsequent syndicate group discussions achieved consensus in aspects of all phases of chemical incident planning and response that can be translated into practical guidance for use at regional prehospital and hospital level. Additionally, areas of non-consensus have been identified where further work is required. PMID:14734369
Genetic relationships among strains of Xanthomonas fragariae based on random amplified polymorphic DNA PCR, repetitive extragenic palindromic PCR, and enterobacterial repetitive intergenic consensus PCR data and generation of multiplexed PCR primers useful for the identification of this phytopathogen.

PubMed Central

Pooler, M R; Ritchie, D F; Hartung, J S

1996-01-01

Genetic relationships among 25 isolates of Xanthomonas fragariae from diverse geographic regions were determined by three PCR methods that rely on different amplification priming strategies: random amplified polymorphic DNA (RAPD) PCR, repetitive extragenic palindromic (REP) PCR, and enterobacterial repetitive intergenic consensus (ERIC) PCR. The results of these assays are mutually consistent and indicate that pathogenic strains are very closely related to each other. RAPD, ERIC, and REP PCR assays identified nine, four, and two genotypes, respectively, within X. fragariae isolates. A single nonpathogenic isolate of X. fragariae was not distinguishable by these methods. The results of the PCR assays were also fully confirmed by physiological tests. There was no correlation between DNA amplification product patterns and geographic sites of isolation, suggesting that this bacterium has spread largely through exchange of infected plant germ plasm. Sequences identified through the RAPD assays were used to develop three primer pairs for standard PCR assays to identify X. fragariae. In addition, we developed a stringent multiplexed PCR assay to identify X. fragariae by simultaneously using the three independently derived sets of primers specific for pathogenic strains of the bacteria. PMID:8795198

An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge.

PubMed

Brownstein, Catherine A; Beggs, Alan H; Homer, Nils; Merriman, Barry; Yu, Timothy W; Flannery, Katherine C; DeChene, Elizabeth T; Towne, Meghan C; Savage, Sarah K; Price, Emily N; Holm, Ingrid A; Luquette, Lovelace J; Lyon, Elaine; Majzoub, Joseph; Neupert, Peter; McCallie, David; Szolovits, Peter; Willard, Huntington F; Mendelsohn, Nancy J; Temme, Renee; Finkel, Richard S; Yum, Sabrina W; Medne, Livija; Sunyaev, Shamil R; Adzhubey, Ivan; Cassa, Christopher A; de Bakker, Paul I W; Duzkale, Hatice; Dworzyński, Piotr; Fairbrother, William; Francioli, Laurent; Funke, Birgit H; Giovanni, Monica A; Handsaker, Robert E; Lage, Kasper; Lebo, Matthew S; Lek, Monkol; Leshchiner, Ignaty; MacArthur, Daniel G; McLaughlin, Heather M; Murray, Michael F; Pers, Tune H; Polak, Paz P; Raychaudhuri, Soumya; Rehm, Heidi L; Soemedi, Rachel; Stitziel, Nathan O; Vestecka, Sara; Supper, Jochen; Gugenmus, Claudia; Klocke, Bernward; Hahn, Alexander; Schubach, Max; Menzel, Mortiz; Biskup, Saskia; Freisinger, Peter; Deng, Mario; Braun, Martin; Perner, Sven; Smith, Richard J H; Andorf, Janeen L; Huang, Jian; Ryckman, Kelli; Sheffield, Val C; Stone, Edwin M; Bair, Thomas; Black-Ziegelbein, E Ann; Braun, Terry A; Darbro, Benjamin; DeLuca, Adam P; Kolbe, Diana L; Scheetz, Todd E; Shearer, Aiden E; Sompallae, Rama; Wang, Kai; Bassuk, Alexander G; Edens, Erik; Mathews, Katherine; Moore, Steven A; Shchelochkov, Oleg A; Trapane, Pamela; Bossler, Aaron; Campbell, Colleen A; Heusel, Jonathan W; Kwitek, Anne; Maga, Tara; Panzer, Karin; Wassink, Thomas; Van Daele, Douglas; Azaiez, Hela; Booth, Kevin; Meyer, Nic; Segal, Michael M; Williams, Marc S; Tromp, Gerard; White, Peter; Corsmeier, Donald; Fitzgerald-Butt, Sara; Herman, Gail; Lamb-Thrush, Devon; McBride, Kim L; Newsom, David; Pierson, Christopher R; Rakowsky, Alexander T; Maver, Aleš; Lovrečić, Luca; Palandačić, Anja; Peterlin, Borut; Torkamani, Ali; Wedell, Anna; Huss, Mikael; Alexeyenko, Andrey; Lindvall, Jessica M; Magnusson, Måns; Nilsson, Daniel; Stranneheim, Henrik; Taylan, Fulya; Gilissen, Christian; Hoischen, Alexander; van Bon, Bregje; Yntema, Helger; Nelen, Marcel; Zhang, Weidong; Sager, Jason; Zhang, Lu; Blair, Kathryn; Kural, Deniz; Cariaso, Michael; Lennon, Greg G; Javed, Asif; Agrawal, Saloni; Ng, Pauline C; Sandhu, Komal S; Krishna, Shuba; Veeramachaneni, Vamsi; Isakov, Ofer; Halperin, Eran; Friedman, Eitan; Shomron, Noam; Glusman, Gustavo; Roach, Jared C; Caballero, Juan; Cox, Hannah C; Mauldin, Denise; Ament, Seth A; Rowen, Lee; Richards, Daniel R; San Lucas, F Anthony; Gonzalez-Garay, Manuel L; Caskey, C Thomas; Bai, Yu; Huang, Ying; Fang, Fang; Zhang, Yan; Wang, Zhengyuan; Barrera, Jorge; Garcia-Lobo, Juan M; González-Lamuño, Domingo; Llorca, Javier; Rodriguez, Maria C; Varela, Ignacio; Reese, Martin G; De La Vega, Francisco M; Kiruluta, Edward; Cargill, Michele; Hart, Reece K; Sorenson, Jon M; Lyon, Gholson J; Stevenson, David A; Bray, Bruce E; Moore, Barry M; Eilbeck, Karen; Yandell, Mark; Zhao, Hongyu; Hou, Lin; Chen, Xiaowei; Yan, Xiting; Chen, Mengjie; Li, Cong; Yang, Can; Gunel, Murat; Li, Peining; Kong, Yong; Alexander, Austin C; Albertyn, Zayed I; Boycott, Kym M; Bulman, Dennis E; Gordon, Paul M K; Innes, A Micheil; Knoppers, Bartha M; Majewski, Jacek; Marshall, Christian R; Parboosingh, Jillian S; Sawyer, Sarah L; Samuels, Mark E; Schwartzentruber, Jeremy; Kohane, Isaac S; Margulies, David M

2014-03-25

There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups.
Dimeric PROP1 binding to diverse palindromic TAAT sequences promotes its transcriptional activity.

PubMed

Nakayama, Michie; Kato, Takako; Susa, Takao; Sano, Akiko; Kitahara, Kousuke; Kato, Yukio

2009-08-13

Mutations in the Prop1 gene are responsible for murine Ames dwarfism and human combined pituitary hormone deficiency with hypogonadism. Recently, we reported that PROP1 is a possible transcription factor for gonadotropin subunit genes through plural cis-acting sites composed of AT-rich sequences containing a TAAT motif which differs from its consensus binding sequence known as PRDQ9 (TAATTGAATTA). This study aimed to verify the binding specificity and sequence of PROP1 by applying the method of SELEX (Systematic Evolution of Ligands by EXponential enrichment), EMSA (electrophoretic mobility shift assay) and transient transfection assay. SELEX, after 5, 7 and 9 generations of selection using a random sequence library, showed that nucleotides containing one or two TAAT motifs were accumulated and accounted for 98.5% at the 9th generation. Aligned sequences and EMSA demonstrated that PROP1 binds preferentially to 11 nucleotides composed of an inverted TAAT motif separated by 3 nucleotides with variation in the half site of palindromic TAAT motifs and with preferential requirement of T at the nucleotide number 5 immediately 3' to a TAAT motif. Transient transfection assay demonstrated first that dimeric binding of PROP1 to an inverted TAAT motif and its cognates resulted in transcriptional activation, whereas monomeric binding of PROP1 to a single TAAT motif and an inverted ATTA motif did not mediate activation. Thus, this study demonstrated that dimeric binding of PROP1 is able to recognize diverse palindromic TAAT sequences separated by 3 nucleotides and to exhibit its transcriptional activity.
A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae)

PubMed Central

Tank, David C.

2016-01-01

Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses the Fluidigm Access Array System (Fluidigm, San Francisco, CA, USA) and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per Access Array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or—more importantly—recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our approach and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500bp for 349 samples. PMID:26828929
Distribution of a Nocardia brasiliensis catalase gene fragment in members of the genera Nocardia, Gordona, and Rhodococcus.

PubMed

Vera-Cabrera, L; Johnson, W M; Welsh, O; Resendiz-Uresti, F L; Salinas-Carmona, M C

1999-06-01

An immunodominant protein from Nocardia brasiliensis, P61, was subjected to amino-terminal and internal sequence analysis. Three sequences of 22, 17, and 38 residues, respectively, were obtained and compared with the protein database from GenBank by using the BLAST system. The sequences showed homology to some eukaryotic catalases and to a bromoperoxidase-catalase from Streptomyces violaceus. Its identity as a catalase was confirmed by analysis of its enzymatic activity on H2O2 and by a double-staining method on a nondenaturing polyacrylamide gel with 3,3'-diaminobenzidine and ferricyanide; the result showed only catalase activity, but no peroxidase. By using one of the internal amino acid sequences and a consensus catalase motif (VGNNTP), we were able to design a PCR assay that generated a 500-bp PCR product. The amplicon was analyzed, and the nucleotide sequence was compared to the GenBank database with the observation of high homology to other bacterial and eukaryotic catalases. A PCR assay based on this target sequence was performed with primers NB10 and NB11 to confirm the presence of the NB10-NB11 gene fragment in several N. brasiliensis strains isolated from mycetoma. The same assay was used to determine whether there were homologous sequences in several type strains from the genera Nocardia, Rhodococcus, Gordona, and Streptomyces. All of the N. brasiliensis strains presented a positive result but only some of the actinomycetes species tested were positive in the PCR assay. In order to confirm these findings, genomic DNA was subjected to Southern blot analysis. A 1.7-kbp band was observed in the N. brasiliensis strains, and bands of different molecular weight were observed in cross-reacting actinomycetes. Sequence analysis of the amplicons of selected actinomycetes showed high homology in this catalase fragment, thus demonstrating that this protein is highly conserved in this group of bacteria.
DEApp: an interactive web interface for differential expression analysis of next generation sequence data.

PubMed

Li, Yan; Andrade, Jorge

2017-01-01

A growing trend in the biomedical community is the use of Next Generation Sequencing (NGS) technologies in genomics research. The complexity of downstream differential expression (DE) analysis is however still challenging, as it requires sufficient computer programing and command-line knowledge. Furthermore, researchers often need to evaluate and visualize interactively the effect of using differential statistical and error models, assess the impact of selecting different parameters and cutoffs, and finally explore the overlapping consensus of cross-validated results obtained with different methods. This represents a bottleneck that slows down or impedes the adoption of NGS technologies in many labs. We developed DEApp, an interactive and dynamic web application for differential expression analysis of count based NGS data. This application enables models selection, parameter tuning, cross validation and visualization of results in a user-friendly interface. DEApp enables labs with no access to full time bioinformaticians to exploit the advantages of NGS applications in biomedical research. This application is freely available at https://yanli.shinyapps.io/DEAppand https://gallery.shinyapps.io/DEApp.
Epitope mapping of PR81 anti-MUC1 monoclonal antibody following PEPSCAN and phage display techniques.

PubMed

Mohammadi, Mohammad; Rasaee, Mohammad Javad; Rajabibazl, Masoumeh; Paknejad, Malihe; Zare, Mehrak; Mohammadzadeh, Sara

2007-08-01

PR81 is an anti-MUC1 monoclonal antibody (MAb) which was generated against human MUC1 mucin that reacted with breast cancerous tissue, MUC1 positive cell line (MCF-7, BT-20, and T-4 7 D), and synthetic peptide, including the tandem repeat sequence of MUC1. Here we characterized the binding properties of PR81 against the tandem repeat of MUC1 by two different epitope mapping techniques, namely, PEPSCAN and phage display. Epitope mapping of PR81 MAb by PEPSCAN revealed a minimal consensus binding sequence, PDTRP, which is found on MUC1 peptide as the most important epitope. Using the phage display peptide library, we identified the motif PD(T/S/G)RP as an epitope and the motif AVGLSPDGSRGV as a mimotope recognized by PR81. Results of these two methods showed that the two residues, arginine and aspartic acid, have important roles in antibody binding and threonine can be substituted by either glycine or serine. These results may be of importance in tailor making antigens used in immunoassay.
Insights Into Upland Cotton (Gossypium hirsutum L.) Genetic Recombination Based on 3 High-Density Single-Nucleotide Polymorphism and a Consensus Map Developed Independently With Common Parents. Genomics Insights

USDA-ARS?s Scientific Manuscript database

High-density linkage maps are vital to supporting the correct placement of scaffolds and gene sequences on chromosomes and fundamental to contemporary organismal research and scientific approaches to genetic improvement; high-density linkage maps are especially important in paleopolyploids with exce...
Attitude Importance and the False Consensus Effect.

ERIC Educational Resources Information Center

Fabrigar, Leandre R.; Krosnick, Jon A.

1995-01-01

Explores the possibility that importance may regulate the magnitude of the false consensus effect. Analysis revealed a strong false consensus effect but no reliable relation between its magnitude and attitude importance. Results contradict assumptions that the false consensus effect arises from attitudes that directly or indirectly influence…
A synthetic promoter library for constitutive gene expression in Lactobacillus plantarum.

PubMed

Rud, Ida; Jensen, Peter Ruhdal; Naterstad, Kristine; Axelsson, Lars

2006-04-01

A synthetic promoter library (SPL) for Lactobacillus plantarum has been developed, which generalizes the approach for obtaining synthetic promoters. The consensus sequence, derived from rRNA promoters extracted from the L. plantarum WCFS1 genome, was kept constant, and the non-consensus sequences were randomized. Construction of the SPL was performed in a vector (pSIP409) previously developed for high-level, inducible gene expression in L. plantarum and Lactobacillus sakei. A wide range of promoter strengths was obtained with the approach, covering 3-4 logs of expression levels in small increments of activity. The SPL was evaluated for the ability to drive beta-glucuronidase (GusA) and aminopeptidase N (PepN) expression. Protein production from the synthetic promoters was constitutive, and the most potent promoters gave high protein production with levels comparable to those of native rRNA promoters, and production of PepN protein corresponding to approximately 10-15 % of the total cellular protein. High correlation was obtained between the activities of promoters when tested in L. sakei and L. plantarum, which indicates the potential of the SPL for other Lactobacillus species. The SPL enables fine-tuning of stable gene expression for various applications in L. plantarum.
RADH, a gene of Saccharomyces cerevisiae encoding a putative DNA helicase involved in DNA repair. Characteristics of radH mutants and sequence of the gene.

PubMed

Aboussekhra, A; Chanet, R; Zgaga, Z; Cassier-Chauvat, C; Heude, M; Fabre, F

1989-09-25

A new type of radiation-sensitive mutant of S. cerevisiae is described. The recessive radH mutation sensitizes to the lethal effect of UV radiations haploids in the G1 but not in the G2 mitotic phase. Homozygous diploids are as sensitive as G1 haploids. The UV-induced mutagenesis is depressed, while the induction of gene conversion is increased. The mutation is believed to channel the repair of lesions engaged in the mutagenic pathway into a recombination process, successful if the events involve sister-chromatids but lethal if they involve homologous chromosomes. The sequence of the RADH gene reveals that it may code for a DNA helicase, with a Mr of 134 kDa. All the consensus domains of known DNA helicases are present. Besides these consensus regions, strong homologies with the Rep and UvrD helicases of E. coli were found. The RadH putative helicase appears to belong to the set of proteins involved in the error-prone repair mechanism, at least for UV-induced lesions, and could act in coordination with the Rev3 error-prone DNA polymerase.
Spherical: an iterative workflow for assembling metagenomic datasets.

PubMed

Hitch, Thomas C A; Creevey, Christopher J

2018-01-24

The consensus emerging from the study of microbiomes is that they are far more complex than previously thought, requiring better assemblies and increasingly deeper sequencing. However, current metagenomic assembly techniques regularly fail to incorporate all, or even the majority in some cases, of the sequence information generated for many microbiomes, negating this effort. This can especially bias the information gathered and the perceived importance of the minor taxa in a microbiome. We propose a simple but effective approach, implemented in Python, to address this problem. Based on an iterative methodology, our workflow (called Spherical) carries out successive rounds of assemblies with the sequencing reads not yet utilised. This approach also allows the user to reduce the resources required for very large datasets, by assembling random subsets of the whole in a "divide and conquer" manner. We demonstrate the accuracy of Spherical using simulated data based on completely sequenced genomes and the effectiveness of the workflow at retrieving lost information for taxa in three published metagenomics studies of varying sizes. Our results show that Spherical increased the amount of reads utilized in the assembly by up to 109% compared to the base assembly. The additional contigs assembled by the Spherical workflow resulted in a significant (P < 0.05) changes in the predicted taxonomic profile of all datasets analysed. Spherical is implemented in Python 2.7 and freely available for use under the MIT license. Source code and documentation is hosted publically at: https://github.com/thh32/Spherical .
Molecular cloning of crustins from the hemocytes of Brazilian penaeid shrimps.

PubMed

Rosa, Rafael Diego; Bandeira, Paula Terra; Barracco, Margherita Anna

2007-09-01

Crustins are antimicrobial peptides initially identified in the hemocytes of the crab Carcinus maenas (11.5-kDa peptide or carcinin) and recently also recognized in penaeid shrimps and other crustacean species. The aim of this study was to identify sequences encoding for crustins from the hemocytes of four Brazilian penaeid species: Farfantepenaeus paulensis, Farfantepenaeus subtilis, Farfantepenaeus brasiliensis and Litopenaeus schmitti. Using primers based on consensus nucleotide alignment of crustins from different crustaceans, cDNA sequences coding for crustins in all indigenous penaeid species were amplified. The obtained four crustin sequences encoded for peptides containing a hydrophobic N-terminal region rich in glycine repeats and a C-terminal part with 12 cysteine residues and a conserved whey acidic protein domain. All obtained crustin sequences showed high amino acidic similarity among each other and with crustins from litopenaeid shrimps (76-98%). This is the first report of crustins in native Brazilian penaeid shrimps.
Nucleotide sequencing analysis of a LEU gene of Candida maltosa which complements leuB mutation of Escherichia coli and leu2 mutation of Saccharomyces cerevisiae.

PubMed

Takagi, M; Kobayashi, N; Sugimoto, M; Fujii, T; Watari, J; Yano, K

1987-01-01

The expression of a LEU gene from Candida maltosa (designated as C-LEU2) isolated previously (Kawamura et al. 1983) was shown to be regulated, when transferred into Saccharomyces cerevisiae, by leucine and threonine in the medium, as in the case of LEU2 gene of S. cerevisiae. The coding region together with the regulatory region was subcloned and the nucleotide sequence was determined. When the sequence of the coding region was compared with that of LEU2, the homology was 72% for base pairs and 76% for deduced amino acids. Comparison of the regulatory region of C-LEU2 with those of LEU1 and LEU2 suggested a few short consensus sequences which are involved in regulation of gene expression by leucine and threonine in the medium.
Identification of high-specificity H-NS binding site in LEE5 promoter of enteropathogenic Esherichia coli (EPEC).

PubMed

Bhat, Abhay Prasad; Shin, Minsang; Choy, Hyon E

2014-07-01

Histone-like nucleoid structuring protein (H-NS) is a small but abundant protein present in enteric bacteria and is involved in compaction of the DNA and regulation of the transcription. Recent reports have suggested that H-NS binds to a specific AT rich DNA sequence than to intrinsically curved DNA in sequence independent manner. We detected two high-specificity H-NS binding sites in LEE5 promoter of EPEC centered at -110 and -138, which were close to the proposed consensus H-NS binding motif. To identify H-NS binding sequence in LEE5 promoter, we took a random mutagenesis approach and found the mutations at around -138 were specifically defective in the regulation by H-NS. It was concluded that H-NS exerts maximum repression via the specific sequence at around -138 and subsequently contacts a subunit of RNAP through oligomerization.
The s29x gene of symbiotic bacteria in Amoeba proteus with a novel promoter.

PubMed

Pak, J W; Jeon, K W

1996-05-24

Gram-symbiotic bacteria (called X-bacteria), present in the xD strain of Amoeba proteus as required cell components, synthesize and export a large amount of a 29-kDa protein, S29x. S29x is exported into the host's cytoplasm across the bacterial membranes and the symbiosome membrane. The complete nucleotide (nt) sequence of the s29x gene of X-bacteria has been determined, and the promoter sequence and tsp have also been identified. The gene has a nonconventional promoter with putative nt sequences different from the known consensus sequences. When Escherichia coli cells are transformed with s29x, the gene is expressed and the product is secreted into the culture medium. Functions of S29x are not fully known, but it is suspected that S29x plays an important role in the symbiotic relationship between amoebae and X-bacteria.
Current whole-body MRI applications in the neurofibromatoses: NF1, NF2, and schwannomatosis.

PubMed

Ahlawat, Shivani; Fayad, Laura M; Khan, Muhammad Shayan; Bredella, Miriam A; Harris, Gordon J; Evans, D Gareth; Farschtschi, Said; Jacobs, Michael A; Chhabra, Avneesh; Salamon, Johannes M; Wenzel, Ralph; Mautner, Victor F; Dombi, Eva; Cai, Wenli; Plotkin, Scott R; Blakeley, Jaishri O

2016-08-16

The Response Evaluation in Neurofibromatosis and Schwannomatosis (REiNS) International Collaboration Whole-Body MRI (WB-MRI) Working Group reviewed the existing literature on WB-MRI, an emerging technology for assessing disease in patients with neurofibromatosis type 1 (NF1), neurofibromatosis type 2 (NF2), and schwannomatosis (SWN), to recommend optimal image acquisition and analysis methods to enable WB-MRI as an endpoint in NF clinical trials. A systematic process was used to review all published data about WB-MRI in NF syndromes to assess diagnostic accuracy, feasibility and reproducibility, and data about specific techniques for assessment of tumor burden, characterization of neoplasms, and response to therapy. WB-MRI at 1.5T or 3.0T is feasible for image acquisition. Short tau inversion recovery (STIR) sequence is used in all investigations to date, suggesting consensus about the utility of this sequence for detection of WB tumor burden in people with NF. There are insufficient data to support a consensus statement about the optimal imaging planes (axial vs coronal) or 2D vs 3D approaches. Functional imaging, although used in some NF studies, has not been systematically applied or evaluated. There are no comparative studies between regional vs WB-MRI or evaluations of WB-MRI reproducibility. WB-MRI is feasible for identifying tumors using both 1.5T and 3.0T systems. The STIR sequence is a core sequence. Additional investigation is needed to define the optimal approach for volumetric analysis, the reproducibility of WB-MRI in NF, and the diagnostic performance of WB-MRI vs regional MRI. © 2016 American Academy of Neurology.
enoLOGOS: a versatile web tool for energy normalized sequence logos

PubMed Central

Workman, Christopher T.; Yin, Yutong; Corcoran, David L.; Ideker, Trey; Stormo, Gary D.; Benos, Panayiotis V.

2005-01-01

enoLOGOS is a web-based tool that generates sequence logos from various input sources. Sequence logos have become a popular way to graphically represent DNA and amino acid sequence patterns from a set of aligned sequences. Each position of the alignment is represented by a column of stacked symbols with its total height reflecting the information content in this position. Currently, the available web servers are able to create logo images from a set of aligned sequences, but none of them generates weighted sequence logos directly from energy measurements or other sources. With the advent of high-throughput technologies for estimating the contact energy of different DNA sequences, tools that can create logos directly from binding affinity data are useful to researchers. enoLOGOS generates sequence logos from a variety of input data, including energy measurements, probability matrices, alignment matrices, count matrices and aligned sequences. Furthermore, enoLOGOS can represent the mutual information of different positions of the consensus sequence, a unique feature of this tool. Another web interface for our software, C2H2-enoLOGOS, generates logos for the DNA-binding preferences of the C2H2 zinc-finger transcription factor family members. enoLOGOS and C2H2-enoLOGOS are accessible over the web at . PMID:15980495
Evolution to pathogenicity of the parvovirus minute virus of mice in immunodeficient mice involves genetic heterogeneity at the capsid domain that determines tropism.

PubMed

López-Bueno, Alberto; Segovia, José C; Bueren, Juan A; O'Sullivan, M Gerard; Wang, Feng; Tattersall, Peter; Almendral, José M

2008-02-01

Very little is known about the role that evolutionary dynamics plays in diseases caused by mammalian DNA viruses. To address this issue in a natural host model, we compared the pathogenesis and genetics of the attenuated fibrotropic and the virulent lymphohematotropic strains of the parvovirus minute virus of mice (MVM), and of two invasive fibrotropic MVM (MVMp) variants carrying the I362S or K368R change in the VP2 major capsid protein, in the infection of severe combined immunodeficient (SCID) mice. By 14 to 18 weeks after oronasal inoculation, the I362S and K368R viruses caused lethal leukopenia characterized by tissue damage and inclusion bodies in hemopoietic organs, a pattern of disease found by 7 weeks postinfection with the lymphohematotropic MVM (MVMi) strain. The MVMp populations emerging in leukopenic mice showed consensus sequence changes in the MVMi genotype at residues G321E and A551V of VP2 in the I362S virus infections or A551V and V575A changes in the K368R virus infections, as well as a high level of genetic heterogeneity within a capsid domain at the twofold depression where these residues lay. Amino acids forming this capsid domain are important MVM tropism determinants, as exemplified by the switch in MVMi host range toward mouse fibroblasts conferred by coordinated changes of some of these residues and by the essential character of glutamate at residue 321 for maintaining MVMi tropism toward primary hemopoietic precursors. The few viruses within the spectrum of mutants from mice that maintained the respective parental 321G and 575V residues were infectious in a plaque assay, whereas the viruses with the main consensus sequences exhibited low levels of fitness in culture. Consistent with this finding, a recombinant MVMp virus carrying the consensus sequence mutations arising in the K368R virus background in mice failed to initiate infection in cell lines of different tissue origins, even though it caused rapid-course lethal leukopenia in SCID mice. The parental consensus genotype prevailed during leukopenia development, but plaque-forming viruses with the reversion of the 575A residue to valine emerged in affected organs. The disease caused by the DNA virus in mice, therefore, involves the generation of heterogeneous viral populations that may cooperatively interact for the hemopoietic syndrome. The evolutionary changes delineate a sector of the surface of the capsid that determines tropism and that surrounds the sialic acid receptor binding domain.
Initial sequence characterization of the rhabdoviruses of squamate reptiles, including a novel rhabdovirus from a caiman lizard (Dracaena guianensis).

PubMed

Wellehan, James F X; Pessier, Allan P; Archer, Linda L; Childress, April L; Jacobson, Elliott R; Tesh, Robert B

2012-08-17

Rhabdoviruses infect a variety of hosts, including non-avian reptiles. Consensus PCR techniques were used to obtain partial RNA-dependent RNA polymerase gene sequence from five rhabdoviruses of South American lizards; Marco, Chaco, Timbo, Sena Madureira, and a rhabdovirus from a caiman lizard (Dracaena guianensis). The caiman lizard rhabdovirus formed inclusions in erythrocytes, which may be a route for infecting hematophagous insects. This is the first information on behavior of a rhabdovirus in squamates. We also obtained sequence from two rhabdoviruses of Australian lizards, confirming previous Charleville virus sequence and finding that, unlike a previous sequence report but in agreement with serologic reports, Almpiwar virus is clearly distinct from Charleville virus. Bayesian and maximum likelihood phylogenetic analysis revealed that most known rhabdoviruses of squamates cluster in the Almpiwar subgroup. The exception is Marco virus, which is found in the Hart Park group. Copyright © 2012 Elsevier B.V. All rights reserved.
The GENCODE exome: sequencing the complete human exome

PubMed Central

Coffey, Alison J; Kokocinski, Felix; Calafato, Maria S; Scott, Carol E; Palta, Priit; Drury, Eleanor; Joyce, Christopher J; LeProust, Emily M; Harrow, Jen; Hunt, Sarah; Lehesjoki, Anna-Elina; Turner, Daniel J; Hubbard, Tim J; Palotie, Aarno

2011-01-01

Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium. The extended set covers an additional 5594 genes and 10.3 Mb compared with the current CCDS-based sets. The additional regions include potential disease genes previously inaccessible to exome resequencing studies, such as 43 genes linked to ion channel activity and 70 genes linked to protein kinase activity. In total, the new GENCODE exome set developed here covers 47.9 Mb and performed well in sequence capture experiments. In the sample set used in this study, we identified over 5000 SNP variants more in the GENCODE exome target (24%) than in the CCDS-based exome sequencing. PMID:21364695

High-Throughput, Data-Rich Cellular RNA Device Engineering

PubMed Central

Townshend, Brent; Kennedy, Andrew B.; Xiang, Joy S.; Smolke, Christina D.

2015-01-01

Methods for rapidly assessing sequence-structure-function landscapes and developing conditional gene-regulatory devices are critical to our ability to manipulate and interface with biology. We describe a framework for engineering RNA devices from preexisting aptamers that exhibit ligand-responsive ribozyme tertiary interactions. Our methodology utilizes cell sorting, high-throughput sequencing, and statistical data analyses to enable parallel measurements of the activities of hundreds of thousands of sequences from RNA device libraries in the absence and presence of ligands. Our tertiary interaction RNA devices exhibit improved performance in terms of gene silencing, activation ratio, and ligand sensitivity as compared to optimized RNA devices that rely on secondary structure changes. We apply our method to building biosensors for diverse ligands and determine consensus sequences that enable ligand-responsive tertiary interactions. These methods advance our ability to develop broadly applicable genetic tools and to elucidate understanding of the underlying sequence-structure-function relationships that empower rational design of complex biomolecules. PMID:26258292
Structural organization of the porcine and human genes coding for a leydig cell-specific insulin-like peptide (LEY I-L) and chromosomal localization of the human gene (INSL3)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burkhardt E.; Adham, I.M.; Brosig, B.

1994-03-01

Leydig insulin-like protein (LEY I-L) is a member of the insulin-like hormone superfamily. The LEY I-L gene (designated INSL3) is expressed exclusively in prenatal and postnatal Leydig cells. The authors report here the cloning and nucleotide sequence of porcine and human LEY I-L genes including the 5[prime] regions. Both genes consist of two exons and one intron. The organization of the LEY I-L gene is similar to that of insulin and relaxin. The transcription start site in the porcine and human LEY I-L gene is localized 13 and 14 bp upstream of the translation start site, respectively. Alignment of themore » 5[prime] flanking regions of both genes reveals that the first 107 nucleotides upstream of the transcription start site exhibit an overall sequence similarity of 80%. This conserved region contains a consensus TATAA box, a CAAT-like element (GAAT), and a consensus SP1 sequence (GGGCGG) at equivalent positions in both genes and therefore may play a role in regulation of expression of the LEY I-L gene. The porcine and human genome contains a single copy of the LEY I-L gene. By in situ hybridization, the human gene was assigned to bands p13.2-p12 of the short arm of chromosome 19. 25 refs., 6 figs.« less
Mechanisms of radiation-induced gene responses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Woloschak, G.E.; Paunesku, T.

1996-10-01

In the process of identifying genes differentially expressed in cells exposed ultraviolet radiation, we have identified a transcript having a 26-bp region that is highly conserved in a variety of species including Bacillus circulans, yeast, pumpkin, Drosophila, mouse, and man. When the 5` region (flanking region or UTR) of a gene, the sequence is predominantly in +/+ orientation with respect to the coding DNA strand; while in the coding region and the 3` region (UTR), the sequence is most frequently in the +/-orientation with respect to the coding DNA strand. In two genes, the element is split into two parts;more » however, in most cases, it is found only once but with a minimum of 11 consecutive nucleotides precisely depicting the original sequence. The element is found in a large number of different genes with diverse functions (from human ras p21 to B. circulans chitonase). Gel shift assays demonstrated the presence of a protein in HeLa cell extracts that binds to the sense and antisense single-stranded consensus oligomers, as well as to the double- stranded oligonucleotide. When double-stranded oligomer was used, the size shift demonstrated as additional protein-oligomer complex larger than the one bound to either sense or antisense single-stranded consensus oligomers alone. It is speculated either that this element binds to protein(s) important in maintaining DNA is a single-stranded orientation for transcription or, alternatively that this element is important in the transcription-coupled DNA repair process.« less
In-silico analysis of putative HCV epitopes against Pakistani human leukocyte antigen background: An approach towards development of future vaccines for Pakistani population.

PubMed

Ashraf, Naeem Mahmood; Bilal, Muhammad; Mahmood, Malik Siddique; Hussain, Aadil; Mehboob, Muhammad Zubair

2016-09-01

Mounting burden of HCV-infected individuals and soaring cost of treatment is a serious source of unease for developing countries. Numbers of various approaches have been anticipated to develop a vaccine against HCV but the majority of them proved ineffective. Development of vaccine by considering geographical distribution of HCV genotypes and host genetics shows potential. In this research article, we have tried to predict most putative HCV epitopes which are efficiently restricted by most common HLA alleles in Pakistani population through different computational algorithms. Thirteen selected, experimentally identified epitopes sequences were used to derived consensus sequences in all genotypes of HCV. Obtained consensus sequences were used to predict their binding affinities with most prevalent HLA alleles in Pakistani population. Two Class-I epitopes from NS4B region, one from Class-I epitope from NS5A and one Class-II epitope from NS3 region showed effective binding and proved to be highly putative to boost immune response. A cocktail of these four have been checked for population coverage and they gave 75.53% for Pakistani Asian and 70.77% for Pakistani Mixed populations with no allergenic response. Computational algorithms are robust way to shortlist potential candidate epitopes for vaccine development but further, in vivo and in-vitro studies are required to confirm their immunogenic properties. Copyright © 2016 Elsevier B.V. All rights reserved.
Structure and genomic organization of the human B1 receptor gene for kinins (BDKRB1).

PubMed

Bachvarov, D R; Hess, J F; Menke, J G; Larrivée, J F; Marceau, F

1996-05-01

Two subtypes of mammalian bradykinin receptors, B1 and B2 (BDKRB1 and BDKRB2), have been defined based on their pharmacological properties. The B1 type kinin receptors have weak affinity for intact BK or Lys-BK but strong affinity for kinin metabolites without the C-terminal arginine (e.g., des-Arg9-BK and Lys-des-Arg9-BK, also called des-Arg10-kallidin), which are generated by kininase I. The B1 receptor expression is up-regulated following tissue injury and inflammation (hyperemia, exudation, hyperalgesia, etc.). In the present study, we have cloned and sequenced the gene encoding human B1 receptor from a human genomic library. The human B1 receptor gene contains three exons separated by two introns. The first and the second exon are noncoding, while the coding region and the 3'-flanking region are located entirely on the third exon. The exon-intron arrangement of the human B1 receptor gene shows significant similarity with the genes encoding the B2 receptor subtype in human, mouse, and rat. Sequence analysis of the 5'-flanking region revealed the presence of a consensus TATA box and of numerous candidate transcription factor binding sequences. Primer extension experiments have shown the existence of multiple transcription initiation sites situated downstream and upstream from the consensus TATA box. Genomic Southern blot analysis indicated that the human B1 receptor is encoded by a single-copy gene.
AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

PubMed Central

2010-01-01

Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162
LPmerge: an R package for merging genetic maps by linear programming.

PubMed

Endelman, Jeffrey B; Plomion, Christophe

2014-06-01

Consensus genetic maps constructed from multiple populations are an important resource for both basic and applied research, including genome-wide association analysis, genome sequence assembly and studies of evolution. The LPmerge software uses linear programming to efficiently minimize the mean absolute error between the consensus map and the linkage maps from each population. This minimization is performed subject to linear inequality constraints that ensure the ordering of the markers in the linkage maps is preserved. When marker order is inconsistent between linkage maps, a minimum set of ordinal constraints is deleted to resolve the conflicts. LPmerge is on CRAN at http://cran.r-project.org/web/packages/LPmerge. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Transcriptional activation of the Escherichia coli adaptive response gene aidB is mediated by binding of methylated Ada protein. Evidence for a new consensus sequence for Ada-binding sites.

PubMed

Landini, P; Volkert, M R

1995-04-07

The Escherichia coli aidB gene is part of the adaptive response to DNA methylation damage. Genes belonging to the adaptive response are positively regulated by the ada gene; the Ada protein acts as a transcriptional activator when methylated in one of its cysteine residues at position 69. Through DNaseI protection assays, we show that methylated Ada (meAda) is able to bind a DNA sequence between 40 and 60 base pairs upstream of the aidB transcriptional startpoint. Binding of meAda is necessary to activate transcription of the adaptive response genes; accordingly, in vitro transcription of aidB is dependent on the presence of meAda. Unmethylated Ada protein shows no protection against DNaseI digestion in the aidB promoter region nor does it promote aidB in vitro transcription. The aidB Ada-binding site shows only weak homology to the proposed consensus sequences for Ada-binding sites in E. coli (AAANNAA and AAAGCGCA) but shares a higher degree of similarity with the Ada-binding regions from other bacterial species, such as Salmonella typhimurium and Bacillus subtilis. Based on the comparison of five different Ada-dependent promoter regions, we suggest that a possible recognition sequence for meAda might be AATnnnnnnG-CAA. Higher concentrations of Ada are required for the binding of aidB than for the ada promoter, suggesting lower affinity of the protein for the aidB Ada-binding site. Common features in the Ada-binding regions of ada and aidB are a high A/T content, the presence of an inverted repeat structure, and their position relative to the transcriptional start site. We propose that these elements, in addition to the proposed recognition sequence, are important for binding of the Ada protein.
Expression of the Caulobacter heat shock gene dnaK is developmentally controlled during growth at normal temperatures.

PubMed Central

Gomes, S L; Gober, J W; Shapiro, L

1990-01-01

Caulobacter crescentus has a single dnaK gene that is highly homologous to the hsp70 family of heat shock genes. Analysis of the cloned and sequenced dnaK gene has shown that the deduced amino acid sequence could encode a protein of 67.6 kilodaltons that is 68% identical to the DnaK protein of Escherichia coli and 49% identical to the Drosophila and human hsp70 protein family. A partial open reading frame 165 base pairs 3' to the end of dnaK encodes a peptide of 190 amino acids that is 59% identical to DnaJ of E. coli. Northern blot analysis revealed a single 4.0-kilobase mRNA homologous to the cloned fragment. Since the dnaK coding region is 1.89 kilobases, dnaK and dnaJ may be transcribed as a polycistronic message. S1 mapping and primer extension experiments showed that transcription initiated at two sites 5' to the dnaK coding sequence. A single start site of transcription was identified during heat shock at 42 degrees C, and the predicted promoter sequence conformed to the consensus heat shock promoters of E. coli. At normal growth temperature (30 degrees C), a different start site was identified 3' to the heat shock start site that conformed to the E. coli sigma 70 promoter consensus sequence. S1 protection assays and analysis of expression of the dnaK gene fused to the lux transcription reporter gene showed that expression of dnaK is temporally controlled under normal physiological conditions and that transcription occurs just before the initiation of DNA replication. Thus, in both human cells (I. K. L. Milarski and R. I. Morimoto, Proc. Natl. Acad. Sci. USA 83:9517-9521, 1986) and in a simple bacterium, the transcription of a hsp70 gene is temporally controlled as a function of the cell cycle under normal growth conditions. Images PMID:2345134
SSMART: Sequence-structure motif identification for RNA-binding proteins.

PubMed

Munteanu, Alina; Mukherjee, Neelanjan; Ohler, Uwe

2018-06-11

RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary data are available at Bioinformatics online.
Predictive models of safety based on audit findings: Part 2: Measurement of model validity.

PubMed

Hsiao, Yu-Lin; Drury, Colin; Wu, Changxu; Paquet, Victor

2013-07-01

Part 1 of this study sequence developed a human factors/ergonomics (HF/E) based classification system (termed HFACS-MA) for safety audit findings and proved its measurement reliability. In Part 2, we used the human error categories of HFACS-MA as predictors of future safety performance. Audit records and monthly safety incident reports from two airlines submitted to their regulatory authority were available for analysis, covering over 6.5 years. Two participants derived consensus results of HF/E errors from the audit reports using HFACS-MA. We adopted Neural Network and Poisson regression methods to establish nonlinear and linear prediction models respectively. These models were tested for the validity of prediction of the safety data, and only Neural Network method resulted in substantially significant predictive ability for each airline. Alternative predictions from counting of audit findings and from time sequence of safety data produced some significant results, but of much smaller magnitude than HFACS-MA. The use of HF/E analysis of audit findings provided proactive predictors of future safety performance in the aviation maintenance field. Copyright © 2013 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Enterocin T, a novel class IIa bacteriocin produced by Enterococcus sp. 812.

PubMed

Chen, Yi-Sheng; Yu, Chi-Rong; Ji, Si-Hua; Liou, Min-Shiuan; Leong, Kun-Hon; Pan, Shwu-Fen; Wu, Hui-Chung; Lin, Yu-Hsuan; Yu, Bi; Yanagida, Fujitoshi

2013-09-01

Enterococcus sp. 812, isolated from fresh broccoli, was previously found to produce a bacteriocin active against a number of Gram-positive bacteria, including Listeria monocytogenes. Bacteriocin activity decreased slightly after autoclaving (121 °C for 15 min), but was inactivated by protease K. Mass spectrometry analysis revealed the bacteriocin mass to be approximately 4,521.34 Da. N-terminal amino acid sequencing yielded a partial sequence, NH2-ATYYGNGVYXDKKKXWVEWGQA, by Edman degradation, which contained the consensus class IIa bacteriocin motif YGNGV in the N-terminal region. The obtained partial sequence showed high homology with some enterococcal bacteriocins; however, no identical peptide or protein was found. This peptide was therefore considered to be a novel bacteriocin produced by Enterococcus sp. 812 and was termed enterocin T.
PigGIS: Pig Genomic Informatics System

PubMed Central

Ruan, Jue; Guo, Yiran; Li, Heng; Hu, Yafeng; Song, Fei; Huang, Xin; Kristiensen, Karsten; Bolund, Lars; Wang, Jun

2007-01-01

Pig Genomic Information System (PigGIS) is a web-based depository of pig (Sus scrofa) genomic learning mainly engineered for biomedical research to locate pig genes from their human homologs and position single nucleotide polymorphisms (SNPs) in different pig populations. It utilizes a variety of sequence data, including whole genome shotgun (WGS) reads and expressed sequence tags (ESTs), and achieves a successful mapping solution to the low-coverage genome problem. With the data presently available, we have identified a total of 15 700 pig consensus sequences covering 18.5 Mb of the homologous human exons. We have also recovered 18 700 SNPs and 20 800 unique 60mer oligonucleotide probes for future pig genome analyses. PigGIS can be freely accessed via the web at and . PMID:17090590
The pig CYP2E1 promoter is activated by COUP-TF1 and HNF-1 and is inhibited by androstenone.

PubMed

Tambyrajah, Winston S; Doran, Elena; Wood, Jeffrey D; McGivan, John D

2004-11-15

Functional analysis of the pig cytochrome P4502E1 (CYP2E1) promoter identified two major activating elements. One corresponded to the hepatic nuclear factor 1 (HNF-1) consensus binding sequence at nucleotides -128/-98 and the other was located in the region -292/-266. The binding of proteins in pig liver nuclear extracts to a synthetic double-stranded oligonucleotide corresponding to this more distal activating sequence was studied by electrophoretic mobility shift assay. The minimum protein binding sequence was identified as TGTTCTGACCTCTGGG. Gel super-shift assays identified the protein binding to this site as chick ovalbumin upstream promoter transcription factor 1 (COUP-TF1). Androstenone inhibited promoter activity in transfection experiments only with constructs which included the COUP-TF1 binding site. Androstenone inhibited COUP-TF1 binding to synthetic oligonucleotides but did not affect HNF-1 binding. The results offer an explanation for the inhibition of CYP2E1 protein expression by androstenone in isolated pig hepatocytes and may be relevant to the low expression of hepatic CYP2E1 in those pigs which accumulate high levels of androstenone in vivo.
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.

PubMed

Bickhart, Derek M; Rosen, Benjamin D; Koren, Sergey; Sayre, Brian L; Hastie, Alex R; Chan, Saki; Lee, Joyce; Lam, Ernest T; Liachko, Ivan; Sullivan, Shawn T; Burton, Joshua N; Huson, Heather J; Nystrom, John C; Kelley, Christy M; Hutchison, Jana L; Zhou, Yang; Sun, Jiajie; Crisà, Alessandra; Ponce de León, F Abel; Schwartz, John C; Hammond, John A; Waldbieser, Geoffrey C; Schroeder, Steven G; Liu, George E; Dunham, Maitreya J; Shendure, Jay; Sonstegard, Tad S; Phillippy, Adam M; Van Tassell, Curtis P; Smith, Timothy P L

2017-04-01

The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus) based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced what is, to our knowledge, the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ∼400-fold improvement in continuity due to properly assembled gaps, compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex yet produced for an individual of a ruminant species.
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome

PubMed Central

Bickhart, Derek M.; Rosen, Benjamin D.; Koren, Sergey; Sayre, Brian L.; Hastie, Alex R.; Chan, Saki; Lee, Joyce; Lam, Ernest T.; Liachko, Ivan; Sullivan, Shawn T.; Burton, Joshua N.; Huson, Heather J.; Nystrom, John C.; Kelley, Christy M.; Hutchison, Jana L.; Zhou, Yang; Sun, Jiajie; Crisà, Alessandra; de León, F. Abel Ponce; Schwartz, John C.; Hammond, John A.; Waldbieser, Geoffrey C.; Schroeder, Steven G.; Liu, George E.; Dunham, Maitreya J.; Shendure, Jay; Sonstegard, Tad S.; Phillippy, Adam M.; Van Tassell, Curtis P.; Smith, Timothy P.L.

2018-01-01

The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus), based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ~400-fold improvement in continuity due to properly assembled gaps compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex ever produced for an individual of a ruminant species. PMID:28263316
Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

PubMed

Pessôa, Rodrigo; Watanabe, Jaqueline Tomoko; Nukui, Youko; Pereira, Juliana; Casseb, Jorge; Kasseb, Jorge; de Oliveira, Augusto César Penalva; Segurado, Aluisio Cotrim; Sanabani, Sabri Saeed

2014-01-01

Here, we report on the partial and full-length genomic (FLG) variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs), 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) and 7 adult T-cell leukemia/lymphoma (ATLL) patients, using an Illumina paired-end protocol. Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14) and FLG (n = 76) data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5%) individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA) and that 4 individuals (4.5%) were infected with the Japanese sub-subtypes (aB). A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data will add to our current understanding of the evolutionary history of this medically important virus.
Finite-Horizon H∞ Consensus Control of Time-Varying Multiagent Systems With Stochastic Communication Protocol.

PubMed

Zou, Lei; Wang, Zidong; Gao, Huijun; Alsaadi, Fuad E

2017-03-31

This paper is concerned with the distributed H∞ consensus control problem for a discrete time-varying multiagent system with the stochastic communication protocol (SCP). A directed graph is used to characterize the communication topology of the multiagent network. The data transmission between each agent and the neighboring ones is implemented via a constrained communication channel where only one neighboring agent is allowed to transmit data at each time instant. The SCP is applied to schedule the signal transmission of the multiagent system. A sequence of random variables is utilized to capture the scheduling behavior of the SCP. By using the mapping technology combined with the Hadamard product, the closed-loop multiagent system is modeled as a time-varying system with a stochastic parameter matrix. The purpose of the addressed problem is to design a cooperative controller for each agent such that, for all probabilistic scheduling behaviors, the H∞ consensus performance is achieved over a given finite horizon for the closed-loop multiagent system. A necessary and sufficient condition is derived to ensure the H∞ consensus performance based on the completing squares approach and the stochastic analysis technique. Then, the controller parameters are obtained by solving two coupled backward recursive Riccati difference equations. Finally, a numerical example is given to illustrate the effectiveness of the proposed controller design scheme.
[Identification of Tibetan medicine "Dida" of Gentianaceae using DNA barcoding].

PubMed

Liu, Chuan; Zhang, Yu-Xin; Liu, Yue; Chen, Yi-Long; Fan, Gang; Xiang, Li; Xu, Jiang; Zhang, Yi

2016-02-01

The ITS2 barcode was used toidentify Tibetan medicine "Dida", and tosecure its quality and safety in medication. A total of 13 species, 151 experimental samples for the study from the Tibetan Plateau, including Gentianaceae Swertia, Halenia, Gentianopsis, Comastoma, Lomatogonium ITS2 sequences were amplified, and purified PCR products were sequenced. Sequence assembly and consensus sequence generation were performed using the CodonCode Aligner V3.7.1. The Kimura 2-Parameter (K2P) distances were calculated using MEGA 6.0. The neighbor-joining (NJ) phylogenetic trees were constructed. There are 31 haplotypes among 231 bp after alignment of all ITS2 sequence haplotypes, and the average G±C content of 61.40%. The NJ tree strongly supported that every species clustered into their own clade and high identification success rate, except that Swertia bifolia and Swertia wolfangiana could not be distinguished from each other based on the sequence divergences. DNA barcoding could be used as a fast and accurate identification method to distinguish Tibetan medicine "Dida" to ensure its safe use. Copyright© by the Chinese Pharmaceutical Association.
A comparative analysis of exome capture.

PubMed

Parla, Jennifer S; Iossifov, Ivan; Grabill, Ian; Spector, Mona S; Kramer, Melissa; McCombie, W Richard

2011-09-29

Human exome resequencing using commercial target capture kits has been and is being used for sequencing large numbers of individuals to search for variants associated with various human diseases. We rigorously evaluated the capabilities of two solution exome capture kits. These analyses help clarify the strengths and limitations of those data as well as systematically identify variables that should be considered in the use of those data. Each exome kit performed well at capturing the targets they were designed to capture, which mainly corresponds to the consensus coding sequences (CCDS) annotations of the human genome. In addition, based on their respective targets, each capture kit coupled with high coverage Illumina sequencing produced highly accurate nucleotide calls. However, other databases, such as the Reference Sequence collection (RefSeq), define the exome more broadly, and so not surprisingly, the exome kits did not capture these additional regions. Commercial exome capture kits provide a very efficient way to sequence select areas of the genome at very high accuracy. Here we provide the data to help guide critical analyses of sequencing data derived from these products.

Novel isoprenylated proteins identified by an expression library screen.

PubMed

Biermann, B J; Morehead, T A; Tate, S E; Price, J R; Randall, S K; Crowell, D N

1994-10-14

Isoprenylated proteins are involved in eukaryotic cell growth and signal transduction. The protein determinant for prenylation is a short carboxyl-terminal motif containing a cysteine, to which the isoprenoid is covalently attached via thioether linkage. To date, isoprenylated proteins have almost all been identified by demonstrating the attachment of an isoprenoid to previously known proteins. Thus, many isoprenylated proteins probably remain undiscovered. To identify novel isoprenylated proteins for subsequent biochemical study, colony blots of a Glycine max cDNA expression library were [3H]farnesyl-labeled in vitro. Proteins identified by this screen contained several different carboxyl termini that conform to consensus farnesylation motifs. These proteins included known farnesylated proteins (DnaJ homologs) and several novel proteins, two of which contained six or more tandem repeats of a hexapeptide having the consensus sequence (E/G)(G/P)EK(P/K)K. Thus, plants contain a diverse array of genes encoding farnesylated proteins, and our results indicate that fundamental differences in the identities of farnesylated proteins may exist between plants and other eukaryotes. Expression library screening by direct labeling can be adapted to identify isoprenylated proteins from other organisms, as well as proteins with other post-translational modifications.
OARSI Clinical Trials Recommendations for Hip Imaging in Osteoarthritis

PubMed Central

Gold, Garry E.; Cicuttini, Flavia; Crema, Michel D.; Eckstein, Felix; Guermazi, Ali; Kijowski, Richard; Link, Thomas M.; Maheu, Emmanuel; Martel-Pelletier, Johanne; Miller, Colin G.; Pelletier, Jean-Pierre; Peterfy, Charles G.; Potter, Hollis G.; Roemer, Frank W.; Hunter, David. J

2015-01-01

Imaging of hip in osteoarthritis (OA) has seen considerable progress in the past decade, with the introduction of new techniques that may be more sensitive to structural disease changes. The purpose of this expert opinion, consensus driven recommendation is to provide detail on how to apply hip imaging in disease modifying clinical trials. It includes information on acquisition methods/ techniques (including guidance on positioning for radiography, sequence/protocol recommendations/ hardware for MRI); commonly encountered problems (including positioning, hardware and coil failures, artifacts associated with various MRI sequences); quality assurance/ control procedures; measurement methods; measurement performance (reliability, responsiveness, and validity); recommendations for trials; and research recommendations. PMID:25952344
MARTA: a suite of Java-based tools for assigning taxonomic status to DNA sequences.

PubMed

Horton, Matthew; Bodenhausen, Natacha; Bergelson, Joy

2010-02-15

We have created a suite of Java-based software to better provide taxonomic assignments to DNA sequences. We anticipate that the program will be useful for protistologists, virologists, mycologists and other microbial ecologists. The program relies on NCBI utilities including the BLAST software and Taxonomy database and is easily manipulated at the command-line to specify a BLAST candidate's query-coverage or percent identity requirements; other options include the ability to set minimal consensus requirements (%) for each of the eight major taxonomic ranks (Domain, Kingdom, Phylum, ...) and whether to consider lower scoring candidates when the top-hit lacks taxonomic classification.
c-Myb Binds to a Sequence in the Proximal Region of the RAG-2 Promoter and Is Essential for Promoter Activity in T-Lineage Cells

PubMed Central

Wang, Qian-Fei; Lauring, Josh; Schlissel, Mark S.

2000-01-01

The RAG-2 gene encodes a component of the V(D)J recombinase which is essential for the assembly of antigen receptor genes in B and T lymphocytes. Previously, we reported that the transcription factor BSAP (PAX-5) regulates the murine RAG-2 promoter in B-cell lines. A partially overlapping but distinct region of the proximal RAG-2 promoter was also identified as an important element for promoter activity in T cells; however, the responsible factor was unknown. In this report, we present data demonstrating that c-Myb binds to a Myb consensus site within the proximal promoter and is critical for its activity in T-lineage cells. We show that c-Myb can transactivate a RAG-2 promoter-reporter construct in cotransfection assays and that this transactivation depends on the proximal promoter Myb consensus site. By using a chromatin immunoprecipitation (ChIP) strategy, fractionation of chromatin with anti-c-Myb antibody specifically enriched endogenous RAG-2 promoter DNA sequences. DNase I genomic footprinting revealed that the c-Myb site is occupied in a tissue-specific fashion in vivo. Furthermore, an integrated RAG-2 promoter construct with mutations at the c-Myb site was not enriched in the ChIP assay, while a wild-type integrated promoter construct was enriched. Finally, this lack of binding of c-Myb to a chromosomally integrated mutant RAG-2 promoter construct in vivo was associated with a striking decrease in promoter activity. We conclude that c-Myb regulates the RAG-2 promoter in T cells by binding to this consensus c-Myb binding site. PMID:11094072
Methodological Quality of Consensus Guidelines in Implant Dentistry

PubMed Central

Faggion, Clovis Mariano; Apaza, Karol; Ariza-Fritas, Tania; Málaga, Lilian; Giannakopoulos, Nikolaos Nikitas; Alarcón, Marco Antonio

2017-01-01

Background Consensus guidelines are useful to improve clinical decision making. Therefore, the methodological evaluation of these guidelines is of paramount importance. Low quality information may guide to inadequate or harmful clinical decisions. Objective To evaluate the methodological quality of consensus guidelines published in implant dentistry using a validated methodological instrument. Methods The six implant dentistry journals with impact factors were scrutinised for consensus guidelines related to implant dentistry. Two assessors independently selected consensus guidelines, and four assessors independently evaluated their methodological quality using the Appraisal of Guidelines for Research & Evaluation (AGREE) II instrument. Disagreements in the selection and evaluation of guidelines were resolved by consensus. First, the consensus guidelines were analysed alone. Then, systematic reviews conducted to support the guidelines were included in the analysis. Non-parametric statistics for dependent variables (Wilcoxon signed rank test) was used to compare both groups. Results Of 258 initially retrieved articles, 27 consensus guidelines were selected. Median scores in four domains (applicability, rigour of development, stakeholder involvement, and editorial independence), expressed as percentages of maximum possible domain scores, were below 50% (median, 26%, 30.70%, 41.70%, and 41.70%, respectively). The consensus guidelines and consensus guidelines + systematic reviews data sets could be compared for 19 guidelines, and the results showed significant improvements in all domain scores (p < 0.05). Conclusions Methodological improvement of consensus guidelines published in major implant dentistry journals is needed. The findings of the present study may help researchers to better develop consensus guidelines in implant dentistry, which will improve the quality and trust of information needed to make proper clinical decisions. PMID:28107405
Recombination in feline immunodeficiency virus from feral and companion domestic cats.

PubMed

Hayward, Jessica J; Rodrigo, Allen G

2008-06-17

Recombination is a relatively common phenomenon in retroviruses. We investigated recombination in Feline Immunodeficiency Virus from naturally-infected New Zealand domestic cats (Felis catus) by sequencing regions of the gag, pol and env genes. The occurrence of intragenic recombination was highest in env, with evidence of recombination in 6.4% (n = 156) of all cats. A further recombinant was identified in each of the gag (n = 48) and pol (n = 91) genes. Comparisons of phylogenetic trees across genes identified cases of incongruence, indicating intergenic recombination. Three (7.7%, n = 39) of these incongruencies were found to be significantly different using the Shimodaira-Hasegawa test.Surprisingly, our phylogenies from the gag and pol genes showed that no New Zealand sequences group with reference subtype C sequences within intrasubtype pairwise distances. Indeed, we find one and two distinct unknown subtype groups in gag and pol, respectively. These observations cause us to speculate that these New Zealand FIV strains have undergone several recombination events between subtype A parent strains and undefined unknown subtype strains, similar to the evolutionary history hypothesised for HIV-1 "subtype E".Endpoint dilution sequencing was used to confirm the consensus sequences of the putative recombinants and unknown subtype groups, providing evidence for the authenticity of these sequences. Endpoint dilution sequencing also resulted in the identification of a dual infection event in the env gene. In addition, an intrahost recombination event between variants of the same subtype in the pol gene was established. This is the first known example of naturally-occurring recombination in a cat with infection of the parent strains. Evidence of intragenic recombination in the gag, pol and env regions, and complex intergenic recombination, of FIV from naturally-infected domestic cats in New Zealand was found. Strains of unknown subtype were identified in all three gene regions. These results have implications for the use of the current FIV vaccine in New Zealand.
Assistance Technology Research Center

DTIC Science & Technology

2003-02-01

and perhaps modifying - some social behaviors of people with autism . "* Leveraging ancillary funding from both NIDRR and NSF, Anthrotronix is three...to independence for many individuals with traumatic brain injury, stroke, and autism . This project will develop virtual environments, using the...pentapeptide consensus sequence of the human NMDA receptor NR2a and NR2b subunits. Magnetic resonance imaging (MRI) will be performed for evaluation of
Inducible Transgenic Models of BRCA1 Function

DTIC Science & Technology

1999-10-01

inducible expression vectors were created to conditionally express four different hammerhead ribozymes designed to specifically cleave the Brca...transcript. Hammerhead ribozymes are catalytic RNAs that efficiently cleave RNA and thereby down- regulate gene expression. Hammerhead ribozymes can cleave...any RNA containing its 5’-UH-3’ consensus sequence where U can be replaced by a C, and H=C, U or A. Hammerhead ribozymes effectively and selectively
Intelligent Distributed Systems

DTIC Science & Technology

2015-10-23

periodic gossiping algorithms by using convex combination rules rather than standard averaging rules. On a ring graph, we have discovered how to sequence...the gossips within a period to achieve the best possible convergence rate and we have related this optimal value to the classic edge coloring problem...consensus. There are three different approaches to distributed averaging: linear iterations, gossiping , and dou- ble linear iterations which are also known as
A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples.

PubMed

Li, Yankun; Shao, Xueguang; Cai, Wensheng

2007-04-15

Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.
Evaluation of the genetic diversity of Plum pox virus in a single plum tree.

PubMed

Predajňa, Lukáš; Šubr, Zdeno; Candresse, Thierry; Glasa, Miroslav

2012-07-01

Genetic diversity of Plum pox virus (PPV) and its distribution within a single perennial woody host (plum, Prunus domestica) has been evaluated. A plum tree was triply infected by chip-budding with PPV-M, PPV-D and PPV-Rec isolates in 2003 and left to develop untreated under open field conditions. In September 2010 leaf and fruit samples were collected from different parts of the tree canopy. A 745-bp NIb-CP fragment of PPV genome, containing the hypervariable region encoding the CP N-terminal end was amplified by RT-PCR from each sample and directly sequenced to determine the dominant sequence. In parallel, the PCR products were cloned and a total of 105 individual clones were sequenced. Sequence analysis revealed that after 7 years of infection, only PPV-M was still detectable in the tree and that the two other isolates (PPV-Rec and PPV-D) had been displaced. Despite the fact that the analysis targeted a relatively short portion of the genome, a substantial amount of intra-isolate variability was observed for PPV-M. A total of 51 different haplotypes could be identified from the 105 individual sequences, two of which were largely dominant. However, no clear-cut structuration of the viral population by the tree architecture could be highlighted although the results obtained suggest the possibility of intra-leaf/fruit differentiation of the viral population. Comparison of the consensus sequence with the original source isolate showed no difference, suggesting within-plant stability of this original isolate under open field conditions. Copyright © 2012 Elsevier B.V. All rights reserved.
A charge-dependent mechanism is responsible for the dynamic accumulation of proteins inside nucleoli.

PubMed

Musinova, Yana R; Kananykhina, Eugenia Y; Potashnikova, Daria M; Lisitsyna, Olga M; Sheval, Eugene V

2015-01-01

The majority of known nucleolar proteins are freely exchanged between the nucleolus and the surrounding nucleoplasm. One way proteins are retained in the nucleoli is by the presence of specific amino acid sequences, namely nucleolar localization signals (NoLSs). The mechanism by which NoLSs retain proteins inside the nucleoli is still unclear. Here, we present data showing that the charge-dependent (electrostatic) interactions of NoLSs with nucleolar components lead to nucleolar accumulation as follows: (i) known NoLSs are enriched in positively charged amino acids, but the NoLS structure is highly heterogeneous, and it is not possible to identify a consensus sequence for this type of signal; (ii) in two analyzed proteins (NF-κB-inducing kinase and HIV-1 Tat), the NoLS corresponds to a region that is enriched for positively charged amino acid residues; substituting charged amino acids with non-charged ones reduced the nucleolar accumulation in proportion to the charge reduction, and nucleolar accumulation efficiency was strongly correlated with the predicted charge of the tested sequences; and (iii) sequences containing only lysine or arginine residues (which were referred to as imitative NoLSs, or iNoLSs) are accumulated in the nucleoli in a charge-dependent manner. The results of experiments with iNoLSs suggested that charge-dependent accumulation inside the nucleoli was dependent on interactions with nucleolar RNAs. The results of this work are consistent with the hypothesis that nucleolar protein accumulation by NoLSs can be determined by the electrostatic interaction of positively charged regions with nucleolar RNAs rather than by any sequence-specific mechanism. Copyright © 2014 Elsevier B.V. All rights reserved.
Effect of the linkers between the zinc fingers in zinc finger protein 809 on gene silencing and nuclear localization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ichida, Yu, E-mail: ichida-y@ncchd.go.jp; Utsunomiya, Yuko; Onodera, Masafumi

2016-03-18

Zinc finger protein 809 (ZFP809) belongs to the Kruppel-associated box-containing zinc finger protein (KRAB-ZFP) family and functions in repressing the expression of Moloney murine leukemia virus (MoMLV). ZFP809 binds to the primer-binding site (PBS)located downstream of the MoMLV-long terminal repeat (LTR) and induces epigenetic modifications at integration sites, such as repressive histone modifications and de novo DNA methylation. KRAB-ZFPs contain consensus TGEKP linkers between C2H2 zinc fingers. The phosphorylation of threonine residues within linkers leads to the inactivation of zinc finger binding to target sequences. ZFP809 also contains consensus linkers between zinc fingers. However, the function of ZFP809 linkers remainsmore » unknown. In the present study, we constructed ZFP809 proteins containing mutated linkers and examined their ability to silence transgene expression driven by MLV, binding ability to MLV PBS, and cellular localization. The results of the present study revealed that the linkers affected the ability of ZFP809 to silence transgene expression. Furthermore, this effect could be partly attributed to changes in the localization of ZFP809 proteins containing mutated linkers. Further characterization of ZFP809 linkers is required for understanding the functions and features of KRAB-ZFP-containing linkers. - Highlights: • ZFP809 has three consensus linkers between the zinc fingers. • Linkers are required for ZFP809 to silence transgene expression driven by MLV-LTR. • Linkers affect the precise nuclear localization of ZFP809.« less
Validation of consensus quantitative trait loci associated with resistance to multiple foliar pathogens of maize.

PubMed

Asea, Godfrey; Vivek, Bindiganavile S; Bigirwa, George; Lipps, Patrick E; Pratt, Richard C

2009-05-01

Maize production in sub-Saharan Africa incurs serious losses to epiphytotics of foliar diseases. Quantitative trait loci conditioning partial resistance (rQTL) to infection by causal agents of gray leaf spot (GLS), northern corn leaf blight (NCLB), and maize streak have been reported. Our objectives were to identify simple-sequence repeat (SSR) molecular markers linked to consensus rQTL and one recently identified rQTL associated with GLS, and to determine their suitability as tools for selection of improved host resistance. We conducted evaluations of disease severity phenotypes in separate field nurseries, each containing 410 F2:3 families derived from a cross between maize inbred CML202 (NCLB and maize streak resistant) and VP31 (a GLS-resistant breeding line) that possess complimentary rQTL. F2:3 families were selected for resistance based on genotypic (SSR marker), phenotypic, or combined data and the selected F3:4 families were reevaluated. Phenotypic values associated with SSR markers for consensus rQTL in bins 4.08 for GLS, 5.04 for NCLB, and 1.04 for maize streak significantly reduced disease severity in both generations based on single-factor analysis of variance and marker-interval analysis. These results were consistent with the presence of homozygous resistant parent alleles, except in bin 8.06, where markers were contributed by the NCLB-susceptible parent. Only one marker associated with resistance could be confirmed in bins 2.09 (GLS) and 3.06 (NCLB), illustrating the need for more robust rQTL discovery, fine-mapping, and validation prior to undertaking marker-based selection.
Return of genomic results to research participants: the floor, the ceiling, and the choices in between.

PubMed

Jarvik, Gail P; Amendola, Laura M; Berg, Jonathan S; Brothers, Kyle; Clayton, Ellen W; Chung, Wendy; Evans, Barbara J; Evans, James P; Fullerton, Stephanie M; Gallego, Carlos J; Garrison, Nanibaa' A; Gray, Stacy W; Holm, Ingrid A; Kullo, Iftikhar J; Lehmann, Lisa Soleymani; McCarty, Cathy; Prows, Cynthia A; Rehm, Heidi L; Sharp, Richard R; Salama, Joseph; Sanderson, Saskia; Van Driest, Sara L; Williams, Marc S; Wolf, Susan M; Wolf, Wendy A; Burke, Wylie

2014-06-05

As more research studies incorporate next-generation sequencing (including whole-genome or whole-exome sequencing), investigators and institutional review boards face difficult questions regarding which genomic results to return to research participants and how. An American College of Medical Genetics and Genomics 2013 policy paper suggesting that pathogenic mutations in 56 specified genes should be returned in the clinical setting has raised the question of whether comparable recommendations should be considered in research settings. The Clinical Sequencing Exploratory Research (CSER) Consortium and the Electronic Medical Records and Genomics (eMERGE) Network are multisite research programs that aim to develop practical strategies for addressing questions concerning the return of results in genomic research. CSER and eMERGE committees have identified areas of consensus regarding the return of genomic results to research participants. In most circumstances, if results meet an actionability threshold for return and the research participant has consented to return, genomic results, along with referral for appropriate clinical follow-up, should be offered to participants. However, participants have a right to decline the receipt of genomic results, even when doing so might be viewed as a threat to the participants' health. Research investigators should be prepared to return research results and incidental findings discovered in the course of their research and meeting an actionability threshold, but they have no ethical obligation to actively search for such results. These positions are consistent with the recognition that clinical research is distinct from medical care in both its aims and its guiding moral principles. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Is There a Consensus on Consensus Methodology? Descriptions and Recommendations for Future Consensus Research.

PubMed

Waggoner, Jane; Carline, Jan D; Durning, Steven J

2016-05-01

The authors of this article reviewed the methodology of three common consensus methods: nominal group process, consensus development panels, and the Delphi technique. The authors set out to determine how a majority of researchers are conducting these studies, how they are analyzing results, and subsequently the manner in which they are reporting their findings. The authors conclude with a set of guidelines and suggestions designed to aid researchers who choose to use the consensus methodology in their work.Overall, researchers need to describe their inclusion criteria. In addition to this, on the basis of the current literature the authors found that a panel size of 5 to 11 members was most beneficial across all consensus methods described. Lastly, the authors agreed that the statistical analyses done in consensus method studies should be as rigorous as possible and that the predetermined definition of consensus must be included in the ultimate manuscript. More specific recommendations are given for each of the three consensus methods described in the article.
Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms

PubMed Central

Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.

2013-01-01

Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983
Expressed sequence tags from the oomycete fish pathogen Saprolegnia parasitica reveal putative virulence factors

PubMed Central

Torto-Alalibo, Trudy; Tian, Miaoying; Gajendran, Kamal; Waugh, Mark E; van West, Pieter; Kamoun, Sophien

2005-01-01

Background The oomycete Saprolegnia parasitica is one of the most economically important fish pathogens. There is a dramatic recrudescence of Saprolegnia infections in aquaculture since the use of the toxic organic dye malachite green was banned in 2002. Little is known about the molecular mechanisms underlying pathogenicity in S. parasitica and other animal pathogenic oomycetes. In this study we used a genomics approach to gain a first insight into the transcriptome of S. parasitica. Results We generated 1510 expressed sequence tags (ESTs) from a mycelial cDNA library of S. parasitica. A total of 1279 consensus sequences corresponding to 525944 base pairs were assembled. About half of the unigenes showed similarities to known protein sequences or motifs. The S. parasitica sequences tended to be relatively divergent from Phytophthora sequences. Based on the sequence alignments of 18 conserved proteins, the average amino acid identity between S. parasitica and three Phytophthora species was 77% compared to 93% within Phytophthora. Several S. parasitica cDNAs, such as those with similarity to fungal type I cellulose binding domain proteins, PAN/Apple module proteins, glycosyl hydrolases, proteases, as well as serine and cysteine protease inhibitors, were predicted to encode secreted proteins that could function in virulence. Some of these cDNAs were more similar to fungal proteins than to other eukaryotic proteins confirming that oomycetes and fungi share some virulence components despite their evolutionary distance Conclusion We provide a first glimpse into the gene content of S. parasitica, a reemerging oomycete fish pathogen. These resources will greatly accelerate research on this important pathogen. The data is available online through the Oomycete Genomics Database [1]. PMID:16076392
A single alteration 20 nt 5′ to an editing target inhibits chloroplast RNA editing in vivo

PubMed Central

Reed, Martha L.; Peeters, Nemo M.; Hanson, Maureen R.

2001-01-01

Transcripts of typical dicot plant plastid genes undergo C→U RNA editing at approximately 30 locations, but there is no consensus sequence surrounding the C targets of editing. The cis-acting elements required for editing of the C located at tobacco rpoB editing site II were investigated by introducing translatable chimeric minigenes containing sequence –20 to +6 surrounding the C target of editing. When the –20 to +6 sequence specified by the homologous region present in the black pine chloroplast genome was incorporated, virtually no editing of the transcripts occurred in transgenic tobacco plastids. Nucleotides that differ between the black pine and tobacco sequence were tested for their role in C→U editing by designing chimeric genes containing one or more of these divergent nucleotides. Surprisingly, the divergent nucleotide that had the strongest negative effect on editing of the minigene transcript was located –20 nt 5′ to the C target of editing. Expression of transgene transcripts carrying the 27 nt sequence did not affect the editing extent of the endogenous rpoB transcripts, even though the chimeric transcripts were much more abundant than those of the endogenous gene. In plants carrying a 93 nt rpoB editing site sequence, transgene transcripts accumulated to a level three times greater than transgene transcripts in the plants carrying the 27 nt rpoB editing sites and resulted in editing of the endogenous transcripts from 100 to 50%. Both a lower affinity of the 27 nt site for a trans-acting factor and lower abundance of the transcript could explain why expression of minigene transcripts containing the 27 nt sequence did not affect endogenous editing. PMID:11266552
Analysis of Ribosome Inactivating Protein (RIP): A Bioinformatics Approach

NASA Astrophysics Data System (ADS)

Jothi, G. Edward Gnana; Majilla, G. Sahaya Jose; Subhashini, D.; Deivasigamani, B.

2012-10-01

In spite of the medical advances in recent years, the world is in need of different sources to encounter certain health issues.Ribosome Inactivating Proteins (RIPs) were found to be one among them. In order to get easy access about RIPs, there is a need to analyse RIPs towards constructing a database on RIPs. Also, multiple sequence alignment was done towards screening for homologues of significant RIPs from rare sources against RIPs from easily available sources in terms of similarity. Protein sequences were retrieved from SWISS-PROT and are further analysed using pair wise and multiple sequence alignment.Analysis shows that, 151 RIPs have been characterized to date. Amongst them, there are 87 type I, 37 type II, 1 type III and 25 unknown RIPs. The sequence length information of various RIPs about the availability of full or partial sequence was also found. The multiple sequence alignment of 37 type I RIP using the online server Multalin, indicates the presence of 20 conserved residues. Pairwise alignment and multiple sequence alignment of certain selected RIPs in two groups namely Group I and Group II were carried out and the consensus level was found to be 98%, 98% and 90% respectively.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.