BANNAI, Hiroshi; NEMOTO, Manabu; TSUJIMURA, Koji; YAMANAKA, Takashi; MAEDA, Ken; KONDO, Takashi
2015-01-01
To increase the sensitivity of an enzyme-linked immunosorbent assay (ELISA) for equine herpesvirus type 4 (EHV-4) that uses a 12-mer peptide of glycoprotein G (gG4-12-mer: MKNNPIYSEGSL) [4], we used a longer peptide consisting of a 24-mer repeat sequence (gG4-24-mer: MKNNPIYSEGSLMLNVQHDDSIHT) as an antigen. Sera of horses experimentally infected with EHV-4 reacted much more strongly to the gG4-24-mer peptide than to the gG4-12-mer peptide. We used peptide ELISAs to test paired sera from horses naturally infected with EHV-4 (n=40). gG4-24-mer ELISA detected 37 positive samples (92.5%), whereas gG4-12-mer ELISA detected only 28 (70.0%). gG4-24-mer ELISA was much more sensitive than gG4-12-mer ELISA. PMID:26424485
Bannai, Hiroshi; Nemoto, Manabu; Tsujimura, Koji; Yamanaka, Takashi; Maeda, Ken; Kondo, Takashi
2016-02-01
To increase the sensitivity of an enzyme-linked immunosorbent assay (ELISA) for equine herpesvirus type 4 (EHV-4) that uses a 12-mer peptide of glycoprotein G (gG4-12-mer: MKNNPIYSEGSL) [4], we used a longer peptide consisting of a 24-mer repeat sequence (gG4-24-mer: MKNNPIYSEGSLMLNVQHDDSIHT) as an antigen. Sera of horses experimentally infected with EHV-4 reacted much more strongly to the gG4-24-mer peptide than to the gG4-12-mer peptide. We used peptide ELISAs to test paired sera from horses naturally infected with EHV-4 (n=40). gG4-24-mer ELISA detected 37 positive samples (92.5%), whereas gG4-12-mer ELISA detected only 28 (70.0%). gG4-24-mer ELISA was much more sensitive than gG4-12-mer ELISA.
Pervasive sequence patents cover the entire human genome.
Rosenfeld, Jeffrey A; Mason, Christopher E
2013-01-01
The scope and eligibility of patents for genetic sequences have been debated for decades, but a critical case regarding gene patents (Association of Molecular Pathologists v. Myriad Genetics) is now reaching the US Supreme Court. Recent court rulings have supported the assertion that such patents can provide intellectual property rights on sequences as small as 15 nucleotides (15mers), but an analysis of all current US patent claims and the human genome presented here shows that 15mer sequences from all human genes match at least one other gene. The average gene matches 364 other genes as 15mers; the breast-cancer-associated gene BRCA1 has 15mers matching at least 689 other genes. Longer sequences (1,000 bp) still showed extensive cross-gene matches. Furthermore, 15mer-length claims from bovine and other animal patents could also claim as much as 84% of the genes in the human genome. In addition, when we expanded our analysis to full-length patent claims on DNA from all US patents to date, we found that 41% of the genes in the human genome have been claimed. Thus, current patents for both short and long nucleotide sequences are extraordinarily non-specific and create an uncertain, problematic liability for genomic medicine, especially in regard to targeted re-sequencing and other sequence diagnostic assays.
Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge
2016-04-01
The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bade-Döding, Christina; Theodossis, Alex; Gras, Stephanie
2011-09-28
Polymorphic differences between human leukocyte antigen (HLA) molecules affect the specificity and conformation of their bound peptides and lead to differential selection of the T-cell repertoire. Mismatching during allogeneic transplantation can, therefore, lead to immunological reactions. We investigated the structure-function relationships of six members of the HLA-B*41 allelic group that differ by six polymorphic amino acids, including positions 80, 95, 97 and 114 within the antigen-binding cleft. Peptide-binding motifs for B*41:01, *41:02, *41:03, *41:04, *41:05 and *41:06 were determined by sequencing self-peptides from recombinant B*41 molecules by electrospray ionization tandem mass spectrometry. The crystal structures of HLA-B*41:03 bound to amore » natural 16-mer self-ligand (AEMYGSVTEHPSPSPL) and HLA-B*41:04 bound to a natural 11-mer self-ligand (HEEAVSVDRVL) were solved. Peptide analysis revealed that all B*41 alleles have an identical anchor motif at peptide position 2 (glutamic acid), but differ in their choice of C-terminal p{Omega} anchor (proline, valine, leucine). Additionally, B*41:04 displayed a greater preference for long peptides (>10 residues) when compared to the other B*41 allomorphs, while the longest peptide to be eluted from the allelic group (a 16mer) was obtained from B*41:03. The crystal structures of HLA-B*41:03 and HLA-B*41:04 revealed that both alleles interact in a highly conserved manner with the terminal regions of their respective ligands, while micropolymorphism-induced changes in the steric and electrostatic properties of the antigen-binding cleft account for differences in peptide repertoire and auxiliary anchoring. Differences in peptide repertoire, and peptide length specificity reflect the significant functional evolution of these closely related allotypes and signal their importance in allogeneic transplantation, especially B*41:03 and B*41:04, which accommodate longer peptides, creating structurally distinct peptide-HLA complexes.« less
Ali, Mohamed; El-Shesheny, Rabeh; Kandeil, Ahmed; Shehata, Mahmoud; Elsokary, Basma; Gomaa, Mokhtar; Hassan, Naglaa; El Sayed, Ahmed; El-Taweel, Ahmed; Sobhy, Heba; Oludayo, Fasina Folorunso; Dauphin, Gwenaelle; El Masry, Ihab; Wolde, Abebe Wossene; Daszak, Peter; Miller, Maureen; VonDobschuetz, Sophie; Gardner, Emma; Morzaria, Subhash; Lubroth, Juan; Makonnen, Yilma Jobre
2017-01-01
A cross-sectional study was conducted in Egypt to determine the prevalence of Middle East respiratory syndrome coronavirus (MERS-CoV) in imported and resident camels and bats, as well as to assess possible transmission of the virus to domestic ruminants and equines. A total of 1,031 sera, 1,078 nasal swabs, 13 rectal swabs, and 38 milk samples were collected from 1,078 camels in different types of sites. In addition, 145 domestic animals and 109 bats were sampled. Overall, of 1,031 serologically-tested camels, 871 (84.5%) had MERS-CoV neutralising antibodies. Seroprevalence was significantly higher in imported (614/692; 88.7%) than resident camels (257/339; 5.8%) (p < 0.05). Camels from Sudan (543/594; 91.4%) had a higher seroprevalence than those from East Africa (71/98; 72.4%) (p < 0.05). Sampling site and age were also associated with MERS-CoV seroprevalence (p < 0.05). All tested samples from domestic animals and bats were negative for MERS-CoV antibodies except one sheep sample which showed a 1:640 titre. Of 1,078 camels, 41 (3.8%) were positive for MERS-CoV genetic material. Sequences obtained were not found to cluster with clade A or B MERS-CoV sequences and were genetically diverse. The presence of neutralising antibodies in one sheep apparently in contact with seropositive camels calls for further studies on domestic animals in contact with camels. PMID:28333616
Robust k-mer frequency estimation using gapped k-mers
Ghandi, Mahmoud; Mohammad-Noori, Morteza
2013-01-01
Oligomers of fixed length, k, commonly known as k-mers, are often used as fundamental elements in the description of DNA sequence features of diverse biological function, or as intermediate elements in the constuction of more complex descriptors of sequence features such as position weight matrices. k-mers are very useful as general sequence features because they constitute a complete and unbiased feature set, and do not require parameterization based on incomplete knowledge of biological mechanisms. However, a fundamental limitation in the use of k-mers as sequence features is that as k is increased, larger spatial correlations in DNA sequence elements can be described, but the frequency of observing any specific k-mer becomes very small, and rapidly approaches a sparse matrix of binary counts. Thus any statistical learning approach using k-mers will be susceptible to noisy estimation of k-mer frequencies once k becomes large. Because all molecular DNA interactions have limited spatial extent, gapped k-mers often carry the relevant biological signal. Here we use gapped k-mer counts to more robustly estimate the ungapped k-mer frequencies, by deriving an equation for the minimum norm estimate of k-mer frequencies given an observed set of gapped k-mer frequencies. We demonstrate that this approach provides a more accurate estimate of the k-mer frequencies in real biological sequences using a sample of CTCF binding sites in the human genome. PMID:23861010
Robust k-mer frequency estimation using gapped k-mers.
Ghandi, Mahmoud; Mohammad-Noori, Morteza; Beer, Michael A
2014-08-01
Oligomers of fixed length, k, commonly known as k-mers, are often used as fundamental elements in the description of DNA sequence features of diverse biological function, or as intermediate elements in the constuction of more complex descriptors of sequence features such as position weight matrices. k-mers are very useful as general sequence features because they constitute a complete and unbiased feature set, and do not require parameterization based on incomplete knowledge of biological mechanisms. However, a fundamental limitation in the use of k-mers as sequence features is that as k is increased, larger spatial correlations in DNA sequence elements can be described, but the frequency of observing any specific k-mer becomes very small, and rapidly approaches a sparse matrix of binary counts. Thus any statistical learning approach using k-mers will be susceptible to noisy estimation of k-mer frequencies once k becomes large. Because all molecular DNA interactions have limited spatial extent, gapped k-mers often carry the relevant biological signal. Here we use gapped k-mer counts to more robustly estimate the ungapped k-mer frequencies, by deriving an equation for the minimum norm estimate of k-mer frequencies given an observed set of gapped k-mer frequencies. We demonstrate that this approach provides a more accurate estimate of the k-mer frequencies in real biological sequences using a sample of CTCF binding sites in the human genome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, Richard A.; Panyala, Ajay R.; Glass, Kevin A.
MerCat is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. MerCat inputs include assembled contigs and raw sequence reads from any platform resulting in feature abundance counts tables. MerCat allows for direct analysis of data properties without reference sequence database dependency commonly used by search tools such as BLAST and/or DIAMOND for compositional analysis of whole community shotgun sequencing (e.g. metagenomes and metatranscriptomes).
KMC 2: fast and resource-frugal k-mer counting.
Deorowicz, Sebastian; Kokot, Marek; Grabowski, Szymon; Debudaj-Grabysz, Agnieszka
2015-05-15
Building the histogram of occurrences of every k-symbol long substring of nucleotide data is a standard step in many bioinformatics applications, known under the name of k-mer counting. Its applications include developing de Bruijn graph genome assemblers, fast multiple sequence alignment and repeat detection. The tremendous amounts of NGS data require fast algorithms for k-mer counting, preferably using moderate amounts of memory. We present a novel method for k-mer counting, on large datasets about twice faster than the strongest competitors (Jellyfish 2, KMC 1), using about 12 GB (or less) of RAM. Our disk-based method bears some resemblance to MSPKmerCounter, yet replacing the original minimizers with signatures (a carefully selected subset of all minimizers) and using (k, x)-mers allows to significantly reduce the I/O and a highly parallel overall architecture allows to achieve unprecedented processing speeds. For example, KMC 2 counts the 28-mers of a human reads collection with 44-fold coverage (106 GB of compressed size) in about 20 min, on a 6-core Intel i7 PC with an solid-state disk. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Berenger, Byron M; Berry, Chrystal; Peterson, Trevor; Fach, Patrick; Delannoy, Sabine; Li, Vincent; Tschetter, Lorelee; Nadon, Celine; Honish, Lance; Louie, Marie; Chui, Linda
2015-01-01
A standardised method for determining Escherichia coli O157:H7 strain relatedness using whole genome sequencing or virulence gene profiling is not yet established. We sought to assess the capacity of either high-throughput polymerase chain reaction (PCR) of 49 virulence genes, core-genome single nt variants (SNVs) or k-mer clustering to discriminate between outbreak-associated and sporadic E. coli O157:H7 isolates. Three outbreaks and multiple sporadic isolates from the province of Alberta, Canada were included in the study. Two of the outbreaks occurred concurrently in 2014 and one occurred in 2012. Pulsed-field gel electrophoresis (PFGE) and multilocus variable-number tandem repeat analysis (MLVA) were employed as comparator typing methods. The virulence gene profiles of isolates from the 2012 and 2014 Alberta outbreak events and contemporary sporadic isolates were mostly identical; therefore the set of virulence genes chosen in this study were not discriminatory enough to distinguish between outbreak clusters. Concordant with PFGE and MLVA results, core genome SNV and k-mer phylogenies clustered isolates from the 2012 and 2014 outbreaks as distinct events. k-mer phylogenies demonstrated increased discriminatory power compared with core SNV phylogenies. Prior to the widespread implementation of whole genome sequencing for routine public health use, issues surrounding cost, technical expertise, software standardisation, and data sharing/comparisons must be addressed.
Identification of a Novel Inhibitor against Middle East Respiratory Syndrome Coronavirus
Sun, Yaping; Zhang, Huaidong; Shi, Jian; Zhang, Zhe; Gong, Rui
2017-01-01
The Middle East respiratory syndrome coronavirus (MERS-CoV) was first isolated in 2012, and circulated worldwide with high mortality. The continual outbreaks of MERS-CoV highlight the importance of developing antiviral therapeutics. Here, we rationally designed a novel fusion inhibitor named MERS-five-helix bundle (MERS-5HB) derived from the six-helix bundle (MERS-6HB) which was formed by the process of membrane fusion. MERS-5HB consists of three copies of heptad repeat 1 (HR1) and two copies of heptad repeat 2 (HR2) while MERS-6HB includes three copies each of HR1 and HR2. As it lacks one HR2, MERS-5HB was expected to interact with viral HR2 to interrupt the fusion step. What we found was that MERS-5HB could bind to HR2P, a peptide derived from HR2, with a strong affinity value (KD) of up to 0.24 nM. Subsequent assays indicated that MERS-5HB could inhibit pseudotyped MERS-CoV entry effectively with 50% inhibitory concentration (IC50) of about 1 μM. In addition, MERS-5HB significantly inhibited spike (S) glycoprotein-mediated syncytial formation in a dose-dependent manner. Further biophysical characterization showed that MERS-5HB was a thermo-stable α-helical secondary structure. The inhibitory potency of MERS-5HB may provide an attractive basis for identification of a novel inhibitor against MERS-CoV, as a potential antiviral agent. PMID:28906430
Schnare, Murray N.; Collings, James C.; Spencer, David F.; Gray, Michael W.
2000-01-01
In Crithidia fasciculata, the ribosomal RNA (rRNA) gene repeats range in size from ∼11 to 12 kb. This length heterogeneity is localized to a region of the intergenic spacer (IGS) that contains tandemly repeated copies of a 19mer sequence. The IGS also contains four copies of an ∼55 nt repeat that has an internal inverted repeat and is also present in the IGS of Leishmania species. We have mapped the C.fasciculata transcription initiation site as well as two other reverse transcriptase stop sites that may be analogous to the A0 and A′ pre-rRNA processing sites within the 5′ external transcribed spacer (ETS) of other eukaryotes. Features that could influence processing at these sites include two stretches of conserved primary sequence and three secondary structure elements present in the 5′ ETS. We also characterized the C.fasciculata U3 snoRNA, which has the potential for base-pairing with pre-rRNA sequences. Finally, we demonstrate that biosynthesis of large subunit rRNA in both C.fasciculata and Trypanosoma brucei involves 3′-terminal addition of three A residues that are not present in the corresponding DNA sequences. PMID:10982863
Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features
Mohammad-Noori, Morteza; Beer, Michael A.
2014-01-01
Abstract Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem. PMID:25033408
Enhanced regulatory sequence prediction using gapped k-mer features.
Ghandi, Mahmoud; Lee, Dongwon; Mohammad-Noori, Morteza; Beer, Michael A
2014-07-01
Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem.
Primer design for a prokaryotic differential display RT-PCR.
Fislage, R; Berceanu, M; Humboldt, Y; Wendt, M; Oberender, H
1997-01-01
We have developed a primer set for a prokaryotic differential display of mRNA in the Enterobacteriaceae group. Each combination of ten 10mer and ten 11mer primers generates up to 85 bands from total Escherichia coli RNA, thus covering expressed sequences of a complete bacterial genome. Due to the lack of polyadenylation in prokaryotic RNA the type T11VN anchored oligonucleotides for the reverse transcriptase reaction had to be replaced with respect to the original method described by Liang and Pardee [ Science , 257, 967-971 (1992)]. Therefore, the sequences of both the 10mer and the new 11mer oligonucleotides were determined by a statistical evaluation of species-specific coding regions extracted from the EMBL database. The 11mer primers used for reverse transcription were selected for localization in the 3'-region of the bacterial RNA. The 10mer primers preferentially bind to the 5'-end of the RNA. None of the primers show homology to rRNA or other abundant small RNA species. Randomly sampled cDNA bands were checked for their bacterial origin either by re-amplification, cloning and sequencing or by re-amplification and direct sequencing with 10mer and 11mer primers after asymmetric PCR. PMID:9108168
Primer design for a prokaryotic differential display RT-PCR.
Fislage, R; Berceanu, M; Humboldt, Y; Wendt, M; Oberender, H
1997-05-01
We have developed a primer set for a prokaryotic differential display of mRNA in the Enterobacteriaceae group. Each combination of ten 10mer and ten 11mer primers generates up to 85 bands from total Escherichia coli RNA, thus covering expressed sequences of a complete bacterial genome. Due to the lack of polyadenylation in prokaryotic RNA the type T11VN anchored oligonucleotides for the reverse transcriptase reaction had to be replaced with respect to the original method described by Liang and Pardee [ Science , 257, 967-971 (1992)]. Therefore, the sequences of both the 10mer and the new 11mer oligonucleotides were determined by a statistical evaluation of species-specific coding regions extracted from the EMBL database. The 11mer primers used for reverse transcription were selected for localization in the 3'-region of the bacterial RNA. The 10mer primers preferentially bind to the 5'-end of the RNA. None of the primers show homology to rRNA or other abundant small RNA species. Randomly sampled cDNA bands were checked for their bacterial origin either by re-amplification, cloning and sequencing or by re-amplification and direct sequencing with 10mer and 11mer primers after asymmetric PCR.
Al Ghamdi, Mohammed; Alghamdi, Khalid M; Ghandoora, Yasmeen; Alzahrani, Ameera; Salah, Fatmah; Alsulami, Abdulmoatani; Bawayan, Mayada F; Vaidya, Dhananjay; Perl, Trish M; Sood, Geeta
2016-04-21
Middle Eastern Respiratory Syndrome coronavirus (MERS-CoV) is a poorly understood disease with no known treatments. We describe the clinical features and treatment outcomes of patients with laboratory confirmed MERS-CoV at a regional referral center in the Kingdom of Saudi Arabia. In 2014, a retrospective chart review was performed on patients with a laboratory confirmed diagnosis of MERS-CoV to determine clinical and treatment characteristics associated with death. Confounding was evaluated and a multivariate logistic regression was performed to assess the independent effect of treatments administered. Fifty-one patients had an overall mortality of 37 %. Most patients were male (78 %) with a mean age of 54 years. Almost a quarter of the patients were healthcare workers (23.5 %) and 41 % had a known exposure to another person with MERS-CoV. Survival was associated with male gender, working as a healthcare worker, history of hypertension, vomiting on admission, elevated respiratory rate, abnormal lung exam, elevated alanine transaminase (ALT), clearance of MERS-CoV on repeat PCR polymerase chain reaction (PCR) testing, and mycophenolate mofetil treatment. Survival was reduced in the presence of coronary artery disease, hypotension, hypoxemia, CXR (chest X-ray) abnormalities, leukocytosis, creatinine >1 · 5 mg/dL, thrombocytopenia, anemia, and renal failure. In a multivariate analysis of treatments administered, severity of illness was the greatest predictor of reduced survival. Care for patients with MERS-CoV remains a challenge. In this retrospective cohort, interferon beta and mycophenolate mofetil treatment were predictors of increased survival in the univariate analysis. Severity of illness was the greatest predictor of reduced survival in the multivariate analysis. Larger randomized trials are needed to better evaluate the efficacy of these treatment regimens for MERS-CoV.
Nonin-Lecomte, Sylvie; Dardel, Frédéric; Lestienne, Patrick
2005-08-01
Stretches of cytosines and guanosines have been shown in vitro to adopt non-canonical structures known as i-motifs and G-quartets, respectively. When combined, such sequences are expected to either retain their structure or form duplexes or triple helices. All these structures may occur in vivo whenever the sequence criteria are met. Such stretches are present in the circular genome of human mitochondria, as two 10 nucleotide-long perfect tandem direct repeats (DR1 and DR2). The DR1 and DR2 repeats are G-rich on the heavy strand and C-rich on the light strand. Previous results suggested that during replication, transient formation of a parallel GGC triple helix between the neo-synthesised G-rich DR1 and the double-stranded homologous DR2 could be involved in a rearrangement process leading to genome instability. In order to get structural insights into the interaction between the two repeats, we have studied by nuclear magnetic resonance (NMR) the assembly properties of a 24-mer oligodeoxyribonucleotide in which the C- and G-rich segments of the DRs are covalently tethered by a TTTT linker. We show here that this 24-mer self-associates into a triplex-containing symmetrical tetramer. The core of the structure is composed of anti-parallel Watson-Crick (WC) base pairs. Two additional strands are hydrogen-bonded to the Hoogsteen side of the Gs, thus forming CGC(+) triple helices, with G-rich ends folding into G-quartets. These results suggest that such structures could occur when the two DRs are put to close proximity in a biological context.
Shih, Shou-Chuan; Ho, Tsung-Chuan; Chen, Show-Li; Tsao, Yeou-Ping
2016-01-01
Fibrogenesis is induced by repeated injury to the liver and reactive regeneration and leads eventually to liver cirrhosis. Pigment epithelium derived factor (PEDF) has been shown to prevent liver fibrosis induced by carbon tetrachloride (CCl4). A 44 amino acid domain of PEDF (44-mer) was found to have a protective effect against various insults to several cell types. In this study, we investigated the capability of synthetic 44-mer to protect against liver injury in mice and in primary cultured hepatocytes. Acute liver injury, induced by CCl4, was evident from histological changes, such as cell necrosis, inflammation and apoptosis, and a concomitant reduction of glutathione (GSH) and GSH redox enzyme activities in the liver. Intraperitoneal injection of the 44-mer into CCl4-treated mice abolished the induction of AST and ALT and markedly reduced histological signs of liver injury. The 44-mer treatment can reduce hepatic oxidative stress as evident from lower levels of lipid hydroperoxide, and higher levels of GSH. CCl4 caused a reduction of Bcl-xL, PEDF and PPARγ, which was markedly restored by the 44-mer treatment. Consequently, the 44-mer suppressed liver fibrosis induced by repeated CCl4 injury. Furthermore, our observations in primary culture of rat hepatocytes showed that PEDF and the 44-mer protected primary rat hepatocytes against apoptosis induced by serum deprivation and TGF-β1. PEDF/44-mer induced cell protective STAT3 phosphorylation. Pharmacological STAT3 inhibition prevented the antiapoptotic action of PEDF/44-mer. Among several PEDF receptor candidates that may be responsible for hepatocyte protection, we demonstrated that PNPLA2 was essential for PEDF/44-mer-mediated STAT3 phosphorylation and antiapoptotic activity by using siRNA to selectively knockdown PNPLA2. In conclusion, the PEDF 44-mer protects hepatocytes from single and repeated CCl4 injury. This protective effect may stem from strengthening the counter oxidative stress capacity and induction of hepatoprotective factors. PMID:27384427
Pan, Tony; Flick, Patrick; Jain, Chirag; Liu, Yongchao; Aluru, Srinivas
2017-10-09
Counting and indexing fixed length substrings, or k-mers, in biological sequences is a key step in many bioinformatics tasks including genome alignment and mapping, genome assembly, and error correction. While advances in next generation sequencing technologies have dramatically reduced the cost and improved latency and throughput, few bioinformatics tools can efficiently process the datasets at the current generation rate of 1.8 terabases every 3 days. We present Kmerind, a high performance parallel k-mer indexing library for distributed memory environments. The Kmerind library provides a set of simple and consistent APIs with sequential semantics and parallel implementations that are designed to be flexible and extensible. Kmerind's k-mer counter performs similarly or better than the best existing k-mer counting tools even on shared memory systems. In a distributed memory environment, Kmerind counts k-mers in a 120 GB sequence read dataset in less than 13 seconds on 1024 Xeon CPU cores, and fully indexes their positions in approximately 17 seconds. Querying for 1% of the k-mers in these indices can be completed in 0.23 seconds and 28 seconds, respectively. Kmerind is the first k-mer indexing library for distributed memory environments, and the first extensible library for general k-mer indexing and counting. Kmerind is available at https://github.com/ParBLiSS/kmerind.
NASA Astrophysics Data System (ADS)
Bustamam, A.; Ulul, E. D.; Hura, H. F. A.; Siswantining, T.
2017-07-01
Hierarchical clustering is one of effective methods in creating a phylogenetic tree based on the distance matrix between DNA (deoxyribonucleic acid) sequences. One of the well-known methods to calculate the distance matrix is k-mer method. Generally, k-mer is more efficient than some distance matrix calculation techniques. The steps of k-mer method are started from creating k-mer sparse matrix, and followed by creating k-mer singular value vectors. The last step is computing the distance amongst vectors. In this paper, we analyze the sequences of MERS-CoV (Middle East Respiratory Syndrome - Coronavirus) DNA by implementing hierarchical clustering using k-mer sparse matrix in order to perform the phylogenetic analysis. Our results show that the ancestor of our MERS-CoV is coming from Egypt. Moreover, we found that the MERS-CoV infection that occurs in one country may not necessarily come from the same country of origin. This suggests that the process of MERS-CoV mutation might not only be influenced by geographical factor.
Martin, Caroline; Kulpa, Richard; Delamarche, Paul; Bideau, Benoit
2013-03-01
The purpose of the study was to identify the relationships between segmental angular momentum and ball velocity between the following events: ball toss, maximal elbow flexion (MEF), racket lowest point (RLP), maximal shoulder external rotation (MER), and ball impact (BI). Ten tennis players performed serves recorded with a real-time motion capture. Mean angular momentums of the trunk, upper arm, forearm, and the hand-racket were calculated. The anteroposterior axis angular momentum of the trunk was significantly related with ball velocity during the MEF-RLP, RLP-MER, and MER-BI phases. The strongest relationships between the transverse-axis angular momentums and ball velocity followed a proximal-to-distal timing sequence that allows the transfer of angular momentum from the trunk (MEF-RLP and RLP-MER phases) to the upper arm (RLP-MER phase), forearm (RLP-MER and MER-BI phases), and the hand-racket (MER-BI phase). Since sequence is crucial for ball velocity, players should increase angular momentums of the trunk during MEF-MER, upper arm during RLP-MER, forearm during RLP-BI, and the hand-racket during MER-BI.
Ní Chadhain, Sinéad M; Schaefer, Jeffra K; Crane, Sharron; Zylstra, Gerben J; Barkay, Tamar
2006-10-01
The reduction of ionic mercury to elemental mercury by the mercuric reductase (MerA) enzyme plays an important role in the biogeochemical cycling of mercury in contaminated environments by partitioning mercury to the atmosphere. This activity, common in aerobic environments, has rarely been examined in anoxic sediments where production of highly toxic methylmercury occurs. Novel degenerate PCR primers were developed which span the known diversity of merA genes in Gram-negative bacteria and amplify a 285 bp fragment at the 3' end of merA. These primers were used to create a clone library and to analyse merA diversity in an anaerobic sediment enrichment collected from a mercury-contaminated site in the Meadowlands, New Jersey. A total of 174 sequences were analysed, representing 71 merA phylotypes and four novel MerA clades. This first examination of merA diversity in anoxic environments suggests an untapped resource for novel merA sequences.
Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter.
Mohamed Hashim, Ezzeddin Kamil; Abdullah, Rosni
2015-12-21
Empirical analysis on k-mer DNA has been proven as an effective tool in finding unique patterns in DNA sequences which can lead to the discovery of potential sequence motifs. In an extensive study of empirical k-mer DNA on hundreds of organisms, the researchers found unique multi-modal k-mer spectra occur in the genomes of organisms from the tetrapod clade only which includes all mammals. The multi-modality is caused by the formation of the two lowest modes where k-mers under them are referred as the rare k-mers. The suppression of the two lowest modes (or the rare k-mers) can be attributed to the CG dinucleotide inclusions in them. Apart from that, the rare k-mers are selectively distributed in certain genomic features of CpG Island (CGI), promoter, 5' UTR, and exon. We correlated the rare k-mers with hundreds of annotated features using several bioinformatic tools, performed further intrinsic rare k-mer analyses within the correlated features, and modeled the elucidated rare k-mer clustering feature into a classifier to predict the correlated CGI and promoter features. Our correlation results show that rare k-mers are highly associated with several annotated features of CGI, promoter, 5' UTR, and open chromatin regions. Our intrinsic results show that rare k-mers have several unique topological, compositional, and clustering properties in CGI and promoter features. Finally, the performances of our RWC (rare-word clustering) method in predicting the CGI and promoter features are ranked among the top three, in eight of the CGI and promoter evaluations, among eight of the benchmarked datasets. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
Rugh, C L; Wilde, H D; Stack, N M; Thompson, D M; Summers, A O; Meagher, R B
1996-01-01
With global heavy metal contamination increasing, plants that can process heavy metals might provide efficient and ecologically sound approaches to sequestration and removal. Mercuric ion reductase, MerA, converts toxic Hg2+ to the less toxic, relatively inert metallic mercury (Hg0) The bacterial merA sequence is rich in CpG dinucleotides and has a highly skewed codon usage, both of which are particularly unfavorable to efficient expression in plants. We constructed a mutagenized merA sequence, merApe9, modifying the flanking region and 9% of the coding region and placing this sequence under control of plant regulatory elements. Transgenic Arabidopsis thaliana seeds expressing merApe9 germinated, and these seedlings grew, flowered, and set seed on medium containing HgCl2 concentrations of 25-100 microM (5-20 ppm), levels toxic to several controls. Transgenic merApe9 seedlings evolved considerable amounts of Hg0 relative to control plants. The rate of mercury evolution and the level of resistance were proportional to the steady-state mRNA level, confirming that resistance was due to expression of the MerApe9 enzyme. Plants and bacteria expressing merApe9 were also resistant to toxic levels of Au3+. These and other data suggest that there are potentially viable molecular genetic approaches to the phytoremediation of metal ion pollution. Images Fig. 2 Fig. 3 Fig. 4 PMID:8622910
Schnitzler, P; Delius, H; Scholz, J; Touray, M; Orth, E; Darai, G
1987-12-01
The genome of the fish lymphocystis disease virus (FLDV) was screened for the existence of repetitive DNA sequences using a defined and complete gene library of the viral genome (98 kbp) by DNA-DNA hybridization, heteroduplex analysis, and restriction fine mapping. A repetitive DNA sequence was detected at the coordinates 0.034 to 0.057 and 0.718 to 0.736 map units (m.u.) of the FLDV genome. The first region (0.034 to 0.057 m.u.) corresponds to the 5' terminus of the EcoRI FLDV DNA fragment B (0.034 to 0.165 m.u.) and the second region (0.718 to 0.736 m.u.) is identical to the EcoRI DNA fragment M of the viral genome. The DNA nucleotide sequence of the EcoRI FLDV DNA fragment M was determined. This analysis revealed the presence of many short direct and inverted repetitions, e.g., a 18-mer direct repetition (TTTAAAATTTAATTAA) that started at nucleotide positions 812 and 942 and a 14-mer inverted repeat (TTAAATTTAAATTT) at nucleotide positions 820 and 959. Only short open reading frames were detected within this region. The DNA repetitions are discussed as sequences that play a possible regulatory role for virus replication. Furthermore, hybridization experiments revealed that the repetitive DNA sequences are conserved in the genome of different strains of fish lymphocystis disease virus isolated from two species of Pleuronectidae (flounder and dab).
Yusof, Mohammed F; Eltahir, Yassir M; Serhan, Wissam S; Hashem, Farouk M; Elsayed, Elsaeid A; Marzoug, Bahaaeldin A; Abdelazim, Assem Si; Bensalah, Oum Keltoum A; Al Muhairi, Salama S
2015-06-01
High seroprevalence of Middle East respiratory syndrome corona virus (MERS-CoV) in dromedary camels has been previously reported in United Arab Emirates (UAE). However, the molecular detection of the virus has never been reported before in UAE. Of the 7,803 nasal swabs tested in the epidemiological survey, MERS-CoV nucleic acid was detected by real-time PCR in a total of 126 (1.6 %) camels. Positive camels were detected at the borders with Saudi Arabia and Oman and in camels' slaughter houses. MERS-CoV partial sequences obtained from UAE camels were clustering with human- and camel-derived MERS-CoV sequences in the same geographic area. Results provide further evidence of MERS-CoV zoonosis.
Kraken: ultrafast metagenomic sequence classification using exact alignments
2014-01-01
Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of k-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at http://ccb.jhu.edu/software/kraken/. PMID:24580807
Muhairi, Salama Al; Hosani, Farida Al; Eltahir, Yassir M; Mulla, Mariam Al; Yusof, Mohammed F; Serhan, Wissam S; Hashem, Farouq M; Elsayed, Elsaeid A; Marzoug, Bahaaeldin A; Abdelazim, Assem S
2016-12-01
The objective of this research was to investigate the prevalence of Middle East respiratory syndrome coronavirus (MERS-CoV) infection primarily in dromedary camel farms and the relationship of those infections with infections in humans in the Emirate of Abu Dhabi. Nasal swabs from 1113 dromedary camels (39 farms) and 34 sheep (1 farm) and sputum samples from 2 MERS-CoV-infected camel farm owners and 1 MERS-CoV-infected sheep farm owner were collected. Samples from camels and humans underwent real-time reverse-transcription quantitative PCR screening to detect MERS-CoV. In addition, sequencing and phylogenetic analysis of partially characterized MERS-CoV genome fragments obtained from camels were performed. Among the 40 farms, 6 camel farms were positive for MERS-CoV; the virus was not detected in the single sheep farm. The maximum duration of viral shedding from infected camels was 2 weeks after the first positive test result as detected in nasal swabs and in rectal swabs obtained from infected calves. Three partial camel sequences characterized in this study (open reading frames 1a and 1ab, Spike1, Spike2, and ORF4b) together with the corresponding regions of previously reported MERS-CoV sequence obtained from one farm owner were clustering together within the larger MERS-CoV sequences cluster containing human and camel isolates reported for the Arabian Peninsula. Data provided further evidence of the zoonotic potential of MERS-CoV infection and strongly suggested that camels may have a role in the transmission of the virus to humans.
Determining geographical spread pattern of MERS-CoV by distance method using Kimura model
NASA Astrophysics Data System (ADS)
Amiroch, Siti; Rohmatullah, Arif
2017-03-01
MERS-CoV or generally called as Middle East Respiratory Syndrome Coronavirus, a respiratory disease syndrome caused by a corona virus that attacks the respiratory tract ranging from mild to severe acute indication of fever, cough and shortness of breath. The cases happened relate to the countries in the Arabian Peninsula (Middle East) and there were 356 deaths have been reported due to the spread of the epidemic MERS. The data used in the case of MERS are the data DNA sequences taken from Genbank, the online database of the United States that stores the results of molecular biological experiments from all over the world (http://www.ncbi.nlm.nih.gov). In this case, bioinformatics plays an important role of reading sequences of DNA and genetic information by using the main device in the form of software that is supported by the availability of the Internet, while the analysis there in made and proven with mathematical methods. In similar research conducted by molecular biologists and physicians, the process of DNA sequencing is done with software that is already available like BLAST. In order to determine the MERS geographical distribution patterns in the Arabian Peninsula is done with program Clustal W, Bayesian, Phylip, etc. In this study, the writer use the Matlab simulation for all processes starting sequence alignment, counting the number of transitions and transversion substitutions for each sequence and its location up to the process of forming a phylogenetic tree that figures out the pattern of spread of the epidemic MERS. Mathematical analysis performed on a decline in the formula is to find Kimura evolutionary models and the process of forming a phylogenetic tree (the pattern of the epidemic MERS distribution) with neighbor joining algorithm. Finally it was obtained the pattern of geographical spread with 6 groups epidemic of MERS which ultimately turns out that all the MERS viruses that were spread in the Arabian Peninsula everything are almost the same as the virus sequence found in al-Hasa.
KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.
Wang, Dapeng; Xu, Jiayue; Yu, Jun
2015-09-16
The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.
Li, Liqi; Luo, Qifa; Xiao, Weidong; Li, Jinhui; Zhou, Shiwen; Li, Yongsheng; Zheng, Xiaoqi; Yang, Hua
2017-02-01
Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.
Sharmin, Refat; Islam, Abul B M M K
2016-01-01
MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV. Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein's antigenic sites are found to be conserved with those in HKU4 and HKU5. This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.
Stochastic precision analysis of 2D cardiac strain estimation in vivo
NASA Astrophysics Data System (ADS)
Bunting, E. A.; Provost, J.; Konofagou, E. E.
2014-11-01
Ultrasonic strain imaging has been applied to echocardiography and carries great potential to be used as a tool in the clinical setting. Two-dimensional (2D) strain estimation may be useful when studying the heart due to the complex, 3D deformation of the cardiac tissue. Increasing the framerate used for motion estimation, i.e. motion estimation rate (MER), has been shown to improve the precision of the strain estimation, although maintaining the spatial resolution necessary to view the entire heart structure in a single heartbeat remains challenging at high MERs. Two previously developed methods, the temporally unequispaced acquisition sequence (TUAS) and the diverging beam sequence (DBS), have been used in the past to successfully estimate in vivo axial strain at high MERs without compromising spatial resolution. In this study, a stochastic assessment of 2D strain estimation precision is performed in vivo for both sequences at varying MERs (65, 272, 544, 815 Hz for TUAS; 250, 500, 1000, 2000 Hz for DBS). 2D incremental strains were estimated during left ventricular contraction in five healthy volunteers using a normalized cross-correlation function and a least-squares strain estimator. Both sequences were shown capable of estimating 2D incremental strains in vivo. The conditional expected value of the elastographic signal-to-noise ratio (E(SNRe|ɛ)) was used to compare strain estimation precision of both sequences at multiple MERs over a wide range of clinical strain values. The results here indicate that axial strain estimation precision is much more dependent on MER than lateral strain estimation, while lateral estimation is more affected by strain magnitude. MER should be increased at least above 544 Hz to avoid suboptimal axial strain estimation. Radial and circumferential strain estimations were influenced by the axial and lateral strain in different ways. Furthermore, the TUAS and DBS were found to be of comparable precision at similar MERs.
Loebel, Madlen; Eckey, Maren; Sotzny, Franziska; Hahn, Elisabeth; Bauer, Sandra; Grabowski, Patricia; Zerweck, Johannes; Holenya, Pavlo; Hanitsch, Leif G; Wittke, Kirsten; Borchmann, Peter; Rüffer, Jens-Ulrich; Hiepe, Falk; Ruprecht, Klemens; Behrends, Uta; Meindl, Carola; Volk, Hans-Dieter; Reimer, Ulf; Scheibenbogen, Carmen
2017-01-01
Epstein-Barr-Virus (EBV) plays an important role as trigger or cofactor for various autoimmune diseases. In a subset of patients with Chronic Fatigue Syndrome (CFS) disease starts with infectious mononucleosis as late primary EBV-infection, whereby altered levels of EBV-specific antibodies can be observed in another subset of patients. We performed a comprehensive mapping of the IgG response against EBV comparing 50 healthy controls with 92 CFS patients using a microarray platform. Patients with multiple sclerosis (MS), systemic lupus erythematosus (SLE) and cancer-related fatigue served as controls. 3054 overlapping peptides were synthesised as 15-mers from 14 different EBV proteins. Array data was validated by ELISA for selected peptides. Prevalence of EBV serotypes was determined by qPCR from throat washing samples. EBV type 1 infections were found in patients and controls. EBV seroarray profiles between healthy controls and CFS were less divergent than that observed for MS or SLE. We found significantly enhanced IgG responses to several EBNA-6 peptides containing a repeat sequence in CFS patients compared to controls. EBNA-6 peptide IgG responses correlated well with EBNA-6 protein responses. The EBNA-6 repeat region showed sequence homologies to various human proteins. Patients with CFS had a quite similar EBV IgG antibody response pattern as healthy controls. Enhanced IgG reactivity against an EBNA-6 repeat sequence and against EBNA-6 protein is found in CFS patients. Homologous sequences of various human proteins with this EBNA-6 repeat sequence might be potential targets for antigenic mimicry.
Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures
Wang, Ying; Fu, Lei; Ren, Jie; Yu, Zhaoxia; Chen, Ting; Sun, Fengzhu
2018-01-01
Comparing metagenomic samples is crucial for understanding microbial communities. For different groups of microbial communities, such as human gut metagenomic samples from patients with a certain disease and healthy controls, identifying group-specific sequences offers essential information for potential biomarker discovery. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered “group-specific” in our study. Our main purpose is to discover group-specific sequence regions between control and case groups as disease-associated markers. We developed a long k-mer (k ≥ 30 bps)-based computational pipeline to detect group-specific sequences at strain resolution free from reference sequences, sequence alignments, and metagenome-wide de novo assembly. We called our method MetaGO: Group-specific oligonucleotide analysis for metagenomic samples. An open-source pipeline on Apache Spark was developed with parallel computing. We applied MetaGO to one simulated and three real metagenomic datasets to evaluate the discriminative capability of identified group-specific markers. In the simulated dataset, 99.11% of group-specific logical 40-mers covered 98.89% disease-specific regions from the disease-associated strain. In addition, 97.90% of group-specific numerical 40-mers covered 99.61 and 96.39% of differentially abundant genome and regions between two groups, respectively. For a large-scale metagenomic liver cirrhosis (LC)-associated dataset, we identified 37,647 group-specific 40-mer features. Any one of the features can predict disease status of the training samples with the average of sensitivity and specificity higher than 0.8. The random forests classification using the top 10 group-specific features yielded a higher AUC (from ∼0.8 to ∼0.9) than that of previous studies. All group-specific 40-mers were present in LC patients, but not healthy controls. All the assembled 11 LC-specific sequences can be mapped to two strains of Veillonella parvula: UTDB1-3 and DSM2008. The experiments on the other two real datasets related to Inflammatory Bowel Disease and Type 2 Diabetes in Women consistently demonstrated that MetaGO achieved better prediction accuracy with fewer features compared to previous studies. The experiments showed that MetaGO is a powerful tool for identifying group-specific k-mers, which would be clinically applicable for disease prediction. MetaGO is available at https://github.com/VVsmileyx/MetaGO. PMID:29774017
Disk-based k-mer counting on a PC
2013-01-01
Background The k-mer counting problem, which is to build the histogram of occurrences of every k-symbol long substring in a given text, is important for many bioinformatics applications. They include developing de Bruijn graph genome assemblers, fast multiple sequence alignment and repeat detection. Results We propose a simple, yet efficient, parallel disk-based algorithm for counting k-mers. Experiments show that it usually offers the fastest solution to the considered problem, while demanding a relatively small amount of memory. In particular, it is capable of counting the statistics for short-read human genome data, in input gzipped FASTQ file, in less than 40 minutes on a PC with 16 GB of RAM and 6 CPU cores, and for long-read human genome data in less than 70 minutes. On a more powerful machine, using 32 GB of RAM and 32 CPU cores, the tasks are accomplished in less than half the time. No other algorithm for most tested settings of this problem and mammalian-size data can accomplish this task in comparable time. Our solution also belongs to memory-frugal ones; most competitive algorithms cannot efficiently work on a PC with 16 GB of memory for such massive data. Conclusions By making use of cheap disk space and exploiting CPU and I/O parallelism we propose a very competitive k-mer counting procedure, called KMC. Our results suggest that judicious resource management may allow to solve at least some bioinformatics problems with massive data on a commodity personal computer. PMID:23679007
Optimization of De Novo Short Read Assembly of Seabuckthorn (Hippophae rhamnoides L.) Transcriptome
Ghangal, Rajesh; Chaudhary, Saurabh; Jain, Mukesh; Purty, Ram Singh; Chand Sharma, Prakash
2013-01-01
Seabuckthorn ( Hippophae rhamnoides L.) is known for its medicinal, nutritional and environmental importance since ancient times. However, very limited efforts have been made to characterize the genome and transcriptome of this wonder plant. Here, we report the use of next generation massive parallel sequencing technology (Illumina platform) and de novo assembly to gain a comprehensive view of the seabuckthorn transcriptome. We assembled 86,253,874 high quality short reads using six assembly tools. At our hand, assembly of non-redundant short reads following a two-step procedure was found to be the best considering various assembly quality parameters. Initially, ABySS tool was used following an additive k-mer approach. The assembled transcripts were subsequently subjected to TGICL suite. Finally, de novo short read assembly yielded 88,297 transcripts (> 100 bp), representing about 53 Mb of seabuckthorn transcriptome. The average length of transcripts was 610 bp, N50 length 1198 BP and 91% of the short reads uniquely mapped back to seabuckthorn transcriptome. A total of 41,340 (46.8%) transcripts showed significant similarity with sequences present in nr protein databases of NCBI (E-value < 1E-06). We also screened the assembled transcripts for the presence of transcription factors and simple sequence repeats. Our strategy involving the use of short read assembler (ABySS) followed by TGICL will be useful for the researchers working with a non-model organism’s transcriptome in terms of saving time and reducing complexity in data management. The seabuckthorn transcriptome data generated here provide a valuable resource for gene discovery and development of functional molecular markers. PMID:23991119
Recent advances in sequence assembly: principles and applications.
Chen, Qingfeng; Lan, Chaowang; Zhao, Liang; Wang, Jianxin; Chen, Baoshan; Chen, Yi-Ping Phoebe
2017-11-01
The application of advanced sequencing technologies and the rapid growth of various sequence data have led to increasing interest in DNA sequence assembly. However, repeats and polymorphism occur frequently in genomes, and each of these has different impacts on assembly. Further, many new applications for sequencing, such as metagenomics regarding multiple species, have emerged in recent years. These not only give rise to higher complexity but also prevent short-read assembly in an efficient way. This article reviews the theoretical foundations that underlie current mapping-based assembly and de novo-based assembly, and highlights the key issues and feasible solutions that need to be considered. It focuses on how individual processes, such as optimal k-mer determination and error correction in assembly, rely on intelligent strategies or high-performance computation. We also survey primary algorithms/software and offer a discussion on the emerging challenges in assembly. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Malecka, Kamila; Michalczuk, Lech; Radecka, Hanna; Radecki, Jerzy
2014-10-09
A DNA biosensor for detection of specific oligonucleotides sequences of Plum Pox Virus (PPV) in plant extracts and buffer is proposed. The working principles of a genosensor are based on the ion-channel mechanism. The NH2-ssDNA probe was deposited onto a glassy carbon electrode surface to form an amide bond between the carboxyl group of oxidized electrode surface and amino group from ssDNA probe. The analytical signals generated as a result of hybridization were registered in Osteryoung square wave voltammetry in the presence of [Fe(CN)6]3-/4- as a redox marker. The 22-mer and 42-mer complementary ssDNA sequences derived from PPV and DNA samples from plants infected with PPV were used as targets. Similar detection limits of 2.4 pM (31.0 pg/mL) and 2.3 pM (29.5 pg/mL) in the concentration range 1-8 pM were observed in the presence of the 22-mer ssDNA and 42-mer complementary ssDNA sequences of PPV, respectively. The genosensor was capable of discriminating between samples consisting of extracts from healthy plants and leaf extracts from infected plants in the concentration range 10-50 pg/mL. The detection limit was 12.8 pg/mL. The genosensor displayed good selectivity and sensitivity. The 20-mer partially complementary DNA sequences with four complementary bases and DNA samples from healthy plants used as negative controls generated low signal.
Tedder, Philip; Zubko, Elena; Westhead, David R.; Meyer, Peter
2009-01-01
Two pools of small RNAs were cloned from inflorescences of Petunia hybrida using a 5′-ligation dependent and a 5′-ligation independent approach. The two libraries were integrated into a public website that allows the screening of individual sequences against 359,769 unique clones. The library contains 15 clones with 100% identity and 53 clones with one mismatch to miRNAs described for other plant species. For two conserved miRNAs, miR159 and miR390, we find clear differences in tissue-specific distribution, compared with other species. This shows that evolutionary conservation of miRNA sequences does not necessarily include a conservation of the miRNA expression profile. Almost 60% of all clones in the database are 24-nucleotide clones. In accordance with the role of 24mers in marking repetitive regions, we find them distributed across retroviral and transposable element sequences but other 24mers map to promoter regions and to different transcript regions. For one target region we observe tissue-specific variation of matching 24mers, which demonstrates that, as for 21mers, 24mer concentrations are not necessarily identical in different tissues. Asymmetric distribution of a putative novel miRNA in the two libraries suggests that the cloning method can be selective for the representation of certain small RNAs in a collection. PMID:19369427
Sievers, Aaron; Bosiek, Katharina; Bisch, Marc; Dreessen, Chris; Riedel, Jascha; Froß, Patrick; Hausmann, Michael; Hildenbrand, Georg
2017-01-01
In genome analysis, k-mer-based comparison methods have become standard tools. However, even though they are able to deliver reliable results, other algorithms seem to work better in some cases. To improve k-mer-based DNA sequence analysis and comparison, we successfully checked whether adding positional resolution is beneficial for finding and/or comparing interesting organizational structures. A simple but efficient algorithm for extracting and saving local k-mer spectra (frequency distribution of k-mers) was developed and used. The results were analyzed by including positional information based on visualizations as genomic maps and by applying basic vector correlation methods. This analysis was concentrated on small word lengths (1 ≤ k ≤ 4) on relatively small viral genomes of Papillomaviridae and Herpesviridae, while also checking its usability for larger sequences, namely human chromosome 2 and the homologous chromosomes (2A, 2B) of a chimpanzee. Using this alignment-free analysis, several regions with specific characteristics in Papillomaviridae and Herpesviridae formerly identified by independent, mostly alignment-based methods, were confirmed. Correlations between the k-mer content and several genes in these genomes have been found, showing similarities between classified and unclassified viruses, which may be potentially useful for further taxonomic research. Furthermore, unknown k-mer correlations in the genomes of Human Herpesviruses (HHVs), which are probably of major biological function, are found and described. Using the chromosomes of a chimpanzee and human that are currently known, identities between the species on every analyzed chromosome were reproduced. This demonstrates the feasibility of our approach for large data sets of complex genomes. Based on these results, we suggest k-mer analysis with positional resolution as a method for closing a gap between the effectiveness of alignment-based methods (like NCBI BLAST) and the high pace of standard k-mer analysis. PMID:28422050
Malecka, Kamila; Stachyra, Anna; Góra-Sochacka, Anna; Sirko, Agnieszka; Zagórski-Ostoja, Włodzimierz; Dehaen, Wim; Radecka, Hanna; Radecki, Jerzy
2015-03-15
This paper concerns the development of a redox-active monolayer and its application for the construction of an electrochemical genosensor designed for the detection of specific DNA and RNA oligonucleotide sequences related to the avian influenza virus (AIV) type H5N1. This new redox layer was created on a gold electrode surface step by step. Cyclic Voltammetry, Osteryoung Square-Wave Voltammetry and Differential Pulse Voltammetry were used for its characterization. This new redox-active layer was applied for the construction of the DNA biosensor. The NH2-NC3 probe (20-mer) was covalently attached to the gold electrode surface via a "click" reaction between the amine and an epoxide group. The hybridization process was monitored using the Osteryoung Square-Wave Voltammetry. The 20-mer DNA and ca. 280-mer RNA oligonucleotides were used as the targets. The constructed genosensor was capable to determine complementary oligonucleotide sequences with a detection limit in the pM range. It is able to distinguish the different position of the part RNA complementary to the DNA probe. The genosensor was very selective. The 20-mer DNA as well as the 280-mer RNA oligonucleotides without a complementary sequence generated a weak signal. Copyright © 2014 Elsevier B.V. All rights reserved.
Scobey, Trevor; Yount, Boyd L; Sims, Amy C; Donaldson, Eric F; Agnihothram, Sudhakar S; Menachery, Vineet D; Graham, Rachel L; Swanstrom, Jesica; Bove, Peter F; Kim, Jeeho D; Grego, Sonia; Randell, Scott H; Baric, Ralph S
2013-10-01
Severe acute respiratory syndrome with high mortality rates (~50%) is associated with a novel group 2c betacoronavirus designated Middle East respiratory syndrome coronavirus (MERS-CoV). We synthesized a panel of contiguous cDNAs that spanned the entire genome. Following contig assembly into genome-length cDNA, transfected full-length transcripts recovered several recombinant viruses (rMERS-CoV) that contained the expected marker mutations inserted into the component clones. Because the wild-type MERS-CoV contains a tissue culture-adapted T1015N mutation in the S glycoprotein, rMERS-CoV replicated ~0.5 log less efficiently than wild-type virus. In addition, we ablated expression of the accessory protein ORF5 (rMERS•ORF5) and replaced it with tomato red fluorescent protein (rMERS-RFP) or deleted the entire ORF3, 4, and 5 accessory cluster (rMERS-ΔORF3-5). Recombinant rMERS-CoV, rMERS-CoV•ORF5, and MERS-CoV-RFP replicated to high titers, whereas MERS-ΔORF3-5 showed 1-1.5 logs reduced titer compared with rMERS-CoV. Northern blot analyses confirmed the associated molecular changes in the recombinant viruses, and sequence analysis demonstrated that RFP was expressed from the appropriate consensus sequence AACGAA. We further show dipeptidyl peptidase 4 expression, MERS-CoV replication, and RNA and protein synthesis in human airway epithelial cell cultures, primary lung fibroblasts, primary lung microvascular endothelial cells, and primary alveolar type II pneumocytes, demonstrating a much broader tissue tropism than severe acute respiratory syndrome coronavirus. The availability of a MERS-CoV molecular clone, as well as recombinant viruses expressing indicator proteins, will allow for high-throughput testing of therapeutic compounds and provide a genetic platform for studying gene function and the rational design of live virus vaccines.
LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.
El-Metwally, Sara; Zakaria, Magdi; Hamza, Taher
2016-11-01
The deluge of current sequenced data has exceeded Moore's Law, more than doubling every 2 years since the next-generation sequencing (NGS) technologies were invented. Accordingly, we will able to generate more and more data with high speed at fixed cost, but lack the computational resources to store, process and analyze it. With error prone high throughput NGS reads and genomic repeats, the assembly graph contains massive amount of redundant nodes and branching edges. Most assembly pipelines require this large graph to reside in memory to start their workflows, which is intractable for mammalian genomes. Resource-efficient genome assemblers combine both the power of advanced computing techniques and innovative data structures to encode the assembly graph efficiently in a computer memory. LightAssembler is a lightweight assembly algorithm designed to be executed on a desktop machine. It uses a pair of cache oblivious Bloom filters, one holding a uniform sample of [Formula: see text]-spaced sequenced [Formula: see text]-mers and the other holding [Formula: see text]-mers classified as likely correct, using a simple statistical test. LightAssembler contains a light implementation of the graph traversal and simplification modules that achieves comparable assembly accuracy and contiguity to other competing tools. Our method reduces the memory usage by [Formula: see text] compared to the resource-efficient assemblers using benchmark datasets from GAGE and Assemblathon projects. While LightAssembler can be considered as a gap-based sequence assembler, different gap sizes result in an almost constant assembly size and genome coverage. https://github.com/SaraEl-Metwally/LightAssembler CONTACT: sarah_almetwally4@mans.edu.egSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Rasmussen, L. D.; Zawadsky, C.; Binnerup, S. J.; Øregaard, G.; Sørensen, S. J.; Kroer, N.
2008-01-01
Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach. The cultivation method simulated natural conditions by using polycarbonate membranes as a growth support and a nonsterile soil slurry as a culture medium. Resistant bacteria were pregrown to microcolony-forming units (mCFU) before being plated on standard medium. Compared to direct plating, culturability was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25 different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One of the sequences did not result in a match in the BLAST search. The results illustrate the power of integrating advanced cultivation methodology with molecular techniques for the characterization of the diversity of mercury-resistant populations and assessing the potential for mercury reduction in contaminated environments. PMID:18441111
Moreno, Ana; Lelli, Davide; de Sabato, Luca; Zaccaria, Guendalina; Boni, Arianna; Sozzi, Enrica; Prosperi, Alice; Lavazza, Antonio; Cella, Eleonora; Castrucci, Maria Rita; Ciccozzi, Massimo; Vaccari, Gabriele
2017-12-19
Middle East respiratory syndrome coronavirus (MERS-CoV), which belongs to beta group of coronavirus, can infect multiple host species and causes severe diseases in humans. Multiple surveillance and phylogenetic studies suggest a bat origin. In this study, we describe the detection and full genome characterization of two CoVs closely related to MERS-CoV from two Italian bats, Pipistrellus kuhlii and Hypsugo savii. Pool of viscera were tested by a pan-coronavirus RT-PCR. Virus isolation was attempted by inoculation in different cell lines. Full genome sequencing was performed using the Ion Torrent platform and phylogenetic trees were performed using IQtree software. Similarity plots of CoV clade c genomes were generated by using SSE v1.2. The three dimensional macromolecular structure (3DMMS) of the receptor binding domain (RBD) in the S protein was predicted by sequence-homology method using the protein data bank (PDB). Both samples resulted positive to the pan-coronavirus RT-PCR (IT-batCoVs) and their genome organization showed identical pattern of MERS CoV. Phylogenetic analysis showed a monophyletic group placed in the Beta2c clade formed by MERS-CoV sequences originating from humans and camels and bat-related sequences from Africa, Italy and China. The comparison of the secondary and 3DMMS of the RBD of IT-batCoVs with MERS, HKU4 and HKU5 bat sequences showed two aa deletions located in a region corresponding to the external subdomain of MERS-RBD in IT-batCoV and HKU5 RBDs. This study reported two beta CoVs closely related to MERS that were obtained from two bats belonging to two commonly recorded species in Italy (P. kuhlii and H. savii). The analysis of the RBD showed similar structure in IT-batCoVs and HKU5 respect to HKU4 sequences. Since the RBD domain of HKU4 but not HKU5 can bind to the human DPP4 receptor for MERS-CoV, it is possible to suggest also for IT-batCoVs the absence of DPP4-binding potential. More surveillance studies are needed to better investigate the potential intermediate hosts that may play a role in the interspecies transmission of known and currently unknown coronaviruses with particular attention to the S protein and the receptor specificity and binding affinity.
1990-02-28
include energy costs, time required for cooling, large volume changes, and degradation. For many high -temperature LCPs, the latter may be the most...LCPs)- high local (microscopic) orientational order, which is retained in the solid state-has significant implications in a range of DOD applications...that yield and maintain specific mer sequences. * Continue efforts to measure mer sequence distribution, e.g., by multinuclei NMR. 0 Develop high
Phylogenomics of nonavian reptiles and the structure of the ancestral amniote genome
Shedlock, Andrew M.; Botka, Christopher W.; Zhao, Shaying; Shetty, Jyoti; Zhang, Tingting; Liu, Jun S.; Deschavanne, Patrick J.; Edwards, Scott V.
2007-01-01
We report results of a megabase-scale phylogenomic analysis of the Reptilia, the sister group of mammals. Large-scale end-sequence scanning of genomic clones of a turtle, alligator, and lizard reveals diverse, mammal-like landscapes of retroelements and simple sequence repeats (SSRs) not found in the chicken. Several global genomic traits, including distinctive phylogenetic lineages of CR1-like long interspersed elements (LINEs) and a paucity of A-T rich SSRs, characterize turtles and archosaur genomes, whereas higher frequencies of tandem repeats and a lower global GC content reveal mammal-like features in Anolis. Nonavian reptile genomes also possess a high frequency of diverse and novel 50-bp unit tandem duplications not found in chicken or mammals. The frequency distributions of ≈65,000 8-mer oligonucleotides suggest that rates of DNA-word frequency change are an order of magnitude slower in reptiles than in mammals. These results suggest a diverse array of interspersed and SSRs in the common ancestor of amniotes and a genomic conservatism and gradual loss of retroelements in reptiles that culminated in the minimalist chicken genome. PMID:17307883
Xie, Qian; Cao, Yujuan; Su, Juan; Wu, Jie; Wu, Xianbo; Wan, Chengsong; He, Mingliang; Ke, Changwen; Zhang, Bao; Zhao, Wei
2017-08-01
Significant sequence variation of Middle East respiratory syndrome coronavirus (MERS CoV) has never been detected since it was first reported in 2012. A MERS patient came from Korea to China in late May 2015. The patient was 44 years old and had symptoms including high fever, dry cough with a little phlegm, and shortness of breath, which are roughly consistent with those associated with MERS, and had had close contact with individuals with confirmed cases of MERS.After one month of therapy with antiviral, anti-infection, and immune-enhancing agents, the patient recovered in the hospital and was discharged. A nasopharyngeal swab sample was collected for direct sequencing, which revealed two deletion variants of MERS CoV. Deletions of 414 and 419 nt occurred between ORF5 and the E protein, resulting in a partial protein fusion or truncation of ORF5 and the E protein. Functional analysis by bioinformatics and comparison to previous studies implied that the two variants might be defective in their ability to package MERS CoV. However, the mechanism of how these deletions occurred and what effects they have need to be further investigated.
Min, Xu; Zeng, Wanwen; Chen, Ning; Chen, Ting; Jiang, Rui
2017-07-15
Experimental techniques for measuring chromatin accessibility are expensive and time consuming, appealing for the development of computational approaches to predict open chromatin regions from DNA sequences. Along this direction, existing methods fall into two classes: one based on handcrafted k -mer features and the other based on convolutional neural networks. Although both categories have shown good performance in specific applications thus far, there still lacks a comprehensive framework to integrate useful k -mer co-occurrence information with recent advances in deep learning. We fill this gap by addressing the problem of chromatin accessibility prediction with a convolutional Long Short-Term Memory (LSTM) network with k -mer embedding. We first split DNA sequences into k -mers and pre-train k -mer embedding vectors based on the co-occurrence matrix of k -mers by using an unsupervised representation learning approach. We then construct a supervised deep learning architecture comprised of an embedding layer, three convolutional layers and a Bidirectional LSTM (BLSTM) layer for feature learning and classification. We demonstrate that our method gains high-quality fixed-length features from variable-length sequences and consistently outperforms baseline methods. We show that k -mer embedding can effectively enhance model performance by exploring different embedding strategies. We also prove the efficacy of both the convolution and the BLSTM layers by comparing two variations of the network architecture. We confirm the robustness of our model to hyper-parameters by performing sensitivity analysis. We hope our method can eventually reinforce our understanding of employing deep learning in genomic studies and shed light on research regarding mechanisms of chromatin accessibility. The source code can be downloaded from https://github.com/minxueric/ismb2017_lstm . tingchen@tsinghua.edu.cn or ruijiang@tsinghua.edu.cn. Supplementary materials are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Min, Xu; Zeng, Wanwen; Chen, Ning; Chen, Ting; Jiang, Rui
2017-01-01
Abstract Motivation: Experimental techniques for measuring chromatin accessibility are expensive and time consuming, appealing for the development of computational approaches to predict open chromatin regions from DNA sequences. Along this direction, existing methods fall into two classes: one based on handcrafted k-mer features and the other based on convolutional neural networks. Although both categories have shown good performance in specific applications thus far, there still lacks a comprehensive framework to integrate useful k-mer co-occurrence information with recent advances in deep learning. Results: We fill this gap by addressing the problem of chromatin accessibility prediction with a convolutional Long Short-Term Memory (LSTM) network with k-mer embedding. We first split DNA sequences into k-mers and pre-train k-mer embedding vectors based on the co-occurrence matrix of k-mers by using an unsupervised representation learning approach. We then construct a supervised deep learning architecture comprised of an embedding layer, three convolutional layers and a Bidirectional LSTM (BLSTM) layer for feature learning and classification. We demonstrate that our method gains high-quality fixed-length features from variable-length sequences and consistently outperforms baseline methods. We show that k-mer embedding can effectively enhance model performance by exploring different embedding strategies. We also prove the efficacy of both the convolution and the BLSTM layers by comparing two variations of the network architecture. We confirm the robustness of our model to hyper-parameters by performing sensitivity analysis. We hope our method can eventually reinforce our understanding of employing deep learning in genomic studies and shed light on research regarding mechanisms of chromatin accessibility. Availability and implementation: The source code can be downloaded from https://github.com/minxueric/ismb2017_lstm. Contact: tingchen@tsinghua.edu.cn or ruijiang@tsinghua.edu.cn Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:28881969
Maruthamuthu, Murali Kannan; Hong, Jiyeon; Arulsamy, Kulandaisamy; Somasundaram, Sivachandiran; Hong, SoonHo; Choe, Woo-Seok; Yoo, Ik-Keun
2018-04-01
Peptide-displaying Escherichia coli cells were investigated for use in adsorptive removal of bisphenol A (BPA) both in Luria-Bertani medium including BPA or ATM thermal paper eluted wastewater. Two recombinant strains were constructed with monomeric and dimeric repeats of the 7-mer BPA-binding peptide (KSLENSY), respectively. Greater than threefold increased adsorption of BPA [230.4 µmol BPA per g dry cell weight (DCW)] was found in dimeric peptide-displaying cells compared to monomeric strains (63.4 µmol per g DCW) in 15 ppm BPA solution. The selective removal of BPA from a mixture of BPA analogs (bisphenol F and bisphenol S) was verified in both monomeric and dimeric peptide-displaying cells. The binding chemistry of BPA with the peptide was assumed, based on molecular docking analysis, to be the interaction of BPA with serine and asparagine residues within the 7-mer peptide sequence. The peptide-displaying cells also functioned efficiently in thermal paper eluted wastewater containing 14.5 ppm BPA.
Dromedary camels and the transmission of Middle East Respiratory Syndrome Coronavirus (MERS-CoV)
Hemida, Maged G; Elmoslemany, Ahmed; Al-Hizab, Fahad; Alnaeem, Abdulmohsen; Almathen, Faisal; Faye, Bernard; Chu, Daniel KW; Perera, Ranawaka A; Peiris, Malik
2015-01-01
Middle East respiratory syndrome coronavirus (MERS-CoV) is an existential threat to global public health. The virus has been repeatedly detected in dromedary camels (Camelus dromedarius). Adult animals in many countries in the Middle East as well as in North and East Africa showed high (>90%) sero-prevalence to the virus. MERS-CoV isolated from dromedaries is genetically and phenotypically similar to viruses from humans. We summarise current understanding of the ecology of MERS-CoV in animals and transmission at the animal-human interface. We review aspects of husbandry, animal movements and trade and the use and consumption of camel dairy and meat products in the Middle East that may be relevant to the epidemiology of MERS. We also highlight the gaps in understanding the transmission of this virus in animals and from animals to humans. PMID:26256102
Linearization improves the repeatability of quantitative dynamic contrast-enhanced MRI.
Jones, Kyle M; Pagel, Mark D; Cárdenas-Rodríguez, Julio
2018-04-01
The purpose of this study was to compare the repeatabilities of the linear and nonlinear Tofts and reference region models (RRM) for dynamic contrast-enhanced MRI (DCE-MRI). Simulated and experimental DCE-MRI data from 12 rats with a flank tumor of C6 glioma acquired over three consecutive days were analyzed using four quantitative and semi-quantitative DCE-MRI metrics. The quantitative methods used were: 1) linear Tofts model (LTM), 2) non-linear Tofts model (NTM), 3) linear RRM (LRRM), and 4) non-linear RRM (NRRM). The following semi-quantitative metrics were used: 1) maximum enhancement ratio (MER), 2) time to peak (TTP), 3) initial area under the curve (iauc64), and 4) slope. LTM and NTM were used to estimate K trans , while LRRM and NRRM were used to estimate K trans relative to muscle (R Ktrans ). Repeatability was assessed by calculating the within-subject coefficient of variation (wSCV) and the percent intra-subject variation (iSV) determined with the Gage R&R analysis. The iSV for R Ktrans using LRRM was two-fold lower compared to NRRM at all simulated and experimental conditions. A similar trend was observed for the Tofts model, where LTM was at least 50% more repeatable than the NTM under all experimental and simulated conditions. The semi-quantitative metrics iauc64 and MER were as equally repeatable as K trans and R Ktrans estimated by LTM and LRRM respectively. The iSV for iauc64 and MER were significantly lower than the iSV for slope and TTP. In simulations and experimental results, linearization improves the repeatability of quantitative DCE-MRI by at least 30%, making it as repeatable as semi-quantitative metrics. Copyright © 2017 Elsevier Inc. All rights reserved.
Simrank: Rapid and sensitive general-purpose k-mer search tool
2011-01-01
Background Terabyte-scale collections of string-encoded data are expected from consortia efforts such as the Human Microbiome Project http://nihroadmap.nih.gov/hmp. Intra- and inter-project data similarity searches are enabled by rapid k-mer matching strategies. Software applications for sequence database partitioning, guide tree estimation, molecular classification and alignment acceleration have benefited from embedded k-mer searches as sub-routines. However, a rapid, general-purpose, open-source, flexible, stand-alone k-mer tool has not been available. Results Here we present a stand-alone utility, Simrank, which allows users to rapidly identify database strings the most similar to query strings. Performance testing of Simrank and related tools against DNA, RNA, protein and human-languages found Simrank 10X to 928X faster depending on the dataset. Conclusions Simrank provides molecular ecologists with a high-throughput, open source choice for comparing large sequence sets to find similarity. PMID:21524302
Sequence Polishing Library (SPL) v10.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oberortner, Ernst
The Sequence Polishing Library (SPL) is a suite of software tools in order to automate "Design for Synthesis and Assembly" workflows. Specifically: The SPL "Converter" tool converts files among the following sequence data exchange formats: CSV, FASTA, GenBank, and Synthetic Biology Open Language (SBOL); The SPL "Juggler" tool optimizes the codon usages of DNA coding sequences according to an optimization strategy, a user-specific codon usage table and genetic code. In addition, the SPL "Juggler" can translate amino acid sequences into DNA sequences.:The SPL "Polisher" verifies NA sequences against DNA synthesis constraints, such as GC content, repeating k-mers, and restriction sites.more » In case of violations, the "Polisher" reports the violations in a comprehensive manner. The "Polisher" tool can also modify the violating regions according to an optimization strategy, a user-specific codon usage table and genetic code;The SPL "Partitioner" decomposes large DNA sequences into smaller building blocks with partial overlaps that enable an efficient assembly. The "Partitioner" enables the user to configure the characteristics of the overlaps, which are mostly determined by the utilized assembly protocol, such as length, GC content, or melting temperature.« less
Chandrashekar, Darshan Shimoga; Dey, Poulami; Acharya, Kshitish K.
2015-01-01
Background Genome-wide repeat sequences, such as LINEs, SINEs and LTRs share a considerable part of the mammalian nuclear genomes. These repeat elements seem to be important for multiple functions including the regulation of transcription initiation, alternative splicing and DNA methylation. But it is not possible to study all repeats and, hence, it would help to short-list before exploring their potential functional significance via experimental studies and/or detailed in silico analyses. Result We developed the ‘Genomic Repeat Element Analyzer for Mammals’ (GREAM) for analysis, screening and selection of potentially important mammalian genomic repeats. This web-server offers many novel utilities. For example, this is the only tool that can reveal a categorized list of specific types of transposons, retro-transposons and other genome-wide repetitive elements that are statistically over-/under-represented in regions around a set of genes, such as those expressed differentially in a disease condition. The output displays the position and frequency of identified elements within the specified regions. In addition, GREAM offers two other types of analyses of genomic repeat sequences: a) enrichment within chromosomal region(s) of interest, and b) comparative distribution across the neighborhood of orthologous genes. GREAM successfully short-listed a repeat element (MER20) known to contain functional motifs. In other case studies, we could use GREAM to short-list repetitive elements in the azoospermia factor a (AZFa) region of the human Y chromosome and those around the genes associated with rat liver injury. GREAM could also identify five over-represented repeats around some of the human and mouse transcription factor coding genes that had conserved expression patterns across the two species. Conclusion GREAM has been developed to provide an impetus to research on the role of repetitive sequences in mammalian genomes by offering easy selection of more interesting repeats in various contexts/regions. GREAM is freely available at http://resource.ibab.ac.in/GREAM/. PMID:26208093
Verma, Alok Kumar; Misra, Amita; Subash, Swarna; Das, Mukul; Dwivedi, Premendra D
2011-09-01
Development of genetically modified (GM) crops is on increase to improve food quality, increase harvest yields, and reduce the dependency on chemical pesticides. Before their release in marketplace, they should be scrutinized for their safety. Several guidelines of different regulatory agencies like ILSI, WHO Codex, OECD, and so on for allergenicity evaluation of transgenics are available and sequence homology analysis is the first test to determine the allergenic potential of inserted proteins. Therefore, to test and validate, 312 allergenic, 100 non-allergenic, and 48 inserted proteins were assessed for sequence similarity using 8-mer, 80-mer, and full FASTA search. On performing sequence homology studies, ~94% the allergenic proteins gave exact matches for 8-mer and 80-mer homology. However, 20 allergenic proteins showed non-allergenic behavior. Out of 100 non-allergenic proteins, seven qualified as allergens. None of the inserted proteins demonstrated allergenic behavior. In order to improve the predictability, proteins showing anomalous behavior were tested by Algpred and ADFS separately. Use of Algpred and ADFS softwares reduced the tendency of false prediction to a great extent (74-78%). In conclusion, routine sequence homology needs to be coupled with some other bioinformatic method like ADFS/Algpred to reduce false allergenicity prediction of novel proteins.
Assiri, Abdullah M.; Midgley, Claire M.; Abedi, Glen R.; Saeed, Abdulaziz Bin; Almasri, Malak M.; Lu, Xiaoyan; Al-Abdely, Hail M.; Abdalla, Osman; Mohammed, Mutaz; Algarni, Homoud S.; Alhakeem, Raafat F.; Sakthivel, Senthilkumar K.; Nooh, Randa; Alshayab, Zainab; Alessa, Mohammad; Srinivasamoorthy, Ganesh; AlQahtani, Saeed Yahya; Kheyami, Ali; HajOmar, Waleed Husein; Banaser, Talib M.; Esmaeel, Ahmad; Hall, Aron J.; Curns, Aaron T.; Tamin, Azaibi; Alsharef, Ali Abraheem; Erdman, Dean; Watson, John T.; Gerber, Susan I.
2017-01-01
Background Middle East respiratory syndrome coronavirus (MERS-CoV) causes severe respiratory illness in humans. Fundamental questions about circulating viruses and transmission routes remain. Methods We assessed routinely collected epidemiologic data for MERS-CoV cases reported in Saudi Arabia during 1 January– 30 June 2015 and conducted a more detailed investigation of cases reported during February 2015. Available respiratory specimens were obtained for sequencing. Results During the study period, 216 MERS-CoV cases were reported. Full genome (n = 17) or spike gene sequences (n = 82) were obtained from 99 individuals. Most sequences (72 of 99 [73%]) formed a discrete, novel recombinant subclade (NRC-2015), which was detected in 6 regions and became predominant by June 2015. No clinical differences were noted between clades. Among 87 cases reported during February 2015, 13 had no recognized risks for secondary acquisition; 12 of these 13 also denied camel contact. Most viruses (8 of 9) from these 13 individuals belonged to NRC-2015. Discussions Our findings document the spread and eventual predominance of NRC-2015 in humans in Saudi Arabia during the first half of 2015. Our identification of cases without recognized risk factors but with similar virus sequences indicates the need for better understanding of risk factors for MERS-CoV transmission. PMID:27302191
Sekizuka, Tsuyoshi; Yamashita, Akifumi; Murase, Yoshiro; Iwamoto, Tomotada; Mitarai, Satoshi; Kato, Seiya; Kuroda, Makoto
2015-01-01
Whole-genome sequencing (WGS) with next-generation DNA sequencing (NGS) is an increasingly accessible and affordable method for genotyping hundreds of Mycobacterium tuberculosis (Mtb) isolates, leading to more effective epidemiological studies involving single nucleotide variations (SNVs) in core genomic sequences based on molecular evolution. We developed an all-in-one web-based tool for genotyping Mtb, referred to as the Total Genotyping Solution for TB (TGS-TB), to facilitate multiple genotyping platforms using NGS for spoligotyping and the detection of phylogenies with core genomic SNVs, IS6110 insertion sites, and 43 customized loci for variable number tandem repeat (VNTR) through a user-friendly, simple click interface. This methodology is implemented with a KvarQ script to predict MTBC lineages/sublineages and potential antimicrobial resistance. Seven Mtb isolates (JP01 to JP07) in this study showing the same VNTR profile were accurately discriminated through median-joining network analysis using SNVs unique to those isolates. An additional IS6110 insertion was detected in one of those isolates as supportive genetic information in addition to core genomic SNVs. The results of in silico analyses using TGS-TB are consistent with those obtained using conventional molecular genotyping methods, suggesting that NGS short reads could provide multiple genotypes to discriminate multiple strains of Mtb, although longer NGS reads (≥300-mer) will be required for full genotyping on the TGS-TB web site. Most available short reads (~100-mer) can be utilized to discriminate the isolates based on the core genome phylogeny. TGS-TB provides a more accurate and discriminative strain typing for clinical and epidemiological investigations; NGS strain typing offers a total genotyping solution for Mtb outbreak and surveillance. TGS-TB web site: https://gph.niid.go.jp/tgs-tb/. PMID:26565975
Kaga, Chiaki; Okochi, Mina; Tomita, Yasuyuki; Kato, Ryuji; Honda, Hiroyuki
2008-03-01
We developed a method of effective peptide screening that combines experiments and computational analysis. The method is based on the concept that screening efficiency can be enhanced from even limited data by use of a model derived from computational analysis that serves as a guide to screening and combining the model with subsequent repeated experiments. Here we focus on cell-adhesion peptides as a model application of this peptide-screening strategy. Cell-adhesion peptides were screened by use of a cell-based assay of a peptide array. Starting with the screening data obtained from a limited, random 5-mer library (643 sequences), a rule regarding structural characteristics of cell-adhesion peptides was extracted by fuzzy neural network (FNN) analysis. According to this rule, peptides with unfavored residues in certain positions that led to inefficient binding were eliminated from the random sequences. In the restricted, second random library (273 sequences), the yield of cell-adhesion peptides having an adhesion rate more than 1.5-fold to that of the basal array support was significantly high (31%) compared with the unrestricted random library (20%). In the restricted third library (50 sequences), the yield of cell-adhesion peptides increased to 84%. We conclude that a repeated cycle of experiments screening limited numbers of peptides can be assisted by the rule-extracting feature of FNN.
Lau, Billy T; Ji, Hanlee P
2017-09-21
RNA-Seq measures gene expression by counting sequence reads belonging to unique cDNA fragments. Molecular barcodes commonly in the form of random nucleotides were recently introduced to improve gene expression measures by detecting amplification duplicates, but are susceptible to errors generated during PCR and sequencing. This results in false positive counts, leading to inaccurate transcriptome quantification especially at low input and single-cell RNA amounts where the total number of molecules present is minuscule. To address this issue, we demonstrated the systematic identification of molecular species using transposable error-correcting barcodes that are exponentially expanded to tens of billions of unique labels. We experimentally showed random-mer molecular barcodes suffer from substantial and persistent errors that are difficult to resolve. To assess our method's performance, we applied it to the analysis of known reference RNA standards. By including an inline random-mer molecular barcode, we systematically characterized the presence of sequence errors in random-mer molecular barcodes. We observed that such errors are extensive and become more dominant at low input amounts. We described the first study to use transposable molecular barcodes and its use for studying random-mer molecular barcode errors. Extensive errors found in random-mer molecular barcodes may warrant the use of error correcting barcodes for transcriptome analysis as input amounts decrease.
CAFE: aCcelerated Alignment-FrEe sequence analysis.
Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A; Waterman, Michael S; Sun, Fengzhu
2017-07-03
Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Shirato, Kazuya; Semba, Shohei; El-Kafrawy, Sherif A; Hassan, Ahmed M; Tolah, Ahmed M; Takayama, Ikuyo; Kageyama, Tsutomu; Notomi, Tsugunori; Kamitani, Wataru; Matsuyama, Shutoku; Azhar, Esam Ibraheem
2018-05-12
Clinical detection of Middle East respiratory syndrome (MERS) coronavirus (MERS-CoV) in patients is achieved using genetic diagnostic methods, such as real-time RT-PCR assay. Previously, we developed a reverse transcription-loop-mediated isothermal amplification (RT-LAMP) assay for the detection of MERS-CoV [Virol J. 2014. 11:139]. Generally, amplification of RT-LAMP is monitored by the turbidity induced by precipitation of magnesium pyrophosphate with newly synthesized DNA. However, this mechanism cannot completely exclude the possibility of unexpected reactions. Therefore, in this study, fluorescent RT-LAMP assays using quenching probes (QProbes) were developed specifically to monitor only primer-derived signals. Two primer sets (targeting nucleocapsid and ORF1a sequences) were constructed to confirm MERS cases by RT-LAMP assay only. Our data indicate that both primer sets were capable of detecting MERS-CoV RNA to the same level as existing genetic diagnostic methods, and that both were highly specific with no cross-reactivity observed with other respiratory viruses. These primer sets were highly efficient in amplifying target sequences derived from different MERS-CoV strains, including camel MERS-CoV. In addition, the detection efficacy of QProbe RT-LAMP was comparable to that of real-time RT-PCR assay using clinical specimens from patients in Saudi Arabia. Altogether, these results indicate that QProbe RT-LAMP assays described here can be used as powerful diagnostic tools for rapid detection and surveillance of MERS-CoV infections. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Primer3_masker: integrating masking of template sequence with primer design software.
Kõressaar, Triinu; Lepamets, Maarja; Kaplinski, Lauris; Raime, Kairi; Andreson, Reidar; Remm, Maido
2018-06-01
Designing PCR primers for amplifying regions of eukaryotic genomes is a complicated task because the genomes contain a large number of repeat sequences and other regions unsuitable for amplification by PCR. We have developed a novel k-mer based masking method that uses a statistical model to detect and mask failure-prone regions on the DNA template prior to primer design. We implemented the software as a standalone software primer3_masker and integrated it into the primer design program Primer3. The standalone version of primer3_masker is implemented in C. The source code is freely available at https://github.com/bioinfo-ut/primer3_masker/ (standalone version for Linux and macOS) and at https://github.com/primer3-org/primer3/ (integrated version). Primer3 web application that allows masking sequences of 196 animal and plant genomes is available at http://primer3.ut.ee/. maido.remm@ut.ee. Supplementary data are available at Bioinformatics online.
Exploration of RNA Sequence Space in the Absence of a Replicase.
Tirumalai, Madhan R; Tran, Quyen; Paci, Maxim; Chavan, Dimple; Marathe, Anuradha; Fox, George E
2018-05-11
It is generally considered that if an RNA World ever existed that it would be driven by an RNA capable of RNA replication. Whether such a catalytic RNA could emerge in an RNA World or not, there would need to be prior routes to increasing complexity in order to produce it. It is hypothesized here that increasing sequence variety, if not complexity, can in fact readily emerge in response to a dynamic equilibrium between synthesis and degradation. A model system in which T4 RNA ligase catalyzes synthesis and Benzonase catalyzes degradation was constructed. An initial 20-mer served as a seed and was subjected to 180 min of simultaneous ligation and degradation. The seed RNA rapidly disappeared and was replaced by an increasing number and variety of both larger and smaller variants. Variants of 40-80 residues were consistently seen, typically representing 2-4% of the unique sequences. In a second experiment with four individual 9-mers, numerous variants were again produced. These included variants of the individual 9-mers as well as sequences that contained sequence segments from two or more 9-mers. In both cases, the RNA products lack large numbers of point mutations but instead incorporate additions and subtractions of fragments of the original RNAs. The system demonstrates that if such equilibrium were established in a prebiotic world it would result in significant exploration of RNA sequence space and likely increased complexity. It remains to be seen if the variety of products produced is affected by the presence of small peptide oligomers.
Xiong, Shengwen; Borrego, Pedro; Ding, Xiaohui; Zhu, Yuanmei; Martins, Andreia; Chong, Huihui
2016-01-01
ABSTRACT Human immunodeficiency virus type 2 (HIV-2) has already spread to different regions worldwide, and currently about 1 to 2 million people have been infected, calling for new antiviral agents that are effective on both HIV-1 and HIV-2 isolates. T20 (enfuvirtide), a 36-mer peptide derived from the C-terminal heptad repeat region (CHR) of gp41, is the only clinically approved HIV-1 fusion inhibitor, but it easily induces drug resistance and is not active on HIV-2. In this study, we first demonstrated that the M-T hook structure was also vital to enhancing the binding stability and inhibitory activity of diverse CHR-based peptide inhibitors. We then designed a novel short peptide (23-mer), termed 2P23, by introducing the M-T hook structure, HIV-2 sequences, and salt bridge-forming residues. Promisingly, 2P23 was a highly stable helical peptide with high binding to the surrogate targets derived from HIV-1, HIV-2, and simian immunodeficiency virus (SIV). Consistent with this, 2P23 exhibited potent activity in inhibiting diverse subtypes of HIV-1 isolates, T20-resistant HIV-1 mutants, and a panel of primary HIV-2 isolates, HIV-2 mutants, and SIV isolates. Therefore, we conclude that 2P23 has high potential to be further developed for clinical use, and it is also an ideal tool for exploring the mechanisms of HIV-1/2- and SIV-mediated membrane fusion. IMPORTANCE The peptide drug T20 is the only approved HIV-1 fusion inhibitor, but it is not active on HIV-2 isolates, which have currently infected 1 to 2 million people and continue to spread worldwide. Recent studies have demonstrated that the M-T hook structure can greatly enhance the binding and antiviral activities of gp41 CHR-derived inhibitors, especially for short peptides that are otherwise inactive. By combining the hook structure, HIV-2 sequence, and salt bridge-based strategies, the short peptide 2P23 has been successfully designed. 2P23 exhibits prominent advantages over many other peptide fusion inhibitors, including its potent and broad activity on HIV-1, HIV-2, and even SIV isolates, its stability as a helical, oligomeric peptide, and its high binding to diverse targets. The small size of 2P23 would benefit its synthesis and significantly reduce production cost. Therefore, 2P23 is an ideal candidate for further development, and it also provides a novel tool for studying HIV-1/2- and SIV-mediated cell fusion. PMID:27795437
Designing small universal k-mer hitting sets for improved analysis of high-throughput sequencing
Kingsford, Carl
2017-01-01
With the rapidly increasing volume of deep sequencing data, more efficient algorithms and data structures are needed. Minimizers are a central recent paradigm that has improved various sequence analysis tasks, including hashing for faster read overlap detection, sparse suffix arrays for creating smaller indexes, and Bloom filters for speeding up sequence search. Here, we propose an alternative paradigm that can lead to substantial further improvement in these and other tasks. For integers k and L > k, we say that a set of k-mers is a universal hitting set (UHS) if every possible L-long sequence must contain a k-mer from the set. We develop a heuristic called DOCKS to find a compact UHS, which works in two phases: The first phase is solved optimally, and for the second we propose several efficient heuristics, trading set size for speed and memory. The use of heuristics is motivated by showing the NP-hardness of a closely related problem. We show that DOCKS works well in practice and produces UHSs that are very close to a theoretical lower bound. We present results for various values of k and L and by applying them to real genomes show that UHSs indeed improve over minimizers. In particular, DOCKS uses less than 30% of the 10-mers needed to span the human genome compared to minimizers. The software and computed UHSs are freely available at github.com/Shamir-Lab/DOCKS/ and acgt.cs.tau.ac.il/docks/, respectively. PMID:28968408
Improving the performance of minimizers and winnowing schemes
Marçais, Guillaume; Pellow, David; Bork, Daniel; Orenstein, Yaron; Shamir, Ron; Kingsford, Carl
2017-01-01
Abstract Motivation: The minimizers scheme is a method for selecting k-mers from sequences. It is used in many bioinformatics software tools to bin comparable sequences or to sample a sequence in a deterministic fashion at approximately regular intervals, in order to reduce memory consumption and processing time. Although very useful, the minimizers selection procedure has undesirable behaviors (e.g. too many k-mers are selected when processing certain sequences). Some of these problems were already known to the authors of the minimizers technique, and the natural lexicographic ordering of k-mers used by minimizers was recognized as their origin. Many software tools using minimizers employ ad hoc variations of the lexicographic order to alleviate those issues. Results: We provide an in-depth analysis of the effect of k-mer ordering on the performance of the minimizers technique. By using small universal hitting sets (a recently defined concept), we show how to significantly improve the performance of minimizers and avoid some of its worse behaviors. Based on these results, we encourage bioinformatics software developers to use an ordering based on a universal hitting set or, if not possible, a randomized ordering, rather than the lexicographic order. This analysis also settles negatively a conjecture (by Schleimer et al.) on the expected density of minimizers in a random sequence. Availability and Implementation: The software used for this analysis is available on GitHub: https://github.com/gmarcais/minimizers.git. Contact: gmarcais@cs.cmu.edu or carlk@cs.cmu.edu PMID:28881970
Lee, Ji Yeon; Kim, You-Jin; Chung, Eun Hee; Kim, Dae-Won; Jeong, Ina; Kim, Yeonjae; Yun, Mi-Ran; Kim, Sung Soon; Kim, Gayeon; Joh, Joon-Sung
2017-07-14
In 2015, the largest outbreak of Middle East respiratory syndrome coronavirus (MERS-CoV) infection outside the Middle East occurred in South Korea. We summarized the epidemiological, clinical, and laboratory findings of the first Korean case of MERS-CoV and analyzed whole-genome sequences of MERS-CoV derived from the patient. A 68-year-old man developed fever and myalgia 7 days after returning to Korea, following a 10-day trip to the Middle East. Before diagnosis, he visited 4 hospitals, potentially resulting in secondary transmission to 28 patients. On admission to the National Medical Center (day 9, post-onset of clinical illness), he presented with drowsiness, hypoxia, and multiple patchy infiltrations on the chest radiograph. He was intubated (day 12) because of progressive acute respiratory distress syndrome (ARDS) and INF-α2a and ribavirin treatment was commenced. The treatment course was prolonged by superimposed ventilator associated pneumonia. MERS-CoV PCR results converted to negative from day 47 and the patient was discharged (day 137), following rehabilitation therapy. The complete genome sequence obtained from a sputum sample (taken on day 11) showed the highest sequence similarity (99.59%) with the virus from an outbreak in Riyadh, Saudi Arabia, in February 2015. The first case of MERS-CoV infection had high transmissibility and was associated with a severe clinical course. The patient made a successful recovery after early treatment with antiviral agents and adequate supportive care. This first case in South Korea became a super-spreader because of improper infection control measures, rather than variations of the virus.
Turtle: identifying frequent k-mers with cache-efficient algorithms.
Roy, Rajat Shuvro; Bhattacharya, Debashish; Schliep, Alexander
2014-07-15
Counting the frequencies of k-mers in read libraries is often a first step in the analysis of high-throughput sequencing data. Infrequent k-mers are assumed to be a result of sequencing errors. The frequent k-mers constitute a reduced but error-free representation of the experiment, which can inform read error correction or serve as the input to de novo assembly methods. Ideally, the memory requirement for counting should be linear in the number of frequent k-mers and not in the, typically much larger, total number of k-mers in the read library. We present a novel method that balances time, space and accuracy requirements to efficiently extract frequent k-mers even for high-coverage libraries and large genomes such as human. Our method is designed to minimize cache misses in a cache-efficient manner by using a pattern-blocked Bloom filter to remove infrequent k-mers from consideration in combination with a novel sort-and-compact scheme, instead of a hash, for the actual counting. Although this increases theoretical complexity, the savings in cache misses reduce the empirical running times. A variant of method can resort to a counting Bloom filter for even larger savings in memory at the expense of false-negative rates in addition to the false-positive rates common to all Bloom filter-based approaches. A comparison with the state-of-the-art shows reduced memory requirements and running times. The tools are freely available for download at http://bioinformatics.rutgers.edu/Software/Turtle and http://figshare.com/articles/Turtle/791582. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Bhadra, Sanchita; Jiang, Yu Sherry; Kumar, Mia R.; Johnson, Reed F.; Hensley, Lisa E.; Ellington, Andrew D.
2015-01-01
The Middle East respiratory syndrome coronavirus (MERS-CoV), an emerging human coronavirus, causes severe acute respiratory illness with a 35% mortality rate. In light of the recent surge in reported infections we have developed asymmetric five-primer reverse transcription loop-mediated isothermal amplification (RT-LAMP) assays for detection of MERS-CoV. Isothermal amplification assays will facilitate the development of portable point-of-care diagnostics that are crucial for management of emerging infections. The RT-LAMP assays are designed to amplify MERS-CoV genomic loci located within the open reading frame (ORF)1a and ORF1b genes and upstream of the E gene. Additionally we applied one-step strand displacement probes (OSD) for real-time sequence-specific verification of LAMP amplicons. Asymmetric amplification effected by incorporating a single loop primer in each assay accelerated the time-to-result of the OSD-RT-LAMP assays. The resulting assays could detect 0.02 to 0.2 plaque forming units (PFU) (5 to 50 PFU/ml) of MERS-CoV in infected cell culture supernatants within 30 to 50 min and did not cross-react with common human respiratory pathogens. PMID:25856093
Development of Animal Models Against Emerging Coronaviruses: From SARS to MERS coronavirus
Sutton, Troy C; Subbarao, Kanta
2016-01-01
Two novel coronaviruses have emerged to cause severe disease in humans. While bats may be the primary reservoir for both viruses, SARS coronavirus (SARS-CoV) likely crossed into humans from civets in China, and MERS coronavirus (MERS-CoV) has been transmitted from camels in the Middle East. Unlike SARS-CoV that resolved within a year, continued introductions of MERS-CoV present an on-going public health threat. Animal models are needed to evaluate countermeasures against emerging viruses. With SARS-CoV, several animal species were permissive to infection. In contrast, most laboratory animals are refractory or only semi-permissive to infection with MERS-CoV. This host-range restriction is largely determined by sequence heterogeneity in the MERS-CoV receptor. We describe animal models developed to study coronaviruses, with a focus on host-range restriction at the level of the viral receptor and discuss approaches to consider in developing a model to evaluate countermeasures against MERS-CoV. PMID:25791336
Development of animal models against emerging coronaviruses: From SARS to MERS coronavirus.
Sutton, Troy C; Subbarao, Kanta
2015-05-01
Two novel coronaviruses have emerged to cause severe disease in humans. While bats may be the primary reservoir for both viruses, SARS coronavirus (SARS-CoV) likely crossed into humans from civets in China, and MERS coronavirus (MERS-CoV) has been transmitted from camels in the Middle East. Unlike SARS-CoV that resolved within a year, continued introductions of MERS-CoV present an on-going public health threat. Animal models are needed to evaluate countermeasures against emerging viruses. With SARS-CoV, several animal species were permissive to infection. In contrast, most laboratory animals are refractory or only semi-permissive to infection with MERS-CoV. This host-range restriction is largely determined by sequence heterogeneity in the MERS-CoV receptor. We describe animal models developed to study coronaviruses, with a focus on host-range restriction at the level of the viral receptor and discuss approaches to consider in developing a model to evaluate countermeasures against MERS-CoV. Copyright © 2015. Published by Elsevier Inc.
Nguyen, Duc; Aden, Bashir; Al Bandar, Zyad; Al Dhaheri, Wafa; Abu Elkheir, Kheir; Khudair, Ahmed; Al Mulla, Mariam; El Saleh, Feda; Imambaccus, Hala; Al Kaabi, Nawal; Sheikh, Farrukh Amin; Sasse, Jurgen; Turner, Andrew; Abdel Wareth, Laila; Weber, Stefan; Al Ameri, Asma; Abu Amer, Wesal; Alami, Negar N.; Bunga, Sudhir; Haynes, Lia M.; Hall, Aron J.; Kallen, Alexander J.; Kuhar, David; Pham, Huong; Pringle, Kimberly; Tong, Suxiang; Whitaker, Brett L.; Gerber, Susan I.; Al Hosani, Farida Ismail
2016-01-01
Middle East respiratory syndrome coronavirus (MERS-CoV) infections sharply increased in the Arabian Peninsula during spring 2014. In Abu Dhabi, United Arab Emirates, these infections occurred primarily among healthcare workers and patients. To identify and describe epidemiologic and clinical characteristics of persons with healthcare-associated infection, we reviewed laboratory-confirmed MERS-CoV cases reported to the Health Authority of Abu Dhabi during January 1, 2013–May 9, 2014. Of 65 case-patients identified with MERS-CoV infection, 27 (42%) had healthcare-associated cases. Epidemiologic and genetic sequencing findings suggest that 3 healthcare clusters of MERS-CoV infection occurred, including 1 that resulted in 20 infected persons in 1 hospital. MERS-CoV in healthcare settings spread predominantly before MERS-CoV infection was diagnosed, underscoring the importance of increasing awareness and infection control measures at first points of entry to healthcare facilities. PMID:26981708
KAnalyze: a fast versatile pipelined K-mer toolkit
Audano, Peter; Vannberg, Fredrik
2014-01-01
Motivation: Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language. Results: As a k-mer counter, KAnalyze outperforms Jellyfish, DSK and a pipeline built on Perl and Linux utilities. Through extensive unit and system testing, we have verified that KAnalyze produces the correct k-mer counts over multiple datasets and k-mer sizes. Availability and implementation: KAnalyze is available on SourceForge: https://sourceforge.net/projects/kanalyze/ Contact: fredrik.vannberg@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24642064
KAnalyze: a fast versatile pipelined k-mer toolkit.
Audano, Peter; Vannberg, Fredrik
2014-07-15
Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language. As a k-mer counter, KAnalyze outperforms Jellyfish, DSK and a pipeline built on Perl and Linux utilities. Through extensive unit and system testing, we have verified that KAnalyze produces the correct k-mer counts over multiple datasets and k-mer sizes. KAnalyze is available on SourceForge: https://sourceforge.net/projects/kanalyze/. © The Author 2014. Published by Oxford University Press.
Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L
2017-09-27
Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.
Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J
2000-12-01
The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.
Improving the performance of minimizers and winnowing schemes.
Marçais, Guillaume; Pellow, David; Bork, Daniel; Orenstein, Yaron; Shamir, Ron; Kingsford, Carl
2017-07-15
The minimizers scheme is a method for selecting k -mers from sequences. It is used in many bioinformatics software tools to bin comparable sequences or to sample a sequence in a deterministic fashion at approximately regular intervals, in order to reduce memory consumption and processing time. Although very useful, the minimizers selection procedure has undesirable behaviors (e.g. too many k -mers are selected when processing certain sequences). Some of these problems were already known to the authors of the minimizers technique, and the natural lexicographic ordering of k -mers used by minimizers was recognized as their origin. Many software tools using minimizers employ ad hoc variations of the lexicographic order to alleviate those issues. We provide an in-depth analysis of the effect of k -mer ordering on the performance of the minimizers technique. By using small universal hitting sets (a recently defined concept), we show how to significantly improve the performance of minimizers and avoid some of its worse behaviors. Based on these results, we encourage bioinformatics software developers to use an ordering based on a universal hitting set or, if not possible, a randomized ordering, rather than the lexicographic order. This analysis also settles negatively a conjecture (by Schleimer et al. ) on the expected density of minimizers in a random sequence. The software used for this analysis is available on GitHub: https://github.com/gmarcais/minimizers.git . gmarcais@cs.cmu.edu or carlk@cs.cmu.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Trinucleotide cassettes increase diversity of T7 phage-displayed peptide library.
Krumpe, Lauren R H; Schumacher, Kathryn M; McMahon, James B; Makowski, Lee; Mori, Toshiyuki
2007-10-05
Amino acid sequence diversity is introduced into a phage-displayed peptide library by randomizing library oligonucleotide DNA. We recently evaluated the diversity of peptide libraries displayed on T7 lytic phage and M13 filamentous phage and showed that T7 phage can display a more diverse amino acid sequence repertoire due to differing processes of viral morphogenesis. In this study, we evaluated and compared the diversity of a 12-mer T7 phage-displayed peptide library randomized using codon-corrected trinucleotide cassettes with a T7 and an M13 12-mer phage-displayed peptide library constructed using the degenerate codon randomization method. We herein demonstrate that the combination of trinucleotide cassette amino acid codon randomization and T7 phage display construction methods resulted in a significant enhancement to the functional diversity of a 12-mer peptide library. This novel library exhibited superior amino acid uniformity and order-of-magnitude increases in amino acid sequence diversity as compared to degenerate codon randomized peptide libraries. Comparative analyses of the biophysical characteristics of the 12-mer peptide libraries revealed the trinucleotide cassette-randomized library to be a unique resource. The combination of T7 phage display and trinucleotide cassette randomization resulted in a novel resource for the potential isolation of binding peptides for new and previously studied molecular targets.
Comparing K-mer based methods for improved classification of 16S sequences.
Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars
2015-07-01
The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.
The effects of downwelling radiance on MER surface spectra: the evil that atmospheres do
NASA Astrophysics Data System (ADS)
Wolff, M.; Ghosh, A.; Arvidson, R.; Christensen, P.; Guinness, E.; Ruff, S.; Seelos, F.; Smith, M.; Athena Science
2004-11-01
While it may not be surprising to some that downwelling radiation in the martian atmosphere may contribute a non-negligible fraction of the radiance for a given surface scene, others remain shocked and surprised (and often dismayed) to discover this fact; particularly with regard to mini-TES observations. Naturally, the relative amplitude of this sky ``contamination'' is often a complicated function of meteorological conditions, viewing geometry, surface properties, and (for the IR) surface temperature. Ideally, one would use a specialized observations to mimic the actual hemispherical-directional nature of the problem. Despite repeated attempts to obtain Pancam complete sky observations and mini-TES sky octants, such observations are not available in the MER observational database. As a result, one is left with the less-enviable, though certainly more computationally intensive, task of connecting point observations (radiance and derived meteorological parameters) to a hemispherical integral of downwelling radiance. Naturally, one must turn to a radiative transfer analysis, despite oft-repeated attempts to assert otherwise. In our presentation, we offer insight into the conditions under which one must worry about atmospheric removal, as well as semi-empirical approaches (based upon said radiative transfer efforts) for producing the correction factors from the available MER atmospheric observations. This work is proudly supported by the MER program through NASA/JPL Contract No. 1242889 (MJW), as well as the contracts for the co-authors.
A mouse model for MERS coronavirus-induced acute respiratory distress syndrome.
Cockrell, Adam S; Yount, Boyd L; Scobey, Trevor; Jensen, Kara; Douglas, Madeline; Beall, Anne; Tang, Xian-Chun; Marasco, Wayne A; Heise, Mark T; Baric, Ralph S
2016-11-28
Middle East respiratory syndrome coronavirus (MERS-CoV) is a novel virus that emerged in 2012, causing acute respiratory distress syndrome (ARDS), severe pneumonia-like symptoms and multi-organ failure, with a case fatality rate of ∼36%. Limited clinical studies indicate that humans infected with MERS-CoV exhibit pathology consistent with the late stages of ARDS, which is reminiscent of the disease observed in patients infected with severe acute respiratory syndrome coronavirus. Models of MERS-CoV-induced severe respiratory disease have been difficult to achieve, and small-animal models traditionally used to investigate viral pathogenesis (mouse, hamster, guinea-pig and ferret) are naturally resistant to MERS-CoV. Therefore, we used CRISPR-Cas9 gene editing to modify the mouse genome to encode two amino acids (positions 288 and 330) that match the human sequence in the dipeptidyl peptidase 4 receptor, making mice susceptible to MERS-CoV infection and replication. Serial MERS-CoV passage in these engineered mice was then used to generate a mouse-adapted virus that replicated efficiently within the lungs and evoked symptoms indicative of severe ARDS, including decreased survival, extreme weight loss, decreased pulmonary function, pulmonary haemorrhage and pathological signs indicative of end-stage lung disease. Importantly, therapeutic countermeasures comprising MERS-CoV neutralizing antibody treatment or a MERS-CoV spike protein vaccine protected the engineered mice against MERS-CoV-induced ARDS.
Gomes, Sílvia; Numata, Keiji; Leonor, Isabel B.; Mano, João F.; Reis, Rui L.; Kaplan, David L.
2011-01-01
Atomic force microscopy (AFM) was used to assess a new chimeric protein consisting of a fusion protein of the consensus repeat for Nephila clavipes spider dragline protein and bone sialoprotein (6mer+BSP). The elastic modulus of this protein in film form was assessed through force curves, and film surface roughness was also determined. The results showed a significant difference between the elastic modulus of the chimeric silk protein, 6mer+BSP, and control films consisting of only the silk component (6mer). The behaviour of the 6mer+BSP and 6mer proteins in aqueous solution in the presence of calcium (Ca) ions was also assessed to determine interactions between the inorganic and organic components related to bone interactions, anchoring and biomaterial network formation. The results demonstrated the formation of protein networks in the presence of Ca2+ ions, characteristics that may be important in the context of controlling materials assembly and properties related to bone-formation with this new chimeric silk-BSP protein. PMID:21370930
Gomes, Sílvia; Numata, Keiji; Leonor, Isabel B; Mano, João F; Reis, Rui L; Kaplan, David L
2011-05-09
Atomic force microscopy (AFM) was used to assess a new chimeric protein consisting of a fusion protein of the consensus repeat for Nephila clavipes spider dragline protein and bone sialoprotein (6mer+BSP). The elastic modulus of this protein in film form was assessed through force curves, and film surface roughness was also determined. The results showed a significant difference among the elastic modulus of the chimeric silk protein, 6mer+BSP, and control films consisting of only the silk component (6mer). The behavior of the 6mer+BSP and 6mer proteins in aqueous solution in the presence of calcium (Ca) ions was also assessed to determine interactions between the inorganic and organic components related to bone interactions, anchoring, and biomaterial network formation. The results demonstrated the formation of protein networks in the presence of Ca(2+) ions, characteristics that may be important in the context of controlling materials assembly and properties related to bone formation with this new chimeric silk-BSP protein.
Length and sequence variability in mitochondrial control region of the milkfish, Chanos chanos.
Ravago, Rachel G; Monje, Virginia D; Juinio-Meñez, Marie Antonette
2002-01-01
Extensive length variability was observed in the mitochondrial control region of the milkfish, Chanos chanos. The nucleotide sequence of the control region and flanking regions was determined. Length variability and heteroplasmy was due to the presence of varying numbers of a 41-bp tandemly repeated sequence and a 48-bp insertion/deletion (indel). The structure and organization of the milkfish control region is similar to that of other teleost fish and vertebrates. However, extensive variation in the copy number of tandem repeats (4-20 copies) and the presence of a relatively large (48-bp) indel, are apparently uncommon in teleost fish control region sequences reported to date. High sequence variability of control region peripheral domains indicates the potential utility of selected regions as markers for population-level studies.
The genome sequence of sweet cherry (Prunus avium) for use in genomics-assisted breeding.
Shirasawa, Kenta; Isuzugawa, Kanji; Ikenaga, Mitsunobu; Saito, Yutaro; Yamamoto, Toshiya; Hirakawa, Hideki; Isobe, Sachiko
2017-10-01
We determined the genome sequence of sweet cherry (Prunus avium) using next-generation sequencing technology. The total length of the assembled sequences was 272.4 Mb, consisting of 10,148 scaffold sequences with an N50 length of 219.6 kb. The sequences covered 77.8% of the 352.9 Mb sweet cherry genome, as estimated by k-mer analysis, and included >96.0% of the core eukaryotic genes. We predicted 43,349 complete and partial protein-encoding genes. A high-density consensus map with 2,382 loci was constructed using double-digest restriction site-associated DNA sequencing. Comparing the genetic maps of sweet cherry and peach revealed high synteny between the two genomes; thus the scaffolds were integrated into pseudomolecules using map- and synteny-based strategies. Whole-genome resequencing of six modern cultivars found 1,016,866 SNPs and 162,402 insertions/deletions, out of which 0.7% were deleterious. The sequence variants, as well as simple sequence repeats, can be used as DNA markers. The genomic information helps us to identify agronomically important genes and will accelerate genetic studies and breeding programs for sweet cherries. Further information on the genomic sequences and DNA markers is available in DBcherry (http://cherry.kazusa.or.jp (8 May 2017, date last accessed)). © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.
2007-01-01
We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.
Song, Li; Florea, Liliana
2015-01-01
Next-generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole-genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, owing to the variation in gene expression levels and alternative splicing. We developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read. Rcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from https://github.com/mourisl/Rcorrector/.
CAFE: aCcelerated Alignment-FrEe sequence analysis
Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A.; Waterman, Michael S.
2017-01-01
Abstract Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$d_2^*$\\end{document} and \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$d_2^S$\\end{document} are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. PMID:28472388
Reverse transcription polymerase chain reaction protocols for cloning small circular RNAs.
Navarro, B; Daròs, J A; Flores, R
1998-07-01
A protocol is described for general application for cloning small circular RNAs which requires only minimal amounts of template (approximately 50 ng) of unknown sequence. Both cDNA strands are synthesized with a 26-mer primer whose six 3'-terminal positions are totally degenerate in two consecutive reactions catalyzed by reverse transcriptase and DNA polymerase, respectively. The cDNAs are then PCR-amplified, using a 20-mer primer with the non-degenerate sequence of the previous primer, cloned and sequenced. This information permits the synthesis of one or more pairs of specific and adjacent primers for obtaining full-length cDNA clones by a protocol which is also described.
Multiplexing Short Primers for Viral Family PCR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S N; Hiddessen, A L; Hara, C A
We describe a Multiplex Primer Prediction (MPP) algorithm to build multiplex compatible primer sets for large, diverse, and unalignable sets of target sequences. The MPP algorithm is scalable to larger target sets than other available software, and it does not require a multiple sequence alignment. We applied it to questions in viral detection, and demonstrated that there are no universally conserved priming sequences among viruses and that it could require an unfeasibly large number of primers ({approx}3700 18-mers or {approx}2000 10-mers) to generate amplicons from all sequenced viruses. We then designed primer sets separately for each viral family, and formore » several diverse species such as foot-and-mouth disease virus, hemagglutinin and neuraminidase segments of influenza A virus, Norwalk virus, and HIV-1.« less
Middle East respiratory syndrome coronavirus (MERS-CoV): animal to human interaction
Omrani, Ali S.; Al-Tawfiq, Jaffar A.
2015-01-01
The Middle East respiratory syndrome coronavirus (MERS-CoV) is a novel enzootic betacoronavirus that was first described in September 2012. The clinical spectrum of MERS-CoV infection in humans ranges from an asymptomatic or mild respiratory illness to severe pneumonia and multi-organ failure; overall mortality is around 35.7%. Bats harbour several betacoronaviruses that are closely related to MERS-CoV but more research is needed to establish the relationship between bats and MERS-CoV. The seroprevalence of MERS-CoV antibodies is very high in dromedary camels in Eastern Africa and the Arabian Peninsula. MERS-CoV RNA and viable virus have been isolated from dromedary camels, including some with respiratory symptoms. Furthermore, near-identical strains of MERS-CoV have been isolated from epidemiologically linked humans and camels, confirming inter-transmission, most probably from camels to humans. Though inter-human spread within health care settings is responsible for the majority of reported MERS-CoV cases, the virus is incapable at present of causing sustained human-to-human transmission. Clusters can be readily controlled with implementation of appropriate infection control procedures. Phylogenetic and sequencing data strongly suggest that MERS-CoV originated from bat ancestors after undergoing a recombination event in the spike protein, possibly in dromedary camels in Africa, before its exportation to the Arabian Peninsula along the camel trading routes. MERS-CoV serosurveys are needed to investigate possible unrecognized human infections in Africa. Amongst the important measures to control MERS-CoV spread are strict regulation of camel movement, regular herd screening and isolation of infected camels, use of personal protective equipment by camel handlers and enforcing rules banning all consumption of unpasteurized camel milk and urine. PMID:26924345
Engineering tobacco to remove mercury from polluted soil.
Chang, S; Wei, F; Yang, Y; Wang, A; Jin, Z; Li, J; He, Y; Shu, H
2015-04-01
Tobacco is an ideal plant for modification to remove mercury from soil. Although several transgenic tobacco strains have been developed, they either release elemental mercury directly into the air or are only capable of accumulating small quantities of mercury. In this study, we constructed two transgenic tobacco lines: Ntk-7 (a tobacco plant transformed with merT-merP-merB1-merB2-ppk) and Ntp-36 (tobacco transformed with merT-merP-merB1-merB2-pcs1). The genes merT, merP, merB1, and merB2 were obtained from the well-known mercury-resistant bacterium Pseudomonas K-62. Ppk is a gene that encodes polyphosphate kinase, a key enzyme for synthesizing polyphosphate in Enterobacter aerogenes. Pcs1 is a tobacco gene that encodes phytochelatin synthase, which is the key enzyme for phytochelatin synthesis. The genes were linked with LP4/2A, a sequence that encodes a well-known linker peptide. The results demonstrate that all foreign genes can be abundantly expressed. The mercury resistance of Ntk-7 and Ntp-36 was much higher than that of the wild type whether tested with organic mercury or with mercuric ions. The transformed plants can accumulate significantly more mercury than the wild type, and Ntp-36 can accumulate more mercury from soil than Ntk-7. In mercury-polluted soil, the mercury content in Ntp-36's root can reach up to 251 μg/g. This is the first report to indicate that engineered tobacco can not only accumulate mercury from soil but also retain this mercury within the plant. Ntp-36 has good prospects for application in bioremediation for mercury pollution.
Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Sethi, Himanshu; Liang, Shoudan; Nelson, David C.; Hegeman, Adrian; Nelson, Clark; Rancour, David; Bednarek, Sebastian; Ulrich, Eldon L.; Zhao, Qin; Wrobel, Russell L.; Newman, Craig S.; Fox, Brian G.; Phillips, George N.; Markley, John L.; Sussman, Michael R.
2005-01-01
Using a maskless photolithography method, we produced DNA oligonucleotide microarrays with probe sequences tiled throughout the genome of the plant Arabidopsis thaliana. RNA expression was determined for the complete nuclear, mitochondrial, and chloroplast genomes by tiling 5 million 36-mer probes. These probes were hybridized to labeled mRNA isolated from liquid grown T87 cells, an undifferentiated Arabidopsis cell culture line. Transcripts were detected from at least 60% of the nearly 26,330 annotated genes, which included 151 predicted genes that were not identified previously by a similar genome-wide hybridization study on four different cell lines. In comparison with previously published results with 25-mer tiling arrays produced by chromium masking-based photolithography technique, 36-mer oligonucleotide probes were found to be more useful in identifying intron–exon boundaries. Using two-dimensional HPLC tandem mass spectrometry, a small-scale proteomic analysis was performed with the same cells. A large amount of strongly hybridizing RNA was found in regions “antisense” to known genes. Similarity of antisense activities between the 25-mer and 36-mer data sets suggests that it is a reproducible and inherent property of the experiments. Transcription activities were also detected for many of the intergenic regions and the small RNAs, including tRNA, small nuclear RNA, small nucleolar RNA, and microRNA. Expression of tRNAs correlates with genome-wide amino acid usage. PMID:15755812
NASA Technical Reports Server (NTRS)
Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Sethi, Himanshu; Liang, Shoudan; Nelson, David C.; Hegeman, Adrian; Nelson, Clark; Rancour, David; Bednarek, Sebastian;
2005-01-01
Using a maskless photolithography method, we produced DNA oligonucleotide microarrays with probe sequences tiled throughout the genome of the plant Arabidopsis thaliana. RNA expression was determined for the complete nuclear, mitochondrial, and chloroplast genomes by tiling 5 million 36-mer probes. These probes were hybridized to labeled mRNA isolated from liquid grown T87 cells, an undifferentiated Arabidopsis cell culture line. Transcripts were detected from at least 60% of the nearly 26,330 annotated genes, which included 151 predicted genes that were not identified previously by a similar genome-wide hybridization study on four different cell lines. In comparison with previously published results with 25-mer tiling arrays produced by chromium masking-based photolithography technique, 36-mer oligonucleotide probes were found to be more useful in identifying intron-exon boundaries. Using two-dimensional HPLC tandem mass spectrometry, a small-scale proteomic analysis was performed with the same cells. A large amount of strongly hybridizing RNA was found in regions "antisense" to known genes. Similarity of antisense activities between the 25-mer and 36-mer data sets suggests that it is a reproducible and inherent property of the experiments. Transcription activities were also detected for many of the intergenic regions and the small RNAs, including tRNA, small nuclear RNA, small nucleolar RNA, and microRNA. Expression of tRNAs correlates with genome-wide amino acid usage.
An, Na; Fleming, Aaron M.; Middleton, Eric G.; Burrows, Cynthia J.
2014-01-01
Human telomeric DNA consists of tandem repeats of the sequence 5′-TTAGGG-3′ that can fold into various G-quadruplexes, including the hybrid, basket, and propeller folds. In this report, we demonstrate use of the α-hemolysin ion channel to analyze these subtle topological changes at a nanometer scale by providing structure-dependent electrical signatures through DNA–protein interactions. Whereas the dimensions of hybrid and basket folds allowed them to enter the protein vestibule, the propeller fold exceeds the size of the latch region, producing only brief collisions. After attaching a 25-mer poly-2′-deoxyadenosine extension to these structures, unraveling kinetics also were evaluated. Both the locations where the unfolding processes occur and the molecular shapes of the G-quadruplexes play important roles in determining their unfolding profiles. These results provide insights into the application of α-hemolysin as a molecular sieve to differentiate nanostructures as well as the potential technical hurdles DNA secondary structures may present to nanopore technology. PMID:25225404
Evaluation of candidate vaccine approaches for MERS-CoV
Wang, Lingshu; Shi, Wei; Joyce, M. Gordon; ...
2015-07-28
The emergence of Middle East respiratory syndrome coronavirus (MERS-CoV) as a cause of severe respiratory disease highlights the need for effective approaches to CoV vaccine development. Efforts focused solely on the receptor-binding domain (RBD) of the viral Spike (S) glycoprotein may not optimize neutralizing antibody (NAb) responses. Here we show that immunogens based on full-length S DNA and S1 subunit protein elicit robust serum-neutralizing activity against several MERS-CoV strains in mice and non-human primates. Serological analysis and isolation of murine monoclonal antibodies revealed that immunization elicits NAbs to RBD and, non-RBD portions of S1 and S2 subunit. Multiple neutralization mechanismsmore » were demonstrated by solving the atomic structure of a NAb-RBD complex, through sequencing of neutralization escape viruses and by constructing MERS-CoV S variants for serological assays. Immunization of rhesus macaques confers protection against MERS-CoV-induced radiographic pneumonia, as assessed using computerized tomography, supporting this strategy as a promising approach for MERS-CoV vaccine development.« less
Artz, Jacob H.; White, Spencer N.; Zadvornyy, Oleg A.; Fugate, Corey J.; Hicks, Danny; Gauss, George H.; Posewitz, Matthew C.; Boyd, Eric S.; Peters, John W.
2015-01-01
Mercuric ion reductase (MerA), a mercury detoxification enzyme, has been tuned by evolution to have high specificity for mercuric ions (Hg2+) and to catalyze their reduction to a more volatile, less toxic elemental form. Here, we present a biochemical and structural characterization of MerA from the thermophilic crenarchaeon Metallosphaera sedula. MerA from M. sedula is a thermostable enzyme, and remains active after extended incubation at 97°C. At 37°C, the NADPH oxidation-linked Hg2+ reduction specific activity was found to be 1.9 μmol/min⋅mg, increasing to 3.1 μmol/min⋅mg at 70°C. M. sedula MerA crystals were obtained and the structure was solved to 1.6 Å, representing the first solved crystal structure of a thermophilic MerA. Comparison of both the crystal structure and amino acid sequence of MerA from M. sedula to mesophillic counterparts provides new insights into the structural determinants that underpin the thermal stability of the enzyme. PMID:26217660
Evaluation of candidate vaccine approaches for MERS-CoV
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Lingshu; Shi, Wei; Joyce, M. Gordon
The emergence of Middle East respiratory syndrome coronavirus (MERS-CoV) as a cause of severe respiratory disease highlights the need for effective approaches to CoV vaccine development. Efforts focused solely on the receptor-binding domain (RBD) of the viral Spike (S) glycoprotein may not optimize neutralizing antibody (NAb) responses. Here we show that immunogens based on full-length S DNA and S1 subunit protein elicit robust serum-neutralizing activity against several MERS-CoV strains in mice and non-human primates. Serological analysis and isolation of murine monoclonal antibodies revealed that immunization elicits NAbs to RBD and, non-RBD portions of S1 and S2 subunit. Multiple neutralization mechanismsmore » were demonstrated by solving the atomic structure of a NAb-RBD complex, through sequencing of neutralization escape viruses and by constructing MERS-CoV S variants for serological assays. Immunization of rhesus macaques confers protection against MERS-CoV-induced radiographic pneumonia, as assessed using computerized tomography, supporting this strategy as a promising approach for MERS-CoV vaccine development.« less
Amziane, Meriam; Darenfed-Bouanane, Amel; Abderrahmani, Ahmed; Selama, Okba; Jouadi, Lydia; Cayol, Jean-Luc; Nateche, Farida; Fardeau, Marie-Laure
2017-02-01
A Gram-positive, moderately halophilic, endospore-forming bacterium, designated MerV T , was isolated from a sediment sample of a saline lake located in Ain Salah, south of Algeria. The cells were rod shaped and motile. Isolate MerV T grew at salinity interval of 0.5-25% NaCl (optimum, 5-10%), pH 6.0-12.0 (optimum, 8.0), and temperature between 10 and 40 °C (optimum, 30 °C).The polar lipids comprised diphosphatidylglycerol, phosphatidylglycerol, a glycolipid, a phospholipid, and two lipids, and MK-7 is the predominant menaquinone. The predominant cellular fatty acids were anteiso C 15:0 and anteiso C 17:0 . The DNA G+C content was 45.3 mol%. Phylogenetic analysis based on 16S rRNA gene sequence comparisons revealed that strain MerV T was most closely related to Virgibacillus halodenitrificans (gene sequence similarity of 97.0%). On the basis of phenotypic, chemotaxonomic properties, and phylogenetic analyses, strain MerV T (=DSM = 28944 T ) should be placed in the genus Virgibacillus as a novel species, for which the name Virgibacillus ainsalahensis is proposed.
Li, S; Cullen, D; Hjort, M; Spear, R; Andrews, J H
1996-01-01
Aureobasidium pullulans, a cosmopolitan yeast-like fungus, colonizes leaf surfaces and has potential as a biocontrol agent of pathogens. To assess the feasibility of rRNA as a target for A. pullulans-specific oligonucleotide probes, we compared the nucleotide sequences of the small-subunit rRNA (18S) genes of 12 geographically diverse A. pullulans strains. Extreme sequence conservation was observed. The consensus A. pullulans sequence was compared with other fungal sequences to identify potential probes. A 21-mer probe which hybridized to the 12 A. pullulans strains but not to 98 other fungi, including 82 isolates from the phylloplane, was identified. A 17-mer highly specific for Cladosporium herbarum was also identified. These probes have potential in monitoring and quantifying fungi in leaf surface and other microbial communities. PMID:8633850
Wang, Y.; Boyd, E.; Crane, S.; Lu-Irving, P.; Krabbenhoft, D.; King, S.; Dighton, J.; Geesey, G.; Barkay, T.
2011-01-01
The distribution and phylogeny of extant protein-encoding genes recovered from geochemically diverse environments can provide insight into the physical and chemical parameters that led to the origin and which constrained the evolution of a functional process. Mercuric reductase (MerA) plays an integral role in mercury (Hg) biogeochemistry by catalyzing the transformation of Hg(II) to Hg(0). Putative merA sequences were amplified from DNA extracts of microbial communities associated with mats and sulfur precipitates from physicochemically diverse Hg-containing springs in Yellowstone National Park, Wyoming, using four PCR primer sets that were designed to capture the known diversity of merA. The recovery of novel and deeply rooted MerA lineages from these habitats supports previous evidence that indicates merA originated in a thermophilic environment. Generalized linear models indicate that the distribution of putative archaeal merA lineages was constrained by a combination of pH, dissolved organic carbon, dissolved total mercury and sulfide. The models failed to identify statistically well supported trends for the distribution of putative bacterial merA lineages as a function of these or other measured environmental variables, suggesting that these lineages were either influenced by environmental parameters not considered in the present study, or the bacterial primer sets were designed to target too broad of a class of genes which may have responded differently to environmental stimuli. The widespread occurrence of merA in the geothermal environments implies a prominent role for Hg detoxification in these environments. Moreover, the differences in the distribution of the merA genes amplified with the four merA primer sets suggests that the organisms putatively engaged in this activity have evolved to occupy different ecological niches within the geothermal gradient. ?? 2011 Springer Science+Business Media, LLC.
Wang, Yanping; Boyd, Eric; Crane, Sharron; Lu-Irving, Patricia; Krabbenhoft, David; King, Susan; Dighton, John; Geesey, Gill; Barkay, Tamar
2011-11-01
The distribution and phylogeny of extant protein-encoding genes recovered from geochemically diverse environments can provide insight into the physical and chemical parameters that led to the origin and which constrained the evolution of a functional process. Mercuric reductase (MerA) plays an integral role in mercury (Hg) biogeochemistry by catalyzing the transformation of Hg(II) to Hg(0). Putative merA sequences were amplified from DNA extracts of microbial communities associated with mats and sulfur precipitates from physicochemically diverse Hg-containing springs in Yellowstone National Park, Wyoming, using four PCR primer sets that were designed to capture the known diversity of merA. The recovery of novel and deeply rooted MerA lineages from these habitats supports previous evidence that indicates merA originated in a thermophilic environment. Generalized linear models indicate that the distribution of putative archaeal merA lineages was constrained by a combination of pH, dissolved organic carbon, dissolved total mercury and sulfide. The models failed to identify statistically well supported trends for the distribution of putative bacterial merA lineages as a function of these or other measured environmental variables, suggesting that these lineages were either influenced by environmental parameters not considered in the present study, or the bacterial primer sets were designed to target too broad of a class of genes which may have responded differently to environmental stimuli. The widespread occurrence of merA in the geothermal environments implies a prominent role for Hg detoxification in these environments. Moreover, the differences in the distribution of the merA genes amplified with the four merA primer sets suggests that the organisms putatively engaged in this activity have evolved to occupy different ecological niches within the geothermal gradient.
Recapitulating phylogenies using k-mers: from trees to networks.
Bernard, Guillaume; Ragan, Mark A; Chan, Cheong Xin
2016-01-01
Ernst Haeckel based his landmark Tree of Life on the supposed ontogenic recapitulation of phylogeny, i.e. that successive embryonic stages during the development of an organism re-trace the morphological forms of its ancestors over the course of evolution. Much of this idea has since been discredited. Today, phylogenies are often based on families of molecular sequences. The standard approach starts with a multiple sequence alignment, in which the sequences are arranged relative to each other in a way that maximises a measure of similarity position-by-position along their entire length. A tree (or sometimes a network) is then inferred. Rigorous multiple sequence alignment is computationally demanding, and evolutionary processes that shape the genomes of many microbes (bacteria, archaea and some morphologically simple eukaryotes) can add further complications. In particular, recombination, genome rearrangement and lateral genetic transfer undermine the assumptions that underlie multiple sequence alignment, and imply that a tree-like structure may be too simplistic. Here, using genome sequences of 143 bacterial and archaeal genomes, we construct a network of phylogenetic relatedness based on the number of shared k -mers (subsequences at fixed length k ). Our findings suggest that the network captures not only key aspects of microbial genome evolution as inferred from a tree, but also features that are not treelike. The method is highly scalable, allowing for investigation of genome evolution across a large number of genomes. Instead of using specific regions or sequences from genome sequences, or indeed Haeckel's idea of ontogeny, we argue that genome phylogenies can be inferred using k -mers from whole-genome sequences. Representing these networks dynamically allows biological questions of interest to be formulated and addressed quickly and in a visually intuitive manner.
SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform.
Lin, Jie; Wei, Jing; Adjeroh, Donald; Jiang, Bing-Hua; Jiang, Yue
2018-05-02
Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications.
Melicher, Dacotah; Torson, Alex S; Dworkin, Ian; Bowsher, Julia H
2014-03-12
The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.
In vitro and in vivo characterization of a reversible synthetic heparin analog.
Whelihan, Matthew F; Cooley, Brian; Xu, Yongmei; Pawlinski, Rafal; Liu, Jian; Key, Nigel S
2016-02-01
The global supply of unfractionated heparin (UFH) and all commercially available low molecular weight heparins (LMWH) remain dependent on animal sources, such as porcine intestine or bovine lung. Recent experience has shown that contamination of the supply chain (with over-sulfated chondroitin sulfates) can result in lethal toxicity. Fondaparinux is currently the only commercially available synthetic analog of heparin. We recently described a new class of chemoenzymatically synthesized heparin analogs. One of these compounds (S12-mer) is a dodecasaccharide consisting of an antithrombin-binding moiety with repeating units of IdoA2S-GlcNS6S and two 3-O-sulfate groups that confer the ability to bind protamine. We sought to further characterize this new compound in vitro using biochemical and global coagulation assays and in vivo using thrombosis and hemostasis assays. The anticoagulant activities of the Super 12-mer (S12-mer) and Enoxaparin in anti-factor Xa and plasma-based thrombin generation assays were roughly equivalent with a 50% reduction in peak thrombin generation occurring at approximately 325nM. When protamine was titrated against a fixed concentration of S12-mer in plasma or blood, the S12-mer displayed a significant restitution of thrombin generation and clot formation. In vivo, S12-mer inhibited venous thrombosis to a similar extent as Enoxaparin, with similar bleeding profiles. These data show that the S12-mer has almost identical efficacy to Enoxaparin in terms of FXa inhibition, while displaying significant reversibility with protamine. Taken together with the ability to ensure purity and homogeneity from batch to batch, the S12-mer is a promising new synthetic heparin analog with a potentially enhanced safety profile. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ziya Motalebipour, Elmira; Kafkas, Salih; Khodaeiaminjan, Mortaza; Çoban, Nergiz; Gözel, Hatice
2016-12-07
Pistachio (Pistacia vera L.) is one of the most important nut crops in the world. There are about 11 wild species in the genus Pistacia, and they have importance as rootstock seed sources for cultivated P. vera and forest trees. Published information on the pistachio genome is limited. Therefore, a genome survey is necessary to obtain knowledge on the genome structure of pistachio by next generation sequencing. Simple sequence repeat (SSR) markers are useful tools for germplasm characterization, genetic diversity analysis, and genetic linkage mapping, and may help to elucidate genetic relationships among pistachio cultivars and species. To explore the genome structure of pistachio, a genome survey was performed using the Illumina platform at approximately 40× coverage depth in the P. vera cv. Siirt. The K-mer analysis indicated that pistachio has a genome that is about 600 Mb in size and is highly heterozygous. The assembly of 26.77 Gb Illumina data produced 27,069 scaffolds at N50 = 3.4 kb with a total of 513.5 Mb. A total of 59,280 SSR motifs were detected with a frequency of 8.67 kb. A total of 206 SSRs were used to characterize 24 P. vera cultivars and 20 wild Pistacia genotypes (four genotypes from each five wild Pistacia species) belonging to P. atlantica, P. integerrima, P. chinenesis, P. terebinthus, and P. lentiscus genotypes. Overall 135 SSR loci amplified in all 44 cultivars and genotypes, 41 were polymorphic in six Pistacia species. The novel SSR loci developed from cultivated pistachio were highly transferable to wild Pistacia species. The results from a genome survey of pistachio suggest that the genome size of pistachio is about 600 Mb with a high heterozygosity rate. This information will help to design whole genome sequencing strategies for pistachio. The newly developed novel polymorphic SSRs in this study may help germplasm characterization, genetic diversity, and genetic linkage mapping studies in the genus Pistacia.
Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav
2013-07-18
Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.
NASA Astrophysics Data System (ADS)
Pylaev, T. E.; Khanadeev, V. A.; Khlebtsov, B. N.; Dykman, L. A.; Bogatyrev, V. A.; Khlebtsov, N. G.
2011-07-01
We introduce a new genosensing approach employing CTAB (cetyltrimethylammonium bromide)-coated positively charged colloidal gold nanoparticles (GNPs) to detect target DNA sequences by using absorption spectroscopy and dynamic light scattering. The approach is compared with a previously reported method employing unmodified CTAB-coated gold nanorods (GNRs). Both approaches are based on the observation that whereas the addition of probe and target ssDNA to CTAB-coated particles results in particle aggregation, no aggregation is observed after addition of probe and nontarget DNA sequences. Our goal was to compare the feasibility and sensitivity of both methods. A 21-mer ssDNA from the human immunodeficiency virus type 1 HIV-1 U5 long terminal repeat (LTR) sequence and a 23-mer ssDNA from the Bacillus anthracis cryptic protein and protective antigen precursor (pagA) genes were used as ssDNA models. In the case of GNRs, unexpectedly, the colorimetric test failed with perfect cigar-like particles but could be performed with dumbbell and dog-bone rods. By contrast, our approach with cationic CTAB-coated GNPs is easy to implement and possesses excellent feasibility with retention of comparable sensitivity—a 0.1 nM concentration of target cDNA can be detected with the naked eye and 10 pM by dynamic light scattering (DLS) measurements. The specificity of our method is illustrated by successful DLS detection of one-three base mismatches in cDNA sequences for both DNA models. These results suggest that the cationic GNPs and DLS can be used for genosensing under optimal DNA hybridization conditions without any chemical modifications of the particle surface with ssDNA molecules and signal amplification. Finally, we discuss a more than two-three-order difference in the reported estimations of the detection sensitivity of colorimetric methods (0.1 to 10-100 pM) to show that the existing aggregation models are inconsistent with the detection limits of about 0.1-1 pM DNA and that other explanations should be developed.
Luo, Chu-Ming; Wang, Ning; Yang, Xing-Lou; Liu, Hai-Zhou; Zhang, Wei; Li, Bei; Hu, Ben; Peng, Cheng; Geng, Qi-Bin; Zhu, Guang-Jian; Li, Fang; Shi, Zheng-Li
2018-07-01
Middle East respiratory syndrome coronavirus (MERS-CoV) has represented a human health threat since 2012. Although several MERS-related CoVs that belong to the same species as MERS-CoV have been identified from bats, they do not use the MERS-CoV receptor, dipeptidyl peptidase 4 (DPP4). Here, we screened 1,059 bat samples from at least 30 bat species collected in different regions in south China and identified 89 strains of lineage C betacoronaviruses, including Tylonycteris pachypus coronavirus HKU4 , Pipistrellus pipistrellus coronavirus HKU5 , and MERS-related CoVs. We sequenced the full-length genomes of two positive samples collected from the great evening bat, Ia io , from Guangdong Province. The two genomes were highly similar and exhibited genomic structures identical to those of other lineage C betacoronaviruses. While they exhibited genome-wide nucleotide identities of only 75.3 to 81.2% with other MERS-related CoVs, their gene-coding regions were highly similar to their counterparts, except in the case of the spike proteins. Further protein-protein interaction assays demonstrated that the spike proteins of these MERS-related CoVs bind to the receptor DPP4. Recombination analysis suggested that the newly discovered MERS-related CoVs have acquired their spike genes from a DPP4-recognizing bat coronavirus HKU4. Our study provides further evidence that bats represent the evolutionary origins of MERS-CoV. IMPORTANCE Previous studies suggested that MERS-CoV originated in bats. However, its evolutionary path from bats to humans remains unclear. In this study, we discovered 89 novel lineage C betacoronaviruses in eight bat species. We provide evidence of a MERS-related CoV derived from the great evening bat that uses the same host receptor as human MERS-CoV. This virus also provides evidence for a natural recombination event between the bat MERS-related CoV and another bat coronavirus, HKU4. Our study expands the host ranges of MERS-related CoV and represents an important step toward establishing bats as the natural reservoir of MERS-CoV. These findings may lead to improved epidemiological surveillance of MERS-CoV and the prevention and control of the spread of MERS-CoV to humans. Copyright © 2018 American Society for Microbiology.
Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso
2015-07-01
In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Du, Jiansen; Xue, Hailing; Ma, Jing
HIV CRF07 B′/C is a strain circulating mainly in northwest region of China. The gp41 region of CRF07 is derived from a clade C virus. In order to compare the difference of CRF07 gp41 with that of typical clade B virus, we solved the crystal structure of the core region of CRF07 gp41. Compared with clade B gp41, CRF07 gp41 evolved more basic and hydrophilic residues on its helix bundle surface. Based on sequence alignment, a hyper-mutant cluster located in the middle of HR2 heptads repeat was identified. The mutational study of these residues revealed that this site is importantmore » in HIV mediated cell–cell fusion and plays critical roles in conformational changes during viral invasion. - Highlights: • We solved the crystal structure of HIV CRF07 gp41 core region. • A hyper-mutant cluster in the middle of HR2 heptads repeat was identified. • The hyper-mutant site is important in HIV-cell fusion. • The model will help to understand the HIV fusion process.« less
Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie
2014-02-17
As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots.
2014-01-01
Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots. PMID:24533858
NIPTmer: rapid k-mer-based software package for detection of fetal aneuploidies.
Sauk, Martin; Žilina, Olga; Kurg, Ants; Ustav, Eva-Liina; Peters, Maire; Paluoja, Priit; Roost, Anne Mari; Teder, Hindrek; Palta, Priit; Brison, Nathalie; Vermeesch, Joris R; Krjutškov, Kaarel; Salumets, Andres; Kaplinski, Lauris
2018-04-04
Non-invasive prenatal testing (NIPT) is a recent and rapidly evolving method for detecting genetic lesions, such as aneuploidies, of a fetus. However, there is a need for faster and cheaper laboratory and analysis methods to make NIPT more widely accessible. We have developed a novel software package for detection of fetal aneuploidies from next-generation low-coverage whole genome sequencing data. Our tool - NIPTmer - is based on counting pre-defined per-chromosome sets of unique k-mers from raw sequencing data, and applying linear regression model on the counts. Additionally, the filtering process used for k-mer list creation allows one to take into account the genetic variance in a specific sample, thus reducing the source of uncertainty. The processing time of one sample is less than 10 CPU-minutes on a high-end workstation. NIPTmer was validated on a cohort of 583 NIPT samples and it correctly predicted 37 non-mosaic fetal aneuploidies. NIPTmer has the potential to reduce significantly the time and complexity of NIPT post-sequencing analysis compared to mapping-based methods. For non-commercial users the software package is freely available at http://bioinfo.ut.ee/NIPTMer/ .
Almutairy, Meznah; Torng, Eric
2018-01-01
Bioinformatics applications and pipelines increasingly use k-mer indexes to search for similar sequences. The major problem with k-mer indexes is that they require lots of memory. Sampling is often used to reduce index size and query time. Most applications use one of two major types of sampling: fixed sampling and minimizer sampling. It is well known that fixed sampling will produce a smaller index, typically by roughly a factor of two, whereas it is generally assumed that minimizer sampling will produce faster query times since query k-mers can also be sampled. However, no direct comparison of fixed and minimizer sampling has been performed to verify these assumptions. We systematically compare fixed and minimizer sampling using the human genome as our database. We use the resulting k-mer indexes for fixed sampling and minimizer sampling to find all maximal exact matches between our database, the human genome, and three separate query sets, the mouse genome, the chimp genome, and an NGS data set. We reach the following conclusions. First, using larger k-mers reduces query time for both fixed sampling and minimizer sampling at a cost of requiring more space. If we use the same k-mer size for both methods, fixed sampling requires typically half as much space whereas minimizer sampling processes queries only slightly faster. If we are allowed to use any k-mer size for each method, then we can choose a k-mer size such that fixed sampling both uses less space and processes queries faster than minimizer sampling. The reason is that although minimizer sampling is able to sample query k-mers, the number of shared k-mer occurrences that must be processed is much larger for minimizer sampling than fixed sampling. In conclusion, we argue that for any application where each shared k-mer occurrence must be processed, fixed sampling is the right sampling method.
Torng, Eric
2018-01-01
Bioinformatics applications and pipelines increasingly use k-mer indexes to search for similar sequences. The major problem with k-mer indexes is that they require lots of memory. Sampling is often used to reduce index size and query time. Most applications use one of two major types of sampling: fixed sampling and minimizer sampling. It is well known that fixed sampling will produce a smaller index, typically by roughly a factor of two, whereas it is generally assumed that minimizer sampling will produce faster query times since query k-mers can also be sampled. However, no direct comparison of fixed and minimizer sampling has been performed to verify these assumptions. We systematically compare fixed and minimizer sampling using the human genome as our database. We use the resulting k-mer indexes for fixed sampling and minimizer sampling to find all maximal exact matches between our database, the human genome, and three separate query sets, the mouse genome, the chimp genome, and an NGS data set. We reach the following conclusions. First, using larger k-mers reduces query time for both fixed sampling and minimizer sampling at a cost of requiring more space. If we use the same k-mer size for both methods, fixed sampling requires typically half as much space whereas minimizer sampling processes queries only slightly faster. If we are allowed to use any k-mer size for each method, then we can choose a k-mer size such that fixed sampling both uses less space and processes queries faster than minimizer sampling. The reason is that although minimizer sampling is able to sample query k-mers, the number of shared k-mer occurrences that must be processed is much larger for minimizer sampling than fixed sampling. In conclusion, we argue that for any application where each shared k-mer occurrence must be processed, fixed sampling is the right sampling method. PMID:29389989
Multiplexed microsatellite markers for seven Metarhizium species
USDA-ARS?s Scientific Manuscript database
Cross-species transferability of 41 previously published simple sequence repeat (SSR) markers was assessed for 11 species of the entomopathogenic fungus Metarhizium. A collection of 65 Metarhizium isolates including all 54 used in a recent phylogenetic revision of the genus were characterized. Betwe...
Cognitive Dissonance as an Instructional Tool for Understanding Chemical Representations
NASA Astrophysics Data System (ADS)
Corradi, David; Clarebout, Geraldine; Elen, Jan
2015-10-01
Previous research on multiple external representations (MER) indicates that sequencing representations (compared with presenting them as a whole) can, in some cases, increase conceptual understanding if there is interference between internal and external representations. We tested this mechanism by sequencing different combinations of scientific and abstract chemical representations and presenting them to 133 learners with low prior knowledge of the represented domain. The results provide insight into three separate mechanisms of learning with MER. (1) A memory (number of ideas reproduced) and (2) an accuracy (correctness of these ideas) effects occur when two representations are presented in a sequence. An accuracy and a (3) redundancy (number of redundant ideas remembered) effects occur when three representations are presented in a sequence. A necessary precondition for these effects is that descriptive formats are placed before depictive formats. The identified effects are analyzed in terms of the concept of cognitive dissonance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and treesmore » determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Borucki, Monica K.; Lao, Victoria; Hwang, Mona
Middle East respiratory syndrome coronavirus (MERS-CoV) is an emerging human pathogen related to SARS virus. In vitro studies indicate this virus may have a broad host range suggesting an increased pandemic potential. Genetic and epidemiological evidence indicate camels serve as a reservoir for MERS virus but the mechanism of cross species transmission is unclear and many questions remain regarding the susceptibility of humans to infection. Deep sequencing data was obtained from the nasal samples of three camels that had been experimentally infected with a human MERS-CoV isolate. A majority of the genome was covered and average coverage was greater thanmore » 12,000x depth. Although only 5 mutations were detected in the consensus sequences, 473 intrahost single nucleotide variants were identified. Lastly, many of these variants were present at high frequencies and could potentially influence viral phenotype and the sensitivity of detection assays that target these regions for primer or probe binding.« less
Converting CSV Files to RKSML Files
NASA Technical Reports Server (NTRS)
Trebi-Ollennu, Ashitey; Liebersbach, Robert
2009-01-01
A computer program converts, into a format suitable for processing on Earth, files of downlinked telemetric data pertaining to the operation of the Instrument Deployment Device (IDD), which is a robot arm on either of the Mars Explorer Rovers (MERs). The raw downlinked data files are in comma-separated- value (CSV) format. The present program converts the files into Rover Kinematics State Markup Language (RKSML), which is an Extensible Markup Language (XML) format that facilitates representation of operations of the IDD and enables analysis of the operations by means of the Rover Sequencing Validation Program (RSVP), which is used to build sequences of commanded operations for the MERs. After conversion by means of the present program, the downlinked data can be processed by RSVP, enabling the MER downlink operations team to play back the actual IDD activity represented by the telemetric data against the planned IDD activity. Thus, the present program enhances the diagnosis of anomalies that manifest themselves as differences between actual and planned IDD activities.
Borucki, Monica K.; Lao, Victoria; Hwang, Mona; ...
2016-01-20
Middle East respiratory syndrome coronavirus (MERS-CoV) is an emerging human pathogen related to SARS virus. In vitro studies indicate this virus may have a broad host range suggesting an increased pandemic potential. Genetic and epidemiological evidence indicate camels serve as a reservoir for MERS virus but the mechanism of cross species transmission is unclear and many questions remain regarding the susceptibility of humans to infection. Deep sequencing data was obtained from the nasal samples of three camels that had been experimentally infected with a human MERS-CoV isolate. A majority of the genome was covered and average coverage was greater thanmore » 12,000x depth. Although only 5 mutations were detected in the consensus sequences, 473 intrahost single nucleotide variants were identified. Lastly, many of these variants were present at high frequencies and could potentially influence viral phenotype and the sensitivity of detection assays that target these regions for primer or probe binding.« less
Hsin, Wei-Chen; Chang, Chan-Hua; Chang, Chi-You; Peng, Wei-Hao; Chien, Chung-Liang; Chang, Ming-Fu; Chang, Shin C
2018-05-24
Middle East respiratory syndrome coronavirus (MERS-CoV) consists of a positive-sense, single-stranded RNA genome and four structural proteins: the spike, envelope, membrane, and nucleocapsid protein. The assembly of the viral genome into virus particles involves viral structural proteins and is believed to be mediated through recognition of specific sequences and RNA structures of the viral genome. A culture system for the production of MERS coronavirus-like particles (MERS VLPs) was determined and established by electron microscopy and the detection of coexpressed viral structural proteins. Using the VLP system, a 258-nucleotide RNA fragment, which spans nucleotides 19,712 to 19,969 of the MERS-CoV genome (designated PS258(19712-19969) ME ), was identified to function as a packaging signal. Assembly of the RNA packaging signal into MERS VLPs is dependent on the viral nucleocapsid protein. In addition, a 45-nucleotide stable stem-loop substructure of the PS258(19712-19969) ME interacted with both the N-terminal domain and the C-terminal domain of the viral nucleocapsid protein. Furthermore, a functional SARS-CoV RNA packaging signal failed to assemble into the MERS VLPs, which indicated virus-specific assembly of the RNA genome. A MERS-oV RNA packaging signal was identified by the detection of GFP expression following an incubation of MERS VLPs carrying the heterologous mRNA GFP-PS258(19712-19969) ME with virus permissive Huh7 cells. The MERS VLP system could help us in understanding virus infection and morphogenesis.
Pasquato, Antonella; Pullikotil, Philomena; Asselin, Marie-Claude; Vacatello, Manuela; Paolillo, Livio; Ghezzo, Francesca; Basso, Federica; Di Bello, Carlo; Dettin, Monica; Seidah, Nabil G
2006-08-18
Herein we designed, synthesized, tested, and validated fluorogenic methylcoumarinamide (MCA) and chloromethylketone-peptides spanning the Lassa virus GPC cleavage site as substrates and inhibitors for the proprotein convertase SKI-1/S1P. The 7-mer MCA (YISRRLL-MCA) and 8-mer MCA (IYISRRLL-MCA) are very efficiently cleaved with respect to both the 6-mer MCA (ISRRLL-MCA) and point mutated fluorogenic analogues, except for the 7-mer mutant Y253F. The importance of the P7 phenylic residue was confirmed by digestions of two 16-mer non-fluorogenic peptidyl substrates that differ by a single point mutation (Y253A). Because NMR analysis of these 16-mer peptides did not reveal significant structural differences at recognition motif RRLL, the P7 Tyr residue is likely important in establishing key interactions within the catalytic pocket of SKI-1. Based on these data, we established through analysis of pro-ATF6 and pro-SREBP-2 cellular processing that decanoylated chloromethylketone 7-mer, 6-mer, and 4-mer peptides containing the core RRLL sequence are irreversible and potent ex vivo SKI-1 inhibitors. Although caution must be exercised in using these inhibitors in in vitro reactions, as they can also inhibit the basic amino acid-specific convertase furin, within cells and when used at concentrations < or = 100 microM these inhibitors are relatively specific for inhibition of SKI-1 processing events, as opposed to those performed by furin-like convertases.
Jan, Arif Tasleem; Azam, Mudsser; Choi, Inho; Ali, Arif; Haq, Qazi Mohd. Rizwanul
2016-01-01
Mercury, which is ubiquitous and recalcitrant to biodegradation processes, threatens human health by escaping to the environment via various natural and anthropogenic activities. Non-biodegradability of mercury pollutants has necessitated the development and implementation of economic alternatives with promising potential to remove metals from the environment. Enhancement of microbial based remediation strategies through genetic engineering approaches provides one such alternative with a promising future. In this study, bacterial isolates inhabiting polluted sites were screened for tolerance to varying concentrations of mercuric chloride. Following identification, several Pseudomonas and Klebsiella species were found to exhibit the highest tolerance to both organic and inorganic mercury. Screened bacterial isolates were examined for their genetic make-up in terms of the presence of genes (merP and merT) involved in the transport of mercury across the membrane either alone or in combination to deal with the toxic mercury. Gene sequence analysis revealed that the merP gene showed 86–99% homology, while the merT gene showed >98% homology with previously reported sequences. By exploring the genes involved in imparting metal resistance to bacteria, this study will serve to highlight the credentials that are particularly advantageous for their practical application to remediation of mercury from the environment. PMID:26887227
Inhibition of trypanosomal cysteine proteinases by their propeptides.
Lalmanach, G; Lecaille, F; Chagas, J R; Authié, E; Scharfstein, J; Juliano, M A; Gauthier, F
1998-09-25
The ability of the prodomains of trypanosomal cysteine proteinases to inhibit their active form was studied using a set of 23 overlapping 15-mer peptides covering the whole prosequence of congopain, the major cysteine proteinase of Trypanosoma congolense. Three consecutive peptides with a common 5-mer sequence YHNGA were competitive inhibitors of congopain. A shorter synthetic peptide consisting of this 5-mer sequence flanked by two Ala residues (AYHNGAA) also inhibited purified congopain. No residue critical for inhibition was identified in this sequence, but a significant improvement in Ki value was obtained upon N-terminal elongation. Procongopain-derived peptides did not inhibit lysosomal cathepsins B and L but did inhibit native cruzipain (from Dm28c clone epimastigotes), the major cysteine proteinase of Trypanosoma cruzi, the proregion of which also contains the sequence YHNGA. The positioning of the YHNGA inhibitory sequence within the prosegment of trypanosomal proteinases is similar to that covering the active site in the prosegment of cysteine proteinases, the three-dimensional structure of which has been resolved. This strongly suggests that trypanosomal proteinases, despite their long C-terminal extension, have a prosegment that folds similarly to that in related mammal and plant cysteine proteinases, resulting in reverse binding within the active site. Such reverse binding could also occur for short procongopain-derived inhibitory peptides, based on their resistance to proteolysis and their ability to retain inhibitory activity after prolonged incubation. In contrast, homologous peptides in related cysteine proteinases did not inhibit trypanosomal proteinases and were rapidly cleaved by these enzymes.
Koparde, Vishal N.; Jameson-Lee, Maximilian; Elnasseh, Abdelrhman G.; Scalora, Allison F.; Kobulnicky, David J.; Serrano, Myrna G.; Roberts, Catherine H.; Buck, Gregory A.; Neale, Michael C.; Nixon, Daniel E.; Toor, Amir A.
2017-01-01
Human cytomegalovirus (hCMV) reactivation may often coincide with the development of graft-versus-host-disease (GVHD) in stem cell transplantation (SCT). Seventy seven SCT donor-recipient pairs (DRP) (HLA matched unrelated donor (MUD), n = 50; matched related donor (MRD), n = 27) underwent whole exome sequencing to identify single nucleotide polymorphisms (SNPs) generating alloreactive peptide libraries for each DRP (9-mer peptide-HLA complexes); Human CMV CROSS (Cross-Reactive Open Source Sequence) database was compiled from NCBI; HLA class I binding affinity for each DRPs HLA was calculated by NetMHCpan 2.8 and hCMV- derived 9-mers algorithmically compared to the alloreactive peptide-HLA complex libraries. Short consecutive (≥6) amino acid (AA) sequence homology matching hCMV to recipient peptides was considered for HLA-bound-peptide (IC50<500nM) cross reactivity. Of the 70,686 hCMV 9-mers contained within the hCMV CROSS database, an average of 29,658 matched the MRD DRP alloreactive peptides and 52,910 matched MUD DRP peptides (p<0.001). In silico analysis revealed multiple high affinity, immunogenic CMV-Human peptide matches (IC50<500 nM) expressed in GVHD-affected tissue-specific manner. hCMV+GVHD was found in 18 patients, 13 developing hCMV viremia before GVHD onset. Analysis of patients with GVHD identified potential cross reactive peptide expression within affected organs. We propose that hCMV peptide sequence homology with human alloreactive peptides may contribute to the pathophysiology of GVHD. PMID:28800601
Hall, Charles E; Koparde, Vishal N; Jameson-Lee, Maximilian; Elnasseh, Abdelrhman G; Scalora, Allison F; Kobulnicky, David J; Serrano, Myrna G; Roberts, Catherine H; Buck, Gregory A; Neale, Michael C; Nixon, Daniel E; Toor, Amir A
2017-01-01
Human cytomegalovirus (hCMV) reactivation may often coincide with the development of graft-versus-host-disease (GVHD) in stem cell transplantation (SCT). Seventy seven SCT donor-recipient pairs (DRP) (HLA matched unrelated donor (MUD), n = 50; matched related donor (MRD), n = 27) underwent whole exome sequencing to identify single nucleotide polymorphisms (SNPs) generating alloreactive peptide libraries for each DRP (9-mer peptide-HLA complexes); Human CMV CROSS (Cross-Reactive Open Source Sequence) database was compiled from NCBI; HLA class I binding affinity for each DRPs HLA was calculated by NetMHCpan 2.8 and hCMV- derived 9-mers algorithmically compared to the alloreactive peptide-HLA complex libraries. Short consecutive (≥6) amino acid (AA) sequence homology matching hCMV to recipient peptides was considered for HLA-bound-peptide (IC50<500nM) cross reactivity. Of the 70,686 hCMV 9-mers contained within the hCMV CROSS database, an average of 29,658 matched the MRD DRP alloreactive peptides and 52,910 matched MUD DRP peptides (p<0.001). In silico analysis revealed multiple high affinity, immunogenic CMV-Human peptide matches (IC50<500 nM) expressed in GVHD-affected tissue-specific manner. hCMV+GVHD was found in 18 patients, 13 developing hCMV viremia before GVHD onset. Analysis of patients with GVHD identified potential cross reactive peptide expression within affected organs. We propose that hCMV peptide sequence homology with human alloreactive peptides may contribute to the pathophysiology of GVHD.
MicroRNA categorization using sequence motifs and k-mers.
Yousef, Malik; Khalifa, Waleed; Acar, İlhan Erkin; Allmer, Jens
2017-03-14
Post-transcriptional gene dysregulation can be a hallmark of diseases like cancer and microRNAs (miRNAs) play a key role in the modulation of translation efficiency. Known pre-miRNAs are listed in miRBase, and they have been discovered in a variety of organisms ranging from viruses and microbes to eukaryotic organisms. The computational detection of pre-miRNAs is of great interest, and such approaches usually employ machine learning to discriminate between miRNAs and other sequences. Many features have been proposed describing pre-miRNAs, and we have previously introduced the use of sequence motifs and k-mers as useful ones. There have been reports of xeno-miRNAs detected via next generation sequencing. However, they may be contaminations and to aid that important decision-making process, we aimed to establish a means to differentiate pre-miRNAs from different species. To achieve distinction into species, we used one species' pre-miRNAs as the positive and another species' pre-miRNAs as the negative training and test data for the establishment of machine learned models based on sequence motifs and k-mers as features. This approach resulted in higher accuracy values between distantly related species while species with closer relation produced lower accuracy values. We were able to differentiate among species with increasing success when the evolutionary distance increases. This conclusion is supported by previous reports of fast evolutionary changes in miRNAs since even in relatively closely related species a fairly good discrimination was possible.
Pettengill, James B; Pightling, Arthur W; Baugher, Joseph D; Rand, Hugh; Strain, Errol
2016-01-01
The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.
Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.; ...
2016-11-10
The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging duemore » to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.
The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging duemore » to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.« less
Zheng, Yang; Cai, Jing; Li, JianWen; Li, Bo; Lin, Runmao; Tian, Feng; Wang, XiaoLing; Wang, Jun
2010-01-01
A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
Tackett, Alan J.; Corey, David R.; Raney, Kevin D.
2002-01-01
Peptide nucleic acid (PNA) is a DNA mimic in which the nucleobases are linked by an N-(2-aminoethyl) glycine backbone. Here we report that PNA can interact with single-stranded DNA (ssDNA) in a non-sequence-specific fashion. We observed that a 15mer PNA inhibited the ssDNA-stimulated ATPase activity of a bacteriophage T4 helicase, Dda. Surprisingly, when a fluorescein-labeled 15mer PNA was used in binding studies no interaction was observed between PNA and Dda. However, fluorescence polarization did reveal non-sequence-specific interactions between PNA and ssDNA. Thus, the inhibition of ATPase activity of Dda appears to result from depletion of the available ssDNA due to non-Watson–Crick binding of PNA to ssDNA. Inhibition of the ssDNA-stimulated ATPase activity was observed for several PNAs of varying length and sequence. To study the basis for this phenomenon, we examined self-aggregation by PNAs. The 15mer PNA readily self-aggregates to the point of precipitation. Since PNAs are hydrophobic, they aggregate more than DNA or RNA, making the study of this phenomenon essential for understanding the properties of PNA. Non-sequence-specific interactions between PNA and ssDNA were observed at moderate concentrations of PNA, suggesting that such interactions should be considered for antisense and antigene applications. PMID:11842106
Structural Analysis of the Hg(II)-Regulatory Protein Tn501 MerR from Pseudomonas aeruginosa
NASA Astrophysics Data System (ADS)
Wang, Dan; Huang, Shanqing; Liu, Pingying; Liu, Xichun; He, Yafeng; Chen, Weizhong; Hu, Qingyuan; Wei, Tianbiao; Gan, Jianhua; Ma, Jing; Chen, Hao
2016-09-01
The metalloprotein MerR is a mercury(II)-dependent transcriptional repressor-activator that responds to mercury(II) with extraordinary sensitivity and selectivity. It’s widely distributed in both Gram-negative and Gram-positive bacteria but with barely detectable sequence identities between the two sources. To provide structural basis for the considerable biochemical and biophysical experiments previously performed on Tn501 and Tn21 MerR from Gram-negative bacteria, we analyzed the crystal structure of mercury(II)-bound Tn501 MerR. The structure in the metal-binding domain provides Tn501 MerR with a high affinity for mercury(II) and the ability to distinguish mercury(II) from other metals with its unique planar trigonal coordination geometry, which is adopted by both Gram-negative and Gram-positive bacteria. The mercury(II) coordination state in the C-terminal metal-binding domain is transmitted through the allosteric network across the dimer interface to the N-terminal DNA-binding domain. Together with the previous mutagenesis analyses, the present data indicate that the residues in the allosteric pathway have a central role in maintaining the functions of Tn501 MerR. In addition, the complex structure exhibits significant differences in tertiary and quaternary structural arrangements compared to those of Bacillus MerR from Gram-positive bacteria, which probably enable them to function with specific promoter DNA with different spacers between -35 and -10 elements.
Lau, Susanna K P; Wernery, Renate; Wong, Emily Y M; Joseph, Sunitha; Tsang, Alan K L; Patteril, Nissy Annie Georgy; Elizabeth, Shyna K; Chan, Kwok-Hung; Muhammed, Rubeena; Kinne, Jöerg; Yuen, Kwok-Yung; Wernery, Ulrich; Woo, Patrick C Y
2016-01-01
Little is known regarding the molecular epidemiology of Middle East respiratory syndrome coronavirus (MERS-CoV) circulating in dromedaries outside Saudi Arabia. To address this knowledge gap, we sequenced 10 complete genomes of MERS-CoVs isolated from 2 live and 8 dead dromedaries from different regions in the United Arab Emirates (UAE). Phylogenetic analysis revealed one novel clade A strain, the first detected in the UAE, and nine clade B strains. Strain D998/15 had a distinct phylogenetic position within clade A, being more closely related to the dromedary isolate NRCE-HKU205 from Egypt than to the human isolates EMC/2012 and Jordan-N3/2012. A comparison of predicted protein sequences also demonstrated the existence of two clade A lineages with unique amino acid substitutions, A1 (EMC/2012 and Jordan-N3/2012) and A2 (D998/15 and NRCE-HKU205), circulating in humans and camels, respectively. The nine clade B isolates belong to three distinct lineages: B1, B3 and B5. Two B3 strains, D1271/15 and D1189.1/15, showed evidence of recombination between lineages B4 and B5 in ORF1ab. Molecular clock analysis dated the time of the most recent common ancestor (tMRCA) of clade A to March 2011 and that of clade B to November 2011. Our data support a polyphyletic origin of MERS-CoV in dromedaries and the co-circulation of diverse MERS-CoVs including recombinant strains in the UAE. PMID:27999424
Robasky, Kimberly; Bulyk, Martha L
2011-01-01
The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Menachery, Vineet D.; Gralinski, Lisa E.; Mitchell, Hugh D.
ABSTRACT Coronaviruses (CoVs) encode a mixture of highly conserved and novel genes, as well as genetic elements necessary for infection and pathogenesis, raising the possibility of common targets for attenuation and therapeutic design. In this study, we focused on highly conserved nonstructural protein 16 (NSP16), a viral 2'O-methyltransferase (2'O-MTase) that encodes critical functions in immune modulation and infection. Using reverse genetics, we disrupted a key motif in the conserved KDKE motif of Middle East respiratory syndrome CoV (MERS-CoV) NSP16 (D130A) and evaluated the effect on viral infection and pathogenesis. While the absence of 2'O-MTase activity had only a marginal impactmore » on propagation and replication in Vero cells, dNSP16 mutant MERS-CoV demonstrated significant attenuation relative to the control both in primary human airway cell cultures andin vivo. Further examination indicated that dNSP16 mutant MERS-CoV had a type I interferon (IFN)-based attenuation and was partially restored in the absence of molecules of IFN-induced proteins with tetratricopeptide repeats. Importantly, the robust attenuation permitted the use of dNSP16 mutant MERS-CoV as a live attenuated vaccine platform protecting from a challenge with a mouse-adapted MERS-CoV strain. These studies demonstrate the importance of the conserved 2'O-MTase activity for CoV pathogenesis and highlight NSP16 as a conserved universal target for rapid live attenuated vaccine design in an expanding CoV outbreak setting. IMPORTANCECoronavirus (CoV) emergence in both humans and livestock represents a significant threat to global public health, as evidenced by the sudden emergence of severe acute respiratory syndrome CoV (SARS-CoV), MERS-CoV, porcine epidemic diarrhea virus, and swine delta CoV in the 21st century. These studies describe an approach that effectively targets the highly conserved 2'O-MTase activity of CoVs for attenuation. With clear understanding of the IFN/IFIT (IFN-induced proteins with tetratricopeptide repeats)-based mechanism, NSP16 mutants provide a suitable target for a live attenuated vaccine platform, as well as therapeutic development for both current and future emergent CoV strains. Importantly, other approaches targeting other conserved pan-CoV functions have not yet proven effective against MERS-CoV, illustrating the broad applicability of targeting viral 2'O-MTase function across CoVs.« less
Menachery, Vineet D.; Gralinski, Lisa E.; Mitchell, Hugh D.; Dinnon, Kenneth H.; Leist, Sarah R.; Yount, Boyd L.; Graham, Rachel L.; McAnarney, Eileen T.; Stratton, Kelly G.; Cockrell, Adam S.; Debbink, Kari; Sims, Amy C.; Waters, Katrina M.
2017-01-01
ABSTRACT Coronaviruses (CoVs) encode a mixture of highly conserved and novel genes, as well as genetic elements necessary for infection and pathogenesis, raising the possibility of common targets for attenuation and therapeutic design. In this study, we focused on highly conserved nonstructural protein 16 (NSP16), a viral 2′O-methyltransferase (2′O-MTase) that encodes critical functions in immune modulation and infection. Using reverse genetics, we disrupted a key motif in the conserved KDKE motif of Middle East respiratory syndrome CoV (MERS-CoV) NSP16 (D130A) and evaluated the effect on viral infection and pathogenesis. While the absence of 2′O-MTase activity had only a marginal impact on propagation and replication in Vero cells, dNSP16 mutant MERS-CoV demonstrated significant attenuation relative to the control both in primary human airway cell cultures and in vivo. Further examination indicated that dNSP16 mutant MERS-CoV had a type I interferon (IFN)-based attenuation and was partially restored in the absence of molecules of IFN-induced proteins with tetratricopeptide repeats. Importantly, the robust attenuation permitted the use of dNSP16 mutant MERS-CoV as a live attenuated vaccine platform protecting from a challenge with a mouse-adapted MERS-CoV strain. These studies demonstrate the importance of the conserved 2′O-MTase activity for CoV pathogenesis and highlight NSP16 as a conserved universal target for rapid live attenuated vaccine design in an expanding CoV outbreak setting. IMPORTANCE Coronavirus (CoV) emergence in both humans and livestock represents a significant threat to global public health, as evidenced by the sudden emergence of severe acute respiratory syndrome CoV (SARS-CoV), MERS-CoV, porcine epidemic diarrhea virus, and swine delta CoV in the 21st century. These studies describe an approach that effectively targets the highly conserved 2′O-MTase activity of CoVs for attenuation. With clear understanding of the IFN/IFIT (IFN-induced proteins with tetratricopeptide repeats)-based mechanism, NSP16 mutants provide a suitable target for a live attenuated vaccine platform, as well as therapeutic development for both current and future emergent CoV strains. Importantly, other approaches targeting other conserved pan-CoV functions have not yet proven effective against MERS-CoV, illustrating the broad applicability of targeting viral 2′O-MTase function across CoVs. PMID:29152578
Assiri, Abdullah M.; Biggs, Holly M.; Abedi, Glen R.; Lu, Xiaoyan; Bin Saeed, Abdulaziz; Abdalla, Osman; Mohammed, Mutaz; Al-Abdely, Hail M.; Algarni, Homoud S.; Alhakeem, Raafat F.; Almasri, Malak M.; Alsharef, Ali A.; Nooh, Randa; Erdman, Dean D.; Gerber, Susan I.; Watson, John T.
2016-01-01
During July–August 2015, the number of cases of Middle East respiratory syndrome (MERS) reported from Saudi Arabia increased dramatically. We reviewed the 143 confirmed cases from this period and classified each based upon likely transmission source. We found that the surge in cases resulted predominantly (90%) from secondary transmission largely attributable to an outbreak at a single healthcare facility in Riyadh. Genome sequencing of MERS coronavirus from 6 cases demonstrated continued circulation of the recently described recombinant virus. A single unique frameshift deletion in open reading frame 5 was detected in the viral sequence from 1 case. PMID:27704019
Outbreak of Middle East respiratory syndrome coronavirus in Saudi Arabia: a retrospective study.
Aleanizy, Fadilah Sfouq; Mohmed, Nahla; Alqahtani, Fulwah Y; El Hadi Mohamed, Rania Ali
2017-01-05
The Middle East respiratory syndrome (MERS) is proposed to be a zoonotic disease. Dromedary camels have been implicated due to reports that some confirmed cases were exposed to camels. Risk factors for MERS coronavirus (MERS-CoV) infections in humans are incompletely understood. This study aimed to describe the demographic characteristics, mortality rate, clinical manifestations and comorbidities with confirmed cases of MERS-CoV. Retrospective chart review were performed to identify all laboratory-confirmed cases of MERS-CoV in Saudi Arabia who reported to the Ministry of Health (MOH) of Saudi Arabia and WHO between April 23, 2014 and August 31, 2015. Patients' charts were also reviewed for demographic information, mortality, comorbidities, clinical presentations, health care facility and presented with descriptive and comparative statistics using non parametric binomial test and Chi-square test. Confirmed cases of male patients (61.1%) exceeded those of female patients (38.9%). Infections among Saudi patients (62.6%) exceeded those among non-Saudi patients (37.4%; P = 0.001). The majority of the patients were aged 21-40 years (37.4%) or 41-60 years (35.8%); 43 (22.6%) were aged >61 years, and (8) 4.2% were aged 0-20 years. There was a difference in mortality between confirmed MERS-CoV cases (63.7% alive versus 36.3% dead cases, respectively). Furthermore, fever with cough and shortness of breath (SOB) (n = 39; 20.5%), fever with cough (n = 29; 15.3%), fever (n = 18; 9.5%), and fever with SOB (n = 13; 6.8%), were the most common clinical manifestations associated with confirmed MERS-CoV cases. MERS-CoV is considered an epidemic in Saudi Arabia. The results of the present study showed that the frequency of cases is higher among men than women, in Saudi patients than non-Saudi, and those between 21 to 60 years are most affected. Further studies are required to improve the surveillance associated with MERS-CoV to get definite and clear answers and better understanding of the MERS-CoV outbreak as well the source, and route of infection transmission in Saudi Arabia.
Kim, Min; Taylor, Janette; Sidney, John; Mikloska, Zorka; Bodsworth, Neil; Lagios, Katerina; Dunckley, Heather; Byth-Wilson, Karen; Denis, Martine; Finlayson, Robert; Khanna, Rajiv; Sette, Alessandro; Cunningham, Anthony L
2008-11-01
In human recurrent cutaneous herpes simplex, there is a sequential infiltrate of CD4 and then CD8 lymphocytes into lesions. CD4 lymphocytes are the major producers of the key cytokine IFN-gamma in lesions. They recognize mainly structural proteins and especially glycoproteins D and B (gD and gB) when restimulated in vitro. Recent human vaccine trials using recombinant gD showed partial protection of HSV seronegative women against genital herpes disease and also, in placebo recipients, showed protection by prior HSV1 infection. In this study, we have defined immunodominant peptide epitopes recognized by 8 HSV1(+) and/or 16 HSV2(+) patients using (51)Cr-release cytotoxicity and IFN-gamma ELISPOT assays. Using a set of 39 overlapping 20-mer peptides, more than six immunodominant epitopes were defined in gD2 (two to six peptide epitopes were recognized for each subject). Further fine mapping of these responses for 4 of the 20-mers, using a panel of 9 internal 12-mers for each 20-mers, combined with MHC II typing and also direct in vitro binding assay of these peptides to individual DR molecules, showed more than one epitope per 20-mers and promiscuous binding of individual 20-mers and 12-mers to multiple DR types. All four 20-mer peptides were cross-recognized by both HSV1(+)/HSV2(-) and HSV1(-)/HSV2(+) subjects, but the sites of recognition differed within the 20-mers where their sequences were divergent. This work provides a basis for CD4 lymphocyte cross-recognition of gD2 and possibly cross-protection observed in previous clinical studies and in vaccine trials.
Kandeel, Mahmoud; Al-Taher, Abdulla; Li, Huifang; Schwingenschlogl, Udo; Al-Nazawi, Mohamed
2018-08-01
Structural studies related to Middle East Respiratory Syndrome Coronavirus (MERS CoV) infection process are so limited. In this study, molecular dynamics (MD) simulations were carried out to unravel changes in the MERS CoV heptad repeat domains (HRs) and factors affecting fusion state HR stability. Results indicated that HR trimer is more rapidly stabilized, having stable system energy and lower root mean square deviations (RMSDs). While trimers were the predominant active form of CoVs HRs, monomers were also discovered in both of viral and cellular membranes. In order to find the differences between S2 monomer and trimer molecular dynamics, S2 monomer was modelled and subjected to MD simulation. In contrast to S2 trimer, S2 monomer was unstable, having high RMSDs with major drifts above 8 Å. Fluctuation of HR residue positions revealed major changes in the C-terminal of HR2 and the linker coil between HR1 and HR2 in both monomer and trimer. Hydrophobic residues at the a and d positions of HR helices stabilize the whole system, with minimal changes in RMSD. The global distance test and contact area difference scores support instability of MERS CoV S2 monomer. Analysis of HR1-HR2 inter-residue contacts and interaction energy revealed three energy scales along HR helices. Two strong interaction energies were identified at the start of the HR2 helix and at the C-terminal of HR2. The identified critical residues by MD simulation and residues at the a and d positions of HR helix were strong stabilizers of HR recognition. Copyright © 2018 Elsevier Ltd. All rights reserved.
Chandra, Saket; Singh, Dharmendra; Pathak, Jyoti; Kumari, Supriya; Kumar, Manish; Poddar, Raju; Balyan, Harindra Singh; Gupta, Puspendra Kumar; Prabhu, Kumble Vinod; Mukhopadhyay, Kunal
2016-01-01
Pathogens like Puccinia triticina, the causal organism for leaf rust, extensively damages wheat production. The interaction at molecular level between wheat and the pathogen is complex and less explored. The pathogen induced response was characterized using mock- or pathogen inoculated near-isogenic wheat lines (with or without seedling leaf rust resistance gene Lr28). Four Serial Analysis of Gene Expression libraries were prepared from mock- and pathogen inoculated plants and were subjected to Sequencing by Oligonucleotide Ligation and Detection, which generated a total of 165,767,777 reads, each 35 bases long. The reads were processed and multiple k-mers were attempted for de novo transcript assembly; 22 k-mers showed the best results. Altogether 21,345 contigs were generated and functionally characterized by gene ontology annotation, mining for transcription factors and resistance genes. Expression analysis among the four libraries showed extensive alterations in the transcriptome in response to pathogen infection, reflecting reorganizations in major biological processes and metabolic pathways. Role of auxin in determining pathogenesis in susceptible and resistant lines were imperative. The qPCR expression study of four LRR-RLK (Leucine-rich repeat receptor-like protein kinases) genes showed higher expression at 24 hrs after inoculation with pathogen. In summary, the conceptual model of induced resistance in wheat contributes insights on defense responses and imparts knowledge of Puccinia triticina-induced defense transcripts in wheat plants.
Pathak, Jyoti; Kumari, Supriya; Kumar, Manish; Poddar, Raju; Balyan, Harindra Singh; Gupta, Puspendra Kumar; Prabhu, Kumble Vinod; Mukhopadhyay, Kunal
2016-01-01
Pathogens like Puccinia triticina, the causal organism for leaf rust, extensively damages wheat production. The interaction at molecular level between wheat and the pathogen is complex and less explored. The pathogen induced response was characterized using mock- or pathogen inoculated near-isogenic wheat lines (with or without seedling leaf rust resistance gene Lr28). Four Serial Analysis of Gene Expression libraries were prepared from mock- and pathogen inoculated plants and were subjected to Sequencing by Oligonucleotide Ligation and Detection, which generated a total of 165,767,777 reads, each 35 bases long. The reads were processed and multiple k-mers were attempted for de novo transcript assembly; 22 k-mers showed the best results. Altogether 21,345 contigs were generated and functionally characterized by gene ontology annotation, mining for transcription factors and resistance genes. Expression analysis among the four libraries showed extensive alterations in the transcriptome in response to pathogen infection, reflecting reorganizations in major biological processes and metabolic pathways. Role of auxin in determining pathogenesis in susceptible and resistant lines were imperative. The qPCR expression study of four LRR-RLK (Leucine-rich repeat receptor-like protein kinases) genes showed higher expression at 24 hrs after inoculation with pathogen. In summary, the conceptual model of induced resistance in wheat contributes insights on defense responses and imparts knowledge of Puccinia triticina-induced defense transcripts in wheat plants. PMID:26840746
[Detection of CRISPR and its relationship to drug resistance in Shigella].
Wang, Linlin; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Guo, Xiangjiao; Wang, Pengfei; Xi, Yuanlin; Yang, Haiyan
2015-04-04
To detect clustered regularly interspaced short palindromic repeats (CRISPR) in Shigella, and to analyze its relationship to drug resistance. Four pairs of primers were used for the detection of convincing CRISPR structures CRISPR-S2 and CRISPR-S4, questionable CRISPR structures CRISPR-S1 and CRISPR-S3 in 60 Shigella strains. All primers were designed using sequences in CRISPR database. CRISPR Finder was used to analyze CRISPR and susceptibilities of Shigella strains were tested by agar diffusion method. Furthermore, we analyzed the relationship between drug resistance and CRISPR-S4. The positive rate of convincing CRISPR structures was 95%. The four CRISPR loci formed 12 spectral patterns (A-L), all of which contained convincing CRISPR structures except type K. We found one new repeat and 12 new spacers. The multi-drug resistance rate was 53. 33% . We found no significant difference between CRISPR-S4 and drug resistant. However, the repeat sequence of CRISPR-S4 in multi- or TE-resistance strains was mainly R4.1 with AC deletions in the 3' end, and the spacer sequences of CRISPR-S4 in multi-drug resistance strains were mainly Sp5.1, Sp6.1 and Sp7. CRISPR was common in Shigella. Variations df repeat sequences and diversities of spacer sequences might be related to drug resistance in Shigella.
Shiba, Yoshinobu; Masuda, Hirofumi; Watanabe, Naoki; Ego, Takeshi; Takagaki, Kazuchika; Ishiyama, Kouichi; Ohgi, Tadaaki; Yano, Junichi
2007-01-01
A long RNA oligomer, a 110mer with the sequence of a precursor-microRNA candidate, has been chemically synthesized in a single synthesizer run by means of standard automated phosphoramidite chemistry. The synthetic method involved the use of 2-cyanoethoxymethyl (CEM), a 2′-hydroxyl protecting group recently developed in our laboratory. We improved the methodology, introducing better coupling and capping conditions. The overall isolated yield of highly pure 110mer was 5.5%. Such a yield on a 1-μmol scale corresponds to 1 mg of product and emphasizes the practicality of the CEM method for synthesizing oligomers of more than 100 nt in sufficient quantity for biological research. We confirmed the identity of the 110mer by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry, as well as HPLC, electrophoretic methods, and RNase-digestion experiments. The 110mer also showed sense-selective specific gene-silencing activity. As far as we know, this is the longest chemically synthesized RNA oligomer reported to date. Furthermore, the identity of the 110mer was confirmed by both physicochemical and biological methods. PMID:17459888
Control of silicification by genetically engineered fusion proteins: silk-silica binding peptides.
Zhou, Shun; Huang, Wenwen; Belton, David J; Simmons, Leo O; Perry, Carole C; Wang, Xiaoqin; Kaplan, David L
2015-03-01
In the present study, an artificial spider silk gene, 6mer, derived from the consensus sequence of Nephila clavipes dragline silk gene, was fused with different silica-binding peptides (SiBPs), A1, A3 and R5, to study the impact of the fusion protein sequence chemistry on silica formation and the ability to generate a silk-silica composite in two different bioinspired silicification systems: solution-solution and solution-solid. Condensed silica nanoscale particles (600-800 nm) were formed in the presence of the recombinant silk and chimeras, which were smaller than those formed by 15mer-SiBP chimeras, revealing that the molecular weight of the silk domain correlated to the sizes of the condensed silica particles in the solution system. In addition, the chimeras (6mer-A1/A3/R5) produced smaller condensed silica particles than the control (6mer), revealing that the silica particle size formed in the solution system is controlled by the size of protein assemblies in solution. In the solution-solid interface system, silicification reactions were performed on the surface of films fabricated from the recombinant silk proteins and chimeras and then treated to induce β-sheet formation. A higher density of condensed silica formed on the films containing the lowest β-sheet content while the films with the highest β-sheet content precipitated the lowest density of silica, revealing an inverse correlation between the β-sheet secondary structure and the silica content formed on the films. Intriguingly, the 6mer-A3 showed the highest rate of silica condensation but the lowest density of silica deposition on the films, compared with 6mer-A1 and -R5, revealing antagonistic crosstalk between the silk and the SiBP domains in terms of protein assembly. These findings offer a path forward in the tailoring of biopolymer-silica composites for biomaterial related needs. Copyright © 2014 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Control of silicification by genetically engineered fusion proteins: Silk–silica binding peptides
Zhou, Shun; Huang, Wenwen; Belton, David J.; Simmons, Leo O.; Perry, Carole C.; Wang, Xiaoqin; Kaplan, David L.
2014-01-01
In the present study, an artificial spider silk gene, 6mer, derived from the consensus sequence of Nephila clavipes dragline silk gene, was fused with different silica-binding peptides (SiBPs), A1, A3 and R5, to study the impact of the fusion protein sequence chemistry on silica formation and the ability to generate a silk–silica composite in two different bioinspired silicification systems: solution–solution and solution– solid. Condensed silica nanoscale particles (600–800 nm) were formed in the presence of the recombinant silk and chimeras, which were smaller than those formed by 15mer-SiBP chimeras [1], revealing that the molecular weight of the silk domain correlated to the sizes of the condensed silica particles in the solution system. In addition, the chimeras (6mer-A1/A3/R5) produced smaller condensed silica particles than the control (6mer), revealing that the silica particle size formed in the solution system is controlled by the size of protein assemblies in solution. In the solution–solid interface system, silicification reactions were performed on the surface of films fabricated from the recombinant silk proteins and chimeras and then treated to induce β-sheet formation. A higher density of condensed silica formed on the films containing the lowest β-sheet content while the films with the highest β-sheet content precipitated the lowest density of silica, revealing an inverse correlation between the β-sheet secondary structure and the silica content formed on the films. Intriguingly, the 6mer-A3 showed the highest rate of silica condensation but the lowest density of silica deposition on the films, compared with 6mer-A1 and -R5, revealing antagonistic crosstalk between the silk and the SiBP domains in terms of protein assembly. These findings offer a path forward in the tailoring of biopolymer–silica composites for biomaterial related needs. PMID:25462851
Update on Rover Sequencing and Visualization Program
NASA Technical Reports Server (NTRS)
Cooper, Brian; Hartman, Frank; Maxwell, Scott; Yen, Jeng; Wright, John; Balacuit, Carlos
2005-01-01
The Rover Sequencing and Visualization Program (RSVP) has been updated. RSVP was reported in Rover Sequencing and Visualization Program (NPO-30845), NASA Tech Briefs, Vol. 29, No. 4 (April 2005), page 38. To recapitulate: The Rover Sequencing and Visualization Program (RSVP) is the software tool to be used in the Mars Exploration Rover (MER) mission for planning rover operations and generating command sequences for accomplishing those operations. RSVP combines three-dimensional (3D) visualization for immersive exploration of the operations area, stereoscopic image display for high-resolution examination of the downlinked imagery, and a sophisticated command-sequence editing tool for analysis and completion of the sequences. RSVP is linked with actual flight code modules for operations rehearsal to provide feedback on the expected behavior of the rover prior to committing to a particular sequence. Playback tools allow for review of both rehearsed rover behavior and downlinked results of actual rover operations. These can be displayed simultaneously for comparison of rehearsed and actual activities for verification. The primary inputs to RSVP are downlink data products from the Operations Storage Server (OSS) and activity plans generated by the science team. The activity plans are high-level goals for the next day s activities. The downlink data products include imagery, terrain models, and telemetered engineering data on rover activities and state. The Rover Sequence Editor (RoSE) component of RSVP performs activity expansion to command sequences, command creation and editing with setting of command parameters, and viewing and management of rover resources. The HyperDrive component of RSVP performs 2D and 3D visualization of the rover s environment, graphical and animated review of rover predicted and telemetered state, and creation and editing of command sequences related to mobility and Instrument Deployment Device (robotic arm) operations. Additionally, RoSE and HyperDrive together evaluate command sequences for potential violations of flight and safety rules. The products of RSVP include command sequences for uplink that are stored in the Distributed Object Manager (DOM) and predicted rover state histories stored in the OSS for comparison and validation of downlinked telemetry. The majority of components comprising RSVP utilize the MER command and activity dictionaries to automatically customize the system for MER activities.
Implementing Distributed Operations: A Comparison of Two Deep Space Missions
NASA Technical Reports Server (NTRS)
Mishkin, Andrew; Larsen, Barbara
2006-01-01
Two very different deep space exploration missions--Mars Exploration Rover and Cassini--have made use of distributed operations for their science teams. In the case of MER, the distributed operations capability was implemented only after the prime mission was completed, as the rovers continued to operate well in excess of their expected mission lifetimes; Cassini, designed for a mission of more than ten years, had planned for distributed operations from its inception. The rapid command turnaround timeline of MER, as well as many of the operations features implemented to support it, have proven to be conducive to distributed operations. These features include: a single science team leader during the tactical operations timeline, highly integrated science and engineering teams, processes and file structures designed to permit multiple team members to work in parallel to deliver sequencing products, web-based spacecraft status and planning reports for team-wide access, and near-elimination of paper products from the operations process. Additionally, MER has benefited from the initial co-location of its entire operations team, and from having a single Principal Investigator, while Cassini operations have had to reconcile multiple science teams distributed from before launch. Cassini has faced greater challenges in implementing effective distributed operations. Because extensive early planning is required to capture science opportunities on its tour and because sequence development takes significantly longer than sequence execution, multiple teams are contributing to multiple sequences concurrently. The complexity of integrating inputs from multiple teams is exacerbated by spacecraft operability issues and resource contention among the teams, each of which has their own Principal Investigator. Finally, much of the technology that MER has exploited to facilitate distributed operations was not available when the Cassini ground system was designed, although later adoption of web-based and telecommunication tools has been critical to the success of Cassini operations.
Multiplex primer prediction software for divergent targets
Gardner, Shea N.; Hiddessen, Amy L.; Williams, Peter L.; Hara, Christine; Wagner, Mark C.; Colston, Bill W.
2009-01-01
We describe a Multiplex Primer Prediction (MPP) algorithm to build multiplex compatible primer sets to amplify all members of large, diverse and unalignable sets of target sequences. The MPP algorithm is scalable to larger target sets than other available software, and it does not require a multiple sequence alignment. We applied it to questions in viral detection, and demonstrated that there are no universally conserved priming sequences among viruses and that it could require an unfeasibly large number of primers (∼3700 18-mers or ∼2000 10-mers) to generate amplicons from all sequenced viruses. We then designed primer sets separately for each viral family, and for several diverse species such as foot-and-mouth disease virus (FMDV), hemagglutinin (HA) and neuraminidase (NA) segments of influenza A virus, Norwalk virus, and HIV-1. We empirically demonstrated the application of the software with a multiplex set of 16 short (10 nt) primers designed to amplify the Poxviridae family to produce a specific amplicon from vaccinia virus. PMID:19759213
Rover Sequencing and Visualization Program
NASA Technical Reports Server (NTRS)
Cooper, Brian; Hartman, Frank; Maxwell, Scott; Yen, Jeng; Wright, John; Balacuit, Carlos
2005-01-01
The Rover Sequencing and Visualization Program (RSVP) is the software tool for use in the Mars Exploration Rover (MER) mission for planning rover operations and generating command sequences for accomplishing those operations. RSVP combines three-dimensional (3D) visualization for immersive exploration of the operations area, stereoscopic image display for high-resolution examination of the downlinked imagery, and a sophisticated command-sequence editing tool for analysis and completion of the sequences. RSVP is linked with actual flight-code modules for operations rehearsal to provide feedback on the expected behavior of the rover prior to committing to a particular sequence. Playback tools allow for review of both rehearsed rover behavior and downlinked results of actual rover operations. These can be displayed simultaneously for comparison of rehearsed and actual activities for verification. The primary inputs to RSVP are downlink data products from the Operations Storage Server (OSS) and activity plans generated by the science team. The activity plans are high-level goals for the next day s activities. The downlink data products include imagery, terrain models, and telemetered engineering data on rover activities and state. The Rover Sequence Editor (RoSE) component of RSVP performs activity expansion to command sequences, command creation and editing with setting of command parameters, and viewing and management of rover resources. The HyperDrive component of RSVP performs 2D and 3D visualization of the rover s environment, graphical and animated review of rover-predicted and telemetered state, and creation and editing of command sequences related to mobility and Instrument Deployment Device (IDD) operations. Additionally, RoSE and HyperDrive together evaluate command sequences for potential violations of flight and safety rules. The products of RSVP include command sequences for uplink that are stored in the Distributed Object Manager (DOM) and predicted rover state histories stored in the OSS for comparison and validation of downlinked telemetry. The majority of components comprising RSVP utilize the MER command and activity dictionaries to automatically customize the system for MER activities. Thus, RSVP, being highly data driven, may be tailored to other missions with minimal effort. In addition, RSVP uses a distributed, message-passing architecture to allow multitasking, and collaborative visualization and sequence development by scattered team members.
Fatty Acid Profile and Unigene-Derived Simple Sequence Repeat Markers in Tung Tree (Vernicia fordii)
Zhang, Lin; Jia, Baoguang; Tan, Xiaofeng; Thammina, Chandra S.; Long, Hongxu; Liu, Min; Wen, Shanna; Song, Xianliang; Cao, Heping
2014-01-01
Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple sequence repeat (SSR) markers in tung tree. Fatty acid profiles of 41 accessions showed that the ratio of α-eleostearic acid was increasing continuously with a parallel trend to the amount of tung oil accumulation while the ratios of other fatty acids were decreasing in different stages of the seeds and that α-eleostearic acid (18∶3) consisted of 77% of the total fatty acids in tung oil. Transcriptome sequencing identified 81,805 unigenes from tung cDNA library constructed using seed mRNA and discovered 6,366 SSRs in 5,404 unigenes. The di- and tri-nucleotide microsatellites accounted for 92% of the SSRs with AG/CT and AAG/CTT being the most abundant SSR motifs. Fifteen polymorphic genic-SSR markers were developed from 98 unigene loci tested in 41 cultivated tung accessions by agarose gel and capillary electrophoresis. Genbank database search identified 10 of them putatively coding for functional proteins. Quantitative PCR demonstrated that all 15 polymorphic SSR-associated unigenes were expressed in tung seeds and some of them were highly correlated with oil composition in the seeds. Dendrogram revealed that most of the 41 accessions were clustered according to the geographic region. These new polymorphic genic-SSR markers will facilitate future studies on genetic diversity, molecular fingerprinting, comparative genomics and genetic mapping in tung tree. The lipid profiles in the seeds of 41 tung accessions will be valuable for biochemical and breeding studies. PMID:25167054
Structural Analysis of the Hg(II)-Regulatory Protein Tn501 MerR from Pseudomonas aeruginosa
Wang, Dan; Huang, Shanqing; Liu, Pingying; Liu, Xichun; He, Yafeng; Chen, Weizhong; Hu, Qingyuan; Wei, Tianbiao; Gan, Jianhua; Ma, Jing; Chen, Hao
2016-01-01
The metalloprotein MerR is a mercury(II)-dependent transcriptional repressor-activator that responds to mercury(II) with extraordinary sensitivity and selectivity. It’s widely distributed in both Gram-negative and Gram-positive bacteria but with barely detectable sequence identities between the two sources. To provide structural basis for the considerable biochemical and biophysical experiments previously performed on Tn501 and Tn21 MerR from Gram-negative bacteria, we analyzed the crystal structure of mercury(II)-bound Tn501 MerR. The structure in the metal-binding domain provides Tn501 MerR with a high affinity for mercury(II) and the ability to distinguish mercury(II) from other metals with its unique planar trigonal coordination geometry, which is adopted by both Gram-negative and Gram-positive bacteria. The mercury(II) coordination state in the C-terminal metal-binding domain is transmitted through the allosteric network across the dimer interface to the N-terminal DNA-binding domain. Together with the previous mutagenesis analyses, the present data indicate that the residues in the allosteric pathway have a central role in maintaining the functions of Tn501 MerR. In addition, the complex structure exhibits significant differences in tertiary and quaternary structural arrangements compared to those of Bacillus MerR from Gram-positive bacteria, which probably enable them to function with specific promoter DNA with different spacers between −35 and −10 elements. PMID:27641146
2014-01-01
Background Leptotrombidium pallidum and Leptotrombidium scutellare are the major vector mites for Orientia tsutsugamushi, the causative agent of scrub typhus. Before these organisms can be subjected to whole-genome sequencing, it is necessary to estimate their genome sizes to obtain basic information for establishing the strategies that should be used for genome sequencing and assembly. Method The genome sizes of L. pallidum and L. scutellare were estimated by a method based on quantitative real-time PCR. In addition, a k-mer analysis of the whole-genome sequences obtained through Illumina sequencing was conducted to verify the mutual compatibility and reliability of the results. Results The genome sizes estimated using qPCR were 191 ± 7 Mb for L. pallidum and 262 ± 13 Mb for L. scutellare. The k-mer analysis-based genome lengths were estimated to be 175 Mb for L. pallidum and 286 Mb for L. scutellare. The estimates from these two independent methods were mutually complementary and within a similar range to those of other Acariform mites. Conclusions The estimation method based on qPCR appears to be a useful alternative when the standard methods, such as flow cytometry, are impractical. The relatively small estimated genome sizes should facilitate whole-genome analysis, which could contribute to our understanding of Arachnida genome evolution and provide key information for scrub typhus prevention and mite vector competence. PMID:24947244
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
Marinier, Eric; Zaheer, Rahat; Berry, Chrystal; Weedmark, Kelly A.; Domaratzki, Michael; Mabon, Philip; Knox, Natalie C.; Reimer, Aleisha R.; Graham, Morag R.; Chui, Linda; Patterson-Fortin, Laura; Zhang, Jian; Pagotto, Franco; Farber, Jeff; Mahony, Jim; Seyer, Karine; Bekal, Sadjia; Tremblay, Cécile; Isaac-Renton, Judy; Prystajecky, Natalie; Chen, Jessica; Slade, Peter
2017-01-01
Abstract The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. PMID:29048594
Sel1-like repeat proteins in signal transduction.
Mittl, Peer R E; Schneider-Brachert, Wulf
2007-01-01
Solenoid proteins, which are distinguished from general globular proteins by their modular architectures, are frequently involved in signal transduction pathways. Proteins from the tetratricopeptide repeat (TPR) and Sel1-like repeat (SLR) families share similar alpha-helical conformations but different consensus sequence lengths and superhelical topologies. Both families are characterized by low sequence similarity levels, rendering the identification of functional homologous difficult. Therefore current knowledge of the molecular and cellular functions of the SLR proteins Sel1, Hrd3, Chs4, Nif1, PodJ, ExoR, AlgK, HcpA, Hsp12, EnhC, LpnE, MotX, and MerG has been reviewed. Although SLR proteins possess different cellular functions they all seem to serve as adaptor proteins for the assembly of macromolecular complexes. Sel1, Hrd3, Hsp12 and LpnE are activated under cellular stress. The eukaryotic Sel1 and Hrd3 proteins are involved in the ER-associated protein degradation, whereas the bacterial LpnE, EnhC, HcpA, ExoR, and AlgK proteins mediate the interactions between bacterial and eukaryotic host cells. LpnE and EnhC are responsible for the entry of L. pneumophila into epithelial cells and macrophages. ExoR from the symbiotic microorganism S. melioti and AlgK from the pathogen P. aeruginosa regulate exopolysaccaride synthesis. Nif1 and Chs4 from yeast are responsible for the regulation of mitosis and septum formation during cell division, respectively, and PodJ guides the cellular differentiation during the cell cycle of the bacterium C. crescentus. Taken together the SLR motif establishes a link between signal transduction pathways from eukaryotes and bacteria. The SLR motif is so far absent from archaea. Therefore the SLR could have developed in the last common ancestor between eukaryotes and bacteria.
Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.
Alkhateeb, Abedalrhman; Rueda, Luis
2017-08-01
Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.
Cognitive Dissonance as an Instructional Tool for Understanding Chemical Representations
ERIC Educational Resources Information Center
Corradi, David; Clarebout, Geraldine; Elen, Jan
2015-01-01
Previous research on multiple external representations (MER) indicates that sequencing representations (compared with presenting them as a whole) can, in some cases, increase conceptual understanding if there is interference between internal and external representations. We tested this mechanism by sequencing different combinations of scientific…
Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C J; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H; Cui, Helen; Markotter, Wanda
2018-01-01
Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard.
Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C. J.; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H.; Cui, Helen; Markotter, Wanda
2018-01-01
Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard. PMID:29579103
Inhibitor recognition specificity of MERS-CoV papain-like protease may differ from that of SARS-CoV.
Lee, Hyun; Lei, Hao; Santarsiero, Bernard D; Gatuz, Joseph L; Cao, Shuyi; Rice, Amy J; Patel, Kavankumar; Szypulinski, Michael Z; Ojeda, Isabel; Ghosh, Arun K; Johnson, Michael E
2015-06-19
The Middle East Respiratory Syndrome coronavirus (MERS-CoV) papain-like protease (PLpro) blocking loop 2 (BL2) structure differs significantly from that of SARS-CoV PLpro, where it has been proven to play a crucial role in SARS-CoV PLpro inhibitor binding. Four SARS-CoV PLpro lead inhibitors were tested against MERS-CoV PLpro, none of which were effective against MERS-CoV PLpro. Structure and sequence alignments revealed that two residues, Y269 and Q270, responsible for inhibitor binding to SARS-CoV PLpro, were replaced by T274 and A275 in MERS-CoV PLpro, making critical binding interactions difficult to form for similar types of inhibitors. High-throughput screening (HTS) of 25 000 compounds against both PLpro enzymes identified a small fragment-like noncovalent dual inhibitor. Mode of inhibition studies by enzyme kinetics and competition surface plasmon resonance (SPR) analyses suggested that this compound acts as a competitive inhibitor with an IC50 of 6 μM against MERS-CoV PLpro, indicating that it binds to the active site, whereas it acts as an allosteric inhibitor against SARS-CoV PLpro with an IC50 of 11 μM. These results raised the possibility that inhibitor recognition specificity of MERS-CoV PLpro may differ from that of SARS-CoV PLpro. In addition, inhibitory activity of this compound was selective for SARS-CoV and MERS-CoV PLpro enzymes over two human homologues, the ubiquitin C-terminal hydrolases 1 and 3 (hUCH-L1 and hUCH-L3).
Catalysis in prebiotic chemistry RNA synthesis
NASA Astrophysics Data System (ADS)
Ferris, J.; Joshi, P.; Wang, K.; Huang, W.; Miyakawa, S.
It is proposed that catalysis by minerals and metal ions had a central role in the steps that led to the origins of life. In particular, the formation of biopolymers in the presence of water requires catalysis to compete with hydrolytic processes. Catalysis is required to limit the number of isomers generated so that the longer polymers necessary for the origins of life formed. Montmorillonite clay catalyzes the formation of 6 14 mers of RNA from activated monomers of A, G, U and C in- aqueous solution. Daily addition of activated monomers to a 10 mer primer results in the formation of 40-50 mers of adenylic acid and 30 mers of uridylic acid. The sequence selectivity and regioselectivity in phosphodiester bond formation results from the montmorillonite catalysis. Reaction of D, L-activated monomers of A and U leads to the preferential formation of homochiral dimers (eg. D, D and L, L-- pApA). These data and any more recent developments will be discussed.
Lau, Susanna K P; Li, Kenneth S M; Tsang, Alan K L; Lam, Carol S F; Ahmed, Shakeel; Chen, Honglin; Chan, Kwok-Hung; Woo, Patrick C Y; Yuen, Kwok-Yung
2013-08-01
While the novel Middle East respiratory syndrome coronavirus (MERS-CoV) is closely related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and Pipistrellus bat CoV HKU5 (Pi-BatCoV HKU5) in bats from Hong Kong, and other potential lineage C betacoronaviruses in bats from Africa, Europe, and America, its animal origin remains obscure. To better understand the role of bats in its origin, we examined the molecular epidemiology and evolution of lineage C betacoronaviruses among bats. Ty-BatCoV HKU4 and Pi-BatCoV HKU5 were detected in 29% and 25% of alimentary samples from lesser bamboo bat (Tylonycteris pachypus) and Japanese pipistrelle (Pipistrellus abramus), respectively. Sequencing of their RNA polymerase (RdRp), spike (S), and nucleocapsid (N) genes revealed that MERS-CoV is more closely related to Pi-BatCoV HKU5 in RdRp (92.1% to 92.3% amino acid [aa] identity) but is more closely related to Ty-BatCoV HKU4 in S (66.8% to 67.4% aa identity) and N (71.9% to 72.3% aa identity). Although both viruses were under purifying selection, the S of Pi-BatCoV HKU5 displayed marked sequence polymorphisms and more positively selected sites than that of Ty-BatCoV HKU4, suggesting that Pi-BatCoV HKU5 may generate variants to occupy new ecological niches along with its host in diverse habitats. Molecular clock analysis showed that they diverged from a common ancestor with MERS-CoV at least several centuries ago. Although MERS-CoV may have diverged from potential lineage C betacoronaviruses in European bats more recently, these bat viruses were unlikely to be the direct ancestor of MERS-CoV. Intensive surveillance for lineage C betaCoVs in Pipistrellus and related bats with diverse habitats and other animals in the Middle East may fill the evolutionary gap.
Lau, Susanna K. P.; Li, Kenneth S. M.; Tsang, Alan K. L.; Lam, Carol S. F.; Ahmed, Shakeel; Chen, Honglin; Chan, Kwok-Hung
2013-01-01
While the novel Middle East respiratory syndrome coronavirus (MERS-CoV) is closely related to Tylonycteris bat CoV HKU4 (Ty-BatCoV HKU4) and Pipistrellus bat CoV HKU5 (Pi-BatCoV HKU5) in bats from Hong Kong, and other potential lineage C betacoronaviruses in bats from Africa, Europe, and America, its animal origin remains obscure. To better understand the role of bats in its origin, we examined the molecular epidemiology and evolution of lineage C betacoronaviruses among bats. Ty-BatCoV HKU4 and Pi-BatCoV HKU5 were detected in 29% and 25% of alimentary samples from lesser bamboo bat (Tylonycteris pachypus) and Japanese pipistrelle (Pipistrellus abramus), respectively. Sequencing of their RNA polymerase (RdRp), spike (S), and nucleocapsid (N) genes revealed that MERS-CoV is more closely related to Pi-BatCoV HKU5 in RdRp (92.1% to 92.3% amino acid [aa] identity) but is more closely related to Ty-BatCoV HKU4 in S (66.8% to 67.4% aa identity) and N (71.9% to 72.3% aa identity). Although both viruses were under purifying selection, the S of Pi-BatCoV HKU5 displayed marked sequence polymorphisms and more positively selected sites than that of Ty-BatCoV HKU4, suggesting that Pi-BatCoV HKU5 may generate variants to occupy new ecological niches along with its host in diverse habitats. Molecular clock analysis showed that they diverged from a common ancestor with MERS-CoV at least several centuries ago. Although MERS-CoV may have diverged from potential lineage C betacoronaviruses in European bats more recently, these bat viruses were unlikely to be the direct ancestor of MERS-CoV. Intensive surveillance for lineage C betaCoVs in Pipistrellus and related bats with diverse habitats and other animals in the Middle East may fill the evolutionary gap. PMID:23720729
High-Throughput Block Optical DNA Sequence Identification.
Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant
2018-01-01
Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
G-triplex structure and formation propensity
Cerofolini, Linda; Amato, Jussara; Giachetti, Andrea; Limongelli, Vittorio; Novellino, Ettore; Parrinello, Michele; Fragai, Marco; Randazzo, Antonio; Luchinat, Claudio
2014-01-01
The occurrence of a G-triplex folding intermediate of thrombin binding aptamer (TBA) has been recently predicted by metadynamics calculations, and experimentally supported by Nuclear Magnetic Resonance (NMR), Circular Dichroism (CD) and Differential Scanning Calorimetry (DSC) data collected on a 3′ end TBA-truncated 11-mer oligonucleotide (11-mer-3′-t-TBA). Here we present the solution structure of 11-mer-3′-t-TBA in the presence of potassium ions. This structure is the first experimental example of a G-triplex folding, where a network of Hoogsteen-like hydrogen bonds stabilizes six guanines to form two G:G:G triad planes. The G-triplex folding of 11-mer-3′-t-TBA is stabilized by the potassium ion and destabilized by increasing the temperature. The superimposition of the experimental structure with that predicted by metadynamics shows a great similarity, with only significant differences involving two loops. These new structural data show that 11-mer-3′-t-TBA assumes a G-triplex DNA conformation as its stable form, reinforcing the idea that G-triplex folding intermediates may occur in vivo in human guanine-rich sequences. NMR and CD screening of eight different constructs obtained by removing from one to four bases at either the 3′ and the 5′ ends show that only the 11-mer-3′-t-TBA yields a relatively stable G-triplex. PMID:25378342
RAD tag sequencing as a source of SNP markers in Cynara cardunculus L
2012-01-01
Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349
Abuhammad, Areej; Al-Aqtash, Rua'a A; Anson, Brandon J; Mesecar, Andrew D; Taha, Mutasem O
2017-11-01
The Middle East respiratory syndrome coronavirus (MERS-CoV) is an emerging virus that poses a major challenge to clinical management. The 3C-like protease (3CL pro ) is essential for viral replication and thus represents a potential target for antiviral drug development. Presently, very few data are available on MERS-CoV 3CL pro inhibition by small molecules. We conducted extensive exploration of the pharmacophoric space of a recently identified set of peptidomimetic inhibitors of the bat HKU4-CoV 3CL pro . HKU4-CoV 3CL pro shares high sequence identity (81%) with the MERS-CoV enzyme and thus represents a potential surrogate model for anti-MERS drug discovery. We used 2 well-established methods: Quantitative structure-activity relationship (QSAR)-guided modeling and docking-based comparative intermolecular contacts analysis. The established pharmacophore models highlight structural features needed for ligand recognition and revealed important binding-pocket regions involved in 3CL pro -ligand interactions. The best models were used as 3D queries to screen the National Cancer Institute database for novel nonpeptidomimetic 3CL pro inhibitors. The identified hits were tested for HKU4-CoV and MERS-CoV 3CL pro inhibition. Two hits, which share the phenylsulfonamide fragment, showed moderate inhibitory activity against the MERS-CoV 3CL pro and represent a potential starting point for the development of novel anti-MERS agents. To the best of our knowledge, this is the first pharmacophore modeling study supported by in vitro validation on the MERS-CoV 3CL pro . MERS-CoV is an emerging virus that is closely related to the bat HKU4-CoV. 3CL pro is a potential drug target for coronavirus infection. HKU4-CoV 3CL pro is a useful surrogate model for the identification of MERS-CoV 3CL pro enzyme inhibitors. dbCICA is a very robust modeling method for hit identification. The phenylsulfonamide scaffold represents a potential starting point for MERS coronavirus 3CL pro inhibitors development. Copyright © 2017 John Wiley & Sons, Ltd.
2013-01-01
Background Hybridization based assays and capture systems depend on the specificity of hybridization between a probe and its intended target. A common guideline in the construction of DNA microarrays, for instance, is that avoiding complementary stretches of more than 15 nucleic acids in a 50 or 60-mer probe will eliminate sequence specific cross-hybridization reactions. Here we present a study of the behavior of partially matched oligonucleotide pairs with complementary stretches starting well below this threshold complementarity length – in silico, in solution, and at the microarray surface. The modeled behavior of pairs of oligonucleotide probes and their targets suggests that even a complementary stretch of sequence 12 nt in length would give rise to specific cross-hybridization. We designed a set of binding partners to a 50-mer oligonucleotide containing complementary stretches from 6 nt to 21 nt in length. Results Solution melting experiments demonstrate that stable partial duplexes can form when only 12 bp of complementary sequence are present; surface hybridization experiments confirm that a signal close in magnitude to full-strength signal can be obtained from hybridization of a 12 bp duplex within a 50mer oligonucleotide. Conclusions Microarray and other molecular capture strategies that rely on a 15 nt lower complementarity bound for eliminating specific cross-hybridization may not be sufficiently conservative. PMID:23445545
Mango: multiple alignment with N gapped oligos.
Zhang, Zefeng; Lin, Hao; Li, Ming
2008-06-01
Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.
Palmer, Lance E; Dejori, Mathaeus; Bolanos, Randall; Fasulo, Daniel
2010-01-15
With the rapid expansion of DNA sequencing databases, it is now feasible to identify relevant information from prior sequencing projects and completed genomes and apply it to de novo sequencing of new organisms. As an example, this paper demonstrates how such extra information can be used to improve de novo assemblies by augmenting the overlapping step. Finding all pairs of overlapping reads is a key task in many genome assemblers, and to this end, highly efficient algorithms have been developed to find alignments in large collections of sequences. It is well known that due to repeated sequences, many aligned pairs of reads nevertheless do not overlap. But no overlapping algorithm to date takes a rigorous approach to separating aligned but non-overlapping read pairs from true overlaps. We present an approach that extends the Minimus assembler by a data driven step to classify overlaps as true or false prior to contig construction. We trained several different classification models within the Weka framework using various statistics derived from overlaps of reads available from prior sequencing projects. These statistics included percent mismatch and k-mer frequencies within the overlaps as well as a comparative genomics score derived from mapping reads to multiple reference genomes. We show that in real whole-genome sequencing data from the E. coli and S. aureus genomes, by providing a curated set of overlaps to the contigging phase of the assembler, we nearly doubled the median contig length (N50) without sacrificing coverage of the genome or increasing the number of mis-assemblies. Machine learning methods that use comparative and non-comparative features to classify overlaps as true or false can be used to improve the quality of a sequence assembly.
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)
Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn
2009-01-01
Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547
NASA Technical Reports Server (NTRS)
Maimone, Mark W.
2009-01-01
Scripts Providing a Cool Kit of Telemetry Enhancing Tools (SPACKLE) is a set of software tools that fill gaps in capabilities of other software used in processing downlinked data in the Mars Exploration Rovers (MER) flight and test-bed operations. SPACKLE tools have helped to accelerate the automatic processing and interpretation of MER mission data, enabling non-experts to understand and/or use MER query and data product command simulation software tools more effectively. SPACKLE has greatly accelerated some operations and provides new capabilities. The tools of SPACKLE are written, variously, in Perl or the C or C++ language. They perform a variety of search and shortcut functions that include the following: Generating text-only, Event Report-annotated, and Web-enhanced views of command sequences; Labeling integer enumerations with their symbolic meanings in text messages and engineering channels; Systematic detecting of corruption within data products; Generating text-only displays of data-product catalogs including downlink status; Validating and labeling of commands related to data products; Performing of convenient searches of detailed engineering data spanning multiple Martian solar days; Generating tables of initial conditions pertaining to engineering, health, and accountability data; Simplified construction and simulation of command sequences; and Fast time format conversions and sorting.
SD-MSAEs: Promoter recognition in human genome based on deep feature extraction.
Xu, Wenxuan; Zhang, Li; Lu, Yaping
2016-06-01
The prediction and recognition of promoter in human genome play an important role in DNA sequence analysis. Entropy, in Shannon sense, of information theory is a multiple utility in bioinformatic details analysis. The relative entropy estimator methods based on statistical divergence (SD) are used to extract meaningful features to distinguish different regions of DNA sequences. In this paper, we choose context feature and use a set of methods of SD to select the most effective n-mers distinguishing promoter regions from other DNA regions in human genome. Extracted from the total possible combinations of n-mers, we can get four sparse distributions based on promoter and non-promoters training samples. The informative n-mers are selected by optimizing the differentiating extents of these distributions. Specially, we combine the advantage of statistical divergence and multiple sparse auto-encoders (MSAEs) in deep learning to extract deep feature for promoter recognition. And then we apply multiple SVMs and a decision model to construct a human promoter recognition method called SD-MSAEs. Framework is flexible that it can integrate new feature extraction or new classification models freely. Experimental results show that our method has high sensitivity and specificity. Copyright © 2016 Elsevier Inc. All rights reserved.
Predicting the binding preference of transcription factors to individual DNA k-mers.
Alleyne, Trevis M; Peña-Castillo, Lourdes; Badis, Gwenael; Talukder, Shaheynoor; Berger, Michael F; Gehrke, Andrew R; Philippakis, Anthony A; Bulyk, Martha L; Morris, Quaid D; Hughes, Timothy R
2009-04-15
Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members. We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.
Chen, Dana; Orenstein, Yaron; Golodnitsky, Rada; Pellach, Michal; Avrahami, Dorit; Wachtel, Chaim; Ovadia-Shochat, Avital; Shir-Shapira, Hila; Kedmi, Adi; Juven-Gershon, Tamar; Shamir, Ron; Gerber, Doron
2016-01-01
Transcription factors (TFs) alter gene expression in response to changes in the environment through sequence-specific interactions with the DNA. These interactions are best portrayed as a landscape of TF binding affinities. Current methods to study sequence-specific binding preferences suffer from limited dynamic range, sequence bias, lack of specificity and limited throughput. We have developed a microfluidic-based device for SELEX Affinity Landscape MAPping (SELMAP) of TF binding, which allows high-throughput measurement of 16 proteins in parallel. We used it to measure the relative affinities of Pho4, AtERF2 and Btd full-length proteins to millions of different DNA binding sites, and detected both high and low-affinity interactions in equilibrium conditions, generating a comprehensive landscape of the relative TF affinities to all possible DNA 6-mers, and even DNA10-mers with increased sequencing depth. Low quantities of both the TFs and DNA oligomers were sufficient for obtaining high-quality results, significantly reducing experimental costs. SELMAP allows in-depth screening of hundreds of TFs, and provides a means for better understanding of the regulatory processes that govern gene expression. PMID:27628341
Gille, H; Messer, W
1991-01-01
The leftmost region of the Escherichia coli origin of DNA replication (oriC) contains three tandemly repeated AT-rich 13mers which have been shown to become single-stranded during the early stages of initiation in vitro. Melting is induced by the ATP form of DnaA, the initiator protein of DNA replication. KMnO4 was used to probe for single-stranded regions and altered DNA conformation during the initiation of DNA replication at oriC in vitro and in vivo. Unpairing in the AT-rich 13mer region is thermodynamically stable even in the absence of DnaA protein, but only when divalent cations are omitted from the reaction. In the presence of Mg2+, oriC melting is strictly DnaA dependent. The sensitive region is distinct from that detected in the absence of DnaA as it is located further to the left within the minimal origin. In addition, the DNA is severely distorted between the three 13mers and the IHF binding site in oriC. A change of conformation can also be observed during the initiation of DNA replication in vivo. This is the first in vivo evidence for a structural change at the 13mers during initiation complex formation. Images PMID:2026151
Nishiyama, Kazusa; Takakusagi, Yoichi; Kusayanagi, Tomoe; Matsumoto, Yuki; Habu, Shiori; Kuramochi, Kouji; Sugawara, Fumio; Sakaguchi, Kengo; Takahashi, Hideyo; Natsugari, Hideaki; Kobayashi, Susumu
2009-01-01
Here, we report on the identification of trimannoside-recognizing peptide sequences from a T7 phage display screen using a quartz-crystal microbalance (QCM) device. A trimannoside derivative that can form a self-assembled monolayer (SAM) was synthesized and used for immobilization on the gold electrode surface of a QCM sensor chip. After six sets of one-cycle affinity selection, T7 phage particles displaying PSVGLFTH (8-mer) and SVGLGLGFSTVNCF (14-mer) were found to be enriched at a rate of 17/44, 9/44, respectively, suggesting that these peptides specifically recognize trimannoside. Binding checks using the respective single T7 phage and synthetic peptide also confirmed the specific binding of these sequences to the trimannoside-SAM. Subsequent analysis revealed that these sequences correspond to part of the primary amino acid sequence found in many mannose- or hexose-related proteins. Taken together, these results demonstrate the effectiveness of our T7 phage display environment for affinity selection of binding peptides. We anticipate this screening result will also be extremely useful in the development of inhibitors or drug delivery systems targeting polysaccharides as well as further investigations into the function of carbohydrates in vivo.
The diversity and evolution of chelicerate hemocyanins
2012-01-01
Background Oxygen transport in the hemolymph of many arthropod species is facilitated by large copper-proteins referred to as hemocyanins. Arthropod hemocyanins are hexamers or oligomers of hexamers, which are characterized by a high O2 transport capacity and a high cooperativity, thereby enhancing O2 supply. Hemocyanin subunit sequences had been available from horseshoe crabs (Xiphosura) and various spiders (Araneae), but not from any other chelicerate taxon. To trace the evolution of hemocyanins and the emergence of the large hemocyanin oligomers, hemocyanin cDNA sequences were obtained from representatives of selected chelicerate classes. Results Hemocyanin subunits from a sea spider, a scorpion, a whip scorpion and a whip spider were sequenced. Hemocyanin has been lost in Opiliones, Pseudoscorpiones, Solifugae and Acari, which may be explained by the evolution of trachea (i.e., taxon Apulmonata). Bayesian phylogenetic analysis was used to reconstruct the evolution of hemocyanin subunits and a relaxed molecular clock approach was applied to date the major events. While the sea spider has a simple hexameric hemocyanin, four distinct subunit types evolved before Xiphosura and Arachnida diverged around 470 Ma ago, suggesting the existence of a 4 × 6mer at that time. Subsequently, independent gene duplication events gave rise to the other distinct subunits in each of the 8 × 6mer hemocyanin of Xiphosura and the 4 × 6mer of Arachnida. The hemocyanin sequences were used to infer the evolutionary history of chelicerates. The phylogenetic trees support a basal position of Pycnogonida, a sister group relationship of Xiphosura and Arachnida, and a sister group relationship of the whip scorpions and the whip spiders. Conclusion Formation of a complex hemocyanin oligomer commenced early in the evolution of euchelicerates. A 4 × 6mer hemocyanin consisting of seven subunit types is conserved in most arachnids since more than 400 Ma, although some entelegyne spiders display selective subunit loss and independent oligomerization. Hemocyanins also turned out to be a good marker to trace chelicerate evolution, which is, however, limited by the loss of hemocyanin in some taxa. The molecular clock calculations were in excellent agreement with the fossil record, also demonstrating the applicability of hemocyanins for such approach. PMID:22333134
Divergence, differential methylation and interspersion of melon satellite DNA sequences.
Shmookler Reis, R; Timmis, J N; Ingle, J
1981-01-01
Melon (Cucumis melo) satellite DNA consists of two components, Q and S, each with a buoyant density in CsCl of 1.707 g/ml, but differing by 9 degrees C in "melting" temperature. These physical properties appear to be in contradiction, since both depend on G + C content. In order to resolve this anomaly, base compositions were directly determined for isolated fractions. the low-"melting" component S contains 41.8% G + C, with 6% of C present as 5-methylcytosine, whereas Q DNA contains 54% G + C, with 41% of C methylated. Analyses of restriction site loss agreed well with the direct determinations of methylation and divergence, and indicated some clustering of methylated sites in Q DNA. Analysis of restricted main-band DNA by hydridization with RNA complementary to Q satellite DNA ("Southern transfer") showed satellite Q tandem arrays interspersed in DNA of main-band density. Sequence divergence and extent of methylation did not appear to depend on whether a repeat array was present as satellite or interspersed in main-band DNA. Hydridization in situ indicated considerable heterogeneity in the genomic proportion of the Q-DNA sequences in melon fruit nuclei, implying over- and under-representation consistent with extensive unequal recombination in satellite Q tandem arrays. The cucumber, Cucumis sativus, contains less than 8% as much Q-homologous DNA per genome as the melon, suggesting rapid evolutionary gain or loss of these tandem repeat sequences. Images Fig. 2. PLATE 1 Fig. 4. Fig. 10. PMID:6172117
NASA Astrophysics Data System (ADS)
Rayaprolu, Vamseedhar; Moore, Alan; Che-Yen Wang, Joseph; Goh, Boon Chong; Perilla, Juan R.; Zlotnick, Adam; Mukhopadhyay, Suchetana
2017-12-01
In vitro assembly of alphavirus nucleocapsid cores, called core-like particles (CLPs), requires a polyanionic cargo. There are no sequence or structure requirements to encapsidate single-stranded nucleic acid cargo. In this work, we wanted to determine how the length of the cargo impacts the stability and structure of the assembled CLPs. We hypothesized that cargo neutralizes the basic region of the alphavirus capsid protein and if the cargo is long enough, it will also act to scaffold the CP monomers together. Experimentally we found that CLPs encapsidating short 27mer oligonucleotides were less stable than CLPs encapsidating 48mer or 90mer oligonucleotides under different chemical and thermal conditions. Furthermore, cryo-EM studies showed there were structural differences between CLPs assembled with 27mer and 48mer cargo. To mimic the role of the cargo in CLP assembly we made a mutant (4D) where we substituted a cluster of four Lys residues in the CP with four Asp residues. We found that these few amino acid substitutions were enough to initiate CLP assembly in the absence of cargo. The cargo-free 4D CLPs show higher resistance to ionic strength and increased temperature compared to wild-type cargo containing CLPs suggesting their CLP assembly mechanism might also be different.
Expanded breadth of the T-cell response to mosaic HIV-1 envelope DNA vaccination
DOE Office of Scientific and Technical Information (OSTI.GOV)
Korber, Bette; Fischer, William; Wallstrom, Timothy
2009-01-01
An effective AIDS vaccine must control highly diverse circulating strains of HIV-1. Among HIV -I gene products, the envelope (Env) protein contains variable as well as conserved regions. In this report, an informatic approach to the design of T-cell vaccines directed to HIV -I Env M group global sequences was tested. Synthetic Env antigens were designed to express mosaics that maximize the inclusion of common potential Tcell epitope (PTE) 9-mers and minimize the inclusion of rare epitopes likely to elicit strain-specific responses. DNA vaccines were evaluated using intracellular cytokine staining (ICS) in inbred mice with a standardized panel of highlymore » conserved 15-mer PTE peptides. I, 2 and 3 mosaic sets were developed that increased theoretical epitope coverage. The breadth and magnitude ofT-cell immunity stimulated by these vaccines were compared to natural strain Env's; additional comparisons were performed on mutant Env's, including gpl60 or gpl45 with or without V regions and gp41 deletions. Among them, the 2 or 3 mosaic Env sets elicited the optimal CD4 and CD8 responses. These responses were most evident in CD8 T cells; the 3 mosaic set elicited responses to an average of 8 peptide pools compared to 2 pools for a set of3 natural Env's. Synthetic mosaic HIV -I antigens can therefore induce T-cell responses with expanded breadth and may facilitate the development of effective T -cell-based HIV -1 vaccines.« less
Positional bias in variant calls against draft reference assemblies.
Briskine, Roman V; Shimizu, Kentaro K
2017-03-28
Whole genome resequencing projects may implement variant calling using draft reference genomes assembled de novo from short-read libraries. Despite lower quality of such assemblies, they allowed researchers to extend a wide range of population genetic and genome-wide association analyses to non-model species. As the variant calling pipelines are complex and involve many software packages, it is important to understand inherent biases and limitations at each step of the analysis. In this article, we report a positional bias present in variant calling performed against draft reference assemblies constructed from de Bruijn or string overlap graphs. We assessed how frequently variants appeared at each position counted from ends of a contig or scaffold sequence, and discovered unexpectedly high number of variants at the positions related to the length of either k-mers or reads used for the assembly. We detected the bias in both publicly available draft assemblies from Assemblathon 2 competition as well as in the assemblies we generated from our simulated short-read data. Simulations confirmed that the bias causing variants are predominantly false positives induced by reads from spatially distant repeated sequences. The bias is particularly strong in contig assemblies. Scaffolding does not eliminate the bias but tends to mitigate it because of the changes in variants' relative positions and alterations in read alignments. The bias can be effectively reduced by filtering out the variants that reside in repetitive elements. Draft genome sequences generated by several popular assemblers appear to be susceptible to the positional bias potentially affecting many resequencing projects in non-model species. The bias is inherent to the assembly algorithms and arises from their particular handling of repeated sequences. It is recommended to reduce the bias by filtering especially if higher-quality genome assembly cannot be achieved. Our findings can help other researchers to improve the quality of their variant data sets and reduce artefactual findings in downstream analyses.
Ferraro Petrillo, Umberto; Roscigno, Gianluca; Cattaneo, Giuseppe; Giancarlo, Raffaele
2018-06-01
Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in the realm of genome assembly. However, they are so specialized to this domain that they do not extend easily to the computation of informational and linguistic indices, concurrently on sets of genomes. Following the well-established approach in many disciplines, and with a growing success also in bioinformatics, to resort to MapReduce and Hadoop to deal with 'Big Data' problems, we present KCH, the first set of MapReduce algorithms able to perform concurrently informational and linguistic analysis of large collections of genomic sequences on a Hadoop cluster. The benchmarking of KCH that we provide indicates that it is quite effective and versatile. It is also competitive with respect to the parallel and distributed algorithms highly specialized to k-mer statistics collection for genome assembly problems. In conclusion, KCH is a much needed addition to the growing number of algorithms and tools that use MapReduce for bioinformatics core applications. The software, including instructions for running it over Amazon AWS, as well as the datasets are available at http://www.di-srv.unisa.it/KCH. umberto.ferraro@uniroma1.it. Supplementary data are available at Bioinformatics online.
Repeated-Sprint Sequences During Female Soccer Matches Using Fixed and Individual Speed Thresholds.
Nakamura, Fábio Y; Pereira, Lucas A; Loturco, Irineu; Rosseti, Marcelo; Moura, Felipe A; Bradley, Paul S
2017-07-01
Nakamura, FY, Pereira, LA, Loturco, I, Rosseti, M, Moura, FA, and Bradley, PS. Repeated-sprint sequences during female soccer matches using fixed and individual speed thresholds. J Strength Cond Res 31(7): 1802-1810, 2017-The main objective of this study was to characterize the occurrence of single sprint and repeated-sprint sequences (RSS) during elite female soccer matches, using fixed (20 km·h) and individually based speed thresholds (>90% of the mean speed from a 20-m sprint test). Eleven elite female soccer players from the same team participated in the study. All players performed a 20-m linear sprint test, and were assessed in up to 10 official matches using Global Positioning System technology. Magnitude-based inferences were used to test for meaningful differences. Results revealed that irrespective of adopting fixed or individual speed thresholds, female players produced only a few RSS during matches (2.3 ± 2.4 sequences using the fixed threshold and 3.3 ± 3.0 sequences using the individually based threshold), with most sequences composing of just 2 sprints. Additionally, central defenders performed fewer sprints (10.2 ± 4.1) than other positions (fullbacks: 28.1 ± 5.5; midfielders: 21.9 ± 10.5; forwards: 31.9 ± 11.1; with the differences being likely to almost certainly associated with effect sizes ranging from 1.65 to 2.72), and sprinting ability declined in the second half. The data do not support the notion that RSS occurs frequently during soccer matches in female players, irrespective of using fixed or individual speed thresholds to define sprint occurrence. However, repeated-sprint ability development cannot be ruled out from soccer training programs because of its association with match-related performance.
Conserved Amphipathic Helices Mediate Lipid Droplet Targeting of Perilipins 1–3*
Rowe, Emily R.; Mimmack, Michael L.; Barbosa, Antonio D.; Haider, Afreen; Isaac, Iona; Ouberai, Myriam M.; Thiam, Abdou Rachid; Patel, Satish; Saudek, Vladimir; Siniossoglou, Symeon; Savage, David B.
2016-01-01
Perilipins (PLINs) play a key role in energy storage by orchestrating the activity of lipases on the surface of lipid droplets. Failure of this activity results in severe metabolic disease in humans. Unlike all other lipid droplet-associated proteins, PLINs localize almost exclusively to the phospholipid monolayer surrounding the droplet. To understand how they sense and associate with the unique topology of the droplet surface, we studied the localization of human PLINs in Saccharomyces cerevisiae, demonstrating that the targeting mechanism is highly conserved and that 11-mer repeat regions are sufficient for droplet targeting. Mutations designed to disrupt folding of this region into amphipathic helices (AHs) significantly decreased lipid droplet targeting in vivo and in vitro. Finally, we demonstrated a substantial increase in the helicity of this region in the presence of detergent micelles, which was prevented by an AH-disrupting missense mutation. We conclude that highly conserved 11-mer repeat regions of PLINs target lipid droplets by folding into AHs on the droplet surface, thus enabling PLINs to regulate the interface between the hydrophobic lipid core and its surrounding hydrophilic environment. PMID:26742848
Malczyk, Anna H.; Kupke, Alexandra; Prüfer, Steffen; Scheuplein, Vivian A.; Hutzler, Stefan; Kreuz, Dorothea; Beissert, Tim; Bauer, Stefanie; Hubich-Rau, Stefanie; Tondera, Christiane; Eldin, Hosam Shams; Schmidt, Jörg; Vergara-Alert, Júlia; Süzer, Yasemin; Seifried, Janna; Hanschmann, Kay-Martin; Kalinke, Ulrich; Herold, Susanne; Sahin, Ugur; Cichutek, Klaus; Waibler, Zoe; Eickmann, Markus; Becker, Stephan
2015-01-01
ABSTRACT In 2012, the first cases of infection with the Middle East respiratory syndrome coronavirus (MERS-CoV) were identified. Since then, more than 1,000 cases of MERS-CoV infection have been confirmed; infection is typically associated with considerable morbidity and, in approximately 30% of cases, mortality. Currently, there is no protective vaccine available. Replication-competent recombinant measles virus (MV) expressing foreign antigens constitutes a promising tool to induce protective immunity against corresponding pathogens. Therefore, we generated MVs expressing the spike glycoprotein of MERS-CoV in its full-length (MERS-S) or a truncated, soluble variant of MERS-S (MERS-solS). The genes encoding MERS-S and MERS-solS were cloned into the vaccine strain MVvac2 genome, and the respective viruses were rescued (MVvac2-CoV-S and MVvac2-CoV-solS). These recombinant MVs were amplified and characterized at passages 3 and 10. The replication of MVvac2-CoV-S in Vero cells turned out to be comparable to that of the control virus MVvac2-GFP (encoding green fluorescent protein), while titers of MVvac2-CoV-solS were impaired approximately 3-fold. The genomic stability and expression of the inserted antigens were confirmed via sequencing of viral cDNA and immunoblot analysis. In vivo, immunization of type I interferon receptor-deficient (IFNAR−/−)-CD46Ge mice with 2 × 105 50% tissue culture infective doses of MVvac2-CoV-S(H) or MVvac2-CoV-solS(H) in a prime-boost regimen induced robust levels of both MV- and MERS-CoV-neutralizing antibodies. Additionally, induction of specific T cells was demonstrated by T cell proliferation, antigen-specific T cell cytotoxicity, and gamma interferon secretion after stimulation of splenocytes with MERS-CoV-S presented by murine dendritic cells. MERS-CoV challenge experiments indicated the protective capacity of these immune responses in vaccinated mice. IMPORTANCE Although MERS-CoV has not yet acquired extensive distribution, being mainly confined to the Arabic and Korean peninsulas, it could adapt to spread more readily among humans and thereby become pandemic. Therefore, the development of a vaccine is mandatory. The integration of antigen-coding genes into recombinant MV resulting in coexpression of MV and foreign antigens can efficiently be achieved. Thus, in combination with the excellent safety profile of the MV vaccine, recombinant MV seems to constitute an ideal vaccine platform. The present study shows that a recombinant MV expressing MERS-S is genetically stable and induces strong humoral and cellular immunity against MERS-CoV in vaccinated mice. Subsequent challenge experiments indicated protection of vaccinated animals, illustrating the potential of MV as a vaccine platform with the potential to target emerging infections, such as MERS-CoV. PMID:26355094
Ozuna, Carmen V; Iehisa, Julio C M; Giménez, María J; Alvarez, Juan B; Sousa, Carolina; Barro, Francisco
2015-06-01
The gluten proteins from wheat, barley and rye are responsible both for celiac disease (CD) and for non-celiac gluten sensitivity, two pathologies affecting up to 6-8% of the human population worldwide. The wheat α-gliadin proteins contain three major CD immunogenic peptides: p31-43, which induces the innate immune response; the 33-mer, formed by six overlapping copies of three highly stimulatory epitopes; and an additional DQ2.5-glia-α3 epitope which partially overlaps with the 33-mer. Next-generation sequencing (NGS) and Sanger sequencing of α-gliadin genes from diploid and polyploid wheat provided six types of α-gliadins (named 1-6) with strong differences in their frequencies in diploid and polyploid wheat, and in the presence and abundance of these CD immunogenic peptides. Immunogenic variants of the p31-43 peptide were found in most of the α-gliadins. Variants of the DQ2.5-glia-α3 epitope were associated with specific types of α-gliadins. Remarkably, only type 1 α-gliadins contained 33-mer epitopes. Moreover, the full immunodominant 33-mer fragment was only present in hexaploid wheat at low abundance, probably as the result of allohexaploidization events from subtype 1.2 α-gliadins found only in Aegilops tauschii, the D-genome donor of hexaploid wheat. Type 3 α-gliadins seem to be the ancestral type as they are found in most of the α-gliadin-expressing Triticeae species. These findings are important for reducing the incidence of CD by the breeding/selection of wheat varieties with low stimulatory capacity of T cells. Moreover, advanced genome-editing techniques (TALENs, CRISPR) will be easier to implement on the small group of α-gliadins containing only immunogenic peptides. © 2015 Society for Experimental Biology and John Wiley & Sons Ltd.
Hospital Outbreak of Middle East Respiratory Syndrome Coronavirus
Assiri, Abdullah; McGeer, Allison; Perl, Trish M.; Price, Connie S.; Al Rabeeah, Abdullah A.; Cummings, Derek A.T.; Alabdullatif, Zaki N.; Assad, Maher; Almulhim, Abdulmohsen; Makhdoom, Hatem; Madani, Hossam; Alhakeem, Rafat; Al-Tawfiq, Jaffar A.; Cotten, Matthew; Watson, Simon J.; Kellam, Paul; Zumla, Alimuddin I.; Memish, Ziad A.
2013-01-01
BACKGROUND In September 2012, the World Health Organization reported the first cases of pneumonia caused by the novel Middle East respiratory syndrome coronavirus (MERS-CoV). We describe a cluster of health care–acquired MERS-CoV infections. METHODS Medical records were reviewed for clinical and demographic information and determination of potential contacts and exposures. Case patients and contacts were interviewed. The incubation period and serial interval (the time between the successive onset of symptoms in a chain of transmission) were estimated. Viral RNA was sequenced. RESULTS Between April 1 and May 23, 2013, a total of 23 cases of MERS-CoV infection were reported in the eastern province of Saudi Arabia. Symptoms included fever in 20 patients (87%), cough in 20 (87%), shortness of breath in 11 (48%), and gastrointestinal symptoms in 8 (35%); 20 patients (87%) presented with abnormal chest radiographs. As of June 12, a total of 15 patients (65%) had died, 6 (26%) had recovered, and 2 (9%) remained hospitalized. The median incubation period was 5.2 days (95% confidence interval [CI], 1.9 to 14.7), and the serial interval was 7.6 days (95% CI, 2.5 to 23.1). A total of 21 of the 23 cases were acquired by person-to-person transmission in hemodialysis units, intensive care units, or in-patient units in three different health care facilities. Sequencing data from four isolates revealed a single monophyletic clade. Among 217 household contacts and more than 200 health care worker contacts whom we identified, MERS-CoV infection developed in 5 family members (3 with laboratory-confirmed cases) and in 2 health care workers (both with laboratory-confirmed cases). CONCLUSIONS Person-to-person transmission of MERS-CoV can occur in health care settings and may be associated with considerable morbidity. Surveillance and infection-control measures are critical to a global public health response. PMID:23782161
Microbes in mercury-enriched geothermal springs in western North America.
Geesey, Gill G; Barkay, Tamar; King, Sue
2016-11-01
Because geothermal environments contain mercury (Hg) from natural sources, microorganisms that evolved in these systems have likely adapted to this element. Knowledge of the interactions between microorganisms and Hg in geothermal systems may assist in understanding the long-term evolution of microbial adaptation to Hg with relevance to other environments where Hg is introduced from anthropogenic sources. A number of microbiological studies with supporting geochemistry have been conducted in geothermal systems across western North America. Approximately 1 in 5 study sites include measurements of Hg. Of all prokaryotic taxa reported across sites with microbiological and accompanying physicochemical data, 42% have been detected at sites in which Hg was measured. Genes specifying Hg reduction and detoxification by microorganisms were detected in a number of hot springs across the region. Archaeal-like sequences, representing two crenarchaeal orders and one order each of the Euryarchaeota and Thaumarchaeota, dominated in metagenomes' MerA (the mercuric reductase protein) inventories, while bacterial homologs were mostly found in one deeply sequenced metagenome. MerA homologs were more frequently found in metagenomes of microbial communities in acidic springs than in circumneutral or high pH geothermal systems, possibly reflecting higher bioavailability of Hg under acidic conditions. MerA homologs were found in hot springs prokaryotic isolates affiliated with Bacteria and Archaea taxa. Acidic sites with high Hg concentrations contain more of Archaea than Bacteria taxa, while the reverse appears to be the case in circumneutral and high pH sites with high Hg concentrations. However, MerA was detected in only a small fraction of the Archaea and Bacteria taxa inhabiting sites containing Hg. Nevertheless, the presence of MerA homologs and their distribution patterns in systems, in which Hg has yet to be measured, demonstrates the potential for detoxification by Hg reduction in these geothermal systems, particularly the low pH springs that are dominated by Archaea. Copyright © 2016 Elsevier B.V. All rights reserved.
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.
Nature of frequent deletions in CEBPA.
Fuchs, Ota; Kostecka, Arnost; Provaznikova, Dana; Krasna, Blazena; Brezinova, Jana; Filkukova, Jitka; Kotlin, Roman; Kouba, Michal; Kobylka, Petr; Neuwirtova, Radana; Jonasova, Anna; Caniga, Miroslav; Schwarz, Jiri; Markova, Jana; Maaloufova, Jacqueline; Sponerova, Dana; Novakova, Ludmila; Cermak, Jaroslav
2009-01-01
C/EBPalpha (CCAAT/enhancer binding protein alpha) belongs to the family of leucine zipper transcription factors and is necessary for transcriptional control of granulocyte, adipocyte and hepatocyte differentiation, glucose metabolism and lung development. C/EBPalpha is encoded by an intronless gene. CEBPA mutations cause a myeloid differentiation block and were detected in acute myeloid leukemia (AML), myelodysplastic syndrome (MDS), multiple myeloma and non-Hodgkin's lymphoma (NHL) patients. In this study we identified in 41 individuals from 824 screened individuals (290 AML patients, 382 MDS patients, 56 NHL patients and 96 healthy individuals) a single class of 23 deletions in CEBPA gene which involved a direct repeat of at least 2 bp. These mutations are characterised by the loss of one of two same repeats at the ends of deleted sequence. Three most frequent repeats included in these deletions in CEBPA gene are CGCGAG (493-498_865-870), GCCAAGCAGC (508-517_907-916) and GG (486-487_885-886), all according to GenBank accession no. NM_004364.2. A mechanism for deletion formation between two repetitive sequences can be recombination events in the repair process. Double-stranded cut in DNA can initiate these recombination events of adjacent DNA sequences.
Li, Yan; Khalafalla, Abdelmalik Ibrahim; Paden, Clinton R; Yusof, Mohammed F; Eltahir, Yassir M; Al Hammadi, Zulaikha M; Tao, Ying; Queen, Krista; Hosani, Farida Al; Gerber, Susan I; Hall, Aron J; Al Muhairi, Salama; Tong, Suxiang
2017-01-01
Camels are known carriers for many viral pathogens, including Middle East respiratory syndrome coronavirus (MERS-CoV). It is likely that there are additional, as yet unidentified viruses in camels with the potential to cause disease in humans. In this study, we performed metagenomic sequencing analysis on nasopharyngeal swab samples from 108 MERS-CoV-positive dromedary camels from a live animal market in Abu Dhabi, United Arab Emirates. We obtained a total of 846.72 million high-quality reads from these nasopharyngeal swab samples, of which 2.88 million (0.34%) were related to viral sequences while 512.63 million (60.5%) and 50.87 million (6%) matched bacterial and eukaryotic sequences, respectively. Among the viral reads, sequences related to mammalian viruses from 13 genera in 10 viral families were identified, including Coronaviridae, Nairoviridae, Paramyxoviridae, Parvoviridae, Polyomaviridae, Papillomaviridae, Astroviridae, Picornaviridae, Poxviridae, and Genomoviridae. Some viral sequences belong to known camel or human viruses and others are from potentially novel camel viruses with only limited sequence similarity to virus sequences in GenBank. A total of five potentially novel virus species or strains were identified. Co-infection of at least two recently identified camel coronaviruses was detected in 92.6% of the camels in the study. This study provides a comprehensive survey of viruses in the virome of upper respiratory samples in camels that have extensive contact with the human population.
Edwards, W. Barry
2013-01-01
The aim of this study was to identify potential ligands of PSMA suitable for further development as novel PSMA-targeted peptides using phage display technology. The human PSMA protein was immobilized as a target followed by incubation with a 15-mer phage display random peptide library. After one round of prescreening and two rounds of screening, high-stringency screening at the third round of panning was performed to identify the highest affinity binders. Phages which had a specific binding activity to PSMA in human prostate cancer cells were isolated and the DNA corresponding to the 15-mers were sequenced to provide three consensus sequences: GDHSPFT, SHFSVGS and EVPRLSLLAVFL as well as other sequences that did not display consensus. Two of the peptide sequences deduced from DNA sequencing of binding phages, SHSFSVGSGDHSPFT and GRFLTGGTGRLLRIS were labeled with 5-carboxyfluorescein and shown to bind and co-internalize with PSMA on human prostate cancer cells by fluorescence microscopy. The high stringency requirements yielded peptides with affinities KD∼1 µM or greater which are suitable starting points for affinity maturation. While these values were less than anticipated, the high stringency did yield peptide sequences that apparently bound to different surfaces on PSMA. These peptide sequences could be the basis for further development of peptides for prostate cancer tumor imaging and therapy. PMID:23935860
Szatmari, I; Tókés, S; Dunn, C B; Bardos, T J; Aradi, J
2000-06-15
A polymerase chain reaction (PCR)-based radioactive telomerase assay was developed in our laboratory which is quantitative and does not require electrophoretic evaluation (designated as TP-TRAP; it utilizes two reverse primers). The main steps of the assay include (1) extension of a 20-mer oligonucleotide substrate (MTS) by telomerase, (2) amplification of the telomerase products in the presence of [(3)H]dTTP using the substrate oligonucleotide and two reverse primers (RPC3, 38 mer; RP, 20 mer), (3) isolation of the amplified radioactive dsDNA by precipitation and filtration, (4) determination of the radioactivity of the acid-insoluble DNA. The length of the telomerase products does not increase on amplification. This valuable feature of the assay is achieved by utilization of the two reverse primers and a highly specific PCR protocol. The assay is linear, accurate, and suitable for cell-biological studies where slight quantitative differences in telomerase activity must be detected. The assay is also suitable for screening and characterization of telomerase inhibitors, as shown with a chemically modified oligonucleotide reverse transcriptase inhibitor [(s(4)dU)(35)]. Copyright 2000 Academic Press.
Yuan, Yuan; Cao, Duanfang; Zhang, Yanfang; Ma, Jun; Qi, Jianxun; Wang, Qihui; Lu, Guangwen; Wu, Ying; Yan, Jinghua; Shi, Yi; Zhang, Xinzheng; Gao, George F
2017-04-10
The envelope spike (S) proteins of MERS-CoV and SARS-CoV determine the virus host tropism and entry into host cells, and constitute a promising target for the development of prophylactics and therapeutics. Here, we present high-resolution structures of the trimeric MERS-CoV and SARS-CoV S proteins in its pre-fusion conformation by single particle cryo-electron microscopy. The overall structures resemble that from other coronaviruses including HKU1, MHV and NL63 reported recently, with the exception of the receptor binding domain (RBD). We captured two states of the RBD with receptor binding region either buried (lying state) or exposed (standing state), demonstrating an inherently flexible RBD readily recognized by the receptor. Further sequence conservation analysis of six human-infecting coronaviruses revealed that the fusion peptide, HR1 region and the central helix are potential targets for eliciting broadly neutralizing antibodies.
Alshrari, Ahmed S.; Badroon, Nassrin A.; Hassan, Ahmed M.; Alsubhi, Tagreed L.; Ejeeli, Saleh
2017-01-01
We undertook enhanced surveillance of those presenting with respiratory symptoms at five healthcare centers by testing all symptomatic outpatients between November 2013 and January 2014 (winter time). Nasal swabs were collected from 182 patients and screened for MERS-CoV as well as other respiratory viruses using RT-PCR and multiplex microarray. A total of 75 (41.2%) of these patients had positive viral infection. MERS-CoV was not detected in any of the samples. Human rhinovirus (hRV) was the most detected pathogen (40.9%) followed by non-MERS-CoV human coronaviruses (19.3%), influenza (Flu) viruses (15.9%), and human respiratory syncytial virus (hRSV) (13.6%). Viruses differed markedly depending on age in which hRV, Flu A, and hCoV-OC43 were more prevalent in adults and RSV, hCoV-HKU1, and hCoV-NL63 were mostly restricted to children under the age of 15. Moreover, coinfection was not uncommon in this study, in which 17.3% of the infected patients had dual infections due to several combinations of viruses. Dual infections decreased with age and completely disappeared in people older than 45 years. Our study confirms that MERS-CoV is not common in the southwestern region of Saudi Arabia and shows high diversity and prevalence of other common respiratory viruses. This study also highlights the importance and contribution of enhanced surveillance systems for better infection control. PMID:28348590
Phenomenological Partial Specific Volumes for G-Quadruplex DNAs
Hellman, Lance M.; Rodgers, David W.; Fried, Michael G.
2009-01-01
Accurate partial specific volume (ν̄) values are required for sedimentation velocity and sedimentation equilibrium analyses. For nucleic acids, the estimation of these values is complicated by the fact that ν̄ depends on base composition, secondary structure, solvation and the concentrations and identities of ions in the surrounding buffer. Here we describe sedimentation equilibrium measurements of the apparent isopotential partial specific volume φ′ for two G-quadruplex DNAs and a single-stranded DNA of similar molecular weight and base composition. The G-quadruplex DNAs are a 22 nucleotide fragment of the human telomere consensus sequence and a 27 nucleotide fragment from the human c-myc promoter. The single-stranded DNA is 26 nucleotides long and is designed to have low propensity to form secondary structures. Parallel measurements were made in buffers containing NaCl and in buffers containing KCl, spanning the range 0.09M ≤ [salt] ≤ 2.3M. Limiting values of φ′, extrapolated to [salt] = 0M, were: 22-mer (NaCl-form), 0.525 ± 0.004 mL/g; 22-mer (KCl-form), 0.531 ± 0.006 mL/g; 27-mer (NaCl-form), 0.548 ± 0.005 mL/g; 27-mer (KCl-form), 0.557 ± 0.006 mL/g; 26-mer (NaCl-form), 0.555 ± 0.004 mL/g; 26-mer (KCl-form), 0.564 ± 0.006 mL/g. Small changes in φ′ with [salt] suggest that large changes in counterion association or hydration are unlikely to take place over these concentration ranges. PMID:19238377
Chimeric Antisense Oligonucleotide Conjugated to α-Tocopherol
Nishina, Tomoko; Numata, Junna; Nishina, Kazutaka; Yoshida-Tanaka, Kie; Nitta, Keiko; Piao, Wenying; Iwata, Rintaro; Ito, Shingo; Kuwahara, Hiroya; Wada, Takeshi; Mizusawa, Hidehiro; Yokota, Takanori
2015-01-01
We developed an efficient system for delivering short interfering RNA (siRNA) to the liver by using α-tocopherol conjugation. The α-tocopherol–conjugated siRNA was effective and safe for RNA interference–mediated gene silencing in vivo. In contrast, when the 13-mer LNA (locked nucleic acid)-DNA gapmer antisense oligonucleotide (ASO) was directly conjugated with α-tocopherol it showed markedly reduced silencing activity in mouse liver. Here, therefore, we tried to extend the 5′-end of the ASO sequence by using 5′-α-tocopherol–conjugated 4- to 7-mers of unlocked nucleic acid (UNA) as a “second wing.” Intravenous injection of mice with this α-tocopherol–conjugated chimeric ASO achieved more potent silencing than ASO alone in the liver, suggesting increased delivery of the ASO to the liver. Within the cells, the UNA wing was cleaved or degraded and α-tocopherol was released from the 13-mer gapmer ASO, resulting in activation of the gapmer. The α-tocopherol–conjugated chimeric ASO showed high efficacy, with hepatic tropism, and was effective and safe for gene silencing in vivo. We have thus identified a new, effective LNA-DNA gapmer structure in which drug delivery system (DDS) molecules are bound to ASO with UNA sequences. PMID:25584900
kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity.
Murray, Kevin D; Webers, Christfried; Ong, Cheng Soon; Borevitz, Justin; Warthmann, Norman
2017-09-01
Modern genomics techniques generate overwhelming quantities of data. Extracting population genetic variation demands computationally efficient methods to determine genetic relatedness between individuals (or "samples") in an unbiased manner, preferably de novo. Rapid estimation of genetic relatedness directly from sequencing data has the potential to overcome reference genome bias, and to verify that individuals belong to the correct genetic lineage before conclusions are drawn using mislabelled, or misidentified samples. We present the k-mer Weighted Inner Product (kWIP), an assembly-, and alignment-free estimator of genetic similarity. kWIP combines a probabilistic data structure with a novel metric, the weighted inner product (WIP), to efficiently calculate pairwise similarity between sequencing runs from their k-mer counts. It produces a distance matrix, which can then be further analysed and visualised. Our method does not require prior knowledge of the underlying genomes and applications include establishing sample identity and detecting mix-up, non-obvious genomic variation, and population structure. We show that kWIP can reconstruct the true relatedness between samples from simulated populations. By re-analysing several published datasets we show that our results are consistent with marker-based analyses. kWIP is written in C++, licensed under the GNU GPL, and is available from https://github.com/kdmurray91/kwip.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yin, Tongming; Difazio, Stephen P.; Gunter, Lee E
In an attempt to elucidate the molecular mechanisms of Melampsora rust resistance in Populus trichocarpa, we have mapped two resistance loci, MXC3 and MER, and intensively characterized the flanking genomic sequence for the MXC3 locus and the level of linkage disequilibrium (LD) in natural populations. We used an interspecific backcross pedigree and a genetic map that was highly saturated with AFLP and SSR markers, and assembled shotgun-sequence data in the region containing markers linked to MXC3. The two loci were mapped to different linkage groups. Linkage disequilibrium for MXC3 was confined to two closely linked regions spanning 34 and 16more » kb, respectively. The MXC3 region also contained six disease-resistance candidate genes. The MER and MXC3 loci are clearly distinct, and may have different mechanisms of resistance, as different classes of putative resistance genes were present near each locus. The suppressed recombination previously observed in the MXC3 region was possibly caused by extensive hemizygous rearrangements confined to the original parent tree. The relatively low observed LD may facilitate association studies using candidate genes for rust resistance, but will probably inhibit marker-aided selection.« less
A dictionary based informational genome analysis
2012-01-01
Background In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes. Results Any set of words (factors) occurring in a genome provides a genomic dictionary. About sixty genomes were analyzed by means of informational indexes based on genomic dictionaries, where a systemic view replaces a local sequence analysis. A software prototype applying a methodology here outlined carried out some computations on genomic data. We computed informational indexes, built the genomic dictionaries with different sizes, along with frequency distributions. The software performed three main tasks: computation of informational indexes, storage of these in a database, index analysis and visualization. The validation was done by investigating genomes of various organisms. A systematic analysis of genomic repeats of several lengths, which is of vivid interest in biology (for example to compute excessively represented functional sequences, such as promoters), was discussed, and suggested a method to define synthetic genetic networks. Conclusions We introduced a methodology based on dictionaries, and an efficient motif-finding software application for comparative genomics. This approach could be extended along many investigation lines, namely exported in other contexts of computational genomics, as a basis for discrimination of genomic pathologies. PMID:22985068
The role of gut microbiota in fetal methylmercury exposure: Insights from a pilot study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rothenberg, Sarah E.; Keiser, Sharon; Ajami, Nadim J.
The mechanisms by which gut microbiota contribute to methylmercury metabolism remain unclear. Among a cohort of pregnant mothers, the main objectives of our pilot study were to determine 1) associations between gut microbiota and mercury concentrations in biomarkers (stool, hair and cord blood) and 2) the contributions of gut microbial mercury methylation/demethylation to stool methylmercury. Moreover, for pregnant women (36-39 weeks gestation, n=17) donated hair and stool specimens, and cord blood was collected for a subset (n=7). The diversity of gut microbiota was determined using 16S rRNA gene profiling (n=17). For 6 stool samples with highest/lowest methylmercury concentrations, metagenomic wholemore » genome shotgun sequencing was employed to search for one mercury methylation gene (hgcA), and two mer operon genes involved in methylmercury detoxification (merA and merB). There were seventeen bacterial genera that were significantly correlated (increasing or decreasing) with stool methylmercury, stool inorganic mercury, or hair total mercury; however, aside from one genus, there was no overlap between biomarkers. No definitive matches for hgcA or merB, while merA were detected at low concentrations in all six samples. Proportional differences in stool methylmercury were not likely attributed to gut microbiota through methylation/demethylation. Gut microbiota potentially altered methylmercury metabolism using indirect pathways.« less
The role of gut microbiota in fetal methylmercury exposure: Insights from a pilot study
Rothenberg, Sarah E.; Keiser, Sharon; Ajami, Nadim J.; ...
2016-02-01
The mechanisms by which gut microbiota contribute to methylmercury metabolism remain unclear. Among a cohort of pregnant mothers, the main objectives of our pilot study were to determine 1) associations between gut microbiota and mercury concentrations in biomarkers (stool, hair and cord blood) and 2) the contributions of gut microbial mercury methylation/demethylation to stool methylmercury. Moreover, for pregnant women (36-39 weeks gestation, n=17) donated hair and stool specimens, and cord blood was collected for a subset (n=7). The diversity of gut microbiota was determined using 16S rRNA gene profiling (n=17). For 6 stool samples with highest/lowest methylmercury concentrations, metagenomic wholemore » genome shotgun sequencing was employed to search for one mercury methylation gene (hgcA), and two mer operon genes involved in methylmercury detoxification (merA and merB). There were seventeen bacterial genera that were significantly correlated (increasing or decreasing) with stool methylmercury, stool inorganic mercury, or hair total mercury; however, aside from one genus, there was no overlap between biomarkers. No definitive matches for hgcA or merB, while merA were detected at low concentrations in all six samples. Proportional differences in stool methylmercury were not likely attributed to gut microbiota through methylation/demethylation. Gut microbiota potentially altered methylmercury metabolism using indirect pathways.« less
Yin, H; Medstrand, P; Kristofferson, A; Dietrich, U; Aman, P; Blomberg, J
1999-03-30
Previously, we found a retroviral sequence, HML-6.2BC1, to be expressed at high levels in a multifocal ductal breast cancer from a 41-year-old woman who also developed ovarian carcinoma. The sequence of a human genomic clone (HML-6.28) selected by high-stringency hybridization with HML-6.2BC1 is reported here. It was 99% identical to HML-6.2BC1 and gave the same restriction fragments as total DNA. HML-6.28 is a 4.7-kb provirus with a 5'LTR, truncated in RT. Data from two similar genomic clones and sequences found in GenBank are also reported. Overlaps between them gave a rather complete picture of the HML-6.2BC1-like human endogenous retroviral elements. Work with somatic cell hybrids and FISH localized HML-6.28 to chromosome 6, band p21, close to the MHC region. The causal role of HML-6.28 in breast cancer remains unclear. Nevertheless, the ca. 20 Myr old HML-6 sequences enabled the definition of common and unique features of type A, B, and D (ABD) retroviruses. In Gag, HML-6 has no intervening sequences between matrix and capsid proteins, unlike extant exogenous ABD viruses, possibly an ancestral feature. Alignment of the dUTPase showed it to be present in all ABD viruses, but gave a phylogenetic tree different from trees made from other ABD genes, indicating a distinct phylogeny of dUTPase. A conserved 24-mer sequence in the amino terminus of some ABD envelope genes suggested a conserved function. Copyright 1999 Academic Press.
Liu, Di; Zeng, Shao-Hua; Chen, Jian-Jun; Zhang, Yan-Jun; Xiao, Gong; Zhu, Lin-Yao; Wang, Ying
2013-01-01
Epimedium sagittatum (Sieb. et Zucc) Maxim is a member of the Berberidaceae family of basal eudicot plants, widely distributed and used as a traditional medicinal plant in China for therapeutic effects on many diseases with a long history. Recent data shows that E. sagittatum has a relatively large genome, with a haploid genome size of ~4496 Mbp, divided into a small number of only 12 diploid chromosomes (2n = 2x = 12). However, little is known about Epimedium genome structure and composition. Here we present the analysis of 691 kb of high-quality genomic sequence derived from 672 randomly selected plasmid clones of E. sagittatum genomic DNA, representing ~0.0154% of the genome. The sampled sequences comprised at least 78.41% repetitive DNA elements and 2.51% confirmed annotated gene sequences, with a total GC% content of 39%. Retrotransposons represented the major class of transposable element (TE) repeats identified (65.37% of all TE repeats), particularly LTR (Long Terminal Repeat) retrotransposons (52.27% of all TE repeats). Chromosome analysis and Fluorescence in situ Hybridization of Gypsy-Ty3 retrotransposons were performed to survey the E. sagittatum genome at the cytological level. Our data provide the first insights into the composition and structure of the E. sagittatum genome, and will facilitate the functional genomic analysis of this valuable medicinal plant. PMID:23807511
H-Bond Self-Assembly: Folding versus Duplex Formation.
Núñez-Villanueva, Diego; Iadevaia, Giulia; Stross, Alexander E; Jinks, Michael A; Swain, Jonathan A; Hunter, Christopher A
2017-05-17
Linear oligomers equipped with complementary H-bond donor (D) and acceptor (A) sites can interact via intermolecular H-bonds to form duplexes or fold via intramolecular H-bonds. These competing equilibria have been quantified using NMR titration and dilution experiments for seven systems featuring different recognition sites and backbones. For all seven architectures, duplex formation is observed for homo-sequence 2-mers (AA·DD) where there are no competing folding equilibria. The corresponding hetero-sequence AD 2-mers also form duplexes, but the observed self-association constants are strongly affected by folding equilibria in the monomeric states. When the backbone is flexible (five or more rotatable bonds separating the recognition sites), intramolecular H-bonding is favored, and the folded state is highly populated. For these systems, the stability of the AD·AD duplex is 1-2 orders of magnitude lower than that of the corresponding AA·DD duplex. However, for three architectures which have more rigid backbones (fewer than five rotatable bonds), intramolecular interactions are not observed, and folding does not compete with duplex formation. These systems are promising candidates for the development of longer, mixed-sequence synthetic information molecules that show sequence-selective duplex formation.
Probabilistic topic modeling for the analysis and classification of genomic sequences
2015-01-01
Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734
Quantum-dot-based quantitative identification of pathogens in complex mixture
NASA Astrophysics Data System (ADS)
Lim, Sun Hee; Bestwater, Felix; Buchy, Philippe; Mardy, Sek; Yu, Alexey Dan Chin
2010-02-01
In the present study we describe sandwich design hybridization probes consisting of magnetic particles (MP) and quantum dots (QD) with target DNA, and their application in the detection of avian influenza virus (H5N1) sequences. Hybridization of 25-, 40-, and 100-mer target DNA with both probes was analyzed and quantified by flow cytometry and fluorescence microscopy on the scale of single particles. The following steps were used in the assay: (i) target selection by MP probes and (ii) target detection by QD probes. Hybridization efficiency between MP conjugated probes and target DNA hybrids was controlled by a fluorescent dye specific for nucleic acids. Fluorescence was detected by flow cytometry to distinguish differences in oligo sequences as short as 25-mer capturing in target DNA and by gel-electrophoresis in the case of QD probes. This report shows that effective manipulation and control of micro- and nanoparticles in hybridization assays is possible.
Tiazhelova, T V; Ivanov, D V; Makeeva, N V; Kapanadze, B I; Nikitin, E A; Semov, A B; Sangfeldt, O; Grander, D; Vorob'ev, A I; Einhorn, S; Iankovskiĭ, N K; Baranova, A V
2001-11-01
Deletions in the region located between the STS markers D13S1168 and D13S25 on chromosome 13 are the most frequent genomic changes in patients with B-cell chronic lymphocytic leukemia (B-CLL). After sequencing of this region, two novel candidate genes were identified: C13orf1 (chromosome 13 open reading frame 1) and PLCC (putative large CLL candidate). Analysis of the repeat distribution revealed two subregions differing in composition of repetitious DNA and gene organization. The interval D13S1168-D13S319 contains 131 Alu repeats accounting for 24.8% of its length, whereas the interval GCT16C05-D13S25, which is no more than 180 kb away from the former one is extremely poor in Alu repeats (4.1% of the total length). Both intervals contain almost the same amount of the LINE-type repeats L1 and L2 (20.3 and 21.24%, respectively). In the chromosomal region studied, 29 Alu repeats were found to belong to the evolutionary young subfamily Y, which is still capable of amplifying. A considerable proportion of repeats of this type with similar nucleotide sequences may contribute to the recombinational activity of the chromosomal region 13q14.3, which is responsible for its rearrangements in some tumors in humans.
Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram P; Gupta, Deepak K; Singh, Sangeeta; Dogra, Vivek; Gaikwad, Kishor; Sharma, Tilak R; Raje, Ranjeet S; Bandhopadhya, Tapas K; Datta, Subhojit; Singh, Mahendra N; Bashasab, Fakrudin; Kulwal, Pawan; Wanjari, K B; K Varshney, Rajeev; Cook, Douglas R; Singh, Nagendra K
2011-01-20
Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥ 18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea.
2011-01-01
Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea. PMID:21251263
KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies.
Mapleson, Daniel; Garcia Accinelli, Gonzalo; Kettleborough, George; Wright, Jonathan; Clavijo, Bernardo J
2017-02-15
De novo assembly of whole genome shotgun (WGS) next-generation sequencing (NGS) data benefits from high-quality input with high coverage. However, in practice, determining the quality and quantity of useful reads quickly and in a reference-free manner is not trivial. Gaining a better understanding of the WGS data, and how that data is utilized by assemblers, provides useful insights that can inform the assembly process and result in better assemblies. We present the K-mer Analysis Toolkit (KAT): a multi-purpose software toolkit for reference-free quality control (QC) of WGS reads and de novo genome assemblies, primarily via their k-mer frequencies and GC composition. KAT enables users to assess levels of errors, bias and contamination at various stages of the assembly process. In this paper we highlight KAT's ability to provide valuable insights into assembly composition and quality of genome assemblies through pairwise comparison of k-mers present in both input reads and the assemblies. KAT is available under the GPLv3 license at: https://github.com/TGAC/KAT . bernardo.clavijo@earlham.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Autoantibody recognition mechanisms of p53 epitopes
NASA Astrophysics Data System (ADS)
Phillips, J. C.
2016-06-01
There is an urgent need for economical blood based, noninvasive molecular biomarkers to assist in the detection and diagnosis of cancers in a cost-effective manner at an early stage, when curative interventions are still possible. Serum autoantibodies are attractive biomarkers for early cancer detection, but their development has been hindered by the punctuated genetic nature of the ten million known cancer mutations. A landmark study of 50,000 patients (Pedersen et al., 2013) showed that a few p53 15-mer epitopes are much more sensitive colon cancer biomarkers than p53, which in turn is a more sensitive cancer biomarker than any other protein. The function of p53 as a nearly universal ;tumor suppressor; is well established, because of its strong immunogenicity in terms of not only antibody recruitment, but also stimulation of autoantibodies. Here we examine dimensionally compressed bioinformatic fractal scaling analysis for identifying the few sensitive epitopes from the p53 amino acid sequence, and show how it could be used for early cancer detection (ECD). We trim 15-mers to 7-mers, and identify specific 7-mers from other species that could be more sensitive to aggressive human cancers, such as liver cancer. Our results could provide a roadmap for ECD.
NASA Technical Reports Server (NTRS)
Aghazarian, Hrand
2009-01-01
The R4SA GUI mentioned in the immediately preceding article is a userfriendly interface for controlling one or more robot(s). This GUI makes it possible to perform meaningful real-time field experiments and research in robotics at an unmatched level of fidelity, within minutes of setup. It provides such powerful graphing modes as that of a digitizing oscilloscope that displays up to 250 variables at rates between 1 and 200 Hz. This GUI can be configured as multiple intuitive interfaces for acquisition of data, command, and control to enable rapid testing of subsystems or an entire robot system while simultaneously performing analysis of data. The R4SA software establishes an intuitive component-based design environment that can be easily reconfigured for any robotic platform by creating or editing setup configuration files. The R4SA GUI enables event-driven and conditional sequencing similar to those of Mars Exploration Rover (MER) operations. It has been certified as part of the MER ground support equipment and, therefore, is allowed to be utilized in conjunction with MER flight hardware. The R4SA GUI could also be adapted to use in embedded computing systems, other than that of the MER, for commanding and real-time analysis of data.
Silk-based biomaterials functionalized with fibronectin type II promotes cell adhesion.
Pereira, Ana Margarida; Machado, Raul; da Costa, André; Ribeiro, Artur; Collins, Tony; Gomes, Andreia C; Leonor, Isabel B; Kaplan, David L; Reis, Rui L; Casal, Margarida
2017-01-01
The objective of this work was to exploit the fibronectin type II (FNII) module from human matrix metalloproteinase-2 as a functional domain for the development of silk-based biopolymer blends that display enhanced cell adhesion properties. The DNA sequence of spider dragline silk protein (6mer) was genetically fused with the FNII coding sequence and expressed in Escherichia coli. The chimeric protein 6mer+FNII was purified by non-chromatographic methods. Films prepared from 6mer+FNII by solvent casting promoted only limited cell adhesion of human skin fibroblasts. However, the performance of the material in terms of cell adhesion was significantly improved when 6mer+FNII was combined with a silk-elastin-like protein in a concentration-dependent behavior. With this work we describe a novel class of biopolymer that promote cell adhesion and potentially useful as biomaterials for tissue engineering and regenerative medicine. This work reports the development of biocompatible silk-based composites with enhanced cell adhesion properties suitable for biomedical applications in regenerative medicine. The biocomposites were produced by combining a genetically engineered silk-elastin-like protein with a genetically engineered spider-silk-based polypeptide carrying the three domains of the fibronectin type II module from human metalloproteinase-2. These composites were processed into free-standing films by solvent casting and characterized for their biological behavior. To our knowledge this is the first report of the exploitation of all three FNII domains as a functional domain for the development of bioinspired materials with improved biological performance. The present study highlights the potential of using genetically engineered protein-based composites as a platform for the development of new bioinspired biomaterials. Copyright © 2016 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Yingst, R Aileen; Cohen, B A; Crumpler, L; Schmidt, M E; Schrader, C M
2011-01-01
We tested the science operational strategy used for the Mars Exploration Rover (MER) mission on Mars to determine its suitability for conducting remote geology on the Moon by conducting a field test at Cerro de Santa Clara, New Mexico. This region contains volcanic and sedimentary products from a variety of provenances, mimicking the variety that might be found at a lunar site such as South Pole-Aitken Basin. At each site a Science Team broke down observational "days" into a sequence of observations of features and targets of interest. The number, timing, and sequence of observations was chosen to mimic those used by the MERs when traversing. Images simulating high-resolution stereo and hand lens-scale images were taken using a professional SLR digital camera; multispectral and XRD data were acquired from samples to mimic the availability of geochemical data. A separate Tiger Team followed the Science Team and examined each site using traditional terrestrial field methods, facilitating comparison between what was revealed by human versus rover-inspired methods. We conclude from this field test that MER-inspired methodology is not conducive to utilizing all acquired data in a timely manner for the case of any lunar architecture that involves the acquisition of rover data in near real-time. We additionally conclude that a methodology similar to that used for MER can be adapted for use on the Moon if mission goals are focused on reconnaissance. If the goal is to locate and identify a specific feature or material, such as water ice, a different methodology will likely be needed.
Yingst, R. Aileen; Cohen, B. A.; Crumpler, L.; Schmidt, M. E.; Schrader, C. M.
2017-01-01
Background We tested the science operational strategy used for the Mars Exploration Rover (MER) mission on Mars to determine its suitability for conducting remote geology on the Moon by conducting a field test at Cerro de Santa Clara, New Mexico. This region contains volcanic and sedimentary products from a variety of provenances, mimicking the variety that might be found at a lunar site such as South Pole-Aitken Basin. Method At each site a Science Team broke down observational “days” into a sequence of observations of features and targets of interest. The number, timing, and sequence of observations was chosen to mimic those used by the MERs when traversing. Images simulating high-resolution stereo and hand lens-scale images were taken using a professional SLR digital camera; multispectral and XRD data were acquired from samples to mimic the availability of geochemical data. A separate Tiger Team followed the Science Team and examined each site using traditional terrestrial field methods, facilitating comparison between what was revealed by human versus rover-inspired methods. Lessons Learned We conclude from this field test that MER-inspired methodology is not conducive to utilizing all acquired data in a timely manner for the case of any lunar architecture that involves the acquisition of rover data in near real-time. We additionally conclude that a methodology similar to that used for MER can be adapted for use on the Moon if mission goals are focused on reconnaissance. If the goal is to locate and identify a specific feature or material, such as water ice, a different methodology will likely be needed. PMID:29309066
A FASTQ compressor based on integer-mapped k-mer indexing for biologist.
Zhang, Yeting; Patel, Khyati; Endrawis, Tony; Bowers, Autumn; Sun, Yazhou
2016-03-15
Next generation sequencing (NGS) technologies have gained considerable popularity among biologists. For example, RNA-seq, which provides both genomic and functional information, has been widely used by recent functional and evolutionary studies, especially in non-model organisms. However, storing and transmitting these large data sets (primarily in FASTQ format) have become genuine challenges, especially for biologists with little informatics experience. Data compression is thus a necessity. KIC, a FASTQ compressor based on a new integer-mapped k-mer indexing method, was developed (available at http://www.ysunlab.org/kic.jsp). It offers high compression ratio on sequence data, outstanding user-friendliness with graphic user interfaces, and proven reliability. Evaluated on multiple large RNA-seq data sets from both human and plants, it was found that the compression ratio of KIC had exceeded all major generic compressors, and was comparable to those of the latest dedicated compressors. KIC enables researchers with minimal informatics training to take advantage of the latest sequence compression technologies, easily manage large FASTQ data sets, and reduce storage and transmission cost. Copyright © 2015 Elsevier B.V. All rights reserved.
Liu, Le; Zhang, Shijie; Lian, Chunlan
2015-01-01
Japanese red pine (Pinus densiflora) is extensively cultivated in Japan, Korea, China, and Russia and is harvested for timber, pulpwood, garden, and paper markets. However, genetic information and molecular markers were very scarce for this species. In this study, over 51 million sequencing clean reads from P. densiflora mRNA were produced using Illumina paired-end sequencing technology. It yielded 83,913 unigenes with a mean length of 751 bp, of which 54,530 (64.98%) unigenes showed similarity to sequences in the NCBI database. Among which the best matches in the NCBI Nr database were Picea sitchensis (41.60%), Amborella trichopoda (9.83%), and Pinus taeda (4.15%). A total of 1953 putative microsatellites were identified in 1784 unigenes using MISA (MicroSAtellite) software, of which the tri-nucleotide repeats were most abundant (50.18%) and 629 EST-SSR (expressed sequence tag- simple sequence repeats) primer pairs were successfully designed. Among 20 EST-SSR primer pairs randomly chosen, 17 markers yielded amplification products of the expected size in P. densiflora. Our results will provide a valuable resource for gene-function analysis, germplasm identification, molecular marker-assisted breeding and resistance-related gene(s) mapping for pine for P. densiflora. PMID:26690126
Characterization of IS1515, a Functional Insertion Sequence in Streptococcus pneumoniae
Muñoz, Rosario; López, Rubens; García, Ernesto
1998-01-01
We describe the characterization of a new insertion sequence, IS1515, identified in the genome of Streptococcus pneumoniae I41R, an unencapsulated mutant isolated many years ago (R. Austrian, H. P. Bernheimer, E. E. B. Smith, and G. T. Mills, J. Exp. Med. 110:585–602, 1959). A copy of this element located in the cap1EI41R gene was sequenced. The 871-bp-long IS1515 element possesses 12-bp perfect inverted repeats and generates a 3-bp target duplication upon insertion. The IS encodes a protein of 271 amino acid residues similar to the putative transposases of other insertion sequences, namely IS1381 from S. pneumoniae, ISL2 from Lactobacillus helveticus, IS702 from the cyanobacterium Calothrix sp. strain PCC 7601, and IS112 from Streptomyces albus G. IS1515 appears to be present in the genome of most type 1 pneumococci in a maximum of 13 copies, although it has also been found in the chromosome of pneumococcal isolates belonging to other serotypes. We have found that the unencapsulated phenotype of strain I41R is the result of both the presence of an IS1515 copy and a frameshift mutation in the cap1EI41R gene. Precise excision of the IS was observed in the type 1 encapsulated transformants isolated in experiments designed to repair the frameshift. These results reveal that IS1515 behaves quite differently from other previously described pneumococcal insertion sequences. Several copies of IS1515 were also able to excise and move to another locations in the chromosome of S. pneumoniae. To our knowledge, this is the first report of a functional IS in pneumococcus. PMID:9580131
Eltahir, Yassir M.; Al Hammadi, Zulaikha M.; Tao, Ying; Queen, Krista; Hosani, Farida Al; Gerber, Susan I.; Hall, Aron J.; Al Muhairi, Salama
2017-01-01
Camels are known carriers for many viral pathogens, including Middle East respiratory syndrome coronavirus (MERS-CoV). It is likely that there are additional, as yet unidentified viruses in camels with the potential to cause disease in humans. In this study, we performed metagenomic sequencing analysis on nasopharyngeal swab samples from 108 MERS-CoV-positive dromedary camels from a live animal market in Abu Dhabi, United Arab Emirates. We obtained a total of 846.72 million high-quality reads from these nasopharyngeal swab samples, of which 2.88 million (0.34%) were related to viral sequences while 512.63 million (60.5%) and 50.87 million (6%) matched bacterial and eukaryotic sequences, respectively. Among the viral reads, sequences related to mammalian viruses from 13 genera in 10 viral families were identified, including Coronaviridae, Nairoviridae, Paramyxoviridae, Parvoviridae, Polyomaviridae, Papillomaviridae, Astroviridae, Picornaviridae, Poxviridae, and Genomoviridae. Some viral sequences belong to known camel or human viruses and others are from potentially novel camel viruses with only limited sequence similarity to virus sequences in GenBank. A total of five potentially novel virus species or strains were identified. Co-infection of at least two recently identified camel coronaviruses was detected in 92.6% of the camels in the study. This study provides a comprehensive survey of viruses in the virome of upper respiratory samples in camels that have extensive contact with the human population. PMID:28902913
Chan, Jasper Fuk-Woo; Choi, Garnet Kwan-Yue; Tsang, Alan Ka-Lun; Tee, Kah-Meng; Lam, Ho-Yin; Yip, Cyril Chik-Yan; To, Kelvin Kai-Wang; Cheng, Vincent Chi-Chung; Yeung, Man-Lung; Lau, Susanna Kar-Pui; Woo, Patrick Chiu-Yat; Chan, Kwok-Hung; Tang, Bone Siu-Fai
2015-01-01
Based on findings in small RNA-sequencing (Seq) data analysis, we developed highly sensitive and specific real-time reverse transcription (RT)-PCR assays with locked nucleic acid probes targeting the abundantly expressed leader sequences of Middle East respiratory syndrome coronavirus (MERS-CoV) and other human coronaviruses. Analytical and clinical evaluations showed their noninferiority to a commercial multiplex PCR test for the detection of these coronaviruses. PMID:26019210
NASA Technical Reports Server (NTRS)
Zhang, Zhengdong; Willson, Richard C.; Fox, George E.
2002-01-01
MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.
Luczak, Brian B; James, Benjamin T; Girgis, Hani Z
2017-12-06
Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
Chou, Cheng-Chung; Huang, Yi-Han
2012-01-01
This paper reports a nucleic acid sandwich hybridization assay with a quantum dot (QD)-induced fluorescence resonance energy transfer (FRET) reporter system. Two label-free hemagglutinin H5 sequences (60-mer DNA and 630-nt cDNA fragment) of avian influenza viruses were used as the targets in this work. Two oligonucleotides (16 mers and 18 mers) that specifically recognize two separate but neighboring regions of the H5 sequences were served as the capturing and reporter probes, respectively. The capturing probe was conjugated to QD655 (donor) in a molar ratio of 10:1 (probe-to-QD), and the reporter probe was labeled with Alexa Fluor 660 dye (acceptor) during synthesis. The sandwich hybridization assay was done in a 20 μL transparent, adhesive frame-confined microchamber on a disposable, temperature-adjustable indium tin oxide (ITO) glass slide. The FRET signal in response to the sandwich hybridization was monitored by a homemade optical sensor comprising a single 400 nm UV light-emitting diode (LED), optical fibers, and a miniature 16-bit spectrophotometer. The target with a concentration ranging from 0.5 nM to 1 μM was successfully correlated with both QD emission decrease at 653 nm and dye emission increase at 690 nm. To sum up, this work is beneficial for developing a portable QD-based nucleic acid sensor for on-site pathogen detection. PMID:23211753
2011-01-01
Background One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences. Results The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera. Conclusions This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak. PMID:21645357
Long-read sequence assembly of the firefly Pyrocoelia pectoralis genome
Fu, Xinhua; Li, Jingjing; Tian, Yu; Quan, Weipeng; Zhang, Shu; Liu, Qian; Liang, Fan; Zhu, Xinlei; Zhang, Liangsheng
2017-01-01
Abstract Background Fireflies are a family of insects within the beetle order Coleoptera, or winged beetles, and they are one of the most well-known and loved insect species because of their bioluminescence. However, the firefly is in danger of extinction because of the massive destruction of its living environment. In order to improve the understanding of fireflies and protect them effectively, we sequenced the whole genome of the terrestrial firefly Pyrocoelia pectoralis. Findings Here, we developed a highly reliable genome resource for the terrestrial firefly Pyrocoelia pectoralis (E. Oliv., 1883; Coleoptera: Lampyridae) using single molecule real time (SMRT) sequencing on the PacBio Sequel platform. In total, 57.8 Gb of long reads were generated and assembled into a 760.4-Mb genome, which is close to the estimated genome size and covered 98.7% complete and 0.7% partial insect Benchmarking Universal Single-Copy Orthologs. The k-mer analysis showed that this genome is highly heterozygous. However, our long-read assembly demonstrates continuousness with a contig N50 length of 3.04 Mb and the longest contig length of 13.69 Mb. Furthermore, 135 589 SSRs and 341 Mb of repeat sequences were detected. A total of 23 092 genes were predicted; 88.44% of genes were annotated with one or more related functions. Conclusions We assembled a high-quality firefly genome, which will not only provide insights into the conservation and biodiversity of fireflies, but also provide a wealth of information to study the mechanisms of their sexual communication, bio-luminescence, and evolution. PMID:29186486
Application of State Analysis and Goal-based Operations to a MER Mission Scenario
NASA Technical Reports Server (NTRS)
Morris, John Richard; Ingham, Michel D.; Mishkin, Andrew H.; Rasmussen, Robert D.; Starbird, Thomas W.
2006-01-01
State Analysis is a model-based systems engineering methodology employing a rigorous discovery process which articulates operations concepts and operability needs as an integrated part of system design. The process produces requirements on system and software design in the form of explicit models which describe the system behavior in terms of state variables and the relationships among them. By applying State Analysis to an actual MER flight mission scenario, this study addresses the specific real world challenges of complex space operations and explores technologies that can be brought to bear on future missions. The paper first describes the tools currently used on a daily basis for MER operations planning and provides an in-depth description of the planning process, in the context of a Martian day's worth of rover engineering activities, resource modeling, flight rules, science observations, and more. It then describes how State Analysis allows for the specification of a corresponding goal-based sequence that accomplishes the same objectives, with several important additional benefits.
Application of State Analysis and Goal-Based Operations to a MER Mission Scenario
NASA Technical Reports Server (NTRS)
Morris, J. Richard; Ingham, Michel D.; Mishkin, Andrew H.; Rasmussen, Robert D.; Starbird, Thomas W.
2006-01-01
State Analysis is a model-based systems engineering methodology employing a rigorous discovery process which articulates operations concepts and operability needs as an integrated part of system design. The process produces requirements on system and software design in the form of explicit models which describe the behavior of states and the relationships among them. By applying State Analysis to an actual MER flight mission scenario, this study addresses the specific real world challenges of complex space operations and explores technologies that can be brought to bear on future missions. The paper describes the tools currently used on a daily basis for MER operations planning and provides an in-depth description of the planning process, in the context of a Martian day's worth of rover engineering activities, resource modeling, flight rules, science observations, and more. It then describes how State Analysis allows for the specification of a corresponding goal-based sequence that accomplishes the same objectives, with several important additional benefits.
An mRNA-Derived Noncoding RNA Targets and Regulates the Ribosome
Pircher, Andreas; Bakowska-Zywicka, Kamilla; Schneider, Lukas; Zywicki, Marek; Polacek, Norbert
2014-01-01
Summary The structural and functional repertoire of small non-protein-coding RNAs (ncRNAs) is central for establishing gene regulation networks in cells and organisms. Here, we show that an mRNA-derived 18-nucleotide-long ncRNA is capable of downregulating translation in Saccharomyces cerevisiae by targeting the ribosome. This 18-mer ncRNA binds to polysomes upon salt stress and is crucial for efficient growth under hyperosmotic conditions. Although the 18-mer RNA originates from the TRM10 locus, which encodes a tRNA methyltransferase, genetic analyses revealed the 18-mer RNA nucleotide sequence, rather than the mRNA-encoded enzyme, as the translation regulator. Our data reveal the ribosome as a target for a small regulatory ncRNA and demonstrate the existence of a yet unkown mechanism of translation regulation. Ribosome-targeted small ncRNAs are found in all domains of life and represent a prevalent but so far largely unexplored class of regulatory molecules. PMID:24685157
DETECTION OF DNA DAMAGE USING A FIBEROPTIC BIOSENSOR
A rapid and sensitive fiber optic biosensor assay for radiation-induced DNA damage is reported. For this assay, a biotin-labeled capture oligonucleotide (38 mer) was immobilized to an avidin-coated quartz fiber. Hybridization of a dye-labeled complementary sequence was observed...
A novel alignment-free method for detection of lateral genetic transfer based on TF-IDF.
Cong, Yingnan; Chan, Yao-Ban; Ragan, Mark A
2016-07-25
Lateral genetic transfer (LGT) plays an important role in the evolution of microbes. Existing computational methods for detecting genomic regions of putative lateral origin scale poorly to large data. Here, we propose a novel method based on TF-IDF (Term Frequency-Inverse Document Frequency) statistics to detect not only regions of lateral origin, but also their origin and direction of transfer, in sets of hierarchically structured nucleotide or protein sequences. This approach is based on the frequency distributions of k-mers in the sequences. If a set of contiguous k-mers appears sufficiently more frequently in another phyletic group than in its own, we infer that they have been transferred from the first group to the second. We performed rigorous tests of TF-IDF using simulated and empirical datasets. With the simulated data, we tested our method under different parameter settings for sequence length, substitution rate between and within groups and post-LGT, deletion rate, length of transferred region and k size, and found that we can detect LGT events with high precision and recall. Our method performs better than an established method, ALFY, which has high recall but low precision. Our method is efficient, with runtime increasing approximately linearly with sequence length.
Zhang, Qian; Jun, Se -Ran; Leuze, Michael; ...
2017-01-19
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Qian; Jun, Se -Ran; Leuze, Michael
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral tree of life . However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conservedmore » proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. Lastly, the resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses.« less
Zhang, Qian; Jun, Se-Ran; Leuze, Michael; Ussery, David; Nookaew, Intawat
2017-01-01
The development of rapid, economical genome sequencing has shed new light on the classification of viruses. As of October 2016, the National Center for Biotechnology Information (NCBI) database contained >2 million viral genome sequences and a reference set of ~4000 viral genome sequences that cover a wide range of known viral families. Whole-genome sequences can be used to improve viral classification and provide insight into the viral “tree of life”. However, due to the lack of evolutionary conservation amongst diverse viruses, it is not feasible to build a viral tree of life using traditional phylogenetic methods based on conserved proteins. In this study, we used an alignment-free method that uses k-mers as genomic features for a large-scale comparison of complete viral genomes available in RefSeq. To determine the optimal feature length, k (an essential step in constructing a meaningful dendrogram), we designed a comprehensive strategy that combines three approaches: (1) cumulative relative entropy, (2) average number of common features among genomes, and (3) the Shannon diversity index. This strategy was used to determine k for all 3,905 complete viral genomes in RefSeq. The resulting dendrogram shows consistency with the viral taxonomy of the ICTV and the Baltimore classification of viruses. PMID:28102365
Gazarian, Karlen G; Palacios-Rodríguez, Yadira; Gazarian, Tatiana G; Huerta, Leonor
2013-06-01
The crown region of the V3 loop in HIV-1 that contains the conserved amino acid sequence GPGR/G is known as the principal neutralizing determinant due to the extraordinary ability of antibodies to this region to neutralize the virus. To complement the existing peptide models of this epitope, we describe a family of 18 phage-displayed peptides, which include linear 12mer and constrained 7mer peptides that was selected by screening random libraries with serum from HIV-1 subtype B-infected patients. The 7mer constrained peptides presented two conserved amino acid sequences: PR-L in N-terminus and GPG in the C-terminus. On the basis of these peptides we propose a mimotope model of the V3 crown epitope in which the PR-L and GPG sequences represent the two known epitope binding sites. The GPG, has the same function as the V3 crown GPGR sequence but without the involvement of the "R" despite its being considered as the signature of the epitope in B-subtype viruses. The PR-L contains a proline not existing in the epitope that is postulated to induce kinks in the backbones of all peptides and create a spatial element mimicking the N-terminal conformationally variable binding site. Rabbit serum to these mimotopes recognized the V3 peptides and moderately decreased the fusion between HIV-1 Env- and CD4-expressing Jurkat cells. This study proposes the efficient generation by means of patient sera of V3 epitope mimics validated by interaction with the antibodies to contemporary viruses induced in patients. The serum antibody-selectable mimotopes are sources of novel information on the fine structure-function properties of HIV-1 principal neutralizing domain and candidate anti-HIV-1 immunogens. Copyright © 2012 Elsevier Ltd. All rights reserved.
FSH: fast spaced seed hashing exploiting adjacent hashes.
Girotto, Samuele; Comin, Matteo; Pizzi, Cinzia
2018-01-01
Patterns with wildcards in specified positions, namely spaced seeds , are increasingly used instead of k -mers in many bioinformatics applications that require indexing, querying and rapid similarity search, as they can provide better sensitivity. Many of these applications require to compute the hashing of each position in the input sequences with respect to the given spaced seed, or to multiple spaced seeds. While the hashing of k -mers can be rapidly computed by exploiting the large overlap between consecutive k -mers, spaced seeds hashing is usually computed from scratch for each position in the input sequence, thus resulting in slower processing. The method proposed in this paper, fast spaced-seed hashing (FSH), exploits the similarity of the hash values of spaced seeds computed at adjacent positions in the input sequence. In our experiments we compute the hash for each positions of metagenomics reads from several datasets, with respect to different spaced seeds. We also propose a generalized version of the algorithm for the simultaneous computation of multiple spaced seeds hashing. In the experiments, our algorithm can compute the hashing values of spaced seeds with a speedup, with respect to the traditional approach, between 1.6[Formula: see text] to 5.3[Formula: see text], depending on the structure of the spaced seed. Spaced seed hashing is a routine task for several bioinformatics application. FSH allows to perform this task efficiently and raise the question of whether other hashing can be exploited to further improve the speed up. This has the potential of major impact in the field, making spaced seed applications not only accurate, but also faster and more efficient. The software FSH is freely available for academic use at: https://bitbucket.org/samu661/fsh/overview.
Willett-Brozick, J E; Savul, S A; Richey, L E; Baysal, B E
2001-08-01
Constitutional chromosomal translocations are relatively common causes of human morbidity, yet the DNA double-strand break (DSB) repair mechanisms that generate them are incompletely understood. We cloned, sequenced and analyzed the breakpoint junctions of a familial constitutional reciprocal translocation t(9;11)(p24;q23). Within the 10-kb region flanking the breakpoints, chromosome 11 had 25% repeat elements, whereas chromosome 9 had 98% repeats, 95% of which were L1-type LINE elements. The breakpoints occurred within an L1-type repeat element at 9p24 and at the 3'-end of an Alu sequence at 11q23. At the breakpoint junction of derivative chromosome 9, we discovered an unusually large 41-bp insertion, which showed 100% identity to 12S mitochondrial DNA (mtDNA) between nucleotides 896 and 936 of the mtDNA sequence. Analysis of the human genome failed to show the preexistence of the inserted sequence at normal chromosomes 9 and 11 breakpoint junctions or elsewhere in the genome, strongly suggesting that the insertion was derived from human mtDNA and captured into the junction during the DSB repair process. To our knowledge, these findings represent the first observation of spontaneous germ line insertion of modern human mtDNA sequences and suggest that DSB repair may play a role in inter-organellar gene transfer in vivo. Our findings also provide evidence for a previously unrecognized insertional mechanism in human, by which non-mobile extra-chromosomal fragments can be inserted into the genome at DSB repair junctions.
Nasica-Labouze, Jessica; Meli, Massimiliano; Derreumaux, Philippe; Colombo, Giorgio; Mousseau, Normand
2011-01-01
The self-organization of peptides into amyloidogenic oligomers is one of the key events for a wide range of molecular and degenerative diseases. Atomic-resolution characterization of the mechanisms responsible for the aggregation process and the resulting structures is thus a necessary step to improve our understanding of the determinants of these pathologies. To address this issue, we combine the accelerated sampling properties of replica exchange molecular dynamics simulations based on the OPEP coarse-grained potential with the atomic resolution description of interactions provided by all-atom MD simulations, and investigate the oligomerization process of the GNNQQNY for three system sizes: 3-mers, 12-mers and 20-mers. Results for our integrated simulations show a rich variety of structural arrangements for aggregates of all sizes. Elongated fibril-like structures can form transiently in the 20-mer case, but they are not stable and easily interconvert in more globular and disordered forms. Our extensive characterization of the intermediate structures and their physico-chemical determinants points to a high degree of polymorphism for the GNNQQNY sequence that can be reflected at the macroscopic scale. Detailed mechanisms and structures that underlie amyloid aggregation are also provided. PMID:21625573
LeProust, Emily M.; Peck, Bill J.; Spirin, Konstantin; McCuen, Heather Brummel; Moore, Bridget; Namsaraev, Eugeni; Caruthers, Marvin H.
2010-01-01
We have achieved the ability to synthesize thousands of unique, long oligonucleotides (150mers) in fmol amounts using parallel synthesis of DNA on microarrays. The sequence accuracy of the oligonucleotides in such large-scale syntheses has been limited by the yields and side reactions of the DNA synthesis process used. While there has been significant demand for libraries of long oligos (150mer and more), the yields in conventional DNA synthesis and the associated side reactions have previously limited the availability of oligonucleotide pools to lengths <100 nt. Using novel array based depurination assays, we show that the depurination side reaction is the limiting factor for the synthesis of libraries of long oligonucleotides on Agilent Technologies’ SurePrint® DNA microarray platform. We also demonstrate how depurination can be controlled and reduced by a novel detritylation process to enable the synthesis of high quality, long (150mer) oligonucleotide libraries and we report the characterization of synthesis efficiency for such libraries. Oligonucleotide libraries prepared with this method have changed the economics and availability of several existing applications (e.g. targeted resequencing, preparation of shRNA libraries, site-directed mutagenesis), and have the potential to enable even more novel applications (e.g. high-complexity synthetic biology). PMID:20308161
Identification of peptide sequences that target to the brain using in vivo phage display.
Li, Jingwei; Zhang, Qizhi; Pang, Zhiqing; Wang, Yuchen; Liu, Qingfeng; Guo, Liangran; Jiang, Xinguo
2012-06-01
Phage display technology could provide a rapid means for the discovery of novel peptides. To find peptide ligands specific for the brain vascular receptors, we performed a modified phage display method. Phages were recovered from mice brain parenchyma after administrated with a random 7-mer peptide library intravenously. A longer circulation time was arranged according to the biodistributive brain/blood ratios of phage particles. Following sequential rounds of isolation, a number of phages were sequenced and a peptide sequence (CTSTSAPYC, denoted as PepC7) was identified. Clone 7-1, which encodes PepC7, exhibited translocation efficiency about 41-fold higher than the random library phage. Immunofluorescence analysis revealed that Clone 7-1 had a significant superiority on transport efficiency into the brain compared with native M13 phage. Clone 7-1 was inhibited from homing to the brain in a dose-dependent fashion when cyclic peptides of the same sequence were present in a competition assay. Interestingly, the linear peptide (ATSTSAPYA, Pep7) and a scrambled control peptide PepSC7 (CSPATSYTC) did not compete with the phage at the same tested concentration (0.2-200 pg). Labeled by Cy5.5, PepC7 exhibited significant brain-targeting capability in in vivo optical imaging analysis. The cyclic conformation of PepC7 formed by disulfide bond, and the correct structure itself play a critical role in maintaining the selectivity and affinity for the brain. In conclusion, PepC7 is a promising brain-target motif never been reported before and it could be applied to targeted drug delivery into the brain.
Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Rao, A R
2016-11-05
DNA barcoding is a molecular diagnostic method that allows automated and accurate identification of species based on a short and standardized fragment of DNA. To this end, an attempt has been made in this study to develop a computational approach for identifying the species by comparing its barcode with the barcode sequence of known species present in the reference library. Each barcode sequence was first mapped onto a numeric feature vector based on k-mer frequencies and then Random forest methodology was employed on the transformed dataset for species identification. The proposed approach outperformed similarity-based, tree-based, diagnostic-based approaches and found comparable with existing supervised learning based approaches in terms of species identification success rate, while compared using real and simulated datasets. Based on the proposed approach, an online web interface SPIDBAR has also been developed and made freely available at http://cabgrid.res.in:8080/spidbar/ for species identification by the taxonomists. Copyright © 2016 Elsevier B.V. All rights reserved.
Conserved expression of transposon-derived non-coding transcripts in primate stem cells.
Ramsay, LeeAnn; Marchetto, Maria C; Caron, Maxime; Chen, Shu-Huang; Busche, Stephan; Kwan, Tony; Pastinen, Tomi; Gage, Fred H; Bourque, Guillaume
2017-02-28
A significant portion of expressed non-coding RNAs in human cells is derived from transposable elements (TEs). Moreover, it has been shown that various long non-coding RNAs (lncRNAs), which come from the human endogenous retrovirus subfamily H (HERVH), are not only expressed but required for pluripotency in human embryonic stem cells (hESCs). To identify additional TE-derived functional non-coding transcripts, we generated RNA-seq data from induced pluripotent stem cells (iPSCs) of four primate species (human, chimpanzee, gorilla, and rhesus) and searched for transcripts whose expression was conserved. We observed that about 30% of TE instances expressed in human iPSCs had orthologous TE instances that were also expressed in chimpanzee and gorilla. Notably, our analysis revealed a number of repeat families with highly conserved expression profiles including HERVH but also MER53, which is known to be the source of a placental-specific family of microRNAs (miRNAs). We also identified a number of repeat families from all classes of TEs, including MLT1-type and Tigger families, that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved. Together, these results describe TE families and TE-derived lncRNAs whose conserved expression patterns can be used to identify what are likely functional TE-derived non-coding transcripts in primate iPSCs.
Liu, Zhandong; Venkatesh, Santosh S; Maley, Carlo C
2008-01-01
Background Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. Results We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12) genomes. Virtually all possible (> 98%) 12 bp oligomers appear in vertebrate genomes while < 2% of 19 bp oligomers are present. Other species showed different ranges of > 98% to < 2% of possible oligomers in D. melanogaster (12–17 bp), C. elegans (11–17 bp), A. thaliana (11–17 bp), S. cerevisiae (10–16 bp) and E. coli (9–15 bp). Frequencies of unique oligomers in the genomes follow similar patterns. We identified a set of 2.6 M 15-mers that are more than 1 nucleotide different from all 15-mers in the human genome and so could be used as probes to detect microbes in human samples. In a human sample, these probes would detect 100% of the 433 currently fully sequenced prokaryotes and 75% of the 3065 fully sequenced viruses. The human genome is significantly more compact in sequence space than a random genome. We identified the most frequent 5- to 20-mers in the human genome, which may prove useful as PCR primers. We also identified a bacterium, Anaeromyxobacter dehalogenans, which has an exceptionally low diversity of oligomers given the size of its genome and its GC content. The entropy of coding regions in the human genome is significantly higher than non-coding regions and chromosomes. However chromosomes 1, 2, 9, 12 and 14 have a relatively high proportion of coding DNA without high entropy, and chromosome 20 is the opposite with a low frequency of coding regions but relatively high entropy. Conclusion Measures of the frequency of oligomers are useful for designing PCR assays and for identifying chromosomes and organisms with hidden structure that had not been previously recognized. This information may be used to detect novel microbes in human tissues. PMID:18973670
Liu, Zhandong; Venkatesh, Santosh S; Maley, Carlo C
2008-10-30
Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12) genomes. Virtually all possible (> 98%) 12 bp oligomers appear in vertebrate genomes while < 2% of 19 bp oligomers are present. Other species showed different ranges of > 98% to < 2% of possible oligomers in D. melanogaster (12-17 bp), C. elegans (11-17 bp), A. thaliana (11-17 bp), S. cerevisiae (10-16 bp) and E. coli (9-15 bp). Frequencies of unique oligomers in the genomes follow similar patterns. We identified a set of 2.6 M 15-mers that are more than 1 nucleotide different from all 15-mers in the human genome and so could be used as probes to detect microbes in human samples. In a human sample, these probes would detect 100% of the 433 currently fully sequenced prokaryotes and 75% of the 3065 fully sequenced viruses. The human genome is significantly more compact in sequence space than a random genome. We identified the most frequent 5- to 20-mers in the human genome, which may prove useful as PCR primers. We also identified a bacterium, Anaeromyxobacter dehalogenans, which has an exceptionally low diversity of oligomers given the size of its genome and its GC content. The entropy of coding regions in the human genome is significantly higher than non-coding regions and chromosomes. However chromosomes 1, 2, 9, 12 and 14 have a relatively high proportion of coding DNA without high entropy, and chromosome 20 is the opposite with a low frequency of coding regions but relatively high entropy. Measures of the frequency of oligomers are useful for designing PCR assays and for identifying chromosomes and organisms with hidden structure that had not been previously recognized. This information may be used to detect novel microbes in human tissues.
SlideSort: all pairs similarity search for short reads
Shimizu, Kana; Tsuda, Koji
2011-01-01
Motivation: Recent progress in DNA sequencing technologies calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount of short reads. Searching similar pairs from a string pool is a fundamental process of de novo genome assembly, genome-wide alignment and other important analyses. Results: In this study, we designed and implemented an exact algorithm SlideSort that finds all similar pairs from a string pool in terms of edit distance. Using an efficient pattern growth algorithm, SlideSort discovers chains of common k-mers to narrow down the search. Compared to existing methods based on single k-mers, our method is more effective in reducing the number of edit distance calculations. In comparison to backtracking methods such as BWA, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing short reads for further processing. Availability: Executable binary files and C++ libraries are available at http://www.cbrc.jp/~shimizu/slidesort/ for Linux and Windows. Contact: slidesort@m.aist.go.jp; shimizu-kana@aist.go.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21148542
Hume, Maxwell A; Barrera, Luis A; Gisselbrecht, Stephen S; Bulyk, Martha L
2015-01-01
The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k ('k-mers'). The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos. This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference. The UniPROBE database is available at http://uniprobe.org. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Barcodes for genomes and applications
Zhou, Fengfeng; Olman, Victor; Xu, Ying
2008-01-01
Background Each genome has a stable distribution of the combined frequency for each k-mer and its reverse complement measured in sequence fragments as short as 1000 bps across the whole genome, for 1
El-Sagheer, Afaf H.; Sanzone, A. Pia; Gao, Rachel; Tavassoli, Ali; Brown, Tom
2011-01-01
A triazole mimic of a DNA phosphodiester linkage has been produced by templated chemical ligation of oligonucleotides functionalized with 5′-azide and 3′-alkyne. The individual azide and alkyne oligonucleotides were synthesized by standard phosphoramidite methods and assembled using a straightforward ligation procedure. This highly efficient chemical equivalent of enzymatic DNA ligation has been used to assemble a 300-mer from three 100-mer oligonucleotides, demonstrating the total chemical synthesis of very long oligonucleotides. The base sequences of the DNA strands containing this artificial linkage were copied during PCR with high fidelity and a gene containing the triazole linker was functional in Escherichia coli. PMID:21709264
Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A; Lyons, Russell E; Salin, Krishna R; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B
2016-05-07
The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world's most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.
Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A.; Lyons, Russell E.; Salin, Krishna R.; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B.
2016-01-01
The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world’s most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium. PMID:27164098
Is sequencing better than phenotypic tests for the detection of pyrazinamide resistance?
Bouzouita, I; Cabibbe, A M; Trovato, A; Draoui, H; Ghariani, A; Midouni, B; Essalah, L; Mehiri, E; Cirillo, D M; Slim-Saidi, L
2018-06-01
Phenotypic tests used to detect pyrazinamide (PZA) resistance are slow and have a high rate of false resistance. To evaluate the accuracy of pncA sequencing for the detection of PZA resistance in Mycobacterium tuberculosis strains isolated in Tunisia. A total of 82 isolates, 41 resistant and 41 susceptible to PZA on BACTEC™ MGIT™ 960, were sequenced for pncA. Whole genome sequencing was performed for strains that were phenotypically resistant and had wild-type pncA in addition to MGIT retesting with a modified protocol. Twenty-three strains resistant to PZA with negative pyrazinamidase (PZase) activity harboured a mutation in the promoter or coding region of pncA. However, 18 strains resistant to PZA did not present any mutation. Repeat MGIT 960 showed that 16 of 18 M. tuberculosis isolates were falsely resistant to PZA. Compared with MGIT, PZase activity assay and pncA sequencing both presented a sensitivity of 92.0% (95%CI 73.9-99.0) and a specificity of respectively 96.5% (positive predictive value [PPV] 92.0%, negative predictive value [NPV] 96.5%) and 100.0% (PPV 100.0%, NPV 96.6%). The standard MGIT assay showed a high rate of false resistance to PZA, and the PZase activity assay is slow. pncA sequencing could therefore represent a rapid, accurate, alternative test to detect PZA resistance.
Wang, Yanqun; Liu, Di; Shi, Weifeng; Lu, Roujian; Wang, Wenling; Zhao, Yanjie; Deng, Yao; Zhou, Weimin; Ren, Hongguang; Wu, Jun; Wang, Yu; Wu, Guizhen; Gao, George F; Tan, Wenjie
2015-09-08
The Middle East respiratory syndrome coronavirus (MERS-CoV) causes a severe acute respiratory tract infection with a high fatality rate in humans. Coronaviruses are capable of infecting multiple species and can evolve rapidly through recombination events. Here, we report the complete genomic sequence analysis of a MERS-CoV strain imported to China from South Korea. The imported virus, provisionally named ChinaGD01, belongs to group 3 in clade B in the whole-genome phylogenetic tree and also has a similar tree topology structure in the open reading frame 1a and -b (ORF1ab) gene segment but clusters with group 5 of clade B in the tree constructed using the S gene. Genetic recombination analysis and lineage-specific single-nucleotide polymorphism (SNP) comparison suggest that the imported virus is a recombinant comprising group 3 and group 5 elements. The time-resolved phylogenetic estimation indicates that the recombination event likely occurred in the second half of 2014. Genetic recombination events between group 3 and group 5 of clade B may have implications for the transmissibility of the virus. The recent outbreak of MERS-CoV in South Korea has attracted global media attention due to the speed of spread and onward transmission. Here, we present the complete genome of the first imported MERS-CoV case in China and demonstrate genetic recombination events between group 3 and group 5 of clade B that may have implications for the transmissibility of MERS-CoV. Copyright © 2015 Wang et al.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Lingshu; Shi, Wei; Chappell, James D.
ABSTRACT Middle East respiratory syndrome coronavirus (MERS-CoV) causes a highly lethal pulmonary infection with ~35% mortality. The potential for a future pandemic originating from animal reservoirs or health care-associated events is a major public health concern. There are no vaccines or therapeutic agents currently available for MERS-CoV. Using a probe-based single B cell cloning strategy, we have identified and characterized multiple neutralizing monoclonal antibodies (MAbs) specifically binding to the receptor-binding domain (RBD) or S1 (non-RBD) regions from a convalescent MERS-CoV-infected patient and from immunized rhesus macaques. RBD-specific MAbs tended to have greater neutralizing potency than non-RBD S1-specific MAbs. Six RBD-specificmore » and five S1-specific MAbs could be sorted into four RBD and three non-RBD distinct binding patterns, based on competition assays, mapping neutralization escape variants, and structural analysis. We determined cocrystal structures for two MAbs targeting the RBD from different angles and show they can bind the RBD only in the “out” position. We then showed that selected RBD-specific, non-RBD S1-specific, and S2-specific MAbs given prophylactically prevented MERS-CoV replication in lungs and protected mice from lethal challenge. Importantly, combining RBD- and non-RBD MAbs delayed the emergence of escape mutations in a cell-based virus escape assay. These studies identify MAbs targeting different antigenic sites on S that will be useful for defining mechanisms of MERS-CoV neutralization and for developing more effective interventions to prevent or treat MERS-CoV infections. IMPORTANCEMERS-CoV causes a highly lethal respiratory infection for which no vaccines or antiviral therapeutic options are currently available. Based on continuing exposure from established reservoirs in dromedary camels and bats, transmission of MERS-CoV into humans and future outbreaks are expected. Using structurally defined probes for the MERS-CoV spike glycoprotein (S), the target for neutralizing antibodies, single B cells were sorted from a convalescent human and immunized nonhuman primates (NHPs). MAbs produced from paired immunoglobulin gene sequences were mapped to multiple epitopes within and outside the receptor-binding domain (RBD) and protected against lethal MERS infection in a murine model following passive immunization. Importantly, combining MAbs targeting distinct epitopes prevented viral neutralization escape from RBD-directed MAbs. These data suggest that antibody responses to multiple domains on CoV spike protein may improve immunity and will guide future vaccine and therapeutic development efforts.« less
Knockdown of the bovine prion gene PRNP by RNA interference (RNAi) technology.
Sutou, Shizuyo; Kunishi, Miho; Kudo, Toshiyuki; Wongsrikeao, Pimprapar; Miyagishi, Makoto; Otoi, Takeshige
2007-07-26
Since prion gene-knockout mice do not contract prion diseases and animals in which production of prion protein (PrP) is reduced by half are resistant to the disease, we hypothesized that bovine animals with reduced PrP would be tolerant to BSE. Hence, attempts were made to produce bovine PRNP (bPRNP) that could be knocked down by RNA interference (RNAi) technology. Before an in vivo study, optimal conditions for knocking down bPRNP were determined in cultured mammalian cell systems. Factors examined included siRNA (short interfering RNA) expression plasmid vectors, target sites of PRNP, and lengths of siRNAs. Four siRNA expression plasmid vectors were used: three harboring different cloning sites were driven by the human U6 promoter (hU6), and one by the human tRNAVal promoter. Six target sites of bovine PRNP were designed using an algorithm. From 1 (22 mer) to 9 (19, 20, 21, 22, 23, 24, 25, 27, and 29 mer) siRNA expression vectors were constructed for each target site. As targets of siRNA, the entire bPRNP coding sequence was connected to the reporter gene of the fluorescent EGFP, or of firefly luciferase or Renilla luciferase. Target plasmid DNA was co-transfected with siRNA expression vector DNA into HeLaS3 cells, and fluorescence or luminescence was measured. The activities of siRNAs varied widely depending on the target sites, length of the siRNAs, and vectors used. Longer siRNAs were less effective, and 19 mer or 21 mer was generally optimal. Although 21 mer GGGGAGAACTTCACCGAAACT expressed by a hU6-driven plasmid with a Bsp MI cloning site was best under the present experimental conditions, the corresponding tRNA promoter-driven plasmid was almost equally useful. The effectiveness of this siRNA was confirmed by immunostaining and Western blotting. Four siRNA expression plasmid vectors, six target sites of bPRNP, and various lengths of siRNAs from 19 mer to 29 mer were examined to establish optimal conditions for knocking down of bPRNP in vitro. The most effective siRNA so far tested was 21 mer GGGGAGAACTTCACCGAAACT driven either by a hU6 or tRNA promoter, a finding that provides a basis for further studies in vivo.
Flibotte, Stephane; Moerman, Donald G
2008-10-21
Microarray comparative genomic hybridization (CGH) is currently one of the most powerful techniques to measure DNA copy number in large genomes. In humans, microarray CGH is widely used to assess copy number variants in healthy individuals and copy number aberrations associated with various diseases, syndromes and disease susceptibility. In model organisms such as Caenorhabditis elegans (C. elegans) the technique has been applied to detect mutations, primarily deletions, in strains of interest. Although various constraints on oligonucleotide properties have been suggested to minimize non-specific hybridization and improve the data quality, there have been few experimental validations for CGH experiments. For genomic regions where strict design filters would limit the coverage it would also be useful to quantify the expected loss in data quality associated with relaxed design criteria. We have quantified the effects of filtering various oligonucleotide properties by measuring the resolving power for detecting deletions in the human and C. elegans genomes using NimbleGen microarrays. Approximately twice as many oligonucleotides are typically required to be affected by a deletion in human DNA samples in order to achieve the same statistical confidence as one would observe for a deletion in C. elegans. Surprisingly, the ability to detect deletions strongly depends on the oligonucleotide 15-mer count, which is defined as the sum of the genomic frequency of all the constituent 15-mers within the oligonucleotide. A similarity level above 80% to non-target sequences over the length of the probe produces significant cross-hybridization. We recommend the use of a fairly large melting temperature window of up to 10 degrees C, the elimination of repeat sequences, the elimination of homopolymers longer than 5 nucleotides, and a threshold of -1 kcal/mol on the oligonucleotide self-folding energy. We observed very little difference in data quality when varying the oligonucleotide length between 50 and 70, and even when using an isothermal design strategy. We have determined experimentally the effects of varying several key oligonucleotide microarray design criteria for detection of deletions in C. elegans and humans with NimbleGen's CGH technology. Our oligonucleotide design recommendations should be applicable for CGH analysis in most species.
Analysis of sequence repeats of proteins in the PDB.
Mary Rajathei, David; Selvaraj, Samuel
2013-12-01
Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Klon, Anthony E; Segrest, Jere P; Harvey, Stephen C
2002-12-06
Apolipoprotein A-I (apo A-I) is the major protein component of high-density lipoprotein (HDL) particles. Elevated levels of HDL in the bloodstream have been shown to correlate strongly with a reduced risk factor for atherosclerosis. Molecular dynamics simulations have been carried out on three separate model discoidal high-density lipoprotein particles (HDL) containing two monomers of apo A-I and 160 molecules of palmitoyloleoylphosphatidylcholine (POPC), to a time-scale of 1ns. The starting structures were on the basis of previously published molecular belt models of HDL consisting of the lipid-binding C-terminal domain (residues 44-243) wrapped around the circumference of a discoidal HDL particle. Subtle changes between two of the starting structures resulted in significantly different behavior during the course of the simulation. The results provide support for the hypothesis of Segrest et al. that helical registration in the molecular belt model of apo A-I is modulated by intermolecular salt bridges. In addition, we propose an explanation for the presence of proline punctuation in the molecular belt model, and for the presence of two 11-mer helical repeats interrupting the otherwise regular pattern of 22-mer helical repeats in the lipid-binding domain of apo A-I.
DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.
Ma, Wenxiu; Yang, Lin; Rohs, Remo; Noble, William Stafford
2017-10-01
Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. rohs@usc.edu or william-noble@uw.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Strain/species identification in metagenomes using genome-specific markers
Tu, Qichao; He, Zhili; Zhou, Jizhong
2014-01-01
Shotgun metagenome sequencing has become a fast, cheap and high-throughput technology for characterizing microbial communities in complex environments and human body sites. However, accurate identification of microorganisms at the strain/species level remains extremely challenging. We present a novel k-mer-based approach, termed GSMer, that identifies genome-specific markers (GSMs) from currently sequenced microbial genomes, which were then used for strain/species-level identification in metagenomes. Using 5390 sequenced microbial genomes, 8 770 321 50-mer strain-specific and 11 736 360 species-specific GSMs were identified for 4088 strains and 2005 species (4933 strains), respectively. The GSMs were first evaluated against mock community metagenomes, recently sequenced genomes and real metagenomes from different body sites, suggesting that the identified GSMs were specific to their targeting genomes. Sensitivity evaluation against synthetic metagenomes with different coverage suggested that 50 GSMs per strain were sufficient to identify most microbial strains with ≥0.25× coverage, and 10% of selected GSMs in a database should be detected for confident positive callings. Application of GSMs identified 45 and 74 microbial strains/species significantly associated with type 2 diabetes patients and obese/lean individuals from corresponding gastrointestinal tract metagenomes, respectively. Our result agreed with previous studies but provided strain-level information. The approach can be directly applied to identify microbial strains/species from raw metagenomes, without the effort of complex data pre-processing. PMID:24523352
Jankowsky, E; Strunk, G; Schwenzer, B
1997-01-01
Long RNA substrates are inefficiently cleaved by hammerhead ribozymes in trans. Oligonucleotide facilitators capable of affecting the ribozyme activity by interacting with the substrates at the termini of the ribozyme provide a possibility to improve ribozyme mediated cleavage of long RNA substrates. We have examined the effect of PNA as facilitator in vitro in order to test if even artificial compounds have facilitating potential. Effects of 12mer PNA- (peptide nucleic acid), RNA- and DNA-facilitators of identical sequence were measured with three substrates containing either 942, 452 or 39 nucleotides. The PNA facilitator enhances the ribozyme activity with both, the 942mer and the 452mer substrate to a slightly smaller extent than RNA and DNA facilitators. This effect was observed up to PNA facilitator:substrate ratios of 200:1. The enhancement becomes smaller as the PNA facilitator:substrate ratio exceeds 200:1. With the 39mer substrate, the PNA facilitator decreases the ribozyme activity by more than 100-fold, even at PNA facilitator:substrate ratios of 1:1. Although with long substrates the effect of the PNA facilitator is slightly smaller than the effect of identical RNA or DNA facilitators, PNA may be a more practical choice for potential applications in vivo because PNA is much more resistant to degradation by cellular enzymes. PMID:9207013
Usui, Daiki; Inaba, Satomi; Kamatari, Yuji O; Ishiguro, Naotaka; Oda, Masayuki
2017-09-02
The monoclonal antibody, G2, specifically binds to the immunogen peptide derived from the chicken prion protein, Pep18mer, and two chicken proteins derived peptides, Pep8 and Pep395; G2 binds with equal affinity to Pep18mer. The amino acid sequences of the three peptides are completely different, and so the recognition mechanism of G2 is unique and interesting. We generated a single-chain Fv (scFv) antibody of G2, and demonstrated its correct folding with an antigen binding function similar to intact G2 antibody. We also generated a Pro containing mutant of G2 scFv at residue 95 of the light chain, and analyzed its antigen binding using a surface plasmon biosensor. The mutant lost its binding ability to Pep18mer, but remained those to Pep8 and Pep395. The results clearly indicate residue 95 as being critical for multispecific antigen binding of G2 at the site generated from the junctional diversity introduced at the joints between the V and J gene segments. Copyright © 2017 Elsevier Inc. All rights reserved.
Extreme-Scale De Novo Genome Assembly
DOE Office of Scientific and Technical Information (OSTI.GOV)
Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
Marini, Emanuela; Palmieri, Claudio; Magi, Gloria; Facinelli, Bruna
2015-07-09
Integrative conjugative elements (ICEs) are mobile genetic elements that reside in the chromosome but retain the ability to undergo excision and to transfer by conjugation. Genes involved in drug resistance, virulence, or niche adaptation are often found among backbone genes as cargo DNA. We recently characterized in Streptococcus suis an ICE (ICESsu32457) carrying resistance genes [tet(O/W/32/O), tet(40), erm(B), aphA, and aadE] in the 15K unstable genetic element, which is flanked by two ∼1.3kb direct repeats. Remarkably, ∼1.3-kb sequences are conserved in ICESa2603 of Streptococcus agalactiae 2603V/R, which carry heavy metal resistance genes cadC/cadA and mer. In matings between S. suis 32457 (donor) and S. agalactiae 2603V/R (recipient), transconjugants were obtained. PCR experiments, PFGE, and sequence analysis of transconjugants demonstrated a tandem array between ICESsu32457 and ICESa2603. Matings between tandem array-containing S. agalactiae 2603V/R (donor) and Streptococcus pyogenes RF12 (recipient) yielded a single transconjugant containing a hybrid ICE, here named ICESa2603/ICESsu32457. The hybrid formed by recombination of the left ∼1.3-kb sequence of ICESsu32457 and the ∼1.3-kb sequence of ICESa2603. Interestingly, the hybrid ICE was transferable between S. pyogenes strains, thus demonstrating that it behaves as a conventional ICE. These findings suggest that both tandem arrays and hybrid ICEs may contribute to the evolution of antibiotic resistance in streptococci, creating novel mobile elements capable of disseminating new combinations of antibiotic resistance genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Loy, Alexander; Lehner, Angelika; Lee, Natuschka; Adamczyk, Justyna; Meier, Harald; Ernst, Jens; Schleifer, Karl-Heinz; Wagner, Michael
2002-01-01
For cultivation-independent detection of sulfate-reducing prokaryotes (SRPs) an oligonucleotide microarray consisting of 132 16S rRNA gene-targeted oligonucleotide probes (18-mers) having hierarchical and parallel (identical) specificity for the detection of all known lineages of sulfate-reducing prokaryotes (SRP-PhyloChip) was designed and subsequently evaluated with 41 suitable pure cultures of SRPs. The applicability of SRP-PhyloChip for diversity screening of SRPs in environmental and clinical samples was tested by using samples from periodontal tooth pockets and from the chemocline of a hypersaline cyanobacterial mat from Solar Lake (Sinai, Egypt). Consistent with previous studies, SRP-PhyloChip indicated the occurrence of Desulfomicrobium spp. in the tooth pockets and the presence of Desulfonema- and Desulfomonile-like SRPs (together with other SRPs) in the chemocline of the mat. The SRP-PhyloChip results were confirmed by several DNA microarray-independent techniques, including specific PCR amplification, cloning, and sequencing of SRP 16S rRNA genes and the genes encoding the dissimilatory (bi)sulfite reductase (dsrAB). PMID:12324358
Detection of benzo[a]pyrene-guanine adducts in single-stranded DNA using the α-hemolysin nanopore
NASA Astrophysics Data System (ADS)
Perera, Rukshan T.; Fleming, Aaron M.; Johnson, Robert P.; Burrows, Cynthia J.; White, Henry S.
2015-02-01
The carcinogenic precursor benzo[a]pyrene (BP), a polycyclic aromatic hydrocarbon, is released into the environment through the incomplete combustion of hydrocarbons. Metabolism of BP in the human body yields a potent alkylating agent (benzo[a]pyrene diol epoxide, BPDE) that reacts with guanine (G) in DNA to form an adduct implicated in cancer initiation. We report that the α-hemolysin (αHL) nanopore platform can be used to detect a BPDE adduct to G in synthetic oligodeoxynucleotides. Translocation of a 41-mer poly-2‧-deoxycytidine strand with a centrally located BPDE adduct to G through αHL in 1 M KCl produces a unique multi-level current signature allowing the adduct to be detected. This readily distinguishable current modulation was observed when the BPDE-adducted DNA strand translocated from either the 5‧ or 3‧ directions. This study suggests that BPDE adducts and other large aromatic biomarkers can be detected with αHL, presenting opportunities for the monitoring, quantification, and sequencing of mutagenic compounds from cellular DNA samples.
De Bellis, Fabien; Malapa, Roger; Kagy, Valérie; Lebegin, Stéphane; Billot, Claire; Labouisse, Jean-Pierre
2016-08-01
Using next-generation sequencing technology, new microsatellite loci were characterized in Artocarpus altilis (Moraceae) and two congeners to increase the number of available markers for genotyping breadfruit cultivars. A total of 47,607 simple sequence repeat loci were obtained by sequencing a library of breadfruit genomic DNA with an Illumina MiSeq system. Among them, 50 single-locus markers were selected and assessed using 41 samples (39 A. altilis, one A. camansi, and one A. heterophyllus). All loci were polymorphic in A. altilis, 44 in A. camansi, and 21 in A. heterophyllus. The number of alleles per locus ranged from two to 19. The new markers will be useful for assessing the identity and genetic diversity of breadfruit cultivars on a small geographical scale, gaining a better understanding of farmer management practices, and will help to optimize breadfruit genebank management.
kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences
2017-01-01
Abstract Motifs of only 1–4 letters can play important roles when present at key locations within macromolecules. Because existing motif-discovery tools typically miss these position-specific short motifs, we developed kpLogo, a probability-based logo tool for integrated detection and visualization of position-specific ultra-short motifs from a set of aligned sequences. kpLogo also overcomes the limitations of conventional motif-visualization tools in handling positional interdependencies and utilizing ranked or weighted sequences increasingly available from high-throughput assays. kpLogo can be found at http://kplogo.wi.mit.edu/. PMID:28460012
Titus, James K; Kay, Matthew K; Glaser, CDR Jacob J
2017-01-01
Snakebite envenomation is an important global health concern. The current standard treatment approach for snakebite envenomation relies on antibody-based antisera, which are expensive, not universally available, and can lead to adverse physiological effects. Phage display techniques offer a powerful tool for the selection of phage-expressed peptides, which can bind with high specificity and affinity towards venom components. In this research, the amino acid sequences of Phospholipase A2 (PLA2) from multiple cottonmouth species were analyzed, and a consensus peptide synthesized. Three phage display libraries were panned against this consensus peptide, crosslinked to capillary tubes, followed by a modified surface panning procedure. This high throughput selection method identified four phage clones with anti-PLA2 activity against Western cottonmouth venom, and the amino acid sequences of the displayed peptides were identified. This is the first report identifying short peptide sequences capable of inhibiting PLA2 activity of Western cottonmouth venom in vitro, using a phage display technique. Additionally, this report utilizes synthetic panning targets, designed using venom proteomic data, to mimic epitope regions. M13 phages displaying circular 7-mer or linear 12-mer peptides with antivenom activity may offer a novel alternative to traditional antibody-based therapy. PMID:29285351
Molecular dynamics study of some non-hydrogen-bonding base pair DNA strands
NASA Astrophysics Data System (ADS)
Tiwari, Rakesh K.; Ojha, Rajendra P.; Tiwari, Gargi; Pandey, Vishnudatt; Mall, Vijaysree
2018-05-01
In order to elucidate the structural activity of hydrophobic modified DNA, the DMMO2-D5SICS, base pair is introduced as a constituent in different set of 12-mer and 14-mer DNA sequences for the molecular dynamics (MD) simulation in explicit water solvent. AMBER 14 force field was employed for each set of duplex during the 200ns production-dynamics simulation in orthogonal-box-water solvent by the Particle-Mesh-Ewald (PME) method in infinite periodic boundary conditions (PBC) to determine conformational parameters of the complex. The force-field parameters of modified base-pair were calculated by Gaussian-code using Hartree-Fock /ab-initio methodology. RMSD Results reveal that the conformation of the duplex is sequence dependent and the binding energy of the complex depends on the position of the modified base-pair in the nucleic acid strand. We found that non-bonding energy had a significant contribution to stabilising such type of duplex in comparison to electrostatic energy. The distortion produced within strands by such type of base-pair was local and destabilised the duplex integrity near to substitution, moreover the binding energy of duplex depends on the position of substitution of hydrophobic base-pair and the DNA sequence and strongly supports the corresponding experimental study.
Titus, James K; Kay, Matthew K; Glaser, Cdr Jacob J
2017-01-01
Snakebite envenomation is an important global health concern. The current standard treatment approach for snakebite envenomation relies on antibody-based antisera, which are expensive, not universally available, and can lead to adverse physiological effects. Phage display techniques offer a powerful tool for the selection of phage-expressed peptides, which can bind with high specificity and affinity towards venom components. In this research, the amino acid sequences of Phospholipase A 2 (PLA 2 ) from multiple cottonmouth species were analyzed, and a consensus peptide synthesized. Three phage display libraries were panned against this consensus peptide, crosslinked to capillary tubes, followed by a modified surface panning procedure. This high throughput selection method identified four phage clones with anti-PLA 2 activity against Western cottonmouth venom, and the amino acid sequences of the displayed peptides were identified. This is the first report identifying short peptide sequences capable of inhibiting PLA 2 activity of Western cottonmouth venom in vitro , using a phage display technique. Additionally, this report utilizes synthetic panning targets, designed using venom proteomic data, to mimic epitope regions. M13 phages displaying circular 7-mer or linear 12-mer peptides with antivenom activity may offer a novel alternative to traditional antibody-based therapy.
DNA microdevice for electrochemical detection of Escherichia coli 0157:H7 molecular markers.
Berganza, J; Olabarria, G; García, R; Verdoy, D; Rebollo, A; Arana, S
2007-04-15
An electrochemical DNA sensor based on the hybridization recognition of a single-stranded DNA (ssDNA) probe immobilized onto a gold electrode to its complementary ssDNA is presented. The DNA probe is bound on gold surface electrode by using self-assembled monolayer (SAM) technology. An optimized mixed SAM with a blocking molecule preventing the nonspecific adsorption on the electrode surface has been prepared. In this paper, a DNA biosensor is designed by means of the immobilization of a single stranded DNA probe on an electrochemical transducer surface to recognize specifically Escherichia coli (E. coli) 0157:H7 complementary target DNA sequence via cyclic voltammetry experiments. The 21 mer DNA probe including a C6 alkanethiol group at the 5' phosphate end has been synthesized to form the SAM onto the gold surface through the gold sulfur bond. The goal of this paper has been to design, characterise and optimise an electrochemical DNA sensor. In order to investigate the oligonucleotide probe immobilization and the hybridization detection, experiments with different concentration of DNA and mismatch sequences have been performed. This microdevice has demonstrated the suitability of oligonucleotide Self-assembled monolayers (SAMs) on gold as immobilization method. The DNA probes deposited on gold surface have been functional and able to detect changes in bases sequence in a 21-mer oligonucleotide.
Synthesis and biological activity of artificial mRNA prepared with novel phosphorylating reagents
Nagata, Seigo; Hamasaki, Tomohiro; Uetake, Koichi; Masuda, Hirofumi; Takagaki, Kazuchika; Oka, Natsuhisa; Wada, Takeshi; Ohgi, Tadaaki; Yano, Junichi
2010-01-01
Though medicines that target mRNA are under active investigation, there has been little or no effort to develop mRNA itself as a medicine. Here, we report the synthesis of a 130-nt mRNA sequence encoding a 33-amino-acid peptide that includes the sequence of glucagon-like peptide-1, a peptide that stimulates glucose-dependent insulin secretion from the pancreas. The synthesis method used, which had previously been developed in our laboratory, was based on the use of 2-cyanoethoxymethyl as the 2′-hydroxy protecting group. We also developed novel, highly reactive phosphotriester pyrophosphorylating reagents to pyrophosphorylate the 5′-end of the 130-mer RNA in preparation for capping. We completed the synthesis of the artificial mRNA by the enzymatic addition of a 5′-cap and a 3′-poly(A) tail to the pyrophosphorylated 130-mer and showed that the resulting mRNA supported protein synthesis in a cell-free system and in whole cells. As far as we know, this is the first time that mRNA has been prepared from a chemically synthesized RNA sequence. As well as providing a research tool for the intracellular expression of peptides, the technology described here may be used for the production of mRNA for medical applications. PMID:20660478
Studies of G-quadruplexes formed within self-assembled DNA mini-circles.
Klejevskaja, Beata; Pyne, Alice L B; Reynolds, Matthew; Shivalingam, Arun; Thorogate, Richard; Hoogenboom, Bart W; Ying, Liming; Vilar, Ramon
2016-10-13
We have developed self-assembled DNA mini-circles that contain a G-quadruplex-forming sequence from the c-Myc oncogene promoter and demonstrate by FRET that the G-quadruplex unfolding kinetics are 10-fold slower than for the simpler 24-mer G-quadruplex that is commonly used for FRET experiments.
Investigation of Seminal Plasma Hypersensitivity Reactions (AIBS GWI 0046)
1999-10-01
Kit (Cat. No. 29304). The sequences for the 20-mer PCR primers for the urease gene off/, urealyticum (termed UU1 and UU2) and PCR methods were adapted...Ureaplasma urealyticum urease primer was unsuccessful. We therefore sent DNA samples of GW and control civilian couples to an outside laboratory to
Sensitive detection of unlabeled oligonucleotides using a paired surface plasma waves biosensor.
Li, Ying-Chang; Chiou, Chiuan-Chian; Luo, Ji-Dung; Chen, Wei-Ju; Su, Li-Chen; Chang, Ying-Feng; Chang, Yu-Sun; Lai, Chao-Sung; Lee, Cheng-Chung; Chou, Chien
2012-05-15
Detection of unlabeled oligonucleotides using surface plasmon resonance (SPR) is difficult because of the oligonucleotides' relatively lower molecular weight compared with proteins. In this paper, we describe a method for detecting unlabeled oligonucleotides at low concentration using a paired surface plasma waves biosensor (PSPWB). The biosensor uses a sensor chip with an immobilized probe to detect a target oligonucleotide via sequence-specific hybridization. PSPWB measures the demodulated amplitude of the heterodyne signal in real time. In the meantime, the ratio of the amplitudes between the detected output signal and reference can reduce the excess noise from the laser intensity fluctuation. Also, the common-path propagation of p and s waves cancels the common phase noise induced by temperature variation. Thus, a high signal-to-noise ratio (SNR) of the heterodyne signal is detected. The sequence specificity of oligonucleotide hybridization ensures that the platform is precisely discriminating between target and non-target oligonucleotides. Under optimized experimental conditions, the detected heterodyne signal increases linearly with the logarithm of the concentration of target oligonucleotide over the range 0.5-500 pM. The detection limit is 0.5 pM in this experiment. In addition, the non-target oligonucleotide at concentrations of 10 pM and 10nM generated signals only slightly higher than background, indicating the high selectivity and specificity of this method. Different length of perfectly matched oligonucleotide targets at 10-mer, 15-mer and 20-mer were identified at the concentration of 150 pM. Copyright © 2012 Elsevier B.V. All rights reserved.
Wang, Yanping; Wiatrowski, Heather A; John, Ria; Lin, Chu-Ching; Young, Lily Y; Kerkhof, Lee J; Yee, Nathan; Barkay, Tamar
2013-02-01
The contamination of groundwater with mercury (Hg) is an increasing problem worldwide. Yet, little is known about the interactions of Hg with microorganisms and their processes in subsurface environments. We tested the impact of Hg on denitrification in nitrate reducing enrichment cultures derived from subsurface sediments from the Oak Ridge Integrated Field Research Challenge site, where nitrate is a major contaminant and where bioremediation efforts are in progress. We observed an inverse relationship between Hg concentrations and onset and rates of denitrification in nitrate enrichment cultures containing between 53 and 1.1 μM of inorganic Hg; higher Hg concentrations increasingly extended the time to onset of denitrification and inhibited denitrification rates. Microbial community complexity, as indicated by terminal restriction fragment length polymorphism (tRFLP) analysis of the 16S rRNA genes, declined with increasing Hg concentrations; at the 312 nM Hg treatment, a single tRFLP peak was detected representing a culture of Bradyrhizobium sp. that possessed the merA gene indicating a potential for Hg reduction. A culture identified as Bradyrhizobium sp. strain FRC01 with an identical 16S rRNA sequence to that of the enriched peak in the tRFLP patterns, reduced Hg(II) to Hg(0) and carried merA whose amino acid sequence has 97 % identity to merA from the Proteobacteria and Firmicutes. This study demonstrates that in subsurface sediment incubations, Hg may inhibit denitrification and that inhibition may be alleviated when Hg resistant denitrifying Bradyrhizobium spp. detoxify Hg by its reduction to the volatile elemental form.
Gao, Xinxin; Yo, Peggy; Keith, Andrew; Ragan, Timothy J.; Harris, Thomas K.
2003-01-01
A novel thermodynamically-balanced inside-out (TBIO) method of primer design was developed and compared with a thermodynamically-balanced conventional (TBC) method of primer design for PCR-based gene synthesis of codon-optimized gene sequences for the human protein kinase B-2 (PKB2; 1494 bp), p70 ribosomal S6 subunit protein kinase-1 (S6K1; 1622 bp) and phosphoinositide-dependent protein kinase-1 (PDK1; 1712 bp). Each of the 60mer TBIO primers coded for identical nucleotide regions that the 60mer TBC primers covered, except that half of the TBIO primers were reverse complement sequences. In addition, the TBIO and TBC primers contained identical regions of temperature- optimized primer overlaps. The TBC method was optimized to generate sequential overlapping fragments (∼0.4–0.5 kb) for each of the gene sequences, and simultaneous and sequential combinations of overlapping fragments were tested for their ability to be assembled under an array of PCR conditions. However, no fully synthesized gene sequences could be obtained by this approach. In contrast, the TBIO method generated an initial central fragment (∼0.4–0.5 kb), which could be gel purified and used for further inside-out bidirectional elongation by additional increments of 0.4–0.5 kb. By using the newly developed TBIO method of PCR-based gene synthesis, error-free synthetic genes for the human protein kinases PKB2, S6K1 and PDK1 were obtained with little or no corrective mutagenesis. PMID:14602936
Taniguchi, Suguru; Watanabe, Noriko; Nose, Takeru; Maeda, Iori
2016-01-01
Tropoelastin is the primary component of elastin, which forms the elastic fibers that make up connective tissues. The hydrophobic domains of tropoelastin are thought to mediate the self-assembly of elastin into fibers, and the temperature-mediated self-assembly (coacervation) of one such repetitive peptide sequence (VPGVG) has been utilized in various bio-applications. To elucidate a mechanism for coacervation activity enhancement and to develop more potent coacervatable elastin-derived peptides, we synthesized two series of peptide analogs containing an aromatic amino acid, Trp or Tyr, in addition to Phe-containing analogs and tested their functional characteristics. Thus, position 1 of the hydrophobic pentapeptide repeat of elastin (X(1)P(2)G(3)V(4)G(5)) was substituted by Trp or Tyr. Eventually, we acquired a novel, short Trp-containing elastin-derived peptide analog (WPGVG)3 with potent coacervation ability. From the results obtained during this process, we determined the importance of aromaticity and hydrophobicity for the coacervation potency of elastin-derived peptide analogs. Generally, however, the production of long-chain synthetic polypeptides in quantities sufficient for commercial use remain cost-prohibitive. Therefore, the identification of (WPGVG)3, which is a 15-mer short peptide consisting simply of five natural amino acids and shows temperature-dependent self-assembly activity, might serve as a foundation for the development of various kinds of biomaterials. Copyright © 2015 European Peptide Society and John Wiley & Sons, Ltd.
Sequence repeats and protein structure
NASA Astrophysics Data System (ADS)
Hoang, Trinh X.; Trovato, Antonio; Seno, Flavio; Banavar, Jayanth R.; Maritan, Amos
2012-11-01
Repeats are frequently found in known protein sequences. The level of sequence conservation in tandem repeats correlates with their propensities to be intrinsically disordered. We employ a coarse-grained model of a protein with a two-letter amino acid alphabet, hydrophobic (H) and polar (P), to examine the sequence-structure relationship in the realm of repeated sequences. A fraction of repeated sequences comprises a distinct class of bad folders, whose folding temperatures are much lower than those of random sequences. Imperfection in sequence repetition improves the folding properties of the bad folders while deteriorating those of the good folders. Our results may explain why nature has utilized repeated sequences for their versatility and especially to design functional proteins that are intrinsically unstructured at physiological temperatures.
Benchmarking of Methods for Genomic Taxonomy
Larsen, Mette V.; Cosentino, Salvatore; Lukjancenko, Oksana; ...
2014-02-26
One of the first issues that emerges when a prokaryotic organism of interest is encountered is the question of what it is—that is, which species it is. The 16S rRNA gene formed the basis of the first method for sequence-based taxonomy and has had a tremendous impact on the field of microbiology. Nevertheless, the method has been found to have a number of shortcomings. In this paper, we trained and benchmarked five methods for whole-genome sequence-based prokaryotic species identification on a common data set of complete genomes: (i) SpeciesFinder, which is based on the complete 16S rRNA gene; (ii) Reads2Typemore » that searches for species-specific 50-mers in either the 16S rRNA gene or the gyrB gene (for the Enterobacteraceae family); (iii) the ribosomal multilocus sequence typing (rMLST) method that samples up to 53 ribosomal genes; (iv) TaxonomyFinder, which is based on species-specific functional protein domain profiles; and finally (v) KmerFinder, which examines the number of cooccurring k-mers (substrings of k nucleotides in DNA sequence data). The performances of the methods were subsequently evaluated on three data sets of short sequence reads or draft genomes from public databases. In total, the evaluation sets constituted sequence data from more than 11,000 isolates covering 159 genera and 243 species. Our results indicate that methods that sample only chromosomal, core genes have difficulties in distinguishing closely related species which only recently diverged. Finally, the KmerFinder method had the overall highest accuracy and correctly identified from 93% to 97% of the isolates in the evaluations sets.« less
Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsodikov, Oleg V.; Biswas, Tapan
An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less
Park, Mi-Ri; Kwon, Sun-Jung; Choi, Hong-Soo; Hemenway, Cynthia L; Kim, Kook-Hyung
2008-08-15
The repeated ACCA or AC-rich sequence and structural (SL1) elements in the 5' non-translated region (NTR) of the Potato virus X (PVX) RNA play vital roles in the PVX life cycle by controlling translation, RNA replication, movement, and assembly. It has already been shown that the repeated ACCA or AC-rich sequence affect both gRNA and sgRNA accumulation, while not affecting minus-strand RNA accumulation, and are also required for host protein binding. The functional significance of the repeated ACCA sequence elements in the 5' NTR region was investigated by analyzing the effects of deletion and site-directed mutations on PVX replication in Nicotiana benthamiana plants and NT1 protoplasts. Substitution (ACCA into AAAA or UUUU) mutations introduced in the first (nt 10-13) element in the 5' NTR of the PVX RNA significantly affected viral replication, while mutations introduced in the second (nt 17-20) and third (nt 20-23) elements did not. The fourth (nt 29-32) ACCA element weakly affected virus replication, whereas mutations in the fifth (nt 38-41) significantly reduced virus replication due to the structure disruption of SL1 by AAAA and/or UUUU substitutions. Further characterization of the first ACCA element indicated that duplication of ACCA at nt 10-13 (nt 10-17, ACCAACCA) caused severe symptom development as compared to that of wild type, while deletion of the single element (nt 10-13), DeltaACCA) or tripling of this element caused reduced symptom development. Single- and double-nucleotide substitutions introduced into the first ACCA element revealed the importance of CC located at nt positions 11 and 12. Altogether, these results indicate that the first ACCA element is important for PVX replication.
From Prime to Extended Mission: Evolution of the MER Tactical Uplink Process
NASA Technical Reports Server (NTRS)
Mishkin, Andrew H.; Laubach, Sharon
2006-01-01
To support a 90-day surface mission for two robotic rovers, the Mars Exploration Rover mission designed and implemented an intensive tactical operations process, enabling daily commanding of each rover. Using a combination of new processes, custom software tools, a Mars-time staffing schedule, and seven-day-a-week operations, the MER team was able to compress the traditional weeks-long command-turnaround for a deep space robotic mission to about 18 hours. However, the pace of this process was never intended to be continued indefinitely. Even before the end of the three-month prime mission, MER operations began evolving towards greater sustainability. A combination of continued software tool development, increasing team experience, and availability of reusable sequences first reduced the mean process duration to approximately 11 hours. The number of workshifts required to perform the process dropped, and the team returned to a modified 'Earth-time' schedule. Additional process and tool adaptation eventually provided the option of planning multiple Martian days of activity within a single workshift, making 5-day-a-week operations possible. The vast majority of the science team returned to their home institutions, continuing to participate fully in the tactical operations process remotely. MER has continued to operate for over two Earth-years as many of its key personnel have moved on to other projects, the operations team and budget have shrunk, and the rovers have begun to exhibit symptoms of aging.
Overlapping ETS and CRE Motifs (G/CCGGAAGTGACGTCA) Preferentially Bound by GABPα and CREB Proteins
Chatterjee, Raghunath; Zhao, Jianfei; He, Ximiao; Shlyakhtenko, Andrey; Mann, Ishminder; Waterfall, Joshua J.; Meltzer, Paul; Sathyanarayana, B. K.; FitzGerald, Peter C.; Vinson, Charles
2012-01-01
Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X4-N1-30-X4) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif (C/GCCGGAAGCGGAA) and the ETS⇔CRE motif (C/GCGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif. PMID:23050235
Yuan, Can; Peng, Fang; Yang, Ze-Mao; Zhong, Wen-Juan; Mou, Fang-Sheng; Gong, Yi-Yun; Ji, Pei-Cheng; Pu, De-Qiang; Huang, Hai-Yan; Yang, Xiao; Zhang, Chao
2017-09-01
Ligusticum chuanxiong is a well-known traditional Chinese medicine plant. The study on its molecular markers development and germplasm resources is very important. In this study, we obtained 24 422 unigenes by assembling transcriptome sequencing reads of L. chuanxiong root. EST-SSR was detected and 4 073 SSR loci were identified. EST-SSR distribution and characteristic analysis results showed that the mono-nucleotide repeats were the main repeat types, accounting for 41.0%. In addition, the sequences containing SSR were functionally annotated in Gene Ontology (GO) and KEGG pathway and were assigned to 49 GO categories, 242 KEGG pathways, among them 2 201 sequences were annotated against Nr database. By validating 235 EST-SSRs,74 primer pairs were ultimately proved to have high quality amplification. Subsequently, genetic diversity analysis, UPGMA cluster analysis, PCoA analysis and population structure analysis of 34 L. chuanxiong germplasm resources were carried out with 74 primer pairs. In both UPGMA tree and PCoA results, L. chuanxiong resources were clustered into two groups, which are believed to be partial related to their geographical distribution. In this study, EST-SSRs in L. chuanxiong was firstly identified, and newly developed molecular markers would contribute significantly to further genetic diversity study, the purity detection, gene mapping, and molecular breeding. Copyright© by the Chinese Pharmaceutical Association.
RNA-Skim: a rapid method for RNA-Seq quantification at transcript level
Zhang, Zhaojun; Wang, Wei
2014-01-01
Motivation: RNA-Seq technique has been demonstrated as a revolutionary means for exploring transcriptome because it provides deep coverage and base pair-level resolution. RNA-Seq quantification is proven to be an efficient alternative to Microarray technique in gene expression study, and it is a critical component in RNA-Seq differential expression analysis. Most existing RNA-Seq quantification tools require the alignments of fragments to either a genome or a transcriptome, entailing a time-consuming and intricate alignment step. To improve the performance of RNA-Seq quantification, an alignment-free method, Sailfish, has been recently proposed to quantify transcript abundances using all k-mers in the transcriptome, demonstrating the feasibility of designing an efficient alignment-free method for transcriptome quantification. Even though Sailfish is substantially faster than alternative alignment-dependent methods such as Cufflinks, using all k-mers in the transcriptome quantification impedes the scalability of the method. Results: We propose a novel RNA-Seq quantification method, RNA-Skim, which partitions the transcriptome into disjoint transcript clusters based on sequence similarity, and introduces the notion of sig-mers, which are a special type of k-mers uniquely associated with each cluster. We demonstrate that the sig-mer counts within a cluster are sufficient for estimating transcript abundances with accuracy comparable with any state-of-the-art method. This enables RNA-Skim to perform transcript quantification on each cluster independently, reducing a complex optimization problem into smaller optimization tasks that can be run in parallel. As a result, RNA-Skim uses <4% of the k-mers and <10% of the CPU time required by Sailfish. It is able to finish transcriptome quantification in <10 min per sample by using just a single thread on a commodity computer, which represents >100 speedup over the state-of-the-art alignment-based methods, while delivering comparable or higher accuracy. Availability and implementation: The software is available at http://www.csbio.unc.edu/rs. Contact: weiwang@cs.ucla.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24931995
Embedded CMOS basecalling for nanopore DNA sequencing.
Chengjie Wang; Junli Zheng; Magierowski, Sebastian; Ghafar-Zadeh, Ebrahim
2016-08-01
DNA sequencing based on nanopore sensors is now entering the marketplace. The ability to interface this technology to established CMOS microelectronics promises significant improvements in functionality and miniaturization. Among the key functions to benefit from this interface will be basecalling, the conversion of raw electronic molecular signatures to nucleotide sequence predictions. This paper presents the design and performance potential of custom CMOS base-callers embedded alongside nanopore sensors. A basecalliing architecture implemented in 32-nm technology is discussed with the ability to process the equivalent of 20 human genomes per day in real-time at a power density of 5 W/cm2 assuming a 3-mer nanopore sensor.
pH-independent triple-helix formation with 6-oxocytidine as cytidine analogue.
Parsch, U; Engels, J W
2000-07-03
The syntheses of six different phosphoramidite building blocks of 6-oxocytosine and 5-allyl-6-oxocytosine as analogues of N(3)-protonated cytosine are described. These compounds have been incorporated into oligonucleotides by standard solid-phase synthesis. Hybridization of 15-mer Hoogsteen strands with target 21-mer duplexes was investigated. Comparison of the triplex-forming abilities of the different building blocks revealed that: i) 5-allyl substitution has a negative influence on triplex stability, ii) a uniform backbone of the Hoogsteen strand stabilizes triplexes relative to mixed backbones; iii) RNA strands with 6-oxocytidine or 5-allyl-6-oxocytidine do not form a triple helix with the DNA target duplex, probably due to backbone torsional constraints; and (iv) a 15-mer DNA sequence with three isolated 2'-deoxy-6-oxocytidines has the highest T(m) of all cytidine analogues investigated in this study. CD experiments provided further evidence for the presence or absence of triplex structures. In the course of these temperature-dependent CD measurements we were able to detect duplex and triplex melting independent from each other at selected wavelengths. This methodology is especially interesting in cases where UV melting curves show only one transition owing to spectral overlap.
Ma, Menglin; Li, Jihong
2015-01-01
ABSTRACT The accessory growth regulator (Agr)-like quorum sensing (QS) system of Clostridium perfringens controls the production of many toxins, including beta toxin (CPB). We previously showed (J. E. Vidal, M. Ma, J. Saputo, J. Garcia, F. A. Uzal, and B. A. McClane, Mol Microbiol 83:179–194, 2012, http://dx.doi.org/10.1111/j.1365-2958.2011.07925.x) that an 8-amino-acid, AgrD-derived peptide named 8-R upregulates CPB production by this QS system. The current study synthesized a series of small signaling peptides corresponding to sequences within the C. perfringens AgrD polypeptide to investigate the C. perfringens autoinducing peptide (AIP) structure-function relationship. When both linear and cyclic ring forms of these peptides were added to agrB null mutants of type B strain CN1795 or type C strain CN3685, the 5-amino-acid peptides, whether in a linear or ring (thiolactone or lactone) form, induced better signaling (more CPB production) than peptide 8-R for both C. perfringens strains. The 5-mer thiolactone ring peptide induced faster signaling than the 5-mer linear peptide. Strain-related variations in sensing these peptides were detected, with CN3685 sensing the synthetic peptides more strongly than CN1795. Consistent with those synthetic peptide results, Transwell coculture experiments showed that CN3685 exquisitely senses native AIP signals from other isolates (types A, B, C, and D), while CN1795 barely senses even its own AIP. Finally, a C. perfringens AgrD sequence-based peptide with a 6-amino-acid thiolactone ring interfered with CPB production by several C. perfringens strains, suggesting potential therapeutic applications. These results indicate that AIP signaling sensitivity and responsiveness vary among C. perfringens strains and suggest C. perfringens prefers a 5-mer AIP to initiate Agr signaling. IMPORTANCE Clostridium perfringens possesses an Agr-like quorum sensing (QS) system that regulates virulence, sporulation, and toxin production. The current study used synthetic peptides to identify the structure-function relationship for the signaling peptide that activates this QS system. We found that a 5-mer peptide induces optimal signaling. Unlike other Agr systems, a linear version of this peptide (in addition to thiolactone and lactone versions) could induce signaling. Two C. perfringens strains were found to vary in sensitivity to these peptides. We also found that a 6-mer peptide can inhibit toxin production by some strains, suggesting therapeutic applications. PMID:25777675
Bardella, Vanessa Bellini; Pita, Sebastián; Vanzela, André Luis Laforga; Galvão, Cleber; Panzera, Francisco
2016-10-01
The subfamily Triatominae (Hemiptera, Reduviidae) includes 150 species of blood-sucking insects, vectors of Chagas disease or American trypanosomiasis. Karyotypic information reveals a striking stability in the number of autosomes. However, this group shows substantial variability in genome size, the amount and distribution of C-heterochromatin, and the chromosome positions of 45S rDNA clusters. Here, we analysed the karyotypes of 41 species from six different genera with C-fluorescence banding in order to evaluate the base-pair richness of heterochromatic regions. Our results show a high heterogeneity in the fluorescent staining of the heterochromatin in both autosomes and sex chromosomes, never reported before within an insect subfamily with holocentric chromosomes. This technique allows a clear discrimination of the heterochromatic regions classified as similar by C-banding, constituting a new chromosome marker with taxonomic and evolutionary significance. The diverse fluorescent patterns are likely due to the amplification of different repeated sequences, reflecting an unusual dynamic rearrangement in the genomes of this subfamily. Further, we discuss the evolution of these repeated sequences in both autosomes and sex chromosomes in species of Triatominae.
Phytoremediation of Ionic and Methyl Mercury P
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meagher, Richard B.
1999-06-01
Our long-term goal is to enable highly productive plant species to extract, resist, detoxify, and/or sequester toxic heavy metal pollutants as an environmentally friendly alternative to physical remediation methods. We have focused this phytoremediation research on soil and water-borne ionic and methylmercury. Mercury pollution is a serious world-wide problem affecting the health of human and wild-life populations. Methylmercury, produced by native bacteria at mercury-contaminated wetland sites, is a particularly serious problem due to its extreme toxicity and efficient biomagnification in the food chain. We engineered several plant species (e.g., Arabidopsis, tobacco, canola, yellow poplar, rice) to express the bacterial genes,more » merB and/or merA, under the control of plant regulatory sequences. These transgenic plants acquired remarkable properties for mercury remediation. (1) Transgenic plants expressing merB (organomercury lyase) extract methylmercury from their growth substrate and degrade it to less toxic ionic mercury. They grow on concentrations of methylmercury that kill normal plants and accumulate low levels of ionic mercury. (2) Transgenic plants expressing merA (mercuric ion reductase) extract and electrochemically reduce toxic, reactive ionic mercury to much less toxic and volatile metallic mercury. This metal transformation is driven by the powerful photosynthetic reducing capacity of higher plants that generates excess NADPH using solar energy. MerA plants grow vigorously on levels of ionic mercury that kill control plants. Plants expressing both merB and merA degrade high levels of methylmercury and volatilize metallic mercury. These properties were shown to be genetically stable for several generations in the two plant species examined. Our work demonstrates that native trees, shrubs, and grasses can be engineered to remediate the most abundant toxic mercury pollutants. Building on these data our working hypothesis for the next grant period is that transgenic plants expressing the bacterial merB and merA genes will (a) remove mercury from polluted soil and water and (b) prevent methylmercury from entering the food chain. Our specific aims center on understanding the mechanisms by which plants process the various forms of mercury and volatilize or transpire mercury vapor. This information will allow us to improve the design of our current phytoremediation strategies. As an alternative to volatilizing mercury, we are using several new genes to construct plants that will hyperaccumulate mercury in above-ground tissues for later harvest. The Department of Energy's Oak Ridge National Laboratory and Brookhaven National Laboratory have sites with significant levels of mercury contamination that could be cleaned by applying the scientific discoveries and new phytoremediation technologies described in this proposal. The knowledge and expertise gained by engineering plants to hyperaccumulate mercury can be applied to the remediation of other heavy metals pollutants (e.g., arsenic, cesium, cadmium, chromium, lead, strontium, technetium, uranium) found at several DOE facilities.« less
Huang, Yu-Feng; Midha, Mohit; Chen, Tzu-Han; Wang, Yu-Tai; Smith, David Glenn; Pei, Kurtis Jai-Chyi; Chiu, Kuo Ping
2015-01-01
The Taiwanese (Formosan) macaque (Macaca cyclopis) is the only nonhuman primate endemic to Taiwan. This primate species is valuable for evolutionary studies and as subjects in medical research. However, only partial fragments of the mitochondrial genome (mitogenome) of this primate species have been sequenced, not mentioning its nuclear genome. We employed next-generation sequencing to generate 2 x 90 bp paired-end reads, followed by reference-assisted de novo assembly with multiple k-mer strategy to characterize the M. cyclopis mitogenome. We compared the assembled mitogenome with that of other macaque species for phylogenetic analysis. Our results show that, the M. cyclopis mitogenome consists of 16,563 nucleotides encoding for 13 protein-coding genes, 2 ribosomal RNAs and 22 transfer RNAs. Phylogenetic analysis indicates that M. cyclopis is most closely related to M. mulatta lasiota (Chinese rhesus macaque), supporting the notion of Asia-continental origin of M. cyclopis proposed in previous studies based on partial mitochondrial sequences. Our work presents a novel approach for assembling a mitogenome that utilizes the capabilities of de novo genome assembly with assistance of a reference genome. The availability of the complete Taiwanese macaque mitogenome will facilitate the study of primate evolution and the characterization of genetic variations for the potential usage of this species as a non-human primate model for medical research.
Phage display selection of peptides that target calcium-binding proteins.
Vetter, Stefan W
2013-01-01
Phage display allows to rapidly identify peptide sequences with binding affinity towards target proteins, for example, calcium-binding proteins (CBPs). Phage technology allows screening of 10(9) or more independent peptide sequences and can identify CBP binding peptides within 2 weeks. Adjusting of screening conditions allows selecting CBPs binding peptides that are either calcium-dependent or independent. Obtained peptide sequences can be used to identify CBP target proteins based on sequence homology or to quickly obtain peptide-based CBP inhibitors to modulate CBP-target interactions. The protocol described here uses a commercially available phage display library, in which random 12-mer peptides are displayed on filamentous M13 phages. The library was screened against the calcium-binding protein S100B.
Comparison of simple sequence repeats in 19 Archaea.
Trivedi, S
2006-12-05
All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
Kralovicova, Jana; Moreno, Pedro M D; Cross, Nicholas C P; Pêgo, Ana Paula; Vorechovsky, Igor
2016-12-01
ATM (ataxia-telangiectasia, mutated) is an important cancer susceptibility gene that encodes a key apical kinase in the DNA damage response pathway. ATM mutations in the germ line result in ataxia-telangiectasia (A-T), a rare genetic syndrome associated with hypersensitivity to double-strand DNA breaks and predisposition to lymphoid malignancies. ATM expression is limited by a tightly regulated nonsense-mediated RNA decay (NMD) switch exon (termed NSE) located in intron 28. In this study, we identify antisense oligonucleotides that modulate NSE inclusion in mature transcripts by systematically targeting the entire 3.1-kb-long intron. Their identification was assisted by a segmental deletion analysis of transposed elements, revealing NSE repression upon removal of a distant antisense Alu and NSE activation upon elimination of a long terminal repeat transposon MER51A. Efficient NSE repression was achieved by delivering optimized splice-switching oligonucleotides to embryonic and lymphoblastoid cells using chitosan-based nanoparticles. Together, these results provide a basis for possible sequence-specific radiosensitization of cancer cells, highlight the power of intronic antisense oligonucleotides to modify gene expression, and demonstrate transposon-mediated regulation of NSEs.
X-ray characterization of mesophases of human telomeric G-quadruplexes and other DNA analogues
Yasar, Selcuk; Schimelman, Jacob B.; Aksoyoglu, M. Alphan; ...
2016-06-02
We report that observed in the folds of guanine-rich oligonucleotides, non-canonical G-quadruplex structures are based on G-quartets formed by hydrogen bonding and cation-coordination of guanosines. In dilute 5'-guanosine monophosphate (GMP) solutions, G-quartets form by the self-assembly of four GMP nucleotides. We use x-ray diffraction to characterize the columnar liquid-crystalline mesophases in concentrated solutions of various model G-quadruplexes. We then probe the transitions between mesophases by varying the PEG solution osmotic pressure, thus mimicking in vivo molecular crowding conditions. Using the GMP-quadruplex, built by the stacking of G-quartets with no covalent linking between them, as the baseline, we report the liquid-crystallinemore » phase behaviors of two other related G-quadruplexes: (i) the intramolecular parallel-stranded G-quadruplex formed by the 22-mer four-repeat human telomeric sequence AG 3 (TTAG 3) 3 and (ii) the intermolecular parallel-stranded G-quadruplex formed by the TG(4)T oligonucleotides. Finally, we compare the mesophases of the G-quadruplexes, under PEG-induced crowding conditions, with the corresponding mesophases of the canonical duplex and triplex DNA analogues.« less
X-ray characterization of mesophases of human telomeric G-quadruplexes and other DNA analogues
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yasar, Selcuk; Schimelman, Jacob B.; Aksoyoglu, M. Alphan
We report that observed in the folds of guanine-rich oligonucleotides, non-canonical G-quadruplex structures are based on G-quartets formed by hydrogen bonding and cation-coordination of guanosines. In dilute 5'-guanosine monophosphate (GMP) solutions, G-quartets form by the self-assembly of four GMP nucleotides. We use x-ray diffraction to characterize the columnar liquid-crystalline mesophases in concentrated solutions of various model G-quadruplexes. We then probe the transitions between mesophases by varying the PEG solution osmotic pressure, thus mimicking in vivo molecular crowding conditions. Using the GMP-quadruplex, built by the stacking of G-quartets with no covalent linking between them, as the baseline, we report the liquid-crystallinemore » phase behaviors of two other related G-quadruplexes: (i) the intramolecular parallel-stranded G-quadruplex formed by the 22-mer four-repeat human telomeric sequence AG 3 (TTAG 3) 3 and (ii) the intermolecular parallel-stranded G-quadruplex formed by the TG(4)T oligonucleotides. Finally, we compare the mesophases of the G-quadruplexes, under PEG-induced crowding conditions, with the corresponding mesophases of the canonical duplex and triplex DNA analogues.« less
Woo, Patrick C. Y.; Lau, Susanna K. P.; Fan, Rachel Y. Y.; Lau, Candy C. Y.; Wong, Emily Y. M.; Joseph, Sunitha; Tsang, Alan K. L.; Wernery, Renate; Yip, Cyril C. Y.; Tsang, Chi-Ching; Wernery, Ulrich; Yuen, Kwok-Yung
2016-01-01
Recently, we reported the discovery of a dromedary camel coronavirus UAE-HKU23 (DcCoV UAE-HKU23) from dromedaries in the Middle East. In this study, DcCoV UAE-HKU23 was successfully isolated in two of the 14 dromedary fecal samples using HRT-18G cells, with cytopathic effects observed five days after inoculation. Northern blot analysis revealed at least seven distinct RNA species, corresponding to predicted subgenomic mRNAs and confirming the core sequence of transcription regulatory sequence motifs as 5′-UCUAAAC-3′ as we predicted previously. Antibodies against DcCoV UAE-HKU23 were detected in 58 (98.3%) and 59 (100%) of the 59 dromedary sera by immunofluorescence and neutralization antibody tests, respectively. There was significant correlation between the antibody titers determined by immunofluorescence and neutralization assays (Pearson coefficient = 0.525, p < 0.0001). Immunization of mice using recombinant N proteins of DcCoV UAE-HKU23 and Middle East respiratory syndrome coronavirus (MERS-CoV), respectively, and heat-inactivated DcCoV UAE-HKU23 showed minimal cross-antigenicity between DcCoV UAE-HKU23 and MERS-CoV by Western blot and neutralization antibody assays. Codon usage and genetic distance analysis of RdRp, S and N genes showed that the 14 strains of DcCoV UAE-HKU23 formed a distinct cluster, separated from those of other closely related members of Betacoronavirus 1, including alpaca CoV, confirming that DcCoV UAE-HKU23 is a novel member of Betacoronavirus 1. PMID:27164099
You, Fei; Yin, Guangfu; Pu, Ximing; Li, Yucan; Hu, Yang; Huang, Zhongbin; Liao, Xiaoming; Yao, Yadong; Chen, Xianchun
2016-05-01
Functionalization of inorganic nanoparticles (NPs) play an important role in biomedical applications. A proper functionalization of NPs can improve biocompatibility, avoid a loss of bioactivity, and further endow NPs with unique performances. Modification with vairous specific binding biomolecules from random biological libraries has been explored. In this work, two 7-mer peptides with sequences of HYIDFRW and TVNFKLY were selected from a phage display random peptide library by using ferromagnetic NPs as targets, and were verified to display strong binding affinity to Fe3O4 NPs. Fourier transform infrared spectrometry, fluorescence microscopy, thermal analysis and X-ray photoelectron spectroscopy confirmed the presence of peptides on the surface of Fe3O4 NPs. Sequence analyses revealed that the probable binding mechanism between the peptide and Fe3O4 NPs might be driven by Pearson hard acid-hard base specific interaction and hydrogen bonds, accompanied with hydrophilic interactions and non-specific electrostatic attractions. The cell viability assay indicated a good cytocompatibility of peptide-bound Fe3O4 NPs. Furthermore, TVNFKLY peptide and an ovarian tumor cell A2780 specific binding peptide (QQTNWSL) were conjugated to afford a liner 14-mer peptide (QQTNWSLTVNFKLY). The binding and targeting studies showed that 14-mer peptide was able to retain both the strong binding ability to Fe3O4 NPs and the specific binding ability to A2780 cells. The results suggested that the Fe3O4-binding peptides would be of great potential in the functionalization of Fe3O4 NPs for the tumor-targeted drug delivery and magnetic hyperthermia. Copyright © 2016 Elsevier B.V. All rights reserved.
Jain, Priyamvada; Chakma, Babina; Patra, Sanjukta; Goswami, Pranab
2017-03-01
A set of 90 mer long ssDNA candidates, with different degrees of cytosine (C-levels) (% and clusters) was analyzed for their function as suitable Ag-nanocluster (AgNC) nucleation scaffolds. The sequence (P4) with highest C-level (42.2%) emerged as the only candidate supporting the nucleation process as evident from its intense fluorescence peak at λ 660 nm . Shorter DNA subsets derived from P4 with only stable hairpin structures could support the AgNC formation. The secondary hairpin structures were confirmed by PAGE, and CD studies. The number of base pairs in the stem region also contributes to the stability of the hairpins. A shorter 29 mer sequence (Sub 3) (ΔG = -1.3 kcal/mol) with 3-bp in the stem of a 7-mer loop conferred highly stable AgNC. NAD + strongly quenched the fluorescence of Sub 3-AgNC in a concentration dependent manner. Time resolved photoluminescence studies revealed the quenching involves a combined static and dynamic interaction where the binding constant and number of binding sites for NAD + were 0.201 L mol -1 and 3.6, respectively. A dynamic NAD + detection range of 50-500 μM with a limit of detection of 22.3 μM was discerned. The NAD + mediated quenching of AgNC was not interfered by NADH, NADP + , monovalent and divalent ions, or serum samples. The method was also used to follow alcohol dehydrogenase and lactate dehydrogenase catalyzed physiological reactions in a turn-on and turn-off assay, respectively. The proposed method with ssDNA-AgNC could therefore be extended to monitor other NAD + /NADH based enzyme catalyzed reactions in a turn-on/turn-off approach. Copyright © 2016 Elsevier B.V. All rights reserved.
De Bellis, Fabien; Malapa, Roger; Kagy, Valérie; Lebegin, Stéphane; Billot, Claire; Labouisse, Jean-Pierre
2016-01-01
Premise of the study: Using next-generation sequencing technology, new microsatellite loci were characterized in Artocarpus altilis (Moraceae) and two congeners to increase the number of available markers for genotyping breadfruit cultivars. Methods and Results: A total of 47,607 simple sequence repeat loci were obtained by sequencing a library of breadfruit genomic DNA with an Illumina MiSeq system. Among them, 50 single-locus markers were selected and assessed using 41 samples (39 A. altilis, one A. camansi, and one A. heterophyllus). All loci were polymorphic in A. altilis, 44 in A. camansi, and 21 in A. heterophyllus. The number of alleles per locus ranged from two to 19. Conclusions: The new markers will be useful for assessing the identity and genetic diversity of breadfruit cultivars on a small geographical scale, gaining a better understanding of farmer management practices, and will help to optimize breadfruit genebank management. PMID:27610273
Chen, Pu; Ma, Mingyi; Shang, Huifang; Su, Dan; Zhang, Sizhong; Yang, Yuan
2009-12-01
To standardize the experimental procedure of the gene test for autosomal dominant cerebellar ataxias (ADCA), and provide the basis for quantitative criteria of the dynamic mutation of spinocerebellar ataxia (SCA) genes in Chinese population. Genotyping of the dynamic mutation loci of the SCA1, SCA2, SCA3, SCA6 and SCA7 genes was performed, using florescence PCR-capillary electrophoresis followed by DNA sequencing, to investigate the variation range of copy number of CAG tandem repeat of the genes in 263 probands of ADCA pedigrees and 261 non-related normal controls. Based on the sequencing result, the bias of the CAG copy number estimation using capillary electrophoresis with different DNA controls was compared to analyze the technical detailes of the electrophresis method in testing the dynamic mutation sites. PCR products containing dynamic mutation loci of the SCA genes showed significantly higher mobility than that of molecular weigh marker with relatively balanced GC content. This was particularly obvious in the SCA2, SCA 6 and SCA7 genes whereas the deviation of copy number could be corrected to +/-1 when known CAG copy number fragments were used as controls. The mobility of PCR products was primarily related to the copy number of CAG repeat when the fragments contained normal CAG repeat. In the 263 ADCA pedigrees, 6 (2.28%) carried SCA1 gene mutation, 8 (3.04%) had SCA2 mutation and 81 (30.80%) harbored SCA3 mutation. The gene mutation of SCA6 and SCA7 was not found. The normal variation range of the CAG repeat was 17-36 copies in SCA1 gene, 13-30 copies in SCA2, 14-39 copies in SCA3, 6-16 copies in SCA6 and 6-13 copies in SCA7. The heterozygosity was 76.1%, 17.7%, 74.4%, 72.1% and 41.3%, respectively. The mutation range of the CAG repeat was 49-56 copies in SCA1 gene, 36-41 copies in SCA2, 59-81 copies in SCA3. Neither homozygous mutation of an SCA gene nor double heterozygous mutation of the SCA genes was observed in the study. The copy number of the CAG repeat in SCA genes could be calculated accurately based on the result of florescence PCR-capillary electrophoresis when limited amount of known repeat copy number controls were used. Our result supported that the notion that SCA3 gene mutation was the most common cause for ADCA, and the obtained data would be helpful for establishing quantitative criteria of the dynamic mutation of the SCA genes in Chinese.
Kawaguchi, Risa; Kiryu, Hisanori
2016-05-06
RNA secondary structure around splice sites is known to assist normal splicing by promoting spliceosome recognition. However, analyzing the structural properties of entire intronic regions or pre-mRNA sequences has been difficult hitherto, owing to serious experimental and computational limitations, such as low read coverage and numerical problems. Our novel software, "ParasoR", is designed to run on a computer cluster and enables the exact computation of various structural features of long RNA sequences under the constraint of maximal base-pairing distance. ParasoR divides dynamic programming (DP) matrices into smaller pieces, such that each piece can be computed by a separate computer node without losing the connectivity information between the pieces. ParasoR directly computes the ratios of DP variables to avoid the reduction of numerical precision caused by the cancellation of a large number of Boltzmann factors. The structural preferences of mRNAs computed by ParasoR shows a high concordance with those determined by high-throughput sequencing analyses. Using ParasoR, we investigated the global structural preferences of transcribed regions in the human genome. A genome-wide folding simulation indicated that transcribed regions are significantly more structural than intergenic regions after removing repeat sequences and k-mer frequency bias. In particular, we observed a highly significant preference for base pairing over entire intronic regions as compared to their antisense sequences, as well as to intergenic regions. A comparison between pre-mRNAs and mRNAs showed that coding regions become more accessible after splicing, indicating constraints for translational efficiency. Such changes are correlated with gene expression levels, as well as GC content, and are enriched among genes associated with cytoskeleton and kinase functions. We have shown that ParasoR is very useful for analyzing the structural properties of long RNA sequences such as mRNAs, pre-mRNAs, and long non-coding RNAs whose lengths can be more than a million bases in the human genome. In our analyses, transcribed regions including introns are indicated to be subject to various types of structural constraints that cannot be explained from simple sequence composition biases. ParasoR is freely available at https://github.com/carushi/ParasoR .
[Mutation Analysis of 19 STR Loci in 20 723 Cases of Paternity Testing].
Bi, J; Chang, J J; Li, M X; Yu, C Y
2017-06-01
To observe and analyze the confirmed cases of paternity testing, and to explore the mutation rules of STR loci. The mutant STR loci were screened from 20 723 confirmed cases of paternity testing by Goldeneye 20A system.The mutation rates, and the sources, fragment length, steps and increased or decreased repeat sequences of mutant alleles were counted for the analysis of the characteristics of mutation-related factors. A total of 548 mutations were found on 19 STR loci, and 557 mutation events were observed. The loci mutation rate was 0.07‰-2.23‰. The ratio of paternal to maternal mutant events was 3.06:1. One step mutation was the main mutation, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. The repeat sequences were more likely to decrease in two steps mutation and above. Mutation mainly occurred in the medium allele, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. In long allele mutations, the decreased repeat sequences were significantly more than the increased repeat sequences. The number of the increased repeat sequences was almost the same as the decreased repeat sequences in paternal mutation, while the decreased repeat sequences were more than the increased in maternal mutation. There are significant differences in the mutation rate of each locus. When one or two loci do not conform to the genetic law, other detection system should be added, and PI value should be calculated combined with the information of the mutate STR loci in order to further clarify the identification opinions. Copyright© by the Editorial Department of Journal of Forensic Medicine
Effects of Polymer Conjugation on Hybridization Thermodynamics of Oligonucleic Acids.
Ghobadi, Ahmadreza F; Jayaraman, Arthi
2016-09-15
In this work, we perform coarse-grained (CG) and atomistic simulations to study the effects of polymer conjugation on hybridization/melting thermodynamics of oligonucleic acids (ONAs). We present coarse-grained Langevin molecular dynamics simulations (CG-NVT) to assess the effects of the polymer flexibility, length, and architecture on hybridization/melting of ONAs with different ONA duplex sequences, backbone chemistry, and duplex concentration. In these CG-NVT simulations, we use our recently developed CG model of ONAs in implicit solvent, and treat the conjugated polymer as a CG chain with purely repulsive Weeks-Chandler-Andersen interactions with all other species in the system. We find that 8-100-mer linear polymer conjugation destabilizes 8-mer ONA duplexes with weaker Watson-Crick hydrogen bonding (WC H-bonding) interactions at low duplex concentrations, while the same polymer conjugation has an insignificant impact on 8-mer ONA duplexes with stronger WC H-bonding. To ensure the configurational space is sampled properly in the CG-NVT simulations, we also perform CG well-tempered metadynamics simulations (CG-NVT-MetaD) and analyze the free energy landscape of ONA hybridization for a select few systems. We demonstrate that CG-NVT-MetaD simulation results are consistent with the CG-NVT simulations for the studied systems. To examine the limitations of coarse-graining in capturing ONA-polymer interactions, we perform atomistic parallel tempering metadynamics simulations at well-tempered ensemble (AA-MetaD) for a 4-mer DNA in explicit water with and without conjugation to 8-mer poly(ethylene glycol) (PEG). AA-MetaD simulations also show that, for a short DNA duplex at T = 300 K, a condition where the DNA duplex is unstable, conjugation with PEG further destabilizes DNA duplex. We conclude with a comparison of results from these three different types of simulations and discuss their limitations and strengths.
Genotypic & Phenotypic Diversity of Microbial Isolates from the Mars Exploration Rovers
NASA Technical Reports Server (NTRS)
Arora-Williams, Keith
2012-01-01
Mars-bound rovers such as the Mars Exploration Rover (MER) endure strict planetary protection implementation campaigns to assess bioburden. The objective of this study is to identify cultivable microorganisms isolated by the NASA Standard Assay from spacecraft during pre-launch and evaluate their potential to survive conditions on the Martian surface. Of approximately 350 isolates collected from the MER spacecraft archive, 171 microorganisms were reconstituted for characterization via 16S rRNA fingerprinting. Alignment of 16S sequences revealed high levels of sequence similarity to spore-forming species, overwhelmingly of the genera Bacillus (73.7%) and Paenibacillus (14.0%). Samples underwent phenotype characterization employing multiple carbon sources and ion concentrations in an automated microarray format using the Omnilog system. Working and stock cultures were prepared to address the immediate needs for day-to-day culture utilization and long-term preservation, respectively. Results from this study produced details about the microbes that contaminate surfaces of spacecraft, as well as a preliminary evaluation of a rapid biochemical ID method that also provides a phenotypic assessment of contaminants. The overall outcome of this study will benefit emerging cleaning and sterilization technologies for preventing forward contamination that could negatively impact future life detection or sample return missions.
Krumpe, Lauren R H; Atkinson, Andrew J; Smythers, Gary W; Kandel, Andrea; Schumacher, Kathryn M; McMahon, James B; Makowski, Lee; Mori, Toshiyuki
2006-08-01
We investigated whether the T7 system of phage display could produce peptide libraries of greater diversity than the M13 system of phage display due to the differing processes of lytic and filamentous phage morphogenesis. Using a bioinformatics-assisted computational approach, collections of random peptide sequences obtained from a T7 12-mer library (X(12)) and a T7 7-mer disulfide-constrained library (CX(7)C) were analyzed and compared with peptide populations obtained from New England BioLabs' M13 Ph.D.-12 and Ph.D.-C7C libraries. Based on this analysis, peptide libraries constructed with the T7 system have fewer amino acid biases, increased peptide diversity, and more normal distributions of peptide net charge and hydropathy than the M13 libraries. The greater diversity of T7-displayed libraries provides a potential resource of novel binding peptides for new as well as previously studied molecular targets. To demonstrate their utility, several of the T7-displayed peptide libraries were screened for streptavidin- and neutravidin-binding phage. Novel binding motifs were identified for each protein.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats
de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas
2015-01-01
Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Development of a transgenic tobacco plant for phytoremediation of methylmercury pollution.
Nagata, Takeshi; Morita, Hirofumi; Akizawa, Toshifumi; Pan-Hou, Hidemitsu
2010-06-01
To develop the potential of plant for phytoremediation of methylmercury pollution, a genetically engineered tobacco plant that coexpresses organomercurial lyase (MerB) with the ppk-specified polyphosphate (polyP) and merT-encoding mercury transporter was constructed by integrating a bacterial merB gene into ppk/merT-transgenic tobacco. A large number of independent transgenic tobaccos was obtained, in some of which the merB gene was stably integrated in the plant genome and substantially translated to the expected MerB enzyme in the transgenic tobacco. The ppk/merT/merB-transgenic tobacco callus showed more resistance to methylmercury (CH3Hg+) and accumulated more mercury from CH3Hg+-containing medium than the ppk/merT-transgenic and wild-type progenitors. These results suggest that the MerB enzyme encoded by merB degraded the incorporated CH3Hg+ to Hg2+, which then accumulated as a less toxic Hg-polyP complex in the tobacco cells. Phytoremediation of CH3Hg+ and Hg2+ in the environment with this engineered ppk/merT/merB-transgenic plant, which prevents the release mercury vapor (Hg0) into the atmosphere in addition to generating potentially recyclable mercury-rich plant residues, is believed to be more acceptable to the public than other competing technologies, including phytovolatilization.
NASA Astrophysics Data System (ADS)
Li, Qi; Akihiro, Kijima
2007-01-01
The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber (1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats (13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (<20 repeats) were most abundant, accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatellite isolation in other abalone species.
Ciura, Sorana; Sellier, Chantal; Campanari, Maria-Letizia; Charlet-Berguerand, Nicolas; Kabashi, Edor
2016-01-01
ABSTRACT The most common genetic cause for amyotrophic lateral sclerosis and frontotemporal dementia (ALS-FTD) is repeat expansion of a hexanucleotide sequence (GGGGCC) within the C9orf72 genomic sequence. To elucidate the functional role of C9orf72 in disease pathogenesis, we identified certain molecular interactors of this factor. We determined that C9orf72 exists in a complex with SMCR8 and WDR41 and that this complex acts as a GDP/GTP exchange factor for RAB8 and RAB39, 2 RAB GTPases involved in macroautophagy/autophagy. Consequently, C9orf72 depletion in neuronal cultures leads to accumulation of unresolved aggregates of SQSTM1/p62 and phosphorylated TARDBP/TDP-43. However, C9orf72 reduction does not lead to major neuronal toxicity, suggesting that a second stress may be required to induce neuronal cell death. An intermediate size of polyglutamine repeats within ATXN2 is an important genetic modifier of ALS-FTD. We found that coexpression of intermediate polyglutamine repeats (30Q) of ATXN2 combined with C9orf72 depletion increases the aggregation of ATXN2 and neuronal toxicity. These results were confirmed in zebrafish embryos where partial C9orf72 knockdown along with intermediate (but not normal) repeat expansions in ATXN2 causes locomotion deficits and abnormal axonal projections from spinal motor neurons. These results demonstrate that C9orf72 plays an important role in the autophagy pathway while genetically interacting with another major genetic risk factor, ATXN2, to contribute to ALS-FTD pathogenesis. PMID:27245636
Ciura, Sorana; Sellier, Chantal; Campanari, Maria-Letizia; Charlet-Berguerand, Nicolas; Kabashi, Edor
2016-08-02
The most common genetic cause for amyotrophic lateral sclerosis and frontotemporal dementia (ALS-FTD) is repeat expansion of a hexanucleotide sequence (GGGGCC) within the C9orf72 genomic sequence. To elucidate the functional role of C9orf72 in disease pathogenesis, we identified certain molecular interactors of this factor. We determined that C9orf72 exists in a complex with SMCR8 and WDR41 and that this complex acts as a GDP/GTP exchange factor for RAB8 and RAB39, 2 RAB GTPases involved in macroautophagy/autophagy. Consequently, C9orf72 depletion in neuronal cultures leads to accumulation of unresolved aggregates of SQSTM1/p62 and phosphorylated TARDBP/TDP-43. However, C9orf72 reduction does not lead to major neuronal toxicity, suggesting that a second stress may be required to induce neuronal cell death. An intermediate size of polyglutamine repeats within ATXN2 is an important genetic modifier of ALS-FTD. We found that coexpression of intermediate polyglutamine repeats (30Q) of ATXN2 combined with C9orf72 depletion increases the aggregation of ATXN2 and neuronal toxicity. These results were confirmed in zebrafish embryos where partial C9orf72 knockdown along with intermediate (but not normal) repeat expansions in ATXN2 causes locomotion deficits and abnormal axonal projections from spinal motor neurons. These results demonstrate that C9orf72 plays an important role in the autophagy pathway while genetically interacting with another major genetic risk factor, ATXN2, to contribute to ALS-FTD pathogenesis.
Spinocerebellar ataxia 17: full phenotype in a 41 CAG/CAA repeats carrier.
Origone, Paola; Gotta, Fabio; Lamp, Merit; Trevisan, Lucia; Geroldi, Alessandro; Massucco, Davide; Grazzini, Matteo; Massa, Federico; Ticconi, Flavia; Bauckneht, Matteo; Marchese, Roberta; Abbruzzese, Giovanni; Bellone, Emilia; Mandich, Paola
2018-01-01
Spinocerebellar ataxia 17 (SCA17) is one of the most heterogeneous forms of autosomal dominant cerebellar ataxias with a large clinical spectrum which can mimic other movement disorders such as Huntington disease (HD), dystonia and parkinsonism. SCA17 is caused by an expansion of CAG/CAA repeat in the Tata binding protein ( TBP ) gene. Normal alleles contain 25 to 40 CAG/CAA repeats, alleles with 50 or greater CAG/CAA repeats are pathological with full penetrance. Alleles with 43 to 49 CAG/CAA repeats were also reported and their penetrance is estimated between 50 and 80%. Recently few symptomatic individuals having 41 and 42 repeats were reported but it is still unclear whether CAG/CAA repeats of 41 or 42 are low penetrance disease-causing alleles. Thus, phenotypic variability like the disease course in subject with SCA17 locus restricted expansions remains to be fully understood. The patients was a 63-year-old woman who, at 54 years, showed personality changes and increased frequency of falls. At 55 years of age neuropsychological tests showed executive attention and visuospatial deficit. At the age of 59 the patient developed dysarthria and a progressive cognitive deficit. The neurological examination showed moderate gait ataxia, dysdiadochokinesia and dysmetria, dysphagia, dysarthria and abnormal saccadic pursuit, severe axial asynergy during postural changes, choreiform dyskinesias. Molecular analysis of the TBP gene demonstrated an allele with 41 repeat suggesting that 41 CAG/CCG TBP repeats could be an allele associated with the full clinical spectrum of SCA17. The described case with the other similar cases described in the literature suggests that 41 CAG/CAA trinucleotides should be considered as critical threshold in SCA17. We suggest that SCA17 diagnosis should be suspected in patients presenting with movement disorders associated with other neurodegenerative signs and symptoms.
Wang, Li; Deng, Xiangying; Liu, Haican; Zhao, Lanhua; You, Xiaolong; Dai, Pei; Wan, Kanglin; Zeng, Yanhua
2016-11-01
Mycobacterium tuberculosis is an obligate pathogenic bacterial species in the family of Mycobacteriaceae and attracts excessive immune responses which cause pathology of the lungs in active tuberculosis. The lack of more sensitive and effective diagnosis reagents advocates a further recognition for the fast diagnostic and immunological measures for tuberculosis. Here, two 12-mer peptides with core sequences of SVSVGMKPSPRP (CS1) and TMGFTAPRFPHY (CS2) were screened from a phage display random peptide library using the purified mixed tuberculosis-positive serum as a target. Enzyme-linked immunosorbent assay (ELISA) and dot immunobinding assay verified that positive phages exhibited strong binding affinity to mixed tuberculosis-positive serum. BLAST analysis showed that the two sequences may be mimotopes of the Mycobacterium tuberculosis The diagnostic potential for two synthetic mimotope peptides CS1 and CS2 was evaluated using different panels of serum samples (n = 181) by ELISA, and the diagnostic parameters were calculated. CS1 and CS2 achieved sensitivity of 89.41% and 85.88%, and specificities were 90.63% and 87.50%, respectively. We hypothesized that the diagnostic based on CS1 and CS2 may become a promising strategy to enhance the detection of Mycobacterium tuberculosis infection due to higher specificity and sensitivity. Therefore, CS1 and CS2 may possess potentials to provide an experimental basis for the diagnosis of tuberculosis. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L
2013-01-30
Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
2013-01-01
Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705
Barrero, Roberto A; Guerrero, Felix D; Black, Michael; McCooke, John; Chapman, Brett; Schilkey, Faye; Pérez de León, Adalberto A; Miller, Robert J; Bruns, Sara; Dobry, Jason; Mikhaylenko, Galina; Stormo, Keith; Bell, Callum; Tao, Quanzhou; Bogden, Robert; Moolhuijzen, Paula M; Hunter, Adam; Bellgard, Matthew I
2017-08-01
The genome of the cattle tick Rhipicephalus microplus, an ectoparasite with global distribution, is estimated to be 7.1Gbp in length and consists of approximately 70% repetitive DNA. We report the draft assembly of a tick genome that utilized a hybrid sequencing and assembly approach to capture the repetitive fractions of the genome. Our hybrid approach produced an assembly consisting of 2.0Gbp represented in 195,170 scaffolds with a N50 of 60,284bp. The Rmi v2.0 assembly is 51.46% repetitive with a large fraction of unclassified repeats, short interspersed elements, long interspersed elements and long terminal repeats. We identified 38,827 putative R. microplus gene loci, of which 24,758 were protein coding genes (≥100 amino acids). OrthoMCL comparative analysis against 11 selected species including insects and vertebrates identified 10,835 and 3,423 protein coding gene loci that are unique to R. microplus or common to both R. microplus and Ixodes scapularis ticks, respectively. We identified 191 microRNA loci, of which 168 have similarity to known miRNAs and 23 represent novel miRNA families. We identified the genomic loci of several highly divergent R. microplus esterases with sequence similarity to acetylcholinesterase. Additionally we report the finding of a novel cytochrome P450 CYP41 homolog that shows similar protein folding structures to known CYP41 proteins known to be involved in acaricide resistance. Copyright © 2017 Australian Society for Parasitology. Published by Elsevier Ltd. All rights reserved.
Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm
Glunčić, Matko; Paar, Vladimir
2013-01-01
The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). PMID:22977183
Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg
2005-12-01
In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Ye-Ji; Tissue Injury Defense Research Center, School of Medicine, Ewha Womans University, Seoul; Lee, Seung-Hae
2012-08-15
Mer receptor tyrosine kinase (Mer) regulates macrophage activation and promotes apoptotic cell clearance. Mer activation is regulated through proteolytic cleavage of the extracellular domain. To determine if membrane-bound Mer is cleaved during bleomycin-induced lung injury, and, if so, how preventing the cleavage of Mer enhances apoptotic cell uptake and down-regulates pulmonary immune responses. During bleomycin-induced acute lung injury in mice, membrane-bound Mer expression decreased, but production of soluble Mer and activity as well as expression of disintegrin and metalloproteinase 17 (ADAM17) were enhanced . Treatment with the ADAM inhibitor TAPI-0 restored Mer expression and diminished soluble Mer production. Furthermore, TAPI-0more » increased Mer activation in alveolar macrophages and lung tissue resulting in enhanced apoptotic cell clearance in vivo and ex vivo by alveolar macrophages. Suppression of bleomycin-induced pro-inflammatory mediators, but enhancement of hepatocyte growth factor induction were seen after TAPI-0 treatment. Additional bleomycin-induced inflammatory responses reduced by TAPI-0 treatment included inflammatory cell recruitment into the lungs, levels of total protein and lactate dehydrogenase activity in bronchoalveolar lavage fluid, as well as caspase-3 and caspase-9 activity and alveolar epithelial cell apoptosis in lung tissue. Importantly, the effects of TAPI-0 on bleomycin-induced inflammation and apoptosis were reversed by coadministration of specific Mer-neutralizing antibodies. These findings suggest that restored membrane-bound Mer expression by TAPI-0 treatment may help resolve lung inflammation and apoptosis after bleomycin treatment. -- Highlights: ►Mer expression is restored by TAPI-0 treatment in bleomycin-stimulated lung. ►Mer signaling is enhanced by TAPI-0 treatment in bleomycin-stimulated lung. ►TAPI-0 enhances efferocytosis and promotes resolution of lung injury.« less
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.
de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas
2015-11-16
Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Banerjee, Surajit; Christov, Plamen P.; Kozekova, Albena
trans-4-Hydroxynonenal (HNE) is the major peroxidation product of {omega}-6 polyunsaturated fatty acids in vivo. Michael addition of the N{sub 2}-amino group of dGuo to HNE followed by ring closure of N1 onto the aldehyde results in four diastereomeric 1,N{sub 2}-dGuo (1,N{sub 2}-HNE-dGuo) adducts. The (6S,8R,11S)-HNE-1,N{sub 2}-dGuo adduct was incorporated into the 18-mer templates 5'-d(TCATXGAATCCTTCCCCC)-3' and d(TCACXGAATCCTTCCCCC)-3', where X = (6S,8R,11S)-HNE-1,N{sub 2}-dGuo adduct. These differed in the identity of the template 5'-neighbor base, which was either Thy or Cyt, respectively. Each of these templates was annealed with either a 13-mer primer 5'-d(GGGGGAAGGATTC)-3' or a 14-mer primer 5'-d(GGGGGAAGGATTCC)-3'. The addition of dNTPsmore » to the 13-mer primer allowed analysis of dNTP insertion opposite to the (6S,8R,11S)-HNE-1,N{sub 2}-dGuo adduct, whereas the 14-mer primer allowed analysis of dNTP extension past a primed (6S,8R,11S)-HNE-1,N{sub 2}-dGuo:dCyd pair. The Sulfolobus solfataricus P2 DNA polymerase IV (Dpo4) belongs to the Y-family of error-prone polymerases. Replication bypass studies in vitro reveal that this polymerase inserted dNTPs opposite the (6S,8R,11S)-HNE-1,N{sub 2}-dGuo adduct in a sequence-specific manner. If the template 5'-neighbor base was dCyt, the polymerase inserted primarily dGTP, whereas if the template 5'-neighbor base was dThy, the polymerase inserted primarily dATP. The latter event would predict low levels of Gua {yields} Thy mutations during replication bypass when the template 5'-neighbor base is dThy. When presented with a primed (6S,8R,11S)-HNE-1,N{sub 2}-dGuo:dCyd pair, the polymerase conducted full-length primer extension. Structures for ternary (Dpo4-DNA-dNTP) complexes with all four template-primers were obtained. For the 18-mer:13-mer template-primers in which the polymerase was confronted with the (6S,8R,11S)-HNE-1,N{sub 2}-dGuo adduct, the (6S,8R,11S)-1,N{sub 2}-dGuo lesion remained in the ring-closed conformation at the active site. The incoming dNTP, either dGTP or dATP, was positioned with Watson-Crick pairing opposite the template 5'-neighbor base, dCyt or dThy, respectively. In contrast, for the 18-mer:14-mer template-primers with a primed (6S,8R,11S)-HNE-1,N{sub 2}-dGuo:dCyd pair, ring opening of the adduct to the corresponding N{sub 2}-dGuo aldehyde species occurred. This allowed Watson-Crick base pairing at the (6S,8R,11S)-HNE-1,N{sub 2}-dGuo:dCyd pair.« less
NASA Astrophysics Data System (ADS)
Moon, Chung Hee; Tousi, Marzieh; Cheeney, Joseph; Ngo-Duc, Tam-Triet; Zuo, Zheng; Liu, Jianlin; Haberer, Elaine D.
2015-11-01
An 8-mer ZnO-binding peptide, VPGAAEHT, was identified using a M13 pVIII phage display library and employed as an additive during aqueous-based ZnO synthesis at 65 °C. Unlike most other well-studied ZnO-binding sequences which are strongly basic (pI > pH 7), the 8-mer peptide was overall acidic (pI < pH 7) in character, including only a single basic residue. The selected peptide strongly influenced ZnO nanostructure formation. Morphology and optical emission properties were found to be dependent on the concentration of peptide additive. Using lower peptide concentrations (<0.1 mM), single crystal hexagonal rods and platelets were produced, and using higher peptide concentrations (≥0.1 mM), polycrystalline layered platelets, yarn-like structures, and microspheres were assembled. Photoluminescence analysis revealed a characteristic ZnO band-edge peak, as well as sub-bandgap emission peaks. Defect-related green emission, typically associated with surface-related oxygen and zinc vacancies, was significantly reduced by the peptide additive, while blue emission, attributable to oxygen and zinc interstitials, emerged with increased peptide concentrations. Peptide-directed synthesis of ZnO materials may be useful for gas sensing and photocatalytic applications in which properly engineered morphology and defect levels have demonstrated enhanced performance.
Inhibition of HIV Replication by Cyclic and Hairpin PNAs Targeting the HIV-1 TAR RNA Loop
Upert, Gregory; Di Giorgio, Audrey; Upadhyay, Alok; Manvar, Dinesh; Pandey, Nootan; Pandey, Virendra N.; Patino, Nadia
2012-01-01
Human immunodeficiency virus-1 (HIV-1) replication and gene expression entails specific interaction of the viral protein Tat with its transactivation responsive element (TAR), to form a highly stable stem-bulge-loop structure. Previously, we described triphenylphosphonium (TPP) cation-based vectors that efficiently deliver nucleotide analogs (PNAs) into the cytoplasm of cells. In particular, we showed that the TPP conjugate of a linear 16-mer PNA targeting the apical stem-loop region of TAR impedes Tat-mediated transactivation of the HIV-1 LTR in vitro and also in cell culture systems. In this communication, we conjugated TPP to cyclic and hairpin PNAs targeting the loop region of HIV-1 TAR and evaluated their antiviral efficacy in a cell culture system. We found that TPP-cyclic PNAs containing only 8 residues, showed higher antiviral potency compared to hairpin PNAs of 12 or 16 residues. We further noted that the TPP-conjugates of the 8-mer cyclic PNA as well as the 16-mer linear PNA displayed similar antiviral efficacy. However, cyclic PNAs were shown to be highly specific to their target sequences. This communication emphasizes on the importance of small constrained cyclic PNAs over both linear and hairpin structures for targeting biologically relevant RNA hairpins. PMID:23029603
Seasonal changes of mercury reduction and methylation in Gulf of Trieste (north Adriatic Sea)
NASA Astrophysics Data System (ADS)
Horvat, M.; Bratkic, A.; Koron, N.; Faganeli, J.; Ribeiro Guevara, S.; Tinta, T.
2014-12-01
We have successfully improved and applied the 197Hg radiotracer method during the sampling campaign from March until November 2011, collecting and incubating sediments and waters with low 197Hg2+ additions without significantly increasing natural levels. The evolution of Me197Hg and DGM197 was followed. In addition, we have performed Hg speciation of the water column and sediment, determined diversity of microbial community and investigated microbial resistance to Hg through presence of merA and merB genes. Our results showed repeatedly that methylation does not occur in the water column of the GoT, and confirmed that sediments are the principal methylation site, as well as the source of MeHg to the water column. Its formation seems to be closely linked to nutrient cycling at the sediment-water interface, where degradation of organic matter with accompanying oxygen consumption significantly stimulates MeHg production (range 0.85 pM - 3.39 pM). The water column showed a pronounced capability for 197Hg2+ reduction (up to 25% d-1), confirming that the GoT is a source of Hg to the atmosphere. Whether reduction was directly linked to genetic resistance; was a consequence of non-specific redox reactions or of other microbial mechanisms could not be demonstrated. Neither merA nor merB genes were detected, but the microbial community structure was changing in the water column seasonally, as did the reduction rates in the experiments. Most importantly, it was shown that 197Hg methodology is sensitive enough to follow Hg biogeochemical transformations at environmental levels. The advantage is that the minimal additions of 197Hg do not disturb the natural processes occurring in the environment and that very small changes can be detected. Hg stress in the Gulf can directly manifest itself in biota and consequently result in a threat to environmental and public health and therefore needs to be seen in the light of changing global climate and marine environment.
Analysis of composition-based metagenomic classification.
Higashi, Susan; Barreto, André da Motta Salles; Cantão, Maurício Egidio; de Vasconcelos, Ana Tereza Ribeiro
2012-01-01
An essential step of a metagenomic study is the taxonomic classification, that is, the identification of the taxonomic lineage of the organisms in a given sample. The taxonomic classification process involves a series of decisions. Currently, in the context of metagenomics, such decisions are usually based on empirical studies that consider one specific type of classifier. In this study we propose a general framework for analyzing the impact that several decisions can have on the classification problem. Instead of focusing on any specific classifier, we define a generic score function that provides a measure of the difficulty of the classification task. Using this framework, we analyze the impact of the following parameters on the taxonomic classification problem: (i) the length of n-mers used to encode the metagenomic sequences, (ii) the similarity measure used to compare sequences, and (iii) the type of taxonomic classification, which can be conventional or hierarchical, depending on whether the classification process occurs in a single shot or in several steps according to the taxonomic tree. We defined a score function that measures the degree of separability of the taxonomic classes under a given configuration induced by the parameters above. We conducted an extensive computational experiment and found out that reasonable values for the parameters of interest could be (i) intermediate values of n, the length of the n-mers; (ii) any similarity measure, because all of them resulted in similar scores; and (iii) the hierarchical strategy, which performed better in all of the cases. As expected, short n-mers generate lower configuration scores because they give rise to frequency vectors that represent distinct sequences in a similar way. On the other hand, large values for n result in sparse frequency vectors that represent differently metagenomic fragments that are in fact similar, also leading to low configuration scores. Regarding the similarity measure, in contrast to our expectations, the variation of the measures did not change the configuration scores significantly. Finally, the hierarchical strategy was more effective than the conventional strategy, which suggests that, instead of using a single classifier, one should adopt multiple classifiers organized as a hierarchy.
Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R
1991-01-01
Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
Messoussi, Abdellah; Peyronnet, Lucile; Feneyrolles, Clémence; Chevé, Gwénaël; Bougrin, Khalid; Yasri, Aziz
2014-10-10
Structural elucidation of the active (DFG-Asp in) and inactive (DFG-Asp out) states of the TAM family of receptor tyrosine kinases is required for future development of TAM inhibitors as drugs. Herein we report a computational study on each of the three TAM members Tyro-3, Axl and Mer. DFG-Asp in and DFG-Asp out homology models of each one were built based on the X-ray structure of c-Met kinase, an enzyme with a closely related sequence. Structural validation and in silico screening enabled identification of critical amino acids for ligand binding within the active site of each DFG-Asp in and DFG-Asp out model. The position and nature of amino acids that differ among Tyro-3, Axl and Mer, and the potential role of these residues in the design of selective TAM ligands, are discussed.
Autonomous Exploration for Gathering Increased Science
NASA Technical Reports Server (NTRS)
Bornstein, Benjamin J.; Castano, Rebecca; Estlin, Tara A.; Gaines, Daniel M.; Anderson, Robert C.; Thompson, David R.; DeGranville, Charles K.; Chien, Steve A.; Tang, Benyang; Burl, Michael C.;
2010-01-01
The Autonomous Exploration for Gathering Increased Science System (AEGIS) provides automated targeting for remote sensing instruments on the Mars Exploration Rover (MER) mission, which at the time of this reporting has had two rovers exploring the surface of Mars (see figure). Currently, targets for rover remote-sensing instruments must be selected manually based on imagery already on the ground with the operations team. AEGIS enables the rover flight software to analyze imagery onboard in order to autonomously select and sequence targeted remote-sensing observations in an opportunistic fashion. In particular, this technology will be used to automatically acquire sub-framed, high-resolution, targeted images taken with the MER panoramic cameras. This software provides: 1) Automatic detection of terrain features in rover camera images, 2) Feature extraction for detected terrain targets, 3) Prioritization of terrain targets based on a scientist target feature set, and 4) Automated re-targeting of rover remote-sensing instruments at the highest priority target.
A Novel Role of MerC in Methylmercury Transport and Phytoremediation of Methylmercury Contamination.
Sone, Yuka; Uraguchi, Shimpei; Takanezawa, Yasukazu; Nakamura, Ryosuke; Pan-Hou, Hidemitsu; Kiyono, Masako
2017-01-01
MerC, encoded by merC in the transposon Tn21 mer operon, is a heavy metal transporter with potential applications for phytoremediation of heavy metals such as mercuric ion and cadmium. In this study, we demonstrate that MerC also acts as a transporter for methylmercury. When MerC was expressed in Escherichia coli XL1-Blue, cells became hypersensitive to CH 3 Hg(I) and the uptake of CH 3 Hg(I) by these cells was higher than that by cells of the isogenic strain. Moreover, transgenic Arabidopsis plants expressing bacterial MerC or MerC fused to plant soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNAREs) accumulated CH 3 Hg(I) effectively and their growth was comparable to the wild-type plants. These results demonstrate that when the bacterium-derived merC gene is ectopically introduced in genetically modified plants, MerC expression in the transgenic plants promotes the transport and sequestration of methylmercury. Thus, our results show that the expression of merC in Arabidopsis results in transgenic plants that could be used for the phytoremediation and elimination of toxic methylmercury from the environment.
Boyd, Eric S.; Barkay, Tamar
2012-01-01
Mercuric mercury (Hg[II]) is a highly toxic and mobile element that is likely to have had a pronounced and adverse effect on biology since Earth’s oxygenation ∼2.4 billion years ago due to its high affinity for protein sulfhydryl groups, which upon binding destabilize protein structure and decrease enzyme activity, resulting in a decreased organismal fitness. The central enzyme in the microbial mercury detoxification system is the mercuric reductase (MerA) protein, which catalyzes the reduction of Hg(II) to volatile Hg(0). In addition to MerA, mer operons encode for proteins involved in regulation, Hg binding, and organomercury degradation. Mer-mediated approaches have had broad applications in the bioremediation of mercury-contaminated environments and industrial waste streams. Here, we examine the composition of 272 individual mer operons and quantitatively map the distribution of mer-encoded functions on both taxonomic SSU rRNA gene and MerA phylogenies. The results indicate an origin and early evolution of MerA among thermophilic bacteria and an overall increase in the complexity of mer operons through evolutionary time, suggesting continual gene recruitment and evolution leading to an improved efficiency and functional potential of the Mer detoxification system. Consistent with a positive relationship between the evolutionary history and topology of MerA and SSU rRNA gene phylogenies (Mantel R = 0.81, p < 0.01), the distribution of the majority of mer functions, when mapped on these phylograms, indicates an overall tendency to inherit mer-encoded functions through vertical descent. However, individual mer functions display evidence of a variable degree of vertical inheritance, with several genes exhibiting strong evidence for acquisition via lateral gene transfer and/or gene loss. Collectively, these data suggest that (i) mer has evolved from a simple system in geothermal environments to a widely distributed and more complex and efficient detoxification system, and (ii) merA is a suitable biomarker for examining the functional diversity of Hg detoxification and for predicting the composition of mer operons in natural environments. PMID:23087676
A novel sodium bicarbonate cotransporter-like gene in an ancient duplicated region: SLC4A9 at 5q31
Lipovich, Leonard; Lynch, Eric D; Lee, Ming K; King, Mary-Claire
2001-01-01
Background: Sodium bicarbonate cotransporter (NBC) genes encode proteins that execute coupled Na+ and HCO3- transport across epithelial cell membranes. We report the discovery, characterization, and genomic context of a novel human NBC-like gene, SLC4A9, on chromosome 5q31. Results: SLC4A9 was initially discovered by genomic sequence annotation and further characterized by sequencing of long-insert cDNA library clones. The predicted protein of 990 amino acids has 12 transmembrane domains and high sequence similarity to other NBCs. The 23-exon gene has 14 known mRNA isoforms. In three regions, mRNA sequence variation is generated by the inclusion or exclusion of portions of an exon. Noncoding SLC4A9 cDNAs were recovered multiple times from different libraries. The 3' untranslated region is fragmented into six alternatively spliced exons and contains expressed Alu, LINE and MER repeats. SLC4A9 has two alternative stop codons and six polyadenylation sites. Its expression is largely restricted to the kidney. In silico approaches were used to characterize two additional novel SLC4A genes and to place SLC4A9 within the context of multiple paralogous gene clusters containing members of the epidermal growth factor (EGF), ankyrin (ANK) and fibroblast growth factor (FGF) families. Seven human EGF-SLC4A-ANK-FGF clusters were found. Conclusion: The novel sodium bicarbonate cotransporter-like gene SLC4A9 demonstrates abundant alternative mRNA processing. It belongs to a growing class of functionally diverse genes characterized by inefficient highly variable splicing. The evolutionary history of the EGF-SLC4A-ANK-FGF gene clusters involves multiple rounds of duplication, apparently followed by large insertions and deletions at paralogous loci and genome-wide gene shuffling. PMID:11305939
Effective inhibition of MERS-CoV infection by resveratrol.
Lin, Shih-Chao; Ho, Chi-Tang; Chuo, Wen-Ho; Li, Shiming; Wang, Tony T; Lin, Chi-Chen
2017-02-13
Middle East Respiratory Syndrome coronavirus (MERS-CoV) is an emerging viral pathogen that causes severe morbidity and mortality. Up to date, there is no approved or licensed vaccine or antiviral medicines can be used to treat MERS-CoV-infected patients. Here, we analyzed the antiviral activities of resveratrol, a natural compound found in grape seeds and skin and in red wine, against MERS-CoV infection. We performed MTT and neutral red uptake assays to assess the survival rates of MERS-infected Vero E6 cells. In addition, quantitative PCR, western blotting, and immunofluorescent assays determined the intracellular viral RNA and protein expression. For viral productivity, we utilized plaque assays to confirm the antiviral properties of resveratrol against MERS-CoV. Resveratrol significantly inhibited MERS-CoV infection and prolonged cellular survival after virus infection. We also found that the expression of nucleocapsid (N) protein essential for MERS-CoV replication was decreased after resveratrol treatment. Furthermore, resveratrol down-regulated the apoptosis induced by MERS-CoV in vitro. By consecutive administration of resveratrol, we were able to reduce the concentration of resveratrol while achieving inhibitory effectiveness against MERS-CoV. In this study, we first demonstrated that resveratrol is a potent anti-MERS agent in vitro. We perceive that resveratrol can be a potential antiviral agent against MERS-CoV infection in the near future.
Variation, Repetition, And Choice
Abreu-Rodrigues, Josele; Lattal, Kennon A; dos Santos, Cristiano V; Matos, Ricardo A
2005-01-01
Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a response sequence in the terminal link was reinforced only if it differed from the n previous sequences (lag criterion). The REPEAT contingency generated low, constant levels of sequence variation whereas the VARY contingency produced levels of sequence variation that increased with the lag criterion. Preference for the REPEAT alternative tended to increase directly with the degree of variation required for reinforcement. Experiment 2 examined the potential confounding effects in Experiment 1 of immediacy of reinforcement by yoking the interreinforcer intervals in the REPEAT alternative to those in the VARY alternative. Again, preference for REPEAT was a function of the lag criterion. Choice between varying and repeating behavior is discussed with respect to obtained behavioral variability, probability of reinforcement, delay of reinforcement, and switching within a sequence. PMID:15828592
Wahba, Haytham M; Lecoq, Lauriane; Stevenson, Michael; Mansour, Ahmed; Cappadocia, Laurent; Lafrance-Vanasse, Julien; Wilkinson, Kevin J; Sygusch, Jurgen; Wilcox, Dean E; Omichinski, James G
2016-02-23
In bacterial resistance to mercury, the organomercurial lyase (MerB) plays a key role in the detoxification pathway through its ability to cleave Hg-carbon bonds. Two cysteines (C96 and C159; Escherichia coli MerB numbering) and an aspartic acid (D99) have been identified as the key catalytic residues, and these three residues are conserved in all but four known MerB variants, where the aspartic acid is replaced with a serine. To understand the role of the active site serine, we characterized the structure and metal binding properties of an E. coli MerB mutant with a serine substituted for D99 (MerB D99S) as well as one of the native MerB variants containing a serine residue in the active site (Bacillus megaterium MerB2). Surprisingly, the MerB D99S protein copurified with a bound metal that was determined to be Cu(II) from UV-vis absorption, inductively coupled plasma mass spectrometry, nuclear magnetic resonance, and electron paramagnetic resonance studies. X-ray structural studies revealed that the Cu(II) is bound to the active site cysteine residues of MerB D99S, but that it is displaced following the addition of either an organomercurial substrate or an ionic mercury product. In contrast, the B. megaterium MerB2 protein does not copurify with copper, but the structure of the B. megaterium MerB2-Hg complex is highly similar to the structure of the MerB D99S-Hg complexes. These results demonstrate that the active site aspartic acid is crucial for both the enzymatic activity and metal binding specificity of MerB proteins and suggest a possible functional relationship between MerB and its only known structural homologue, the copper-binding protein NosL.
Wirblich, Christoph; Coleman, Christopher M; Kurup, Drishya; Abraham, Tara S; Bernbaum, John G; Jahrling, Peter B; Hensley, Lisa E; Johnson, Reed F; Frieman, Matthew B; Schnell, Matthias J
2017-01-15
Middle East respiratory syndrome coronavirus (MERS-CoV) emerged in 2012 and is a highly pathogenic respiratory virus. There are no treatment options against MERS-CoV for humans or animals, and there are no large-scale clinical trials for therapies against MERS-CoV. To address this need, we developed an inactivated rabies virus (RABV) that contains the MERS-CoV spike (S) protein expressed on its surface. Our initial recombinant vaccine, BNSP333-S, expresses a full-length wild-type MERS-CoV S protein; however, it showed significantly reduced viral titers compared to those of the parental RABV strain and only low-level incorporation of full-length MERS-CoV S into RABV particles. Therefore, we developed a RABV-MERS vector that contained the MERS-CoV S1 domain of the MERS-CoV S protein fused to the RABV G protein C terminus (BNSP333-S1). BNSP333-S1 grew to titers similar to those of the parental vaccine vector BNSP333, and the RABV G-MERS-CoV S1 fusion protein was efficiently expressed and incorporated into RABV particles. When we vaccinated mice, chemically inactivated BNSP333-S1 induced high-titer neutralizing antibodies. Next, we challenged both vaccinated mice and control mice with MERS-CoV after adenovirus transduction of the human dipeptidyl peptidase 4 (hDPP4) receptor and then analyzed the ability of mice to control MERS-CoV infection. Our results demonstrated that vaccinated mice were fully protected from the MERS-CoV challenge, as indicated by the significantly lower MERS-CoV titers and MERS-CoV and mRNA levels in challenged mice than those in unvaccinated controls. These data establish that an inactivated RABV-MERS S-based vaccine may be effective for use in animals and humans in areas where MERS-CoV is endemic. Rabies virus-based vectors have been proven to be efficient dual vaccines against rabies and emergent infectious diseases such as Ebola virus. Here we show that inactivated rabies virus particles containing the MERS-CoV S1 protein induce potent immune responses against MERS-CoV and RABV. This novel vaccine is easy to produce and may be useful to protect target animals, such as camels, as well as humans from deadly MERS-CoV and RABV infections. Our results indicate that this vaccine approach can prevent disease, and the RABV-based vaccine platform may be a valuable tool for timely vaccine development against emerging infectious diseases. Copyright © 2017 American Society for Microbiology.
Zamakhchari, Maram; Wei, Guoxian; Dewhirst, Floyd; Lee, Jaeseop; Schuppan, Detlef; Oppenheim, Frank G.; Helmerhorst, Eva J.
2011-01-01
Background Gluten proteins, prominent constituents of barley, wheat and rye, cause celiac disease in genetically predisposed subjects. Gluten is notoriously difficult to digest by mammalian proteolytic enzymes and the protease-resistant domains contain multiple immunogenic epitopes. The aim of this study was to identify novel sources of gluten-digesting microbial enzymes from the upper gastro-intestinal tract with the potential to neutralize gluten epitopes. Methodology/Principal Findings Oral microorganisms with gluten-degrading capacity were obtained by a selective plating strategy using gluten agar. Microbial speciations were carried out by 16S rDNA gene sequencing. Enzyme activities were assessed using gliadin-derived enzymatic substrates, gliadins in solution, gliadin zymography, and 33-mer α-gliadin and 26-mer γ-gliadin immunogenic peptides. Fragments of the gliadin peptides were separated by RP-HPLC and structurally characterized by mass spectrometry. Strains with high activity towards gluten were typed as Rothia mucilaginosa and Rothia aeria. Gliadins (250 µg/ml) added to Rothia cell suspensions (OD620 1.2) were degraded by 50% after ∼30 min of incubation. Importantly, the 33-mer and 26-mer immunogenic peptides were also cleaved, primarily C-terminal to Xaa-Pro-Gln (XPQ) and Xaa-Pro-Tyr (XPY). The major gliadin-degrading enzymes produced by the Rothia strains were ∼70–75 kDa in size, and the enzyme expressed by Rothia aeria was active over a wide pH range (pH 3–10). Conclusion/Significance While the human digestive enzyme system lacks the capacity to cleave immunogenic gluten, such activities are naturally present in the oral microbial enzyme repertoire. The identified bacteria may be exploited for physiologic degradation of harmful gluten peptides. PMID:21957450
Zamakhchari, Maram; Wei, Guoxian; Dewhirst, Floyd; Lee, Jaeseop; Schuppan, Detlef; Oppenheim, Frank G; Helmerhorst, Eva J
2011-01-01
Gluten proteins, prominent constituents of barley, wheat and rye, cause celiac disease in genetically predisposed subjects. Gluten is notoriously difficult to digest by mammalian proteolytic enzymes and the protease-resistant domains contain multiple immunogenic epitopes. The aim of this study was to identify novel sources of gluten-digesting microbial enzymes from the upper gastro-intestinal tract with the potential to neutralize gluten epitopes. Oral microorganisms with gluten-degrading capacity were obtained by a selective plating strategy using gluten agar. Microbial speciations were carried out by 16S rDNA gene sequencing. Enzyme activities were assessed using gliadin-derived enzymatic substrates, gliadins in solution, gliadin zymography, and 33-mer α-gliadin and 26-mer γ-gliadin immunogenic peptides. Fragments of the gliadin peptides were separated by RP-HPLC and structurally characterized by mass spectrometry. Strains with high activity towards gluten were typed as Rothia mucilaginosa and Rothia aeria. Gliadins (250 µg/ml) added to Rothia cell suspensions (OD(620) 1.2) were degraded by 50% after ∼30 min of incubation. Importantly, the 33-mer and 26-mer immunogenic peptides were also cleaved, primarily C-terminal to Xaa-Pro-Gln (XPQ) and Xaa-Pro-Tyr (XPY). The major gliadin-degrading enzymes produced by the Rothia strains were ∼70-75 kDa in size, and the enzyme expressed by Rothia aeria was active over a wide pH range (pH 3-10). While the human digestive enzyme system lacks the capacity to cleave immunogenic gluten, such activities are naturally present in the oral microbial enzyme repertoire. The identified bacteria may be exploited for physiologic degradation of harmful gluten peptides.
SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes.
Jaron, Kamil S; Moravec, Jiří C; Martínková, Natália
2014-04-15
Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable, and so labour-intensive phylogenetic searches are used instead. We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative GIs. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies using a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score developed herein. We show that the SigHunt method is computationally efficient and able to detect GIs in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research-data-analysis-tools-sighunt are implemented in C and R and are platform-independent. 376090@mail.muni.cz or martinkova@ivb.cz. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Stone, M J; Nedderman, A N; Williams, D H; Lin, P K; Brown, D M
1991-12-05
In order to reach a more detailed understanding of the mechanism of the mutagenic action of methoxyamine and of N4-methoxycytidine and its 2'-deoxyribo-analogue, the solution structures of the self-complementary octanucleotide, d(CGAATTCG) and its analogues, d(CGAATCCG), d(CGAATMCG) and d(CGAATPCG) (designated 8mer-AT, 8mer-AC, 8mer-AM, and 8mer-AP, respectively), were investigated by 1H nuclear magnetic resonance spectroscopy; M is N4-methoxycytosine (mo4C) and P is an analogue, the bicyclic dihydropyrimido[4,5-c][1,2]oxazin-7-one, in which the N-O bond is held in the anti configuration with respect to N3 of the cytosine ring. Correlated spectroscopy and nuclear Overhauser spectroscopy allowed assignment of the base, anomeric and H2'/H2" protons in 8mers-AT, -AM and -AP, and showed that all three had features consistent with a regular B-DNA duplex structure. Duplex-to-coil transition temperatures were determined to be 52(+/- 2) degrees C (8mer-AT), 51(+/- 2) degrees C (8mer-AP), 32(+/- 2) degrees C (8mer-AM); on the chemical shift timescale, the melting transition was fast for 8mer-AT and 8mer-AP, but slow for 8mer-AM. Imino proton spectra were indicative of Watson-Crick base-pairing in 8mers-AT, -AP and -AM. The 8mer-AP duplex had a structure and melting characteristics virtually identical with those of the 8mer-AT duplex. The preferred syn configuration of the methoxyl group in M had a destabilising effect on the 8mer-AM duplex. At low temperatures, the A.M base-pair was in fast equilibrium between Watson-Crick and wobble configurations, with the methoxyl function anti-oriented, but the melting transition was accompanied by isomerization of the methoxyl group to the syn conformation. This syn-anti isomerization was the rate-determining step in the duplex-to-coil transition. The 8mer-AC oligomer did not form a stable duplex.
Alnazawi, Mohamed; Altaher, Abdallah; Kandeel, Mahmoud
2017-01-01
Middle East Respiratory Syndrome Coronavirus (MERS CoV) is a new emerging viral disease characterized by high fatality rate. Understanding MERS CoV genetic aspects and codon usage pattern is important to understand MERS CoV survival, adaptation, evolution, resistance to innate immunity, and help in finding the unique aspects of the virus for future drug discovery experiments. In this work, we provide comprehensive analysis of 238 MERS CoV full genomes comprised of human (hMERS) and camel (cMERS) isolates of the virus. MERS CoV genome shaping seems to be under compositional and mutational bias, as revealed by preference of A/T over G/C nucleotides, preferred codons, nucleotides at the third position of codons (NT3s), relative synonymous codon usage, hydropathicity (Gravy), and aromaticity (Aromo) indices. Effective number of codons (ENc) analysis reveals a general slight codon usage bias. Codon adaptation index reveals incomplete adaptation to host environment. MERS CoV showed high ability to resist the innate immune response by showing lower CpG frequencies. Neutrality evolution analysis revealed a more significant role of mutation pressure in cMERS over hMERS. Correspondence analysis revealed that MERS CoV genomes have three genetic clusters, which were distinct in their codon usage, host, and geographic distribution. Additionally, virtual screening and binding experiments were able to identify three new virus-encoded helicase binding compounds. These compounds can be used for further optimization of inhibitors.
Bizily, Scott P.; Kim, Tehryung; Kandasamy, Muthugapatti K.; Meagher, Richard B.
2003-01-01
Methylmercury is an environmental pollutant that biomagnifies in the aquatic food chain with severe consequences for humans and other animals. In an effort to remove this toxin in situ, we have been engineering plants that express the bacterial mercury resistance enzymes organomercurial lyase MerB and mercuric ion reductase MerA. In vivo kinetics experiments suggest that the diffusion of hydrophobic organic mercury to MerB limits the rate of the coupled reaction with MerA (Bizily et al., 2000). To optimize reaction kinetics for organic mercury compounds, the merB gene was engineered to target MerB for accumulation in the endoplasmic reticulum and for secretion to the cell wall. Plants expressing the targeted MerB proteins and cytoplasmic MerA are highly resistant to organic mercury and degrade organic mercury at 10 to 70 times higher specific activity than plants with the cytoplasmically distributed wild-type MerB enzyme. MerB protein in endoplasmic reticulum-targeted plants appears to accumulate in large vesicular structures that can be visualized in immunolabeled plant cells. These results suggest that the toxic effects of organic mercury are focused in microenvironments of the secretory pathway, that these hydrophobic compartments provide more favorable reaction conditions for MerB activity, and that moderate increases in targeted MerB expression will lead to significant gains in detoxification. In summary, to maximize phytoremediation efficiency of hydrophobic pollutants in plants, it may be beneficial to target enzymes to specific subcellular environments. PMID:12586871
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.
Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S
2015-01-01
In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.
Rehm, Charlotte; Wurmthaler, Lena A.; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S.
2015-01-01
In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria. PMID:26695179
Sequence-selective binding of C8-conjugated pyrrolobenzodiazepines (PBDs) to DNA.
Basher, Mohammad A; Rahman, Khondaker Miraz; Jackson, Paul J M; Thurston, David E; Fox, Keith R
2017-11-01
DNA footprinting and melting experiments have been used to examine the sequence-specific binding of C8-conjugates of pyrrolobenzodiazepines (PBDs) and benzofused rings including benzothiophene and benzofuran, which are attached using pyrrole- or imidazole-containing linkers. The conjugates modulate the covalent attachment points of the PBDs, so that they bind best to guanines flanked by A/T-rich sequences on either the 5'- or 3'-side. The linker affects the binding, and pyrrole produces larger changes than imidazole. Melting studies with 14-mer oligonucleotide duplexes confirm covalent attachment of the conjugates, which show a different selectivity to anthramycin and reveal that more than one ligand molecule can bind to each duplex. Copyright © 2017 Elsevier B.V. All rights reserved.
Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry
2017-01-01
Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
Al Hammadi, Zulaikha M; Chu, Daniel K W; Eltahir, Yassir M; Al Hosani, Farida; Al Mulla, Mariam; Tarnini, Wasim; Hall, Aron J; Perera, Ranawaka A P M; Abdelkhalek, Mohamed M; Peiris, J S M; Al Muhairi, Salama S; Poon, Leo L M
2015-12-01
In May 2015 in United Arab Emirates, asymptomatic Middle East respiratory syndrome coronavirus infection was identified through active case finding in 2 men with exposure to infected dromedaries. Epidemiologic and virologic findings suggested zoonotic transmission. Genetic sequences for viruses from the men and camels were similar to those for viruses recently detected in other countries.
Using pseudoalignment and base quality to accurately quantify microbial community composition
Novembre, John
2018-01-01
Pooled DNA from multiple unknown organisms arises in a variety of contexts, for example microbial samples from ecological or human health research. Determining the composition of pooled samples can be difficult, especially at the scale of modern sequencing data and reference databases. Here we propose a novel method for taxonomic profiling in pooled DNA that combines the speed and low-memory requirements of k-mer based pseudoalignment with a likelihood framework that uses base quality information to better resolve multiply mapped reads. We apply the method to the problem of classifying 16S rRNA reads using a reference database of known organisms, a common challenge in microbiome research. Using simulations, we show the method is accurate across a variety of read lengths, with different length reference sequences, at different sample depths, and when samples contain reads originating from organisms absent from the reference. We also assess performance in real 16S data, where we reanalyze previous genetic association data to show our method discovers a larger number of quantitative trait associations than other widely used methods. We implement our method in the software Karp, for k-mer based analysis of read pools, to provide a novel combination of speed and accuracy that is uniquely suited for enhancing discoveries in microbial studies. PMID:29659582
Catania, Francesco; Lynch, Michael
2010-05-04
In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.
Systematic analysis of protein identity between Zika virus and other arthropod-borne viruses.
Chang, Hsiao-Han; Huber, Roland G; Bond, Peter J; Grad, Yonatan H; Camerini, David; Maurer-Stroh, Sebastian; Lipsitch, Marc
2017-07-01
To analyse the proportions of protein identity between Zika virus and dengue, Japanese encephalitis, yellow fever, West Nile and chikungunya viruses as well as polymorphism between different Zika virus strains. We used published protein sequences for the Zika virus and obtained protein sequences for the other viruses from the National Center for Biotechnology Information (NCBI) protein database or the NCBI virus variation resource. We used BLASTP to find regions of identity between viruses. We quantified the identity between the Zika virus and each of the other viruses, as well as within-Zika virus polymorphism for all amino acid k -mers across the proteome, with k ranging from 6 to 100. We assessed accessibility of protein fragments by calculating the solvent accessible surface area for the envelope and nonstructural-1 (NS1) proteins. In total, we identified 294 Zika virus protein fragments with both low proportion of identity with other viruses and low levels of polymorphisms among Zika virus strains. The list includes protein fragments from all Zika virus proteins, except NS3. NS4A has the highest number (190 k -mers) of protein fragments on the list. We provide a candidate list of protein fragments that could be used when developing a sensitive and specific serological test to detect previous Zika virus infections.
Descriptive Statistics of the Genome: Phylogenetic Classification of Viruses.
Hernandez, Troy; Yang, Jie
2016-10-01
The typical process for classifying and submitting a newly sequenced virus to the NCBI database involves two steps. First, a BLAST search is performed to determine likely family candidates. That is followed by checking the candidate families with the pairwise sequence alignment tool for similar species. The submitter's judgment is then used to determine the most likely species classification. The aim of this article is to show that this process can be automated into a fast, accurate, one-step process using the proposed alignment-free method and properly implemented machine learning techniques. We present a new family of alignment-free vectorizations of the genome, the generalized vector, that maintains the speed of existing alignment-free methods while outperforming all available methods. This new alignment-free vectorization uses the frequency of genomic words (k-mers), as is done in the composition vector, and incorporates descriptive statistics of those k-mers' positional information, as inspired by the natural vector. We analyze five different characterizations of genome similarity using k-nearest neighbor classification and evaluate these on two collections of viruses totaling over 10,000 viruses. We show that our proposed method performs better than, or as well as, other methods at every level of the phylogenetic hierarchy. The data and R code is available upon request.
Chetta, M.; Drmanac, A.; Santacroce, R.; Grandone, E.; Surrey, S.; Fortina, P.; Margaglione, M.
2008-01-01
BACKGROUND: Standard methods of mutation detection are time consuming in Hemophilia A (HA) rendering their application unavailable in some analysis such as prenatal diagnosis. OBJECTIVES: To evaluate the feasibility of combinatorial sequencing-by-hybridization (cSBH) as an alternative and reliable tool for mutation detection in FVIII gene. PATIENTS/METHODS: We have applied a new method of cSBH that uses two different colors for detection of multiple point mutations in the FVIII gene. The 26 exons encompassing the HA gene were analyzed in 7 newly diagnosed Italian patients and in 19 previously characterized individuals with FVIII deficiency. RESULTS: Data show that, when solution-phase TAMRA and QUASAR labeled 5-mer oligonucleotide sets mixed with unlabeled target PCR templates are co-hybridized in the presence of DNA ligase to universal 6-mer oligonucleotide probe-based arrays, a number of mutations can be successfully detected. The technique was reliable also in identifying a mutant FVIII allele in an obligate heterozygote. A novel missense mutation (Leu1843Thr) in exon 16 and three novel neutral polymorphisms are presented with an updated protocol for 2-color cSBH. CONCLUSIONS: cSBH is a reliable tool for mutation detection in FVIII gene and may represent a complementary method for the genetic screening of HA patients. PMID:20300295
Suciu, Maria C.; Telenius, Jelena
2017-01-01
In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding. Sasquatch performs a comprehensive k-mer-based analysis of DNase footprints to determine any k-mer's potential for protein binding in a specific cell type and how this may be changed by sequence variants. Therefore, Sasquatch uses an unbiased approach, independent of known transcription factor binding sites and motifs. Sasquatch only requires a single DNase-seq data set per cell type, from any genotype, and produces consistent predictions from data generated by different experimental procedures and at different sequence depths. Here we demonstrate the effectiveness of Sasquatch using previously validated functional SNPs and benchmark its performance against existing approaches. Sasquatch is available as a versatile webtool incorporating publicly available data, including the human ENCODE collection. Thus, Sasquatch provides a powerful tool and repository for prioritizing likely regulatory SNPs in the noncoding genome. PMID:28904015
Sander, Adam F.; Lavstsen, Thomas; Rask, Thomas S.; Lisby, Michael; Salanti, Ali; Fordyce, Sarah L.; Jespersen, Jakob S.; Carter, Richard; Deitsch, Kirk W.; Theander, Thor G.; Pedersen, Anders Gorm; Arnot, David E.
2014-01-01
Many bacterial, viral and parasitic pathogens undergo antigenic variation to counter host immune defense mechanisms. In Plasmodium falciparum, the most lethal of human malaria parasites, switching of var gene expression results in alternating expression of the adhesion proteins of the Plasmodium falciparum-erythrocyte membrane protein 1 class on the infected erythrocyte surface. Recombination clearly generates var diversity, but the nature and control of the genetic exchanges involved remain unclear. By experimental and bioinformatic identification of recombination events and genome-wide recombination hotspots in var genes, we show that during the parasite’s sexual stages, ectopic recombination between isogenous var paralogs occurs near low folding free energy DNA 50-mers and that these sequences are heavily concentrated at the boundaries of regions encoding individual Plasmodium falciparum-erythrocyte membrane protein 1 structural domains. The recombinogenic potential of these 50-mers is not parasite-specific because these sequences also induce recombination when transferred to the yeast Saccharomyces cerevisiae. Genetic cross data suggest that DNA secondary structures (DSS) act as inducers of recombination during DNA replication in P. falciparum sexual stages, and that these DSS-regulated genetic exchanges generate functional and diverse P. falciparum adhesion antigens. DSS-induced recombination may represent a common mechanism for optimizing the evolvability of virulence gene families in pathogens. PMID:24253306
Shin, Soo-Yong; Seo, Dong-Woo; An, Jisun; Kwak, Haewoon; Kim, Sung-Han; Gwack, Jin; Jo, Min-Woo
2016-09-06
The Middle East respiratory syndrome coronavirus (MERS-CoV) was exported to Korea in 2015, resulting in a threat to neighboring nations. We evaluated the possibility of using a digital surveillance system based on web searches and social media data to monitor this MERS outbreak. We collected the number of daily laboratory-confirmed MERS cases and quarantined cases from May 11, 2015 to June 26, 2015 using the Korean government MERS portal. The daily trends observed via Google search and Twitter during the same time period were also ascertained using Google Trends and Topsy. Correlations among the data were then examined using Spearman correlation analysis. We found high correlations (>0.7) between Google search and Twitter results and the number of confirmed MERS cases for the previous three days using only four simple keywords: "MERS", " ("MERS (in Korean)"), " ("MERS symptoms (in Korean)"), and " ("MERS hospital (in Korean)"). Additionally, we found high correlations between the Google search and Twitter results and the number of quarantined cases using the above keywords. This study demonstrates the possibility of using a digital surveillance system to monitor the outbreak of MERS.
2013-01-01
The bacterial merE gene derived from the Tn21 mer operon encodes a broad-spectrum mercury transporter that governs the transport of methylmercury and mercuric ions across bacterial cytoplasmic membranes, and this gene is a potential molecular tool for improving the efficiency of methylmercury phytoremediation. A transgenic Arabidopsis engineered to express MerE was constructed and the impact of expression of MerE on methylmercury accumulation was evaluated. The subcellular localization of transiently expressed GFP-tagged MerE was examined in Arabidopsis suspension-cultured cells. The GFP-MerE was found to localize to the plasma membrane and cytosol. The transgenic Arabidopsis expressing MerE accumulated significantly more methymercury and mercuric ions into plants than the wild-type Arabidopsis did. The transgenic plants expressing MerE was significantly more resistant to mercuric ions, but only showed more resistant to methylmercury compared with the wild type Arabidopsis. These results demonstrated that expression of the bacterial mercury transporter MerE promoted the transport and accumulation of methylmercury in transgenic Arabidopsis, which may be a useful method for improving plants to facilitate the phytoremediation of methylmercury pollution. PMID:24004544
Yuan, Junliang; Yang, Shuna; Wang, Shuangkun; Qin, Wei; Yang, Lei; Hu, Wenli
2017-05-25
Mild encephalitis/encephalopathy with reversible splenial lesion (MERS) is a rare clinico-radiological entity characterized by the magnetic resonance imaging (MRI) finding of a reversible lesion in the corpus callosum, sometimes involved the symmetrical white matters. Many cases of child-onset MERS with various causes have been reported. However, adult-onset MERS is relatively rare. The clinical characteristics and pathophysiologiccal mechanisms of adult-onset MERS are not well understood. We reviewed the literature on adult-onset MERS in order to describe the characteristics of MERS in adults and to provide experiences for clinician. We reported a case of adult-onset MERS with acute urinary retension and performed literature search from PubMed and web of science databases to identify other adult-onset MERS reports from Januarary 2004 to March 2016. Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline was followed on selection process. And then we summarized the clinico-radiological features of adult-onset MERS. Twenty-nine adult-onset MERS cases were reviewed from available literature including the case we have. 86.2% of the cases (25/29) were reported in Asia, especially in Japan. Ages varied between 18 and 59 years old with a 12:17 female-to-male ratio. The major cause was infection by virus or bacteria. Fever and headache were the most common clinical manifestation, and acute urinary retention was observed in 6 patients. All patients recovered completely within a month. Adult-onset MERS is an entity with a broad clinico-radiological spectrum because of the various diseases and conditions. There are similar characteristics between MERS in adults and children, also some differences.
Kim, Sung-Han; Chang, So Young; Sung, Minki; Park, Ji Hoon; Bin Kim, Hong; Lee, Heeyoung; Choi, Jae-Phil; Choi, Won Suk; Min, Ji-Young
2016-08-01
The largest outbreak of Middle East respiratory syndrome coronavirus (MERS-CoV) outside the Middle East occurred in South Korea in 2015 and resulted in 186 laboratory-confirmed infections, including 36 (19%) deaths. Some hospitals were considered epicenters of infection and voluntarily shut down most of their operations after nearly half of all transmissions occurred in hospital settings. However, the ways that MERS-CoV is transmitted in healthcare settings are not well defined. We explored the possible contribution of contaminated hospital air and surfaces to MERS transmission by collecting air and swabbing environmental surfaces in 2 hospitals treating MERS-CoV patients. The samples were tested by viral culture with reverse transcription polymerase chain reaction (RT-PCR) and immunofluorescence assay (IFA) using MERS-CoV Spike antibody, and electron microscopy (EM). The presence of MERS-CoV was confirmed by RT-PCR of viral cultures of 4 of 7 air samples from 2 patients' rooms, 1 patient's restroom, and 1 common corridor. In addition, MERS-CoV was detected in 15 of 68 surface swabs by viral cultures. IFA on the cultures of the air and swab samples revealed the presence of MERS-CoV. EM images also revealed intact particles of MERS-CoV in viral cultures of the air and swab samples. These data provide experimental evidence for extensive viable MERS-CoV contamination of the air and surrounding materials in MERS outbreak units. Thus, our findings call for epidemiologic investigation of the possible scenarios for contact and airborne transmission, and raise concern regarding the adequacy of current infection control procedures. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.
Mouse-adapted MERS coronavirus causes lethal lung disease in human DPP4 knockin mice.
Li, Kun; Wohlford-Lenane, Christine L; Channappanavar, Rudragouda; Park, Jung-Eun; Earnest, James T; Bair, Thomas B; Bates, Amber M; Brogden, Kim A; Flaherty, Heather A; Gallagher, Tom; Meyerholz, David K; Perlman, Stanley; McCray, Paul B
2017-04-11
The Middle East respiratory syndrome (MERS) emerged in Saudi Arabia in 2012, caused by a zoonotically transmitted coronavirus (CoV). Over 1,900 cases have been reported to date, with ∼36% fatality rate. Lack of autopsies from MERS cases has hindered understanding of MERS-CoV pathogenesis. A small animal model that develops progressive pulmonary manifestations when infected with MERS-CoV would advance the field. As mice are restricted to infection at the level of DPP4, the MERS-CoV receptor, we generated mice with humanized exons 10-12 of the mouse Dpp4 locus. Upon inoculation with MERS-CoV, human DPP4 knockin (KI) mice supported virus replication in the lungs, but developed no illness. After 30 serial passages through the lungs of KI mice, a mouse-adapted virus emerged (MERS MA ) that grew in lungs to over 100 times higher titers than the starting virus. A plaque-purified MERS MA clone caused weight loss and fatal infection. Virus antigen was observed in airway epithelia, pneumocytes, and macrophages. Pathologic findings included diffuse alveolar damage with pulmonary edema and hyaline membrane formation associated with accumulation of activated inflammatory monocyte-macrophages and neutrophils in the lungs. Relative to the parental MERS-CoV, MERS MA viruses contained 13-22 mutations, including several within the spike (S) glycoprotein gene. S-protein mutations sensitized viruses to entry-activating serine proteases and conferred more rapid entry kinetics. Recombinant MERS MA bearing mutant S proteins were more virulent than the parental virus in hDPP4 KI mice. The hDPP4 KI mouse and the MERS MA provide tools to investigate disease causes and develop new therapies.
TRAP: automated classification, quantification and annotation of tandemly repeated sequences.
Sobreira, Tiago José P; Durham, Alan M; Gruber, Arthur
2006-02-01
TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.
Faragher, S G; Dalgarno, L
1986-07-20
The 3' untranslated (UT) sequences of the genomic RNAs of five geographic variants of the alphavirus Ross River virus (RRV) were determined and compared with the 3' UT sequence of RRV T48, the prototype strain. Part of the 3' UT region of Getah virus, a close serological relative of RRV, was also sequenced. The RRV 3' UT region varies markedly in length between variants. Large deletions or insertions, sequence rearrangements and single nucleotide substitutions are observed. A sequence tract of 49 to 58 nucleotides, which is repeated as four blocks in the RRV T48 3' UT region, occurs only once in the 3' UT region of one RRV strain (NB5092), indicating that the existence of repeat sequence blocks is not essential for RRV replication. However, the precise sequence of the 3' proximal copy of the repeat block and its position relative to the poly(A) tail were identical in all RRV isolates examined, suggesting that it has an important role in RRV replication. Nucleotide substitutions between RRV variants are distributed non-randomly along the length of the 3' UT region. The sequence of 120 to 130 nucleotides adjacent to the poly(A) tail is strongly conserved. Getah virus RNA contains three repeat sequence blocks in the 3' UT region. These are similar in sequence to those in RRV RNA but differ in their arrangement. Homology between the RRV and Getah 3' UT sequences is greatest in the 3' proximal repeat sequence block that shows three differences in 49 nucleotides. The 3' proximal repeat in Getah RNA occurs at the same position, relative to the poly(A) tail, as in all RRV variants. The RRV and Getah virus 3' UT sequences show extensive homology in the region between the 3' proximal repeat and the poly(A) tail but, apart from the repeat blocks themselves, they show no significant homology elsewhere.
RHETT/EPDM Performance Characterization
NASA Technical Reports Server (NTRS)
Haag, T.; Osborn, M.
1998-01-01
The 0.6 kW Electric Propulsion Demonstration Module (EPDM) flight thruster system was tested in a large vacuum facility for performance measurements and functional checkout. The thruster was operated at a xenon flow rate of 3.01 mg/s, which was supplied through a self-contained propellant system. All power was provided through a flight-packaged power processing unit, which was mounted in vacuum on a cold plate. The thruster was cycled through 34 individual startup and shutdown sequences. Operating periods ranged from 3 to 3600 seconds. The system responded promptly to each command sequence and there were no involuntary shutdowns. Direct thrust measurements indicated that steady state thrust was temperature sensitive, and varied from a high of 41.7 mN at 16 C, to a low of 34.8 mN at 110 C. Short duration thruster firings showed rapid response and good repeatability.
Replication and shedding of MERS-CoV in Jamaican fruit bats (Artibeus jamaicensis)
Munster, Vincent J.; Adney, Danielle R.; van Doremalen, Neeltje; Brown, Vienna R.; Miazgowicz, Kerri L.; Milne-Price, Shauna; Bushmaker, Trenton; Rosenke, Rebecca; Scott, Dana; Hawkinson, Ann; de Wit, Emmie; Schountz, Tony; Bowen, Richard A.
2016-01-01
The emergence of Middle East respiratory syndrome coronavirus (MERS-CoV) highlights the zoonotic potential of Betacoronaviruses. Investigations into the origin of MERS-CoV have focused on two potential reservoirs: bats and camels. Here, we investigated the role of bats as a potential reservoir for MERS-CoV. In vitro, the MERS-CoV spike glycoprotein interacted with Jamaican fruit bat (Artibeus jamaicensis) dipeptidyl peptidase 4 (DPP4) receptor and MERS-CoV replicated efficiently in Jamaican fruit bat cells, suggesting there is no restriction at the receptor or cellular level for MERS-CoV. To shed light on the intrinsic host-virus relationship, we inoculated 10 Jamaican fruit bats with MERS-CoV. Although all bats showed evidence of infection, none of the bats showed clinical signs of disease. Virus shedding was detected in the respiratory and intestinal tract for up to 9 days. MERS-CoV replicated transiently in the respiratory and, to a lesser extent, the intestinal tracts and internal organs; with limited histopathological changes observed only in the lungs. Analysis of the innate gene expression in the lungs showed a moderate, transient induction of expression. Our results indicate that MERS-CoV maintains the ability to replicate in bats without clinical signs of disease, supporting the general hypothesis of bats as ancestral reservoirs for MERS-CoV. PMID:26899616
The MER/CIP Portal for Ground Operations
NASA Technical Reports Server (NTRS)
Chan, Louise; Desai, Sanjay; DOrtenzio, Matthew; Filman, Robtert E.; Heher, Dennis M.; Hubbard, Kim; Johan, Sandra; Keely, Leslie; Magapu, Vish; Mak, Ronald
2003-01-01
We developed the Mars Exploration Rover/Collaborative Information Portal (MER/CIP) to facilitate MER operations. MER/CIP provides a centralized, one-stop delivery platform integrating science and engineering data from several distributed heterogeneous data sources. Key issues for MER/CIP include: 1) Scheduling and schedule reminders; 2) Tracking the status of daily predicted outputs; 3) Finding and analyzing data products; 4) Collaboration; 5) Announcements; 6) Personalization.
Chu, Yu-Tseng; Wu, Joseph Tsung-Shu; Geng, Xingyi; Zhao, Na; Cheng, Wei; Chen, Enfu; King, Chwan-Chuen
2016-01-01
The largest nosocomial outbreak of Middle East respiratory syndrome (MERS) occurred in South Korea in 2015. Health Care Personnel (HCP) are at high risk of acquiring MERS-Coronavirus (MERS-CoV) infections, similar to the severe acute respiratory syndrome (SARS)-Coronavirus (SARS-CoV) infections first identified in 2003. This study described the similarities and differences in epidemiological and clinical characteristics of 183 confirmed global MERS cases and 98 SARS cases in Taiwan associated with HCP. The epidemiological findings showed that the mean age of MERS-HCP and total MERS cases were 40 (24~74) and 49 (2~90) years, respectively, much older than those in SARS [SARS-HCP: 35 (21~68) years, p = 0.006; total SARS: 42 (0~94) years, p = 0.0002]. The case fatality rates (CFR) was much lower in MERS-HCP [7.03% (9/128)] or SARS-HCP [12.24% (12/98)] than the MERS-non-HCP [36.96% (34/92), p<0.001] or SARS-non-HCP [24.50% (61/249), p<0.001], however, no difference was found between MERS-HCP and SARS-HCP [p = 0.181]. In terms of clinical period, the days from onset to death [13 (4~17) vs 14.5 (0~52), p = 0.045] and to discharge [11 (5~24) vs 24 (0~74), p = 0.010] and be hospitalized days [9.5 (3~22) vs 22 (0~69), p = 0.040] were much shorter in MERS-HCP than SARS-HCP. Similarly, days from onset to confirmation were shorter in MERS-HCP than MERS-non-HCP [6 (1~14) vs 10 (1~21), p = 0.044]. In conclusion, the severity of MERS-HCP and SARS-HCP was lower than that of MERS-non-HCP and SARS-non-HCP due to younger age and early confirmation in HCP groups. However, no statistical difference was found in MERS-HCP and SARS-HCP. Thus, prevention of nosocomial infections involving both novel Coronavirus is crucially important to protect HCP. PMID:26930074
El Bushra, Hassan E; Al Arbash, Hussain A; Mohammed, Mutaz; Abdalla, Osman; Abdallah, Mohamed N; Al-Mayahi, Zayid K; Assiri, Abdallah M; BinSaeed, Abdulaziz A
2017-05-01
The objective of this retrospective cohort study was to assess the impact of implementation of different levels of infection prevention and control (IPC) measures during an outbreak of Middle East respiratory syndrome (MERS) in a large tertiary hospital in Saudi Arabia. The setting was an emergency room (ER) in a large tertiary hospital and included primary and secondary MERS patients. Rapid response teams conducted repeated assessments of IPC and monitored implementation of corrective measures using a detailed structured checklist. We ascertained the epidemiologic link between patients and calculated the secondary attack rate per 10,000 patients visiting the ER (SAR/10,000) in 3 phases of the outbreak. In phase I, 6 primary cases gave rise to 48 secondary cases over 4 generations, including a case that resulted in 9 cases in the first generation of secondary cases and 21 cases over a chain of 4 generations. During the second and third phases, the number of secondary cases sharply dropped to 18 cases and 1 case, respectively, from a comparable number of primary cases. The SAR/10,000 dropped from 75 (95% confidence interval [CI], 55-99) in phase I to 29 (95% CI, 17-46) and 3 (95% CI, 0-17) in phases II and III, respectively. The study demonstrated salient evidence that proper institution of IPC measures during management of an outbreak of MERS could remarkably change the course of the outbreak. Copyright © 2017 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Elsevier Inc. All rights reserved.
Osawa, Takuo; Inanaga, Hideko; Numata, Tomoyuki
2015-06-01
Clustered regularly interspaced short palindromic repeat (CRISPR)-derived RNA (crRNA) and CRISPR-associated (Cas) proteins constitute a prokaryotic adaptive immune system (CRISPR-Cas system) that targets and degrades invading genetic elements. The type III-B CRISPR-Cas Cmr complex, composed of the six Cas proteins (Cmr1-Cmr6) and a crRNA, captures and cleaves RNA complementary to the crRNA guide sequence. Here, a Cmr1-deficient functional Cmr (CmrΔ1) complex composed of Pyrococcus furiosus Cmr2-Cmr3, Archaeoglobus fulgidus Cmr4-Cmr5-Cmr6 and the 39-mer P. furiosus 7.01-crRNA was prepared. The CmrΔ1 complex was cocrystallized with single-stranded DNA (ssDNA) complementary to the crRNA guide by the vapour-diffusion method. The crystals diffracted to 2.1 Å resolution using synchrotron radiation at the Photon Factory. The crystals belonged to the triclinic space group P1, with unit-cell parameters a = 75.5, b = 76.2, c = 139.2 Å, α = 90.3, β = 104.8, γ = 118.6°. The asymmetric unit of the crystals is expected to contain one CmrΔ1-ssDNA complex, with a Matthews coefficient of 2.03 Å(3) Da(-1) and a solvent content of 39.5%.
Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K
2017-04-01
There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Tessé, Sophie; Bourbon, Henri-Marc; Debuchy, Robert; Budin, Karine; Dubois, Emeline; Liangran, Zhang; Antoine, Romain; Piolot, Tristan; Kleckner, Nancy; Zickler, Denise; Espagne, Eric
2017-01-01
Meiosis is the cellular program by which a diploid cell gives rise to haploid gametes for sexual reproduction. Meiotic progression depends on tight physical and functional coupling of recombination steps at the DNA level with specific organizational features of meiotic-prophase chromosomes. The present study reveals that every step of this coupling is mediated by a single molecule: Asy2/Mer2. We show that Mer2, identified so far only in budding and fission yeasts, is in fact evolutionarily conserved from fungi (Mer2/Rec15/Asy2/Bad42) to plants (PRD3/PAIR1) and mammals (IHO1). In yeasts, Mer2 mediates assembly of recombination–initiation complexes and double-strand breaks (DSBs). This role is conserved in the fungus Sordaria. However, functional analysis of 13 mer2 mutants and successive localization of Mer2 to axis, synaptonemal complex (SC), and chromatin revealed, in addition, three further important functions. First, after DSB formation, Mer2 is required for pairing by mediating homolog spatial juxtaposition, with implications for crossover (CO) patterning/interference. Second, Mer2 participates in the transfer/maintenance and release of recombination complexes to/from the SC central region. Third, after completion of recombination, potentially dependent on SUMOylation, Mer2 mediates global chromosome compaction and post-recombination chiasma development. Thus, beyond its role as a recombinosome–axis/SC linker molecule, Mer2 has important functions in relation to basic chromosome structure. PMID:29021238
Gregor, Ivan; Dröge, Johannes; Schirmer, Melanie; Quince, Christopher; McHardy, Alice C
2016-01-01
Background. Metagenomics is an approach for characterizing environmental microbial communities in situ, it allows their functional and taxonomic characterization and to recover sequences from uncultured taxa. This is often achieved by a combination of sequence assembly and binning, where sequences are grouped into 'bins' representing taxa of the underlying microbial community. Assignment to low-ranking taxonomic bins is an important challenge for binning methods as is scalability to Gb-sized datasets generated with deep sequencing techniques. One of the best available methods for species bins recovery from deep-branching phyla is the expert-trained PhyloPythiaS package, where a human expert decides on the taxa to incorporate in the model and identifies 'training' sequences based on marker genes directly from the sample. Due to the manual effort involved, this approach does not scale to multiple metagenome samples and requires substantial expertise, which researchers who are new to the area do not have. Results. We have developed PhyloPythiaS+, a successor to our PhyloPythia(S) software. The new (+) component performs the work previously done by the human expert. PhyloPythiaS+ also includes a new k-mer counting algorithm, which accelerated the simultaneous counting of 4-6-mers used for taxonomic binning 100-fold and reduced the overall execution time of the software by a factor of three. Our software allows to analyze Gb-sized metagenomes with inexpensive hardware, and to recover species or genera-level bins with low error rates in a fully automated fashion. PhyloPythiaS+ was compared to MEGAN, taxator-tk, Kraken and the generic PhyloPythiaS model. The results showed that PhyloPythiaS+ performs especially well for samples originating from novel environments in comparison to the other methods. Availability. PhyloPythiaS+ in a virtual machine is available for installation under Windows, Unix systems or OS X on: https://github.com/algbioi/ppsp/wiki.
Synthesis and structures of a pincer-type rhodium(iii) complex: reactivity toward biomolecules.
Milutinović, Milan M; Bogojeski, Jovana V; Klisurić, Olivera; Scheurer, Andreas; Elmroth, Sofi K C; Bugarčić, Živadin D
2016-10-04
A novel rhodium(iii) complex [Rh III (H 2 L tBu )Cl 3 ] (1) (H 2 L tBu = 2,6-bis(5-tert-butyl-1H-pyrazol-3-yl)pyridine) containing a pincer type, tridentate nitrogen-donor chelate system was synthesized. Single crystal X-ray structure analysis revealed that 1 crystallizes in the orthorhombic space group Pbcn with a = 20.7982(6), b = 10.8952(4), c = 10.9832(4) Å, V = 2488.80(15) Å 3 , and eight molecules in the unit cell. The rhodium center in the complex [Rh III (H 2 L tBu )Cl 3 ] (1) is coordinated in a slightly distorted octahedral geometry by the tridentate N,N,N-donor and three chloro ligands, adopting a mer arrangement with an essentially planar ligand skeleton. Due to the tridentate coordination of the N,N,N-donor, the central nitrogen atom N1 is located closer to the Rh III center. The reactivity of the synthesized complex toward small biomolecules (l-methionine (l-Met), guanosine-5'-monophosphate (5'-GMP), l-histidine (l-His) and glutathione (GSH)) and to a series of duplex DNAs and RNA was investigated. The order of reactivity of the studied small biomolecules is: 5'-GMP > GSH > l-Met > l-His. Duplex RNA reacts faster with the [Rh III (H 2 L tBu )Cl 3 ] complex than duplex DNA, while shorter duplex DNA (15mer GG) reacts faster compared with 22mer GG duplex DNA. In addition, a higher reactivity is achieved with a DNA duplex with a centrally located GG-sequence than with a 22GTG duplex DNA, in which the GG-sequence is separated by a T base. Furthermore, the interaction of this metal complex 1 with calf thymus DNA (CT-DNA) and bovine serum albumin (BSA) was examined by absorption (UV-Vis) and emission spectral studies (EthBr displacement studies). Overall, the studied complex exhibited good DNA and BSA interaction ability.
A novel start codon mutation of the MERTK gene in a patient with retinitis pigmentosa
Jinda, Worapoj; Poungvarin, Naravat; Taylor, Todd D.; Suzuki, Yutaka; Thongnoppakhun, Wanna; Limwongse, Chanin; Lertrit, Patcharee; Suriyaphol, Prapat
2016-01-01
Purpose Retinitis pigmentosa (RP) is a clinically and genetically heterogeneous group of inherited retinal degenerations characterized by progressive loss of photoreceptor cells and RPE functions. More than 70 causative genes are known to be responsible for RP. This study aimed to identify the causative gene in a patient from a consanguineous family with childhood-onset severe retinal dystrophy. Methods To identify the defective gene, whole exome sequencing was performed. Candidate causative variants were selected and validated using Sanger sequencing. Segregation analysis of the causative gene was performed in additional family members. To verify that the mutation has an effect on protein synthesis, an expression vector containing the first ten amino acids of the mutant protein fused with the DsRed2 fluorescent protein was constructed and transfected into HEK293T cells. Expression of the fusion protein in the transfected cells was measured using fluorescence microscopy. Results By filtering against public variant databases, a novel homozygous missense mutation (c.3G>A) localized in the start codon of the MERTK gene was detected as a potentially pathogenic mutation for autosomal recessive RP. The c.3G>A mutation cosegregated with the disease phenotype in the family. No expression of the first ten amino acids of the MerTK mutant fused with the DsRed2 fluorescent protein was detected in HEK293T cells, indicating that the mutation affects the translation initiation site of the gene that may lead to loss of function of the MerTK signaling pathway. Conclusions We report a novel missense mutation (c.3G>A, p.0?) in the MERTK gene that causes severe vision impairment in a patient. Taken together with previous reports, our results expand the spectrum of MERTK mutations and extend our understanding of the role of the MerTK protein in the pathogenesis of retinitis pigmentosa. PMID:27122965
Engineered Single-Chain, Antiparallel, Coiled Coil Mimics the MerR Metal Binding Site
Song, Lingyun; Caguiat, Jonathan; Li, Zhongrui; Shokes, Jacob; Scott, Robert A.; Olliff, Lynda; Summers, Anne O.
2004-01-01
The repressor-activator MerR that controls transcription of the mercury resistance (mer) operon is unusual for its high sensitivity and specificity for Hg(II) in in vivo and in vitro transcriptional assays. The metal-recognition domain of MerR resides at the homodimer interface in a novel antiparallel arrangement of α-helix 5 that forms a coiled-coil motif. To facilitate the study of this novel metal binding motif, we assembled this antiparallel coiled coil into a single chain by directly fusing two copies of the 48-residue α-helix 5 of MerR. The resulting 107-residue polypeptide, called the metal binding domain (MBD), and wild-type MerR were overproduced and purified, and their metal-binding properties were determined in vivo and in vitro. In vitro MBD bound ca. 1.0 equivalent of Hg(II) per pair of binding sites, just as MerR does, and it showed only a slightly lower affinity for Hg(II) than did MerR. Extended X-ray absorption fine structure data showed that MBD has essentially the same Hg(II) coordination environment as MerR. In vivo, cells overexpressing MBD accumulated 70 to 100% more 203Hg(II) than cells bearing the vector alone, without deleterious effects on cell growth. Both MerR and MBD variously bound other thiophilic metal ions, including Cd(II), Zn(II), Pb(II), and As(III), in vitro and in vivo. We conclude that (i) it is possible to simulate in a single polypeptide chain the in vitro and in vivo metal-binding ability of dimeric, full-length MerR and (ii) MerR's specificity in transcriptional activation does not reside solely in the metal-binding step. PMID:14996817
Shah, H N; Gharbia, S E; Scully, C; Finegold, S M
1995-03-01
Eight oligonucleotides based upon regions of the small subunit 16S ribosomal RNA gene sequences were analysed against a background of their position within the molecule and their two-dimensional structure to rationalise their use in recognising Prevotella intermedia and Prevotella nigrescens. The 41 clinical isolates from both oral and respiratory sites and two reference strains were subjected to DNA-DNA hybridisation and multilocus enzyme electrophoresis to confirm their identity. Alignment of oligonucleotide probes designated I Bi-2 to I Bi-6 (for P. intermedia) and 2Bi-2 (for P. nigrescens) with the 16S rRNA suggested that these probes lacked specificity or were constructed from hypervariable regions. A 52-mer oligonucleotide (designated Bi) reliably detected both species. Because of the high degree of concordance between the 16S rRNAs of both species, it was necessary to vary the stringency of hybridisation conditions for detection of both species. Thus probe I Bi-I recognised P. intermedia while I Bi-I detected both P. intermedia and P. nigrescens at low stringency. However, under conditions of high stringency only P. nigrescens was recognised by probe 2Bi-I. These probes were highly specific and did not hybridise with DNA from the closely related P. corporis, nor other periodontal pathogens such as Fusobacterium nucleatum, Actinobacillus actinomycetemcomitans, Treponema denticola and several pigmented species such as Prevotella melaninogenica, P. denticola, P. loescheii, Porphyromonas asaccharolytica, Py. endodontalis, Py. gingivalis, Py. levii, and Py. macacae.
NASA Astrophysics Data System (ADS)
Nasution, M. A. F.; Azzuhdi, M. G.; Tambunan, U. S. F.
2017-07-01
Middle-east respiratory syndrome coronavirus (MERS-CoV) has become the current outbreak, MERS-CoV infection results in illness at the respiratory system, digestive, and even lead to death with an average mortality caused by MERS-CoV infection reaches 50 %. Until now, there is not any effective vaccine or drug to ward off MERS-CoV infection. Papain-like protease (PLpro) is responsible for cleavage of a nonstructural protein that is essential for viral maturation. Inhibition of PLpro with a ligand will block the cleavage process of nonstructural protein, thus reduce the infection of MERS-CoV. Through of bioinformatics study with molecular docking and binding interaction analysis of commercial cyclic peptides, aldosterone secretion inhibiting factor (1-35) (bovine) was obtained as an inhibitor for PLpro. Thus, aldosterone secretion inhibiting factor (1-35) (bovine) has a potential as a novel candidate drug for treating MERS-CoV.
Middle East respiratory syndrome in children. Dental considerations.
Al-Sehaibany, Fares S
2017-04-01
As of January 2016, 1,633 laboratory-confirmed cases of Middle East Respiratory Syndrome Coronavirus (MERS-CoV) infection and 587 MERS-related deaths have been reported by the World Health Organization globally. Middle East Respiratory Syndrome Coronavirus may occur sporadically in communities or may be transmitted within families or hospitals. The number of confirmed MERS-CoV cases among healthcare workers has been increasing. Middle East Respiratory Syndrome Coronavirus may also spread through aerosols generated during various dental treatments, resulting in transmission between patients and dentists. As MERS-CoV cases have also been reported among children, pediatric dentists are at risk of MERS-CoV infection. This review discusses MERS-CoV infection in children and healthcare workers, especially pediatric dentists, and considerations pertaining to pediatric dentistry. Although no cases of MERS-CoV transmission between a patient and a dentist have yet been reported, the risk of MERS-CoV transmission from an infected patient may be high due to the unique work environment of dentists (aerosol generation).
Methods for sequencing GC-rich and CCT repeat DNA templates
Robinson, Donna L.
2007-02-20
The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes
A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt
2000-01-01
Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
USDA-ARS?s Scientific Manuscript database
Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
NASA Technical Reports Server (NTRS)
Hepp, Aloysius F.; Clark, Eric B.; Schupp, John D.; Williams, Jennifer N.; Duraj, Stan A.; Fanwick, Philip E.
2013-01-01
We describe the structures of four related indium complexes obtained during synthesis of solid-state materials precursors. Indium adducts of halides and 4-methylpyridine, InX3(pic)3 (X = Cl, Br; pic = 4-methylpyridine) consist of octahedral molecules with meridional (mer) geometry. Crystals of mer-InCl3(pic)3 (1) are triclinic, space group P1(bar) (No. 2), with a = 9.3240(3), b = 13.9580(6), c = 16.7268 (7) A, alpha = 84.323(2), beta = 80.938(2), gamma = 78.274(3)Z = 4, R = 0.035 for 8820 unique reflections. Crystals of mer-InBr3(pic)3 (2) are monoclinic, space group P21/n (No. 14), with a = 15.010(2), b = 19.938(2), c = 16.593(3), beta = 116.44(1)Z = 8, R = 0.053 for 4174 unique reflections. The synthesis and structures of related compounds with phenylsulfide (chloride) (3) and a dimeric complex with bridging hydroxide (bromide) (4) coordination is also described. Crystals of trans-In(SC6H5)Cl2(pic)3 (3) are monoclinic, space group P21/n (No. 14), with a = 9.5265(2), b = 17.8729(6), c = 13.8296(4), beta = 99.7640(15)Z = 4, R = 0.048 for 5511 unique reflections. Crystals of [In(mu-OH)Br2(pic)22 (4) are tetragonal, space group = I41cd (No. 110) with a = 19.8560(4), b = 19.8560(4), c = 25.9528(6), Z = 8, R = 0.039 for 5982 unique reflections.
Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin
2015-04-01
This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.
Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M
1996-08-01
DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
NASA Technical Reports Server (NTRS)
Sayfi, Elias
2004-01-01
MER SPICE Interface is a software module for use in conjunction with the Mars Exploration Rover (MER) mission and the SPICE software system of the Navigation and Ancillary Information Facility (NAIF) at NASA's Jet Propulsion Laboratory. (SPICE is used to acquire, record, and disseminate engineering, navigational, and other ancillary data describing circumstances under which data were acquired by spaceborne scientific instruments.) Given a Spacecraft Clock value, MER SPICE Interface extracts MER-specific data from SPICE kernels (essentially, raw data files) and calculates values for Planet Day Number, Local Solar Longitude, Local Solar Elevation, Local Solar Azimuth, and Local Solar Time (UTC). MER SPICE Interface was adapted from a subroutine, denoted m98SpiceIF written by Payam Zamani, that was intended to calculate SPICE values for the Mars Polar Lander. The main difference between MER SPICE Interface and m98SpiceIf is that MER SPICE Interface does not explicitly call CHRONOS, a time-conversion program that is part of a library of utility subprograms within SPICE. Instead, MER SPICE Interface mimics some portions of the CHRONOS code, the advantage being that it executes much faster and can efficiently be called from a pipeline of events in a parallel processing environment.
Lyyra, Satu; Meagher, Richard B; Kim, Tehryung; Heaton, Andrew; Montello, Paul; Balish, Rebecca S; Merkle, Scott A
2007-03-01
Eastern cottonwood (Populus deltoides Bartr. ex Marsh.) trees were engineered to express merA (mercuric ion reductase) and merB (organomercury lyase) transgenes in order to be used for the phytoremediation of mercury-contaminated soils. Earlier studies with Arabidopsis thaliana and Nicotiana tabacum showed that this gene combination resulted in more efficient detoxification of organomercurial compounds than did merB alone, but neither species is optimal for long-term field applications. Leaf discs from in vitro-grown merA, nptII (neomycin phosphotransferase) transgenic cottonwood plantlets were inoculated with Agrobacterium tumefaciens strain C58 carrying the merB and hygromycin resistance (hptII) genes. Polymerase chain reaction of shoots regenerated from the leaf discs under selection indicated an overall transformation frequency of 20%. Western blotting of leaves showed that MerA and MerB proteins were produced. In vitro-grown merA/merB plants were highly resistant to phenylmercuric acetate, and detoxified organic mercury compounds two to three times more rapidly than did controls, as shown by mercury volatilization assay. This indicates that these cottonwood trees are reasonable candidates for the remediation of organomercury-contaminated sites.
Johnson, Reed F; Bagci, Ulas; Keith, Lauren; Tang, Xianchun; Mollura, Daniel J; Zeitlin, Larry; Qin, Jing; Huzella, Louis; Bartos, Christopher J; Bohorova, Natasha; Bohorov, Ognian; Goodman, Charles; Kim, Do H; Paulty, Michael H; Velasco, Jesus; Whaley, Kevin J; Johnson, Joshua C; Pettitt, James; Ork, Britini L; Solomon, Jeffrey; Oberlander, Nicholas; Zhu, Quan; Sun, Jiusong; Holbrook, Michael R; Olinger, Gene G; Baric, Ralph S; Hensley, Lisa E; Jahrling, Peter B; Marasco, Wayne A
2016-03-01
Middle East Respiratory Syndrome Coronavirus (MERS-CoV) was identified in 2012 as the causative agent of a severe, lethal respiratory disease occurring across several countries in the Middle East. To date there have been over 1600 laboratory confirmed cases of MERS-CoV in 26 countries with a case fatality rate of 36%. Given the endemic region, it is possible that MERS-CoV could spread during the annual Hajj pilgrimage, necessitating countermeasure development. In this report, we describe the clinical and radiographic changes of rhesus monkeys following infection with 5×10(6) PFU MERS-CoV Jordan-n3/2012. Two groups of NHPs were treated with either a human anti-MERS monoclonal antibody 3B11-N or E410-N, an anti-HIV antibody. MERS-CoV Jordan-n3/2012 infection resulted in quantifiable changes by computed tomography, but limited other clinical signs of disease. 3B11-N treated subjects developed significantly reduced lung pathology when compared to infected, untreated subjects, indicating that this antibody may be a suitable MERS-CoV treatment. Published by Elsevier Inc.
Lee-Sherick, Alisa B.; Zhang, Weihe; Menachof, Kelly K.; Hill, Amanda A.; Rinella, Sean; Kirkpatrick, Gregory; Page, Lauren S.; Stashko, Michael A.; Jordan, Craig T.; Wei, Qi; Liu, Jing; Zhang, Dehui; DeRyckere, Deborah; Wang, Xiaodong; Frye, Stephen; Earp, H. Shelton; Graham, Douglas K.
2015-01-01
Mer and Flt3 receptor tyrosine kinases have been implicated as therapeutic targets in acute myeloid leukemia (AML). In this manuscript we describe UNC1666, a novel ATP-competitive small molecule tyrosine kinase inhibitor, which potently diminishes Mer and Flt3 phosphorylation in AML. Treatment with UNC1666 mediated biochemical and functional effects in AML cell lines expressing Mer or Flt3 internal tandem duplication (ITD), including decreased phosphorylation of Mer, Flt3 and downstream effectors Stat, Akt and Erk, induction of apoptosis in up to 98% of cells, and reduction of colony formation by greater than 90%, compared to treatment with vehicle. These effects were dose-dependent, with inhibition of downstream signaling and functional effects correlating with the degree of Mer or Flt3 kinase inhibition. Treatment of primary AML patient samples expressing Mer and/or Flt3-ITD with UNC1666 also inhibited Mer and Flt3 intracellular signaling, induced apoptosis, and inhibited colony formation. In summary, UNC1666 is a novel potent small molecule tyrosine kinase inhibitor that decreases oncogenic signaling and myeloblast survival, thereby validating dual Mer/Flt3 inhibition as an attractive treatment strategy for AML. PMID:25762638
Role of molecular mimicry to HIV-1 peptides in HIV-1–related immunologic thrombocytopenia
Li, Zongdong; Nardi, Michael A.; Karpatkin, Simon
2005-01-01
Patients with early HIV-1 infection develop an autoimmune thrombocytopenia in which antibody is directed against an immunodominant epitope of the β3 (glycoprotein IIIa [GPIIIa]) integrin, GPIIIa49-66. This antibody induces thrombocytopenia by a novel complement-independent mechanism in which platelets are fragmented by antibody-induced generation of H2O2 derived from the interaction of platelet nicotinamide adenine dinucleotide phosphate (NADPH) oxidase and 12-lipoxygenase. To examine whether sharing of epitope between host and parasite may be responsible for this immunodominant epitope, we screened for antibody-reactive peptides capable of inhibiting platelet lysis and oxidation in vitro, using a filamentous phage display 7-mer peptide library. Fourteen of these phage-peptide clones were identified. Five shared close sequence similarity with GPIIIa49-66, as expected. Ten were molecular mimics with close sequence similarity to HIV-1 proteins nef, gag, env, and pol. Seven were synthesized as 10-mers from their known HIV-1 sequence and found to inhibit anti–GPIIIa49-66–induced platelet oxidation/fragmentation in vitro. Three rabbit antibodies raised against these peptides induced platelet oxidation/fragmentation in vitro and thrombocytopenia in vivo when passively transferred into mice. One of the peptides shared a known epitope region with HIV-1 protein nef and was derived from a variant region of the protein. These data provide strong support for molecular mimicry in HIV-1-immunologic thrombocytopenia within polymorphic regions of HIV-1 proteins. A known epitope of nef is particularly incriminated. PMID:15774614
An improved filtering algorithm for big read datasets and its application to single-cell assembly.
Wedemeyer, Axel; Kliemann, Lasse; Srivastav, Anand; Schielke, Christian; Reusch, Thorsten B; Rosenstiel, Philip
2017-07-03
For single-cell or metagenomic sequencing projects, it is necessary to sequence with a very high mean coverage in order to make sure that all parts of the sample DNA get covered by the reads produced. This leads to huge datasets with lots of redundant data. A filtering of this data prior to assembly is advisable. Brown et al. (2012) presented the algorithm Diginorm for this purpose, which filters reads based on the abundance of their k-mers. We present Bignorm, a faster and quality-conscious read filtering algorithm. An important new algorithmic feature is the use of phred quality scores together with a detailed analysis of the k-mer counts to decide which reads to keep. We qualify and recommend parameters for our new read filtering algorithm. Guided by these parameters, we remove in terms of median 97.15% of the reads while keeping the mean phred score of the filtered dataset high. Using the SDAdes assembler, we produce assemblies of high quality from these filtered datasets in a fraction of the time needed for an assembly from the datasets filtered with Diginorm. We conclude that read filtering is a practical and efficient method for reducing read data and for speeding up the assembly process. This applies not only for single cell assembly, as shown in this paper, but also to other projects with high mean coverage datasets like metagenomic sequencing projects. Our Bignorm algorithm allows assemblies of competitive quality in comparison to Diginorm, while being much faster. Bignorm is available for download at https://git.informatik.uni-kiel.de/axw/Bignorm .
Ishibashi, J; Saido-Sakanaka, H; Yang, J; Sagisaka, A; Yamakawa, M
1999-12-01
A novel member of the insect defensins, a family of antibacterial peptides, was purified from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros, immunized with Escherichia coli. A full-size cDNA was cloned by combining reverse-transcription PCR (RT-PCR), and 5'- and 3'-rapid amplification of cDNA ends (RACE). Analysis of the O. rhinoceros defensin gene expression showed it to be expressed in the fat body and hemocyte, midgut and Malpighian tubules. O. rhinoceros defensin showed strong antibacterial activity against Staphylococcus aureus. A 9-mer peptide amidated at its C-terminus, AHCLAICRK-NH2 (Ala22-Lys30-NH2), was synthesized based on the deduced amino-acid sequence, assumed to be an active site sequence by analogy with the sequence of a defensin isolated from larvae of the beetle Allomyrina dichotoma. This peptide showed antibacterial activity against S. aureus, methicillin-resistant S. aureus, E. coli and Pseudomonas aeruginosa. We further modified this oligopeptide and synthesized five 9-mer peptides, ALRLAIRKR-NH2, ALLLAIRKR-NH2, AWLLAIRKR-NH2, ALYLAIRKR-NH2 and ALWLAIRKR-NH2. These oligopeptides showed strong antibacterial activity against Gram-negative and Gram-positive bacteria. The antibacterial effect of Ala22-Lys30-NH2 analogues was due to its interaction with bacterial membranes, judging from the leakage of liposome-entrapped glucose. These Ala22-Lys30-NH2 analogues did not show haemolytic activity and did not inhibit the growth of murine fibroblast cells or macrophages, except for AWLLAIRKR-NH2.
Oligonucleotide facilitators may inhibit or activate a hammerhead ribozyme.
Jankowsky, E; Schwenzer, B
1996-01-01
Facilitators are oligonucleotides capable of affecting hammerhead ribozyme activity by interacting with the substrate at the termini of the ribozyme. Facilitator effects were determined in vitro using a system consisting of a ribozyme with 7 nucleotides in every stem sequence and two substrates with inverted facilitator binding sequences. The effects of 9mer and 12mer RNA as well as DNA facilitators which bind either adjacent to the 3'- or 5'-end of the ribozyme were investigated. A kinetic model was developed which allows determination of the apparent dissociation constant of the ribozyme-substrate complex from single turnover reactions. We observed a decreased dissociation constant of the ribozyme-substrate complex due to facilitator addition corresponding to an additional stabilization energy of delta delta G=-1.7 kcal/mol with 3'-end facilitators. The cleavage rate constant was increased by 3'-end facilitators and decreased by 5'-end facilitators. Values for Km were slightly lowered by all facilitators and kcat was increased by 3'-end facilitators and decreased by 5'-end facilitators in our system. Generally the facilitator effects increased with the length of the facilitators and RNA provided greater effects than DNA of the same sequence. Results suggest facilitator influences on several steps of the hammerhead reaction, substrate association, cleavage and dissociation of products. Moreover, these effects are dependent in different manners on ribozyme and substrate concentration. This leads to the conclusion that there is a concentration dependence whether activation or inhibition is caused by facilitators. Conclusions are drawn with regard to the design of hammerhead ribozyme facilitator systems. PMID:8602353
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA
Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.
1995-01-01
The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581
Human Centered Design and Development for NASA's MerBoard
NASA Technical Reports Server (NTRS)
Trimble, Jay
2003-01-01
This viewgraph presentation provides an overview of the design and development process for NASA's MerBoard. These devices are large interactive display screens which can be shown on the user's computer, which will allow scientists in many locations to interpret and evaluate mission data in real-time. These tools are scheduled to be used during the 2003 Mars Exploration Rover (MER) expeditions. Topics covered include: mission overview, Mer Human Centered Computers, FIDO 2001 observations and MerBoard prototypes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Reed F., E-mail: johnsonreed@mail.nih.gov; Via, Laura E.; Kumar, Mia R.
Middle East Respiratory Syndrome Coronavirus (MERS-CoV) continues to be a threat to human health in the Middle East. Development of countermeasures is ongoing; however, an animal model that faithfully recapitulates human disease has yet to be defined. A recent study indicated that inoculation of common marmosets resulted in inconsistent lethality. Based on these data we sought to compare two isolates of MERS-CoV. We followed disease progression in common marmosets after intratracheal exposure with: MERS-CoV-EMC/2012, MERS-CoV-Jordan-n3/2012, media, or inactivated virus. Our data suggest that common marmosets developed a mild to moderate non-lethal respiratory disease, which was quantifiable by computed tomography (CT),more » with limited other clinical signs. Based on CT data, clinical data, and virological data, MERS-CoV inoculation of common marmosets results in mild to moderate clinical signs of disease that are likely due to manipulations of the marmoset rather than as a result of robust viral replication. - Highlights: • Common marmosets infected with MERS-EMC and MERS-JOR did not develop lethal disease. • Infected subjects developed transient signs of clinical disease. • CT indicated few differences between the infected and control groups. • Marmosets do not faithfully replicate human MERS pathogenesis.« less
Transmission characteristics of MERS and SARS in the healthcare setting: a comparative study.
Chowell, Gerardo; Abdirizak, Fatima; Lee, Sunmi; Lee, Jonggul; Jung, Eunok; Nishiura, Hiroshi; Viboud, Cécile
2015-09-03
The Middle East respiratory syndrome (MERS) coronavirus has caused recurrent outbreaks in the Arabian Peninsula since 2012. Although MERS has low overall human-to-human transmission potential, there is occasional amplification in the healthcare setting, a pattern reminiscent of the dynamics of the severe acute respiratory syndrome (SARS) outbreaks in 2003. Here we provide a head-to-head comparison of exposure patterns and transmission dynamics of large hospital clusters of MERS and SARS, including the most recent South Korean outbreak of MERS in 2015. To assess the unexpected nature of the recent South Korean nosocomial outbreak of MERS and estimate the probability of future large hospital clusters, we compared exposure and transmission patterns for previously reported hospital clusters of MERS and SARS, based on individual-level data and transmission tree information. We carried out simulations of nosocomial outbreaks of MERS and SARS using branching process models rooted in transmission tree data, and inferred the probability and characteristics of large outbreaks. A significant fraction of MERS cases were linked to the healthcare setting, ranging from 43.5 % for the nosocomial outbreak in Jeddah, Saudi Arabia, in 2014 to 100 % for both the outbreak in Al-Hasa, Saudi Arabia, in 2013 and the outbreak in South Korea in 2015. Both MERS and SARS nosocomial outbreaks are characterized by early nosocomial super-spreading events, with the reproduction number dropping below 1 within three to five disease generations. There was a systematic difference in the exposure patterns of MERS and SARS: a majority of MERS cases occurred among patients who sought care in the same facilities as the index case, whereas there was a greater concentration of SARS cases among healthcare workers throughout the outbreak. Exposure patterns differed slightly by disease generation, however, especially for SARS. Moreover, the distributions of secondary cases per single primary case varied highly across individual hospital outbreaks (Kruskal-Wallis test; P < 0.0001), with significantly higher transmission heterogeneity in the distribution of secondary cases for MERS than SARS. Simulations indicate a 2-fold higher probability of occurrence of large outbreaks (>100 cases) for SARS than MERS (2 % versus 1 %); however, owing to higher transmission heterogeneity, the largest outbreaks of MERS are characterized by sharper incidence peaks. The probability of occurrence of MERS outbreaks larger than the South Korean cluster (n = 186) is of the order of 1 %. Our study suggests that the South Korean outbreak followed a similar progression to previously described hospital clusters involving coronaviruses, with early super-spreading events generating a disproportionately large number of secondary infections, and the transmission potential diminishing greatly in subsequent generations. Differences in relative exposure patterns and transmission heterogeneity of MERS and SARS could point to changes in hospital practices since 2003 or differences in transmission mechanisms of these coronaviruses.
PigGIS: Pig Genomic Informatics System
Ruan, Jue; Guo, Yiran; Li, Heng; Hu, Yafeng; Song, Fei; Huang, Xin; Kristiensen, Karsten; Bolund, Lars; Wang, Jun
2007-01-01
Pig Genomic Information System (PigGIS) is a web-based depository of pig (Sus scrofa) genomic learning mainly engineered for biomedical research to locate pig genes from their human homologs and position single nucleotide polymorphisms (SNPs) in different pig populations. It utilizes a variety of sequence data, including whole genome shotgun (WGS) reads and expressed sequence tags (ESTs), and achieves a successful mapping solution to the low-coverage genome problem. With the data presently available, we have identified a total of 15 700 pig consensus sequences covering 18.5 Mb of the homologous human exons. We have also recovered 18 700 SNPs and 20 800 unique 60mer oligonucleotide probes for future pig genome analyses. PigGIS can be freely accessed via the web at and . PMID:17090590
Nielsen, Troels Tolstrup; Mardosiene, Skirmante; Løkkegaard, Annemette; Stokholm, Jette; Ehrenfels, Susanne; Bech, Sara; Friberg, Lars; Nielsen, Jens Kellberg; Nielsen, Jørgen E
2012-08-13
The autosomal dominant spinocerebellar ataxias (SCAs) confine a group of rare and heterogeneous disorders, which present with progressive ataxia and numerous other features e.g. peripheral neuropathy, macular degeneration and cognitive impairment, and a subset of these disorders is caused by CAG-repeat expansions in their respective genes. The diagnosing of the SCAs is often difficult due to the phenotypic overlap among several of the subtypes and with other neurodegenerative disorders e.g. Huntington's disease. We report a family in which the proband had rapidly progressing cognitive decline and only subtle cerebellar symptoms from age 42. Sequencing of the TATA-box binding protein gene revealed a modest elongation of the CAG/CAA-repeat of only two repeats above the non-pathogenic threshold of 41, confirming a diagnosis of SCA17. Normally, repeats within this range show reduced penetrance and result in a milder disease course with slower progression and later age of onset. Thus, this case presented with an unusual phenotype. The current case highlights the diagnostic challenge of neurodegenerative disorders and the need for a thorough clinical and paraclinical examination of patients presenting with rapid cognitive decline to make a precise diagnosis on which further genetic counseling and initiation of treatment modalities can be based.
Tessé, Sophie; Bourbon, Henri-Marc; Debuchy, Robert; Budin, Karine; Dubois, Emeline; Liangran, Zhang; Antoine, Romain; Piolot, Tristan; Kleckner, Nancy; Zickler, Denise; Espagne, Eric
2017-09-15
Meiosis is the cellular program by which a diploid cell gives rise to haploid gametes for sexual reproduction. Meiotic progression depends on tight physical and functional coupling of recombination steps at the DNA level with specific organizational features of meiotic-prophase chromosomes. The present study reveals that every step of this coupling is mediated by a single molecule: Asy2/Mer2. We show that Mer2, identified so far only in budding and fission yeasts, is in fact evolutionarily conserved from fungi (Mer2/Rec15/Asy2/Bad42) to plants (PRD3/PAIR1) and mammals (IHO1). In yeasts, Mer2 mediates assembly of recombination-initiation complexes and double-strand breaks (DSBs). This role is conserved in the fungus Sordaria However, functional analysis of 13 mer2 mutants and successive localization of Mer2 to axis, synaptonemal complex (SC), and chromatin revealed, in addition, three further important functions. First, after DSB formation, Mer2 is required for pairing by mediating homolog spatial juxtaposition, with implications for crossover (CO) patterning/interference. Second, Mer2 participates in the transfer/maintenance and release of recombination complexes to/from the SC central region. Third, after completion of recombination, potentially dependent on SUMOylation, Mer2 mediates global chromosome compaction and post-recombination chiasma development. Thus, beyond its role as a recombinosome-axis/SC linker molecule, Mer2 has important functions in relation to basic chromosome structure. © 2017 Tessé et al.; Published by Cold Spring Harbor Laboratory Press.
Typing Clostridium difficile strains based on tandem repeat sequences
2009-01-01
Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124
Rockey, William M; Hernandez, Frank J; Huang, Sheng-You; Cao, Song; Howell, Craig A; Thomas, Gregory S; Liu, Xiu Ying; Lapteva, Natalia; Spencer, David M; McNamara, James O; Zou, Xiaoqin; Chen, Shi-Jie; Giangrande, Paloma H
2011-10-01
RNA aptamers represent an emerging class of pharmaceuticals with great potential for targeted cancer diagnostics and therapy. Several RNA aptamers that bind cancer cell-surface antigens with high affinity and specificity have been described. However, their clinical potential has yet to be realized. A significant obstacle to the clinical adoption of RNA aptamers is the high cost of manufacturing long RNA sequences through chemical synthesis. Therapeutic aptamers are often truncated postselection by using a trial-and-error process, which is time consuming and inefficient. Here, we used a "rational truncation" approach guided by RNA structural prediction and protein/RNA docking algorithms that enabled us to substantially truncateA9, an RNA aptamer to prostate-specific membrane antigen (PSMA),with great potential for targeted therapeutics. This truncated PSMA aptamer (A9L; 41mer) retains binding activity, functionality, and is amenable to large-scale chemical synthesis for future clinical applications. In addition, the modeled RNA tertiary structure and protein/RNA docking predictions revealed key nucleotides within the aptamer critical for binding to PSMA and inhibiting its enzymatic activity. Finally, this work highlights the utility of existing RNA structural prediction and protein docking techniques that may be generally applicable to developing RNA aptamers optimized for therapeutic use.
Centrifuge: rapid and sensitive classification of metagenomic sequences
Song, Li; Breitwieser, Florian P.
2016-01-01
Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes. Together, these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers. Because of its space-optimized indexing schemes, Centrifuge also makes it possible to index the entire NCBI nonredundant nucleotide sequence database (a total of 109 billion bases) with an index size of 69 GB, in contrast to k-mer-based indexing schemes, which require far more extensive space. PMID:27852649
Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN
Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger
2016-01-01
Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831
Secondary binding sites for heavily modified triplex forming oligonucleotides
Cardew, Antonia S.; Brown, Tom; Fox, Keith R.
2012-01-01
In order to enhance DNA triple helix stability synthetic oligonucleotides have been developed that bear amino groups on the sugar or base. One of the most effective of these is bis-amino-U (B), which possesses 5-propargylamino and 2′-aminoethoxy modifications. Inclusion of this modified nucleotide not only greatly enhances triplex stability, but also increases the affinity for related sequences. We have used a restriction enzyme protection, selection and amplification assay (REPSA) to isolate sequences that are bound by the heavily modified 9-mer triplex-forming oligonucleotide B6CBT. The isolated sequences contain An tracts (n = 6), suggesting that the 5′-end of this TFO was responsible for successful triplex formation. DNase I footprinting with these sequences confirmed triple helix formation at these secondary targets and demonstrated no interaction with similar oligonucleotides containing T or 5-propargylamino-dU. PMID:22180535
Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian
2009-11-01
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.
Kim, H K; Yoon, S-W; Kim, D-J; Koo, B-S; Noh, J Y; Kim, J H; Choi, Y G; Na, W; Chang, K-T; Song, D; Jeong, D G
2016-08-01
Bat species around the world have recently been recognized as major reservoirs of several zoonotic viruses, such as severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV), Nipah virus and Hendra virus. In this study, consensus primer-based reverse transcriptase polymerase chain reactions (RT-PCRs) and high-throughput sequencing were performed to investigate viruses in bat faecal samples collected at 11 natural bat habitat sites from July to December 2015 in Korea. Diverse coronaviruses were first detected in Korean bat faeces, including alphacoronaviruses, SARS-CoV-like and MERS-CoV-like betacoronaviruses. In addition, we identified a novel bat rotavirus belonging to group H rotavirus which has only been described in human and pigs until now. Therefore, our results suggest the need for continuing surveillance and additional virological studies in domestic bat. © 2016 Blackwell Verlag GmbH.
Mars Exploration Rover Terminal Descent Mission Modeling and Simulation
NASA Technical Reports Server (NTRS)
Raiszadeh, Behzad; Queen, Eric M.
2004-01-01
Because of NASA's added reliance on simulation for successful interplanetary missions, the MER mission has developed a detailed EDL trajectory modeling and simulation. This paper summarizes how the MER EDL sequence of events are modeled, verification of the methods used, and the inputs. This simulation is built upon a multibody parachute trajectory simulation tool that has been developed in POST I1 that accurately simulates the trajectory of multiple vehicles in flight with interacting forces. In this model the parachute and the suspended bodies are treated as 6 Degree-of-Freedom (6 DOF) bodies. The terminal descent phase of the mission consists of several Entry, Descent, Landing (EDL) events, such as parachute deployment, heatshield separation, deployment of the lander from the backshell, deployment of the airbags, RAD firings, TIRS firings, etc. For an accurate, reliable simulation these events need to be modeled seamlessly and robustly so that the simulations will remain numerically stable during Monte-Carlo simulations. This paper also summarizes how the events have been modeled, the numerical issues, and modeling challenges.
Molecular Dynamics of Peptide Folding at Aqueous Interfaces
NASA Technical Reports Server (NTRS)
Pohorille, Andrew; Chipot, Christophe; Chang, Sherwood (Technical Monitor)
1997-01-01
Even though most monomeric peptides are disordered in water they can adopt sequence-dependent, ordered structures, such as a-helices, at aqueous interfaces. This property is relevant to cellular signaling, membrane fusion, and the action of toxins and antibiotics. The mechanism of folding nonpolar peptides at the water-hexane interface was studied in the example of an 11-mer, of poly-L-leucine. Initially placed as a random coil on the water side of the interface, the peptide folded into an a-helix in 36 ns. Simultaneously, the peptide translocated into the hexane side of the interface. Folding was not sequential and involved a 3/10-helix as an intermediate. The folded peptide was either parallel to the interface or had its C-terminus exposed to water. An 11-mer, LQQLLQQLLQL, composed of leucine (L) and glutamine (G), was taken as a model amphiphilic peptide. It rapidly adopted an amphiphilic, disordered structure at the interface. Further folding proceeded through a series of amphiphilic intermediates.
Optimization of single-base-pair mismatch discrimination in oligonucleotide microarrays
NASA Technical Reports Server (NTRS)
Urakawa, Hidetoshi; El Fantroussi, Said; Smidt, Hauke; Smoot, James C.; Tribou, Erik H.; Kelly, John J.; Noble, Peter A.; Stahl, David A.
2003-01-01
The discrimination between perfect-match and single-base-pair-mismatched nucleic acid duplexes was investigated by using oligonucleotide DNA microarrays and nonequilibrium dissociation rates (melting profiles). DNA and RNA versions of two synthetic targets corresponding to the 16S rRNA sequences of Staphylococcus epidermidis (38 nucleotides) and Nitrosomonas eutropha (39 nucleotides) were hybridized to perfect-match probes (18-mer and 19-mer) and to a set of probes having all possible single-base-pair mismatches. The melting profiles of all probe-target duplexes were determined in parallel by using an imposed temperature step gradient. We derived an optimum wash temperature for each probe and target by using a simple formula to calculate a discrimination index for each temperature of the step gradient. This optimum corresponded to the output of an independent analysis using a customized neural network program. These results together provide an experimental and analytical framework for optimizing mismatch discrimination among all probes on a DNA microarray.
Williams, Sunanda Margrett; Chandran, Anu Vijayakumari; Prakash, Sunita; Vijayan, Mamannamana; Chatterji, Dipankar
2017-09-05
Proteins of the ferritin family are ubiquitous in living organisms. With their spherical cage-like structures they are the iron storehouses in cells. Subfamilies of ferritins include 24-meric ferritins and bacterioferritins (maxiferritins), and 12-meric Dps (miniferritins). Dps safeguards DNA by direct binding, affording physical protection and safeguards from free radical-mediated damage by sequestering iron in its core. The maxiferritins can oxidize and store iron but cannot bind DNA. Here we show that a mutation at a critical interface in Dps alters its assembly from the canonical 12-mer to a ferritin-like 24-mer under crystallization. This structural switch was attributed to the conformational alteration of a highly conserved helical loop and rearrangement of the C-terminus. Our results demonstrate a novel concept of mutational switch between related protein subfamilies and corroborate the popular model for evolution by which subtle substitutions in an amino acid sequence lead to diversification among proteins. Copyright © 2017 Elsevier Ltd. All rights reserved.
Genome Wide Characterization of Simple Sequence Repeats in Cucumber
USDA-ARS?s Scientific Manuscript database
The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.
2014-01-01
Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324
NASA Astrophysics Data System (ADS)
Yan, Jiaqing; Wang, Yinghua; Ouyang, Gaoxiang; Yu, Tao; Li, Xiaoli
2016-02-01
A maximum entropy ratio (MER) method is firstly adapted to investigate the high-dimensional Electrocorticogram (ECoG) data from epilepsy patients. MER is a symbolic analysis approach for the detection of recurrence domains of complex dynamical systems from time series. Data were chosen from eight patients undergoing pre-surgical evaluation for drug-resistant temporal lobe epilepsy. MERs for interictal and ictal data were calculated and compared. A statistical test was performed to evaluate the ability of MER to separate the interictal state from the ictal state. MER showed significant changes from the interictal state into the ictal state, where MER was low at the ictal state and is significantly different with that at the interictal state. These suggest that MER is able to separate the ictal state from the interictal state based on ECoG data. It has the potential of detecting the transition between normal brain activity and the ictal state.
Kwon, So Yeun; Lee, Hwan Young; Kim, Eun Hye; Lee, Eun Young; Shin, Kyoung-Jin
2016-11-01
Next-generation sequencing (NGS) can produce massively parallel sequencing (MPS) data for many targeted regions with a high depth of coverage, suggesting its successful application to the amplicons of forensic genetic markers. In the present study, we evaluated the practical utility of MPS in Y-chromosome short tandem repeat (Y-STR) analysis using a multiplex polymerase chain reaction (PCR) system. The multiplex PCR system simultaneously amplified 24 Y-chromosomal markers, including the PowerPlex ® Y23 loci (DYS19, DYS385ab, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS481, DYS533, DYS549, DYS570, DYS576, DYS635, DYS643, and YGATAH4) and the M175 marker with the small-sized amplicons ranging from 85 to 253bp. The barcoded libraries for the amplicons of the 24 Y-chromosomal markers were produced using a simplified PCR-based library preparation method and successfully sequenced using MPS on a MiSeq ® System with samples from 250 unrelated Korean males. The genotyping concordance between MPS and the capillary electrophoresis (CE) method, as well as the sequence structure of the 23 Y-STRs, were investigated. Three samples exhibited discordance between the MPS and CE results at DYS385, DYS439, and DYS576. There were 12 Y-STR loci that showed sequence variations in the alleles by a fragment size determination, and the most varied alleles occurred in DYS389II with a different sequence structure in the repeat region. The largest increase in gene diversity between the CE and MPS results was in DYS437 at +34.41%. Single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) were observed in the flanking regions of DYS481, DYS576, and DYS385, respectively. Stutter and noise ratios of the 23 Y-STRs using the developed MPS system were also investigated. Based on these results, the MPS analysis system used in this study could facilitate the investigation into the sequences of the 23 Y-STRs in forensic genetics laboratories. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
2003-04-30
KENNEDY SPACE CENTER, FLA. - The overhead crane settles the Mars Exploration Rover 2 (MER-2) entry vehicle onto a spin table for a dry-spin test. The MER Mission consists of two identical rovers designed to cover roughly 110 yards each Martian day over various terrain. Each rover will carry five scientific instruments that will allow it to search for evidence of liquid water that may have been present in the planet's past. Identical to each other, the rovers will land at different regions of Mars. Launch for MER-2 (MER-A) is scheduled for June 5.
Srivastava, Deepika; Shanker, Asheesh
2016-12-01
Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
Allostery and the dynamic oligomerization of porphobilinogen synthase
Jaffe, Eileen K.; Lawrence, Sarah H.
2011-01-01
The structural basis for allosteric regulation of porphobilinogen synthase (PBGS) is modulation of a quaternary structure equilibrium between octamer and hexamer (via dimers), which is represented schematically as 8mer ⇔ 2mer ⇔ 2mer* ⇔ 6mer*. The “*” represents a reorientation between two domains of each subunit that occurs in the dissociated state because it is sterically forbidden in the larger multimers. Allosteric effectors of PBGS are both intrinsic and extrinsic and are phylogenetically variable. In some species this equilibrium is modulated intrinsically by magnesium which binds at a site specific to the 8mer. In other species this equilibrium is modulated intrinsically by pH; the guanidinium group of an arginine being spatially equivalent to the allosteric magnesium ion. In humans, disease associated variants all shift the equilibrium toward the 6mer* relative to wild type. The 6mer* has a surface cavity that is not present in the 8mer and is proposed as a small molecule allosteric binding site. In silico and in vitro approaches have revealed species-specific allosteric PBGS inhibitors that stabilize the 6mer*. Some of these inhibitors are drugs in clinical use leading to the hypothesis that extrinsic allosteric inhibition of human PBGS could be a mechanism for drug side effects. PMID:22037356
Deterministic and stochastic models for middle east respiratory syndrome (MERS)
NASA Astrophysics Data System (ADS)
Suryani, Dessy Rizki; Zevika, Mona; Nuraini, Nuning
2018-03-01
World Health Organization (WHO) data stated that since September 2012, there were 1,733 cases of Middle East Respiratory Syndrome (MERS) with 628 death cases that occurred in 27 countries. MERS was first identified in Saudi Arabia in 2012 and the largest cases of MERS outside Saudi Arabia occurred in South Korea in 2015. MERS is a disease that attacks the respiratory system caused by infection of MERS-CoV. MERS-CoV transmission occurs directly through direct contact between infected individual with non-infected individual or indirectly through contaminated object by the free virus. Suspected, MERS can spread quickly because of the free virus in environment. Mathematical modeling is used to illustrate the transmission of MERS disease using deterministic model and stochastic model. Deterministic model is used to investigate the temporal dynamic from the system to analyze the steady state condition. Stochastic model approach using Continuous Time Markov Chain (CTMC) is used to predict the future states by using random variables. From the models that were built, the threshold value for deterministic models and stochastic models obtained in the same form and the probability of disease extinction can be computed by stochastic model. Simulations for both models using several of different parameters are shown, and the probability of disease extinction will be compared with several initial conditions.
2003-05-15
KENNEDY SPACE CENTER, FLA. - In the foreground, three solid rocket boosters (SRBs) suspended in the launch tower flank the Delta II rocket (in the background) that will launch Mars Exploration Rover 2 (MER-2). NASA’s twin Mars Exploration Rovers are designed to study the history of water on Mars. These robotic geologists are equipped with a robotic arm, a drilling tool, three spectrometers, and four pairs of cameras that allow them to have a human-like, 3D view of the terrain. Each rover could travel as far as 100 meters in one day to act as Mars scientists' eyes and hands, exploring an environment where humans can’t yet go. MER-2 is scheduled to launch June 5 as MER-A. MER-1 (MER-B) will launch June 25.
Munding, Elizabeth M.; Igel, A. Haller; Shiue, Lily; Dorighi, Kristel M.; Treviño, Lisa R.; Ares, Manuel
2010-01-01
Splicing regulatory networks are essential components of eukaryotic gene expression programs, yet little is known about how they are integrated with transcriptional regulatory networks into coherent gene expression programs. Here we define the MER1 splicing regulatory network and examine its role in the gene expression program during meiosis in budding yeast. Mer1p splicing factor promotes splicing of just four pre-mRNAs. All four Mer1p-responsive genes also require Nam8p for splicing activation by Mer1p; however, other genes require Nam8p but not Mer1p, exposing an overlapping meiotic splicing network controlled by Nam8p. MER1 mRNA and three of the four Mer1p substrate pre-mRNAs are induced by the transcriptional regulator Ume6p. This unusual arrangement delays expression of Mer1p-responsive genes relative to other genes under Ume6p control. Products of Mer1p-responsive genes are required for initiating and completing recombination and for activation of Ndt80p, the activator of the transcriptional network required for subsequent steps in the program. Thus, the MER1 splicing regulatory network mediates the dependent relationship between the UME6 and NDT80 transcriptional regulatory networks in the meiotic gene expression program. This study reveals how splicing regulatory networks can be interlaced with transcriptional regulatory networks in eukaryotic gene expression programs. PMID:21123654
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Reed F., E-mail: johnsonreed@mail.nih.gov; Bagci, Ulas; Center for Research in Computer Vision
Middle East Respiratory Syndrome Coronavirus (MERS-CoV) was identified in 2012 as the causative agent of a severe, lethal respiratory disease occurring across several countries in the Middle East. To date there have been over 1600 laboratory confirmed cases of MERS-CoV in 26 countries with a case fatality rate of 36%. Given the endemic region, it is possible that MERS-CoV could spread during the annual Hajj pilgrimage, necessitating countermeasure development. In this report, we describe the clinical and radiographic changes of rhesus monkeys following infection with 5×10{sup 6} PFU MERS-CoV Jordan-n3/2012. Two groups of NHPs were treated with either a humanmore » anti-MERS monoclonal antibody 3B11-N or E410-N, an anti-HIV antibody. MERS-CoV Jordan-n3/2012 infection resulted in quantifiable changes by computed tomography, but limited other clinical signs of disease. 3B11-N treated subjects developed significantly reduced lung pathology when compared to infected, untreated subjects, indicating that this antibody may be a suitable MERS-CoV treatment. - Highlights: • MERS-CoV Jordan-n3/2012 challenge of rhesus monkeys results in a mild disease. • CT can be used to monitor disease progression to aid models of human disease. • Treatment with the human monoclonal antibody 3B11-N resulted in decreased disease.« less
Todt, Jill C.; Hu, Bin; Curtis, Jeffrey L.
2008-01-01
Apoptotic leukocytes must be cleared efficiently by macrophages (Mø). Apoptotic cell phagocytosis by Mø requires the receptor tyrosine kinase (RTK) MerTK (also known as c-Mer and Tyro12), the phosphatidylserine receptor (PS-R), and the classical protein kinase C (PKC) isoform βII, which translocates to Mø membrane and cytoskeletal fractions in a PS-R-dependent fashion. How these molecules cooperate to induce phagocytosis is unknown. Because the phosphatidylinositol-specific phospholipase (PI-PLC) PLC γ2 is downstream of RTKs in some cell types and can activate classical PKCs, we hypothesized that MerTK signals via PLC γ2. To test this hypothesis, we examined the interaction of MerTK and PLC γ2 in resident murine PMø and in the murine Mø cell line J774A.1 (J774) following exposure to apoptotic thymocytes. We found that, as with PMø, J774 phagocytosis of apoptotic thymocytes was inhibited by antibody against MerTK. Western blotting and immunoprecipitation showed that exposure to apoptotic cells produced three time-dependent changes in PMø and J774: (1) tyrosine phosphorylation of MerTK; (2) association of PLC γ2 with MerTK; and (3) tyrosine phosphorylation of PLC γ2. Phosphorylation of PLC γ2 and its association with MerTK was also induced by cross-linking MerTK using antibody. A PI-PLC appears to be required for phagocytosis of apoptotic cells because the PI-PLC inhibitor Et-18-OCH3 and the PLC inhibitor U73122, but not the inactive control U73343, blocked phagocytosis without impairing adhesion. On apoptotic cell adhesion to Mø, MerTK signals at least in part via PLC γ2. PMID:14704368
3-O sulfation of heparin leads to hepatotropism and longer circulatory half-life.
Miller, Colton M; Xu, Yongmei; Kudrna, Katrina M; Hass, Blake E; Kellar, Brianna M; Egger, Andrew W; Liu, Jian; Harris, Edward N
2018-05-17
Heparins are common blood anticoagulants that are critical for many surgical and biomedical procedures used in modern medicine. In contrast to natural heparin derived from porcine gut mucosa, synthetic heparins are homogenous by mass, polymer length, and chemistry. Stable cell lines expressing the human and mouse Stabilin receptors were used to evaluate endocytosis of natural and synthetic heparin. We chemoenzymatically produced synthetic heparin consisting of 12 sugars (dodecamers) containing 14 sulfate groups resulting in a non-3-O sulfated structure (n12mer). Half of the n12mer was modified with a 3-O sulfate on a single GlcNS sugar producing the 3-O sulfated heparin (12mer). Wildtype (WT), Stabilin-1 knock-out (KO), and Stabilin-2 KO C57BL/6 mice were developed and used for metabolic studies and provided as a source for primary liver sinusoidal endothelial cells. Human and mouse Stabilin-2 receptors had very similar endocytosis rates of both the 12mer and n12mer, suggesting that they are functionally similar in primary cells. Subcutaneous injections of the n12mer and 12mer revealed that the 12mer had a much longer half-life in circulation and a higher accumulation in liver. The n12mer never accumulated in circulation and was readily excreted by the kidneys before liver accumulation could occur. Liver sinusoidal endothelial cells from the Stabilin-2 KO mice had lower uptake rates for both dodecamers, whereas, the Stabilin-1 KO mice had lower endocytosis rates for the 12mer than the n12mer. 3-O sulfation of heparin is correlated to both a longer circulatory half-life and hepatotropism which is largely performed by the Stabilin receptors. Copyright © 2018 Elsevier Ltd. All rights reserved.
Jhang, Kyoung A; Park, Jin-Sun; Kim, Hee-Sun; Chong, Young Hae
2018-03-12
Mer tyrosine kinase (MerTK) activity necessary for amyloid-stimulated phagocytosis strongly implicates that MerTK dysregulation might contribute to chronic inflammation implicated in Alzheimer's disease (AD) pathology. However, the precise mechanism involved in the regulation of MerTK expression by amyloid-β (Aβ) in proinflammatory environment has not yet been ascertained. The objective of this study was to determine the underlying mechanism involved in Aβ-mediated decrease in MerTK expression through Aβ-mediated regulation of MerTK expression and its modulation by sulforaphane in human THP-1 macrophages challenged with Aβ1-42. We used protein preparation, Ca 2+ influx fluorescence imaging, nuclear fractionation, Western blotting techniques, and small interfering RNA (siRNA) knockdown to perform our study. Aβ1-42 elicited a marked decrease in MerTK expression along with increased intracellular Ca 2+ level and induction of proinflammatory cytokines such as IL-1β and TNF-α. Ionomycin A and thapsigargin also increased intracellular Ca 2+ levels and production of IL-1β and TNF-α, mimicking the effect of Aβ1-42. In contrast, the Aβ1-42-evoked responses were attenuated by depletion of Ca 2+ with ethylene glycol tetraacetic acid. Furthermore, recombinant IL-1β or TNF-α elicited a decrease in MerTK expression. However, immunodepletion of IL-1β or TNF-α with neutralizing antibodies significantly inhibited Aβ1-42-mediated downregulation of MerTK expression. Notably, sulforaphane treatment potently inhibited Aβ1-42-induced intracellular Ca 2+ level and rescued the decrease in MerTK expression by blocking nuclear factor-κB (NF-κB) nuclear translocation, thereby decreasing IL-1β and TNF-α production upon Aβ1-42 stimulation. Such adverse effects of sulforaphane were replicated by BAY 11-7082, a NF-κB inhibitor. Moreover, sulforaphane's anti-inflammatory effects on Aβ1-42-induced production of IL-1β and TNF-α were significantly diminished by siRNA-mediated knockdown of MerTK, confirming a critical role of MerTK in suppressing Aβ1-42-induced innate immune response. These findings implicate that targeting of MerTK with phytochemical sulforaphane as a mechanism for preventing Aβ1-42-induced neuroinflammation has potential to be applied in AD therapeutics.
Tao, Xinrong; Garron, Tania; Agrawal, Anurodh Shankar; Algaissi, Abdullah; Peng, Bi-Hung; Wakamiya, Maki; Chan, Teh-Sheng; Lu, Lu; Du, Lanying; Jiang, Shibo; Couch, Robert B; Tseng, Chien-Te K
2016-01-01
Characterized animal models are needed for studying the pathogenesis of and evaluating medical countermeasures for persisting Middle East respiratory syndrome-coronavirus (MERS-CoV) infections. Here, we further characterized a lethal transgenic mouse model of MERS-CoV infection and disease that globally expresses human CD26 (hCD26)/DPP4. The 50% infectious dose (ID50) and lethal dose (LD50) of virus were estimated to be <1 and 10 TCID50 of MERS-CoV, respectively. Neutralizing antibody developed in the surviving mice from the ID50/LD50 determinations, and all were fully immune to challenge with 100 LD50 of MERS-CoV. The tissue distribution and histopathology in mice challenged with a potential working dose of 10 LD50 of MERS-CoV were subsequently evaluated. In contrast to the overwhelming infection seen in the mice challenged with 10(5) LD50 of MERS-CoV, we were able to recover infectious virus from these mice only infrequently, although quantitative reverse transcription-PCR (qRT-PCR) tests indicated early and persistent lung infection and delayed occurrence of brain infection. Persistent inflammatory infiltrates were seen in the lungs and brain stems at day 2 and day 6 after infection, respectively. While focal infiltrates were also noted in the liver, definite pathology was not seen in other tissues. Finally, using a receptor binding domain protein vaccine and a MERS-CoV fusion inhibitor, we demonstrated the value of this model for evaluating vaccines and antivirals against MERS. As outcomes of MERS-CoV infection in patients differ greatly, ranging from asymptomatic to overwhelming disease and death, having available both an infection model and a lethal model makes this transgenic mouse model relevant for advancing MERS research. Fully characterized animal models are essential for studying pathogenesis and for preclinical screening of vaccines and drugs against MERS-CoV infection and disease. When given a high dose of MERS-CoV, our transgenic mice expressing hCD26/DPP4 viral receptor uniformly succumbed to death within 6 days, making it difficult to evaluate host responses to infection and disease. We further characterized this model by determining both the ID50 and the LD50 of MERS-CoV in order to establish both an infection model and a lethal model for MERS and followed this by investigating the antibody responses and immunity of the mice that survived MERS-CoV infection. Using the estimated LD50 and ID50 data, we dissected the kinetics of viral tissue distribution and pathology in mice challenged with 10 LD50 of virus and utilized the model for preclinical evaluation of a vaccine and drug for treatment of MERS-CoV infection. This further-characterized transgenic mouse model will be useful for advancing MERS research. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Turmel, Monique; Otis, Christian; Lemieux, Claude
2007-01-01
Background The Streptophyta comprises all land plants and six groups of charophycean green algae. The scaly biflagellate Mesostigma viride (Mesostigmatales) and the sarcinoid Chlorokybus atmophyticus (Chlorokybales) represent the earliest diverging lineages of this phylum. In trees based on chloroplast genome data, these two charophycean green algae are nested in the same clade. To validate this relationship and gain insight into the ancestral state of the mitochondrial genome in the Charophyceae, we sequenced the mitochondrial DNA (mtDNA) of Chlorokybus and compared this genome sequence with those of three other charophycean green algae and the bryophytes Marchantia polymorpha and Physcomitrella patens. Results The Chlorokybus genome differs radically from its 42,424-bp Mesostigma counterpart in size, gene order, intron content and density of repeated elements. At 201,763-bp, it is the largest mtDNA yet reported for a green alga. The 70 conserved genes represent 41.4% of the genome sequence and include nad10 and trnL(gag), two genes reported for the first time in a streptophyte mtDNA. At the gene order level, the Chlorokybus genome shares with its Chara, Chaetosphaeridium and bryophyte homologues eight to ten gene clusters including about 20 genes. Notably, some of these clusters exhibit gene linkages not previously found outside the Streptophyta, suggesting that they originated early during streptophyte evolution. In addition to six group I and 14 group II introns, short repeated sequences accounting for 7.5% of the genome were identified. Mitochondrial trees were unable to resolve the correct position of Mesostigma, due to analytical problems arising from accelerated sequence evolution in this lineage. Conclusion The Chlorokybus and Mesostigma mtDNAs exemplify the marked fluidity of the mitochondrial genome in charophycean green algae. The notion that the mitochondrial genome was constrained to remain compact during charophycean evolution is no longer tenable. Our data raise the possibility that the emergence of land plants was not associated with a substantial gain of intergenic sequences by the mitochondrial genome. PMID:17537252
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qi, Zhi; Pan, Chungen; Lu, Hong
Research highlights: {yields} One recombinant mimetics of gp41 prehairpin fusion intermediate (PFI) consisting of gp41 N46 sequence, foldon and IgG Fc, designated N46FdFc, was expressed. {yields} N46FdFc-induced antibodies in mice that neutralized HIV-1 infection, inhibited PIE7 binding to PFI, blocked gp41 six-helix bundle formation, and suppressed HIV-1 mediated cell-cell fusion. {yields} These findings provide an important clue for developing recombinant gp41 PFI mimetics-based HIV vaccines. -- Abstract: HIV-1 gp41 prehairpin fusion intermediate (PFI) composed of three N-terminal heptad repeats (NHR) plays a crucial role in viral fusion and entry and represents an attractive target for anti-HIV therapeutics (e.g., enfuvirtide) andmore » vaccines. In present study, we constructed and expressed two recombinant gp41 PFI mimetics, designated N46Fd and N46FdFc. N46Fd consists of N46 (residues 536-581) in gp41 NHR and foldon (Fd), a trimerization motif. N46FdFc is composed of N46Fd fused with human IgG Fc fragment as an immunoenhancer. We immunized mice with N46 peptide, N46Fd and N46FdFc, respectively, and found that only N46FdFc elicited neutralizing antibody response in mice against infection by HIV-1 strains IIIB (clade B, X4), 92US657 (clade B, R5), and 94UG103 (clade A, X4R5). Anti-N46FdFc antibodies inhibited PIE7 binding to PFI, blocked gp41 six-helix bundle formation, and suppressed HIV-1 mediated cell-cell fusion. These findings provide an important clue for developing recombinant gp41 PFI mimetics-based HIV vaccines.« less
Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.
Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M
1999-10-01
This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.
Wood, Matthew P; Cole, Amy L; Eade, Colleen R; Chen, Li-Mei; Chai, Karl X; Cole, Alexander M
2014-01-01
Several aspects of HIV-1 virulence and pathogenesis are mediated by the envelope protein gp41. Additionally, peptides derived from the gp41 ectodomain have been shown to induce chemotaxis in monocytes and neutrophils. Whereas this chemotactic activity has been reported, it is not known how these peptides could be produced under biological conditions. The heptad repeat 1 (HR1) region of gp41 is exposed to the extracellular environment and could therefore be susceptible to proteolytic processing into smaller peptides. Matriptase is a serine protease expressed at the surface of most epithelia, including the prostate and mucosal surfaces. Here, we present evidence that matriptase efficiently cleaves the HR1 portion of gp41 into a 22-residue chemotactic peptide MAT-1, the sequence of which is highly conserved across HIV-1 clades. We found that MAT-1 induced migration of primary neutrophils and monocytes, the latter of which act as a cellular reservoir of HIV during early stage infection. We then used formyl peptide receptor 1 (FPR1) and FPR2 inhibitors, along with HEK 293 cells, to demonstrate that MAT-1 can induce chemotaxis specifically using FPR2, a receptor found on the surface of monocytes, macrophages and neutrophils. These findings are the first to identify a proteolytic cleavage product of gp41 with chemotactic activity and highlight a potential role for matriptase in HIV-1 transmission and infection at epithelial surfaces and within tissue reservoirs of HIV-1. PMID:24617769
Swiatkowska, Angelika; Kosman, Joanna; Juskowiak, Bernard
2016-01-05
Spectral properties and G-quadruplex folding ability of fluorescent oligonucleotide probes at the cationic dioctadecyldimethylammonium bromide (DODAB) monolayer interface are reported. Two oligonucleotides, a 19-mer bearing thrombin binding aptamer sequence and a 21-mer with human telomeric sequence, were end-labeled with fluorescent groups (FAM and TAMRA) to give FRET probes F19T and F21T, respectively. The probes exhibited abilities to fold into a quadruplex structure and to bind metal cations (Na(+) and K(+)). Fluorescence spectra of G-quadruplex FRET probes at the monolayer interface are reported for the first time. Investigations included film balance measurements (π-A isotherms) and fluorescence spectra recording using a fiber optic accessory interfaced with a spectrofluorimeter. The effect of the presence of DODAB monolayer, metal cations and the surface pressure of monolayer on spectral behavior of FRET probes were examined. Adsorption of probe at the cationic monolayer interface resulted in the FRET signal enhancement even in the absence of metal cations. Variation in the monolayer surface pressure exerted rather modest effect on the spectral properties of probes. The fluorescence energy transfer efficiency of monolayer adsorbed probes increased significantly in the presence of sodium or potassium ion in subphase, which indicated that the probes retained their cation binding properties when adsorbed at the monolayer interface. Copyright © 2015 Elsevier B.V. All rights reserved.
Rayner, Simon; Brignac, Stafford; Bumeister, Ron; Belosludtsev, Yuri; Ward, Travis; Grant, O’dell; O’Brien, Kevin; Evans, Glen A.; Garner, Harold R.
1998-01-01
We have designed and constructed a machine that synthesizes two standard 96-well plates of oligonucleotides in a single run using standard phosphoramidite chemistry. The machine is capable of making a combination of standard, degenerate, or modified oligos in a single plate. The run time is typically 17 hr for two plates of 20-mers and a reaction scale of 40 nm. The reaction vessel is a standard polypropylene 96-well plate with a hole drilled in the bottom of each well. The two plates are placed in separate vacuum chucks and mounted on an xy table. Each well in turn is positioned under the appropriate reagent injection line and the reagent is injected by switching a dedicated valve. All aspects of machine operation are controlled by a Macintosh computer, which also guides the user through the startup and shutdown procedures, provides a continuous update on the status of the run, and facilitates a number of service procedures that need to be carried out periodically. Over 25,000 oligos have been synthesized for use in dye terminator sequencing reactions, polymerase chain reactions (PCRs), hybridization, and RT–PCR. Oligos up to 100 bases in length have been made with a coupling efficiency in excess of 99%. These machines, working in conjunction with our oligo prediction code are particularly well suited to application in automated high throughput genomic sequencing. PMID:9685322
Rayner, S; Brignac, S; Bumeister, R; Belosludtsev, Y; Ward, T; Grant, O; O'Brien, K; Evans, G A; Garner, H R
1998-07-01
We have designed and constructed a machine that synthesizes two standard 96-well plates of oligonucleotides in a single run using standard phosphoramidite chemistry. The machine is capable of making a combination of standard, degenerate, or modified oligos in a single plate. The run time is typically 17 hr for two plates of 20-mers and a reaction scale of 40 nM. The reaction vessel is a standard polypropylene 96-well plate with a hole drilled in the bottom of each well. The two plates are placed in separate vacuum chucks and mounted on an xy table. Each well in turn is positioned under the appropriate reagent injection line and the reagent is injected by switching a dedicated valve. All aspects of machine operation are controlled by a Macintosh computer, which also guides the user through the startup and shutdown procedures, provides a continuous update on the status of the run, and facilitates a number of service procedures that need to be carried out periodically. Over 25,000 oligos have been synthesized for use in dye terminator sequencing reactions, polymerase chain reactions (PCRs), hybridization, and RT-PCR. Oligos up to 100 bases in length have been made with a coupling efficiency in excess of 99%. These machines, working in conjunction with our oligo prediction code are particularly well suited to application in automated high throughput genomic sequencing.
Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium
2010-01-01
Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes. PMID:20441586
Roosaare, Märt; Vaher, Mihkel; Kaplinski, Lauris; Möls, Märt; Andreson, Reidar; Lepamets, Maarja; Kõressaar, Triinu; Naaber, Paul; Kõljalg, Siiri; Remm, Maido
2017-01-01
Fast, accurate and high-throughput identification of bacterial isolates is in great demand. The present work was conducted to investigate the possibility of identifying isolates from unassembled next-generation sequencing reads using custom-made guide trees. A tool named StrainSeeker was developed that constructs a list of specific k -mers for each node of any given Newick-format tree and enables the identification of bacterial isolates in 1-2 min. It uses a novel algorithm, which analyses the observed and expected fractions of node-specific k -mers to test the presence of each node in the sample. This allows StrainSeeker to determine where the isolate branches off the guide tree and assign it to a clade whereas other tools assign each read to a reference genome. Using a dataset of 100 Escherichia coli isolates, we demonstrate that StrainSeeker can predict the clades of E. coli with 92% accuracy and correct tree branch assignment with 98% accuracy. Twenty-five thousand Illumina HiSeq reads are sufficient for identification of the strain. StrainSeeker is a software program that identifies bacterial isolates by assigning them to nodes or leaves of a custom-made guide tree. StrainSeeker's web interface and pre-computed guide trees are available at http://bioinfo.ut.ee/strainseeker. Source code is stored at GitHub: https://github.com/bioinfo-ut/StrainSeeker.
Designing probe from E6 genome region of human Papillomavirus 16 for sensing applications.
Parmin, Nor Azizah; Hashim, Uda; Gopinath, Subash C B
2018-02-01
Human Papillomavirus (HPV) is a standout amongst the most commonly reported over 100 types, among them genotypes 16, 18, 31 and 45 are the high-risk HPV. Herein, we designed the oligonucleotide probe for the detection of predominant HPV type 16 for the sensing applications. Conserved amino acid sequences within E6 region of the open reading frame in the HPV genome was used as the basis to design oligonucleotide probe to detect cervical cancer. Analyses of E6 amino acid sequences from the high-risk HPVs were done to check the percentage of similarity and consensus regions that cause different cancers, including cervical cancer. Basic local alignment search tools (BLAST) have given extra statistical parameters, for example, desire values (E-values) and score bits. The probe, 'GGG GTC GGT GGA CCG GTC GAT GTA' was designed with 66.7% GC content. This oligonucleotide probe is designed with the length of 24 mer, GC percent is between 40 and 70, and the melting point (Tm) is above 50°C. The probe needed an acceptable length between 22 and 31 mer. The choice of region is identified here can be used as a probe, has implications for HPV detection techniques in biosensor especially for clinical determination of cervical cancer. Copyright © 2017 Elsevier B.V. All rights reserved.
Schwessinger, Ron; Suciu, Maria C; McGowan, Simon J; Telenius, Jelena; Taylor, Stephen; Higgs, Doug R; Hughes, Jim R
2017-10-01
In the era of genome-wide association studies (GWAS) and personalized medicine, predicting the impact of single nucleotide polymorphisms (SNPs) in regulatory elements is an important goal. Current approaches to determine the potential of regulatory SNPs depend on inadequate knowledge of cell-specific DNA binding motifs. Here, we present Sasquatch, a new computational approach that uses DNase footprint data to estimate and visualize the effects of noncoding variants on transcription factor binding. Sasquatch performs a comprehensive k -mer-based analysis of DNase footprints to determine any k -mer's potential for protein binding in a specific cell type and how this may be changed by sequence variants. Therefore, Sasquatch uses an unbiased approach, independent of known transcription factor binding sites and motifs. Sasquatch only requires a single DNase-seq data set per cell type, from any genotype, and produces consistent predictions from data generated by different experimental procedures and at different sequence depths. Here we demonstrate the effectiveness of Sasquatch using previously validated functional SNPs and benchmark its performance against existing approaches. Sasquatch is available as a versatile webtool incorporating publicly available data, including the human ENCODE collection. Thus, Sasquatch provides a powerful tool and repository for prioritizing likely regulatory SNPs in the noncoding genome. © 2017 Schwessinger et al.; Published by Cold Spring Harbor Laboratory Press.
Ground Contact Model for Mars Science Laboratory Mission Simulations
NASA Technical Reports Server (NTRS)
Raiszadeh, Behzad; Way, David
2012-01-01
The Program to Optimize Simulated Trajectories II (POST 2) has been successful in simulating the flight of launch vehicles and entry bodies on earth and other planets. POST 2 has been the primary simulation tool for the Entry Descent, and Landing (EDL) phase of numerous Mars lander missions such as Mars Pathfinder in 1997, the twin Mars Exploration Rovers (MER-A and MER-B) in 2004, Mars Phoenix lander in 2007, and it is now the main trajectory simulation tool for Mars Science Laboratory (MSL) in 2012. In all previous missions, the POST 2 simulation ended before ground impact, and a tool other than POST 2 simulated landing dynamics. It would be ideal for one tool to simulate the entire EDL sequence, thus avoiding errors that could be introduced by handing off position, velocity, or other fight parameters from one simulation to the other. The desire to have one continuous end-to-end simulation was the motivation for developing the ground interaction model in POST 2. Rover landing, including the detection of the postlanding state, is a very critical part of the MSL mission, as the EDL landing sequence continues for a few seconds after landing. The method explained in this paper illustrates how a simple ground force interaction model has been added to POST 2, which allows simulation of the entire EDL from atmospheric entry through touchdown.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.
Benslimane, A A; Dron, M; Hartmann, C; Rode, A
1986-01-01
Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia.
Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C
1997-12-01
Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.
Memish, Ziad A; Alsahly, Ahmad; Masri, Malak al; Heil, Gary L; Anderson, Benjamin D; Peiris, Malik; Khan, Salah Uddin; Gray, Gregory C
2015-01-01
Middle East respiratory syndrome coronavirus (MERS-CoV) is an emerging viral pathogen that primarily causes respiratory illness. We conducted a seroprevalence study of banked human serum samples collected in 2012 from Southern Saudi Arabia. Sera from 300 animal workers (17% with daily camel exposure) and 50 non-animal-exposed controls were examined for serological evidence of MERS-CoV infection by a pseudoparticle MERS-CoV spike protein neutralization assay. None of the sera reproducibly neutralized the MERS-CoV-pseudotyped lentiviral vector. These data suggest that serological evidence of zoonotic transmission of MERS-CoV was not common among animal workers in Southern Saudi Arabia during July 2012. PMID:25470665
Development of Pineapple Microsatellite Markers and Germplasm Genetic Diversity Analysis
Tong, Helin; Chen, You; Wang, Jingyi; Chen, Yeyuan; Sun, Guangming; He, Junhu; Wu, Yaoting
2013-01-01
Two methods were used to develop pineapple microsatellite markers. Genomic library-based SSR development: using selectively amplified microsatellite assay, 86 sequences were generated from pineapple genomic library. 91 (96.8%) of the 94 Simple Sequence Repeat (SSR) loci were dinucleotide repeats (39 AC/GT repeats and 52 GA/TC repeats, accounting for 42.9% and 57.1%, resp.), and the other three were mononucleotide repeats. Thirty-six pairs of SSR primers were designed; 24 of them generated clear bands of expected sizes, and 13 of them showed polymorphism. EST-based SSR development: 5659 pineapple EST sequences obtained from NCBI were analyzed; among 1397 nonredundant EST sequences, 843 were found containing 1110 SSR loci (217 of them contained more than one SSR locus). Frequency of SSRs in pineapple EST sequences is 1SSR/3.73 kb, and 44 types were found. Mononucleotide, dinucleotide, and trinucleotide repeats dominate, accounting for 95.6% in total. AG/CT and AGC/GCT were the dominant type of dinucleotide and trinucleotide repeats, accounting for 83.5% and 24.1%, respectively. Thirty pairs of primers were designed for each of randomly selected 30 sequences; 26 of them generated clear and reproducible bands, and 22 of them showed polymorphism. Eighteen pairs of primers obtained by the one or the other of the two methods above that showed polymorphism were selected to carry out germplasm genetic diversity analysis for 48 breeds of pineapple; similarity coefficients of these breeds were between 0.59 and 1.00, and they can be divided into four groups accordingly. Amplification products of five SSR markers were extracted and sequenced, corresponding repeat loci were found and locus mutations are mainly in copy number of repeats and base mutations in the flanking region. PMID:24024187
A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.
Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E
1997-06-01
In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.
NASA Technical Reports Server (NTRS)
Kawamura, K.; Ferris, J. P.
1994-01-01
The rate constants for the condensation reaction of the 5'-phosphorimidazolide of adenosine (ImpA) to form dinucleotides and oligonucleotides have been measured in the presence of Na(+)-volclay (a Na(+)-montmorillonite) in pH 8 aqueous solution at 25 degrees C. The rates of the reaction of ImpA with an excess of adenosine 5'-monophosphoramidate (NH2pA), P1,P2-diadenosine 5',5'-pyrophosphate (A5'ppA), or adenosine 5'-monophosphate (5'-AMP or pA) in the presence of the montmorillonite to form NH2pA3'pA, A5'ppA3'pA, and pA3'pA, respectively, were measured. Only 3',5'-linked products were observed. The magnitude of the rate constants decrease in the order NH2pA3'pA > A5'-ppA3'pA > pA3'pA. The binding of ImpA to montmorillonite was measured, and the adsorption isotherm was determined. The binding of ImpA to montmorillonite and the formation of higher oligonucleotides is not observed in the absence of salts. Mg2+ enhances binding and oligonucleotide formation more than Ca2+ and Na+. The rate constants for the oligonucleotide formation were determined from the reaction products formed from 10 to 40 mM ImpA in the presence of Na(+)-montmorillonite using the computer program SIMFIT. The magnitudes of the rate constants for the formation of oligonucleotides increased in the order 2-mer < 3-mer < 4-mer ... 7-mer. The rate constants for dinucleotide and trinucleotide formation are more than 1000 times larger than those measured in the absence of montmorillonite. The rate constants for the formation of dinucleotide, trinucleotide, and tetranucleotide are 41,2.6, and 3.7 times larger than those for the formation of oligo(G)s with a poly(C) template. The hydrolysis of ImpA was accelerated 35 times in the presence of the montmorillonite. The catalytic ability of montmorillonite to form dinucleotides and oligonucleotides is quantitatively evaluated and possible pathways for oligo(A) formation are proposed.
Zhang, Emma Xuxiao; Oh, Olivia Seen Huey; See, Wanhan; Raj, Pream; James, Lyn; Khan, Kamran
2016-01-01
Objective To assess the public health risk to Singapore posed by the Middle East respiratory syndrome (MERS) outbreak in the Republic of Korea in 2015. Methods The likelihood of importation of MERS cases and the magnitude of the public health impact in Singapore were assessed to determine overall risk. Literature on the epidemiology and contextual factors associated with MERS coronavirus infection was collected and reviewed. Connectivity between the Republic of Korea and Singapore was analysed. Public health measures implemented by the two countries were reviewed. Results The epidemiology of the 2015 MERS outbreak in the Republic of Korea remained similar to the MERS outbreaks in Saudi Arabia. In addition, strong infection control and response measures were effective in controlling the outbreak. In view of the air traffic between Singapore and MERS-affected areas, importation of MERS cases into Singapore is possible. Nonetheless, the risk of a serious public health impact to Singapore in the event of an imported case of MERS would be mitigated by its strong health-care system and established infection control practices. Discussion The MERS outbreak was sparked by an exported case from the Middle East, which remains a concern as the reservoir of infection (thought to be camels) continues to exist in the Middle East, and sporadic cases in the community and outbreaks in health-care settings continue to occur there. This risk assessment highlights the need for Singapore to stay vigilant and to continue enhancing core public health capacities to detect and respond to MERS coronavirus. PMID:27508087
Zhang, Emma Xuxiao; Oh, Olivia Seen Huey; See, Wanhan; Raj, Pream; James, Lyn; Khan, Kamran; Tey, Jeannie Su Hui
2016-01-01
To assess the public health risk to Singapore posed by the Middle East respiratory syndrome (MERS) outbreak in the Republic of Korea in 2015. The likelihood of importation of MERS cases and the magnitude of the public health impact in Singapore were assessed to determine overall risk. Literature on the epidemiology and contextual factors associated with MERS coronavirus infection was collected and reviewed. Connectivity between the Republic of Korea and Singapore was analysed. Public health measures implemented by the two countries were reviewed. The epidemiology of the 2015 MERS outbreak in the Republic of Korea remained similar to the MERS outbreaks in Saudi Arabia. In addition, strong infection control and response measures were effective in controlling the outbreak. In view of the air traffic between Singapore and MERS-affected areas, importation of MERS cases into Singapore is possible. Nonetheless, the risk of a serious public health impact to Singapore in the event of an imported case of MERS would be mitigated by its strong health-care system and established infection control practices. The MERS outbreak was sparked by an exported case from the Middle East, which remains a concern as the reservoir of infection (thought to be camels) continues to exist in the Middle East, and sporadic cases in the community and outbreaks in health-care settings continue to occur there. This risk assessment highlights the need for Singapore to stay vigilant and to continue enhancing core public health capacities to detect and respond to MERS coronavirus.
Fukushi, Shuetsu; Fukuma, Aiko; Kurosu, Takeshi; Watanabe, Shumpei; Shimojima, Masayuki; Shirato, Kazuya; Iwata-Yoshikawa, Naoko; Nagata, Noriyo; Ohnishi, Kazuo; Ato, Manabu; Melaku, Simenew Keskes; Sentsui, Hiroshi; Saijo, Masayuki
2018-01-01
Since discovering the Middle East respiratory syndrome coronavirus (MERS-CoV) as a causative agent of severe respiratory illness in the Middle East in 2012, serological testing has been conducted to assess antibody responses in patients and to investigate the zoonotic reservoir of the virus. Although the virus neutralization test is the gold standard assay for MERS diagnosis and for investigating the zoonotic reservoir, it uses live virus and so must be performed in high containment laboratories. Competitive ELISA (cELISA), in which a labeled monoclonal antibody (MAb) competes with test serum antibodies for target epitopes, may be a suitable alternative because it detects antibodies in a species-independent manner. In this study, novel MAbs against the spike protein of MERS-CoV were produced and characterized. One of these MAbs was used to develop a cELISA. The cELISA detected MERS-CoV-specific antibodies in sera from MERS-CoV-infected rats and rabbits immunized with the spike protein of MERS-CoV. The MAb-based cELISA was validated using sera from Ethiopian dromedary camels. Relative to the neutralization test, the cELISA detected MERS-CoV-specific antibodies in 66 Ethiopian dromedary camels with a sensitivity and specificity of 98% and 100%, respectively. The cELISA and neutralization test results correlated well (Pearson's correlation coefficients=0.71-0.76, depending on the cELISA serum dilution). This cELISA may be useful for MERS epidemiological investigations on MERS-CoV infection. Copyright © 2017 Elsevier B.V. All rights reserved.
Restructuring the vocal fold lamina propria with endoscopic microdissection.
Bartlett, Rebecca S; Hoffman, Henry T; Dailey, Seth H; Bock, Jonathan M; Klemuk, Sarah A; Askeland, Ryan W; Ahlrichs-Hanson, Jan S; Heaford, Andrew C; Thibeault, Susan L
2013-11-01
The purposes of this preclinical study were to investigate histologic and rheologic outcomes of Microendoscopy of Reinke's space (MERS)-guided minithyrotomy and to assess its instrumentation. Human cadaveric and in vivo animal study. Three human cadaveric larynges were treated with MERS-guided placement of Radiesse VoiceGel and immediately evaluated histologically for biomaterial location. In the second part of this investigation, two scarred porcine larynges were treated with MERS-guided placement of HyStem-VF and rheologically evaluated 6 weeks later. Student t tests determined differences in viscoelastic properties of treated/untreated vocal folds. Sialendoscopes and microendoscopes were subjectively compared for their visualization capacity. MERS imaged the subepithelial area and vocal ligament, guiding both tissue dissection and biomaterial positioning. Sialendoscopes provided adequate visualization and feature incorporated working channels. Enhanced image clarity was created in a gas-filled rather than saline-filled environment, per rater judgment. Histological analysis revealed desirable biomaterial positioning with MERS. Per rheological analysis, viscoelastic properties of the MERS-treated porcine vocal folds compared to uninjured vocal folds 6 weeks following treatment did not statistically differ. MERS-guided laryngoplasty using sialendoscopes yielded satisfactory biomaterial positioning in the short-term and normalized rheologic tissue properties in the long-term, contributing to proof of concept for MERS in the treatment of scarring. Strengths of MERS include direct, real-time visualization of Reinke's space and an ability to manipulate surgical instruments parallel to the vocal fold edge while maintaining an intact epithelium. Future work will explore the clinical utility of MERS for addressing scarring, sulcus vocalis, and other intracordal processes. Copyright © 2013 The American Laryngological, Rhinological and Otological Society, Inc.
Taking aim at Mer and Axl receptor tyrosine kinases as novel therapeutic targets in solid tumors
Linger, Rachel M.A.; Keating, Amy K.; Earp, H. Shelton
2010-01-01
Importance of the field Axl and/or Mer expression correlates with poor prognosis in several cancers. Until recently, the specific role of these receptor tyrosine kinases (RTKs) in the development and progression of cancer remained unexplained. Studies demonstrating that Axl and Mer contribute to mechanisms of cell survival, migration, invasion, metastasis, and chemosensitivity justify further investigation of Axl and Mer as novel therapeutic targets in cancer. Areas covered in this review Axl and Mer signaling pathways in cancer cells are summarized and evidence validating these RTKs as therapeutic targets in glioblastoma multiforme, non-small cell lung cancer, and breast cancer is examined. A comprehensive discussion of Axl and/or Mer inhibitors in development is also provided. What the reader will gain Potential toxicities associated with Axl or Mer inhibition are addressed. We hypothesize that the probable action of Mer and Axl inhibitors on cells within the tumor microenvironment will provide a unique therapeutic opportunity to target both tumor cells and the stromal components which facilitate disease progression. Take home message Axl and Mer mediate multiple oncogenic phenotypes and activation of these RTKs constitutes a mechanism of chemoresistance in a variety of solid tumors. Targeted inhibition of these RTKs may be effective as anti-tumor and/or anti-metastatic therapy, particularly if combined with standard cytotoxic therapies. PMID:20809868
Pedagogical Affordances of Multiple External Representations in Scientific Processes
NASA Astrophysics Data System (ADS)
Wu, Hsin-Kai; Puntambekar, Sadhana
2012-12-01
Multiple external representations (MERs) have been widely used in science teaching and learning. Theories such as dual coding theory and cognitive flexibility theory have been developed to explain why the use of MERs is beneficial to learning, but they do not provide much information on pedagogical issues such as how and in what conditions MERs could be introduced and used to support students' engagement in scientific processes and develop competent scientific practices (e.g., asking questions, planning investigations, and analyzing data). Additionally, little is understood about complex interactions among scientific processes and affordances of MERs. Therefore, this article focuses on pedagogical affordances of MERs in learning environments that engage students in various scientific processes. By reviewing literature in science education and cognitive psychology and integrating multiple perspectives, this article aims at exploring (1) how MERs can be integrated with science processes due to their different affordances, and (2) how student learning with MERs can be scaffolded, especially in a classroom situation. We argue that pairing representations and scientific processes in a principled way based on the affordances of the representations and the goals of the activities is a powerful way to use MERs in science education. Finally, we outline types of scaffolding that could help effective use of MERs including dynamic linking, model progression, support in instructional materials, teacher support, and active engagement.
Middle East Respiratory Syndrome (MERS)
Middle East Respiratory Syndrome Coronavirus; MERS-CoV; Novel coronavirus; nCoV ... for Disease Control and Prevention website. Middle East Respiratory Syndrome (MERS): Frequently asked questions and answers. www. ...
Deciphering MERS-CoV Evolution in Dromedary Camels.
Du, Lin; Han, Guan-Zhu
2016-02-01
The emergence of the Middle East respiratory syndrome coronavirus (MERS-CoV) poses a potential threat to global public health. Many aspects of the evolution and transmission of MERS-CoV in its animal reservoir remain unclear. A recent study provides new insights into the evolution and transmission of MERS-CoV in dromedary camels. Copyright © 2015 Elsevier Ltd. All rights reserved.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.
Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario
2011-01-01
Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
2003-04-30
KENNEDY SPACE CENTER, FLA. - An overhead crane lifts the Mars Exploration Rover 2 (MER-2) entry vehicle from its stand to move it to a spin table for a dry-spin test. The MER Mission consists of two identical rovers designed to cover roughly 110 yards each Martian day over various terrain. Each rover will carry five scientific instruments that will allow it to search for evidence of liquid water that may have been present in the planet's past. Identical to each other, the rovers will land at different regions of Mars. Launch for MER-2 (MER-A) is scheduled for June 5.
2003-04-30
KENNEDY SPACE CENTER, FLA. - With help from workers, the overhead crane lowers the Mars Exploration Rover 2 (MER-2) entry vehicle onto a spin table for a dry-spin test. The MER Mission consists of two identical rovers designed to cover roughly 110 yards each Martian day over various terrain. Each rover will carry five scientific instruments that will allow it to search for evidence of liquid water that may have been present in the planet's past. Identical to each other, the rovers will land at different regions of Mars. Launch for MER-2 (MER-A) is scheduled for June 5.
2003-04-30
KENNEDY SPACE CENTER, FLA. - An overhead crane moves the Mars Exploration Rover 2 (MER-2) entry vehicle across the Payload Hazardous Servicing Facility toward a spin table for a dry-spin test. The MER Mission consists of two identical rovers designed to cover roughly 110 yards each Martian day over various terrain. Each rover will carry five scientific instruments that will allow it to search for evidence of liquid water that may have been present in the planet's past. Identical to each other, the rovers will land at different regions of Mars. Launch for MER-2 (MER-A) is scheduled for June 5.
2003-04-30
KENNEDY SPACE CENTER, FLA. - Workers in the Payload Hazardous Servicing Facility help guide the Mars Exploration Rover 2 (MER-2) entry vehicle toward a spin table for a dry-spin test. The MER Mission consists of two identical rovers designed to cover roughly 110 yards each Martian day over various terrain. Each rover will carry five scientific instruments that will allow it to search for evidence of liquid water that may have been present in the planet's past. Identical to each other, the rovers will land at different regions of Mars. Launch for MER-2 (MER-A) is scheduled for June 5.
2003-04-30
KENNEDY SPACE CENTER, FLA. - An overhead crane is in place to lift the Mars Exploration Rover 2 (MER-2) entry vehicle to move it to a spin table for a dry-spin test. The MER Mission consists of two identical rovers designed to cover roughly 110 yards each Martian day over various terrain. Each rover will carry five scientific instruments that will allow it to search for evidence of liquid water that may have been present in the planet's past. Identical to each other, the rovers will land at different regions of Mars. Launch for MER-2 (MER-A) is scheduled for June 5.
Survey and Analysis of Microsatellites in the Silkworm, Bombyx mori
Prasad, M. Dharma; Muthulakshmi, M.; Madhu, M.; Archak, Sunil; Mita, K.; Nagaraju, J.
2005-01-01
We studied microsatellite frequency and distribution in 21.76-Mb random genomic sequences, 0.67-Mb BAC sequences from the Z chromosome, and 6.3-Mb EST sequences of Bombyx mori. We mined microsatellites of ≥15 bases of mononucleotide repeats and ≥5 repeat units of other classes of repeats. We estimated that microsatellites account for 0.31% of the genome of B. mori. Microsatellite tracts of A, AT, and ATT were the most abundant whereas their number drastically decreased as the length of the repeat motif increased. In general, tri- and hexanucleotide repeats were overrepresented in the transcribed sequences except TAA, GTA, and TGA, which were in excess in genomic sequences. The Z chromosome sequences contained shorter repeat types than the rest of the chromosomes in addition to a higher abundance of AT-rich repeats. Our results showed that base composition of the flanking sequence has an influence on the origin and evolution of microsatellites. Transitions/transversions were high in microsatellites of ESTs, whereas the genomic sequence had an equal number of substitutions and indels. The average heterozygosity value for 23 polymorphic microsatellite loci surveyed in 13 diverse silkmoth strains having 2–14 alleles was 0.54. Only 36 (18.2%) of 198 microsatellite loci were polymorphic between the two divergent silkworm populations and 10 (5%) loci revealed null alleles. The microsatellite map generated using these polymorphic markers resulted in 8 linkage groups. B. mori microsatellite loci were the most conserved in its immediate ancestor, B. mandarina, followed by the wild saturniid silkmoth, Antheraea assama. PMID:15371363
SWAP-Assembler 2: Optimization of De Novo Genome Assembler at Large Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meng, Jintao; Seo, Sangmin; Balaji, Pavan
2016-08-16
In this paper, we analyze and optimize the most time-consuming steps of the SWAP-Assembler, a parallel genome assembler, so that it can scale to a large number of cores for huge genomes with the size of sequencing data ranging from terabyes to petabytes. According to the performance analysis results, the most time-consuming steps are input parallelization, k-mer graph construction, and graph simplification (edge merging). For the input parallelization, the input data is divided into virtual fragments with nearly equal size, and the start position and end position of each fragment are automatically separated at the beginning of the reads. Inmore » k-mer graph construction, in order to improve the communication efficiency, the message size is kept constant between any two processes by proportionally increasing the number of nucleotides to the number of processes in the input parallelization step for each round. The memory usage is also decreased because only a small part of the input data is processed in each round. With graph simplification, the communication protocol reduces the number of communication loops from four to two loops and decreases the idle communication time. The optimized assembler is denoted as SWAP-Assembler 2 (SWAP2). In our experiments using a 1000 Genomes project dataset of 4 terabytes (the largest dataset ever used for assembling) on the supercomputer Mira, the results show that SWAP2 scales to 131,072 cores with an efficiency of 40%. We also compared our work with both the HipMER assembler and the SWAP-Assembler. On the Yanhuang dataset of 300 gigabytes, SWAP2 shows a 3X speedup and 4X better scalability compared with the HipMer assembler and is 45 times faster than the SWAP-Assembler. The SWAP2 software is available at https://sourceforge.net/projects/swapassembler.« less
Antibodies against MERS coronavirus in dromedary camels, United Arab Emirates, 2003 and 2013.
Meyer, Benjamin; Müller, Marcel A; Corman, Victor M; Reusken, Chantal B E M; Ritz, Daniel; Godeke, Gert-Jan; Lattwein, Erik; Kallies, Stephan; Siemens, Artem; van Beek, Janko; Drexler, Jan F; Muth, Doreen; Bosch, Berend-Jan; Wernery, Ulrich; Koopmans, Marion P G; Wernery, Renate; Drosten, Christian
2014-04-01
Middle East respiratory syndrome coronavirus (MERS-CoV) has caused an ongoing outbreak of severe acute respiratory tract infection in humans in the Arabian Peninsula since 2012. Dromedary camels have been implicated as possible viral reservoirs. We used serologic assays to analyze 651 dromedary camel serum samples from the United Arab Emirates; 151 of 651 samples were obtained in 2003, well before onset of the current epidemic, and 500 serum samples were obtained in 2013. Recombinant spike protein-specific immunofluorescence and virus neutralization tests enabled clear discrimination between MERS-CoV and bovine CoV infections. Most (632/651, 97.1%) camels had antibodies against MERS-CoV. This result included all 151 serum samples obtained in 2003. Most (389/651, 59.8%) serum samples had MERS-CoV-neutralizing antibody titers >1,280. Dromedary camels from the United Arab Emirates were infected at high rates with MERS-CoV or a closely related, probably conspecific, virus long before the first human MERS cases.
Thermal Protection System Mass Estimating Relationships For Blunt-Body, Earth Entry Spacecraft
NASA Technical Reports Server (NTRS)
Sepka, Steven A.; Samareh, Jamshid A.
2015-01-01
Mass estimating relationships (MERs) are developed to predict the amount of thermal protection system (TPS) necessary for safe Earth entry for blunt-body spacecraft using simple correlations that are non-ITAR and closely match estimates from NASA's highfidelity ablation modeling tool, the Fully Implicit Ablation and Thermal Analysis Program (FIAT). These MERs provide a first order estimate for rapid feasibility studies. There are 840 different trajectories considered in this study, and each TPS MER has a peak heating limit. MERs for the vehicle forebody include the ablators Phenolic Impregnated Carbon Ablator (PICA) and Carbon Phenolic atop Advanced Carbon-Carbon. For the aftbody, the materials are Silicone Impregnated Reusable Ceramic Ablator (SIRCA), Acusil II, SLA- 561V, and LI-900. The MERs are accurate to within 14% (at one standard deviation) of FIAT prediction, and the most any MER can under predict FIAT TPS thickness is 18.7%. This work focuses on the development of these MERs, the resulting equations, model limitations, and model accuracy.
Carbo-biphenyls and Carbo-terphenyls: Oligo(phenylene ethynylene) Ring Carbo-mers.
Zhu, Chongwei; Poater, Albert; Duhayon, Carine; Kauffmann, Brice; Saquet, Alix; Maraval, Valérie; Chauvin, Remi
2018-05-14
Ring carbo-mers of oligo(phenylene ethynylene)s (OPEn, n=0-2), made of C 2 -catenated C 18 carbo-benzene rings, have been synthesized and characterized by NMR and UV-vis spectroscopy, crystallography and voltammetry. Analyses of crystal and DFT-optimized structures show that the C 18 rings preserve their individual aromatic character according to structural and magnetic criteria (NICS indices). Carbo-terphenyls (n=2) are reversibly reduced at ca. -0.42 V/SCE, i.e. 0.41 V more readily than the corresponding carbo-benzene (-0.83 V/SCE), thus revealing efficient inter-ring π-conjugation. An accurate linear fit of E 1/2 red1 vs. the DFT LUMO energy suggests a notably higher value (-0.30 V/SCE) for a carbo-quaterphenyl congener (n=3). Increase with n of the effective π-conjugation is also evidenced by a red shift of two of the three main visible light absorption bands, all being assigned to TDDFT-calculated excited states, one of them restricting to a HOMO→LUMO main one-electron transition. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Transcriptome Analysis and Development of SSR Molecular Markers in Glycyrrhiza uralensis Fisch.
Liu, Yaling; Zhang, Pengfei; Song, Meiling; Hou, Junling; Qing, Mei; Wang, Wenquan; Liu, Chunsheng
2015-01-01
Licorice is an important traditional Chinese medicine with clinical and industrial applications. Genetic resources of licorice are insufficient for analysis of molecular biology and genetic functions; as such, transcriptome sequencing must be conducted for functional characterization and development of molecular markers. In this study, transcriptome sequencing on the Illumina HiSeq 2500 sequencing platform generated a total of 5.41 Gb clean data. De novo assembly yielded a total of 46,641 unigenes. Comparison analysis using BLAST showed that the annotations of 29,614 unigenes were conserved. Further study revealed 773 genes related to biosynthesis of secondary metabolites of licorice, 40 genes involved in biosynthesis of the terpenoid backbone, and 16 genes associated with biosynthesis of glycyrrhizic acid. Analysis of unigenes larger than 1 Kb with a length of 11,702 nt presented 7,032 simple sequence repeats (SSR). Sixty-four of 69 randomly designed and synthesized SSR pairs were successfully amplified, 33 pairs of primers were polymorphism in in Glycyrrhiza uralensis Fisch., Glycyrrhiza inflata Bat., Glycyrrhiza glabra L. and Glycyrrhiza pallidiflora Maxim. This study not only presents the molecular biology data of licorice but also provides a basis for genetic diversity research and molecular marker-assisted breeding of licorice. PMID:26571372
NASA Astrophysics Data System (ADS)
Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.
2015-12-01
Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.
Middle East respiratory syndrome: obstacles and prospects for vaccine development
Papaneri, Amy B.; Johnson, Reed F.; Wada, Jiro; Bollinger, Laura; Jahrling, Peter B.; Kuhn, Jens H.
2016-01-01
Summary The recent emergence of Middle East respiratory syndrome (MERS) highlights the need to engineer new methods for expediting vaccine development against emerging diseases. However, several obstacles prevent pursuit of a licensable MERS vaccine. First, the lack of a suitable animal model for MERS complicates in vivo testing of candidate vaccines. Second, due to the low number of MERS cases, pharmaceutical companies have little incentive to pursue MERS vaccine production as the costs of clinical trials are high. In addition, the timeline from bench research to approved vaccine use is 10 years or longer. Using novel methods and cost-saving strategies, genetically engineered vaccines can be produced quickly and cost-effectively. Along with progress in MERS animal model development, these obstacles can be circumvented or at least mitigated. PMID:25864502
Wang, Lei; Zhou, Mei; McClelland, Ann; Reilly, Aislinn; Chen, Tianbao; Gagliardo, Ron; Walker, Brian; Shaw, Chris
2008-10-01
By integrating systematic peptidome and transcriptome studies of the defensive skin secretion of the Central American red-eyed leaf frog, Agalychnis callidryas, we have identified novel members of three previously described antimicrobial peptide families, a 27-mer dermaseptin-related peptide (designated DRP-AC4), a 33-mer adenoregulin-related peptide (designated ARP-AC1) and most unusually, a 27-mer caerin-related peptide (designated CRP-AC1). While dermaseptin and adenoregulin were originally isolated from phyllomedusine leaf frogs, the caerins, until now, had only been described in Australian frogs of the genus, Litoria. Both the dermaseptin and adenoregulin were C-terminally amidated and lacked the C-terminal tripeptide of the biosynthetic precursor sequence. In contrast, the caerin-related peptide, unlike the majority of Litoria analogs, was not C-terminally amidated. The present data emphasize the need for structural characterization of mature peptides to ensure that unexpected precursor cleavages and/or post-translational modifications do not produce mature peptides that differ in structure to those predicted from cloned biosynthetic precursor cDNA. Additionally, systematic study of the secretory peptidome can produce unexpected results such as the CRP described here that may have phylogenetic implications. It is thus of the utmost importance in the functional evaluation of novel peptides that the primary structure of the mature peptide is unequivocally established -- something that is often facilitated by cloning biosynthetic precursor cDNAs but obviously not reliable using such data alone.
Algorithm to find distant repeats in a single protein sequence
Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj
2008-01-01
Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
Characterization of (CA)n microsatellite repeats from large-insert clones.
Litt, M; Browne, D
2001-05-01
The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit determination of sequences flanking the microsatellites. When cosmids or large-insert phage clones are used as primary sources of (CA)n repeat markers, they have traditionally been subcloned into plasmid vectors such as pUC18 or M13 mp 18/19 cloning vectors to obtain fragments of suitable size for DNA sequencing. This unit presents an alternative approach whereby a set of degenerate sequencing primers that anneal directly to (CA)n microsatellites can be used to determine sequences that are inaccessible with vector-derived primers. Because the primers anneal to the repeat and not to the vector, they can be used with subclones containing inserts of several kilobases and should, in theory, always give sequence in the regions directly flanking the repeat. Degeneracy at the 3 end of each of these primers prevents elongation of primers that have annealed out-of-register. The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit.
Abolfotouh, Mostafa A; AlQarni, Ali A; Al-Ghamdi, Suliman M; Salam, Mahmoud; Al-Assiri, Mohammed H; Balkhy, Hanan H
2017-01-03
Middle East Respiratory Syndrome (MERS) is caused by MERS coronavirus (MERS-CoV). More than 80% of reported cases have occurred in Saudi Arabia, with a mortality exceeding 50%. Health-care workers (HCWs) are at risk of acquiring and transmitting this virus, so the concerns of HCWs in Saudi Arabia regarding MERS were evaluated. An anonymous, self-administered, previously validated questionnaire was given to 1031 HCWs at three tertiary hospitals in Saudi Arabia from October to December, 2014. Concerns regarding the disease, its severity and governmental efforts to contain it, as well as disease outcomes were assessed using 31 concern statements in five distinct domains. A total concern score was calculated for each HCW. Multiple regression analyses were used to identify predictors of high concern scores. The average age of participants was 37.1 ± 9.0 years, 65.8% were married and 59.1% were nurses. The majority of respondents (70.4%) felt at risk of contracting a MERS-CoV infection at work, 69.1% felt threatened if a colleague contracted MERS-CoV, 60.9% felt obliged to care for patients infected with MERS-CoV and 87.8% did not feel safe at work using standard precautions. In addition, 87.7% believed that the government should isolate patients with MERS in specialized hospitals, 73.7% agreed with travel restriction to and from areas affected by MERS and 65.3% agreed with avoiding inviting expatriates from such areas. After adjustment for covariates, high concern scores were significantly associated with being a Saudi national (p < 0.001), a non-physician (p < 0.001) and working in the central region (p < 0.001). The majority of respondents reported concern regarding MERS-CoV infection from exposure at work. The overall level of concern may be influenced by previous experience of MERS outbreaks and related cultural issues. The concerns of HCWs may affect their overall effectiveness in an outbreak and should be addressed by incorporating management strategies in outbreak planning.
The complete mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae).
Zhou, Xuming; Chen, Yu; Zhu, Shanliang; Xu, Haigen; Liu, Yan; Chen, Lian
2016-01-01
The mitochondrial genome of Pomacea canaliculata (Gastropoda: Ampullariidae) is the first complete mtDNA sequence reported in the genus Pomacea. The total length of mtDNA is 15,707 bp, which containing 13 protein-coding genes, 2 ribosomal RNAs, 22 transfer RNAs, and a 359 bp non-coding region. The A + T content of the overall base composition of H-strand is 71.7% (T: 41%, C: 12.7%, A: 30.7%, G: 15.6%). ATP6, ATP8, CO1, CO2, ND1-3, ND5, ND6, ND4L and Cyt b genes begin with ATG as start codon, CO3 and ND4 begin with ATA. ATP8, CO2-3, ND4L, ND2-6 and Cyt b genes are terminated with TAA as stop codon, ATP6, ND1, and CO1 end with TAG. A long non-coding region is found and a 23 bp repeat unit repeat 11 times in this region.
Bhatia, S; Singh Negi, M; Lakshmikumaran, M
1996-11-01
EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
NASA Astrophysics Data System (ADS)
Resom, Angesom; Asrat, Asfawossen; Gossa, Tegenu; Hovers, Erella
2018-06-01
The Melka Wakena archaeological site-complex is located at the eastern rift margin of the central sector of the Main Ethiopian Rift (MER), in south central Ethiopia. This wide, gently sloping rift shoulder, locally called the "Gadeb plain" is underlain by a succession of primary pyroclastic deposits and intercalated fluvial sediments as well as reworked volcaniclastic rocks, the top part of which is exposed by the Wabe River in the Melka Wakena area. Recent archaeological survey and excavations at this site revealed important paleoanthropological records. An integrated stratigraphic, petrological, and major and trace element geochemical study has been conducted to constrain the petrogenesis of the primary pyroclastic deposits and the depositional history of the sequence. The results revealed that the Melka Wakena pyroclastic deposits are a suite of mildly alkaline, rhyolitic pantellerites (ash falls, pumiceous ash falls and ignimbrites) and slightly dacitic ash flows. These rocks were deposited by episodic volcanic eruptions during early to middle Pleistocene from large calderas along the Wonji Fault Belt (WFB) in the central sector of the MER and from large silicic volcanic centers at the eastern rift shoulder. The rhyolitic ash falls, pumiceous ash falls and ignimbrites have been generated by fractional crystallization of a differentiating basaltic magma while the petrogenesis of the slightly dacitic ash flows involved some crustal contamination and assimilation during fractionation. Contemporaneous fluvial activities in the geomorphologically active Gadeb plain deposited overbank sedimentary sequences (archaeology bearing conglomerates and sands) along meandering river courses while a dense network of channels and streams have subsequently down-cut through the older volcanic and sedimentary sequences, redepositing the reworked volcaniclastic sediments further downstream.
2011-01-01
Abstract The addition of relatively short flap sequence at the 5′-end of one of the polymerase chain reaction (PCR) primers considerably improves performance of real-time assays based on 5′-nuclease activity. This new technology, called Snake, was shown to supersede the conventional methods like TaqMan, Molecular Beacons, and Scorpions in the signal productivity and discrimination of target polymorphic variations as small as single nucleotides. The present article describes a number of reaction conditions and methods that allow further improvement of the assay performance. One of the identified approaches is the use of duplex-destabilizing modifications such as deoxyinosine and deoxyuridine in the design of the Snake primers. This approach was shown to solve the most serious problem associated with the antisense amplicon folding and cleavage. As a result, the method permits the use of relatively long—in this study—14-mer flap sequences. Investigation also revealed that only the 5′-segment of the flap requires the deoxyinosine/deoxyuridine destabilization, whereas the 3′-segment is preferably left unmodified or even stabilized using 2-amino deoxyadenosine d(2-amA) and 5-propynyl deoxyuridine d(5-PrU) modifications. The base-modification technique is especially effective when applied in combination with asymmetric three-step PCR. The most valuable discovery of the present study is the effective application of modified deoxynucleoside 5′-triphosphates d(2-amA)TP and d(5-PrU)TP in Snake PCR. This method made possible the use of very short 6-8-mer 5′-flap sequences in Snake primers. PMID:21050073
ARK: Aggregation of Reads by K-Means for Estimation of Bacterial Community Composition.
Koslicki, David; Chatterjee, Saikat; Shahrivar, Damon; Walker, Alan W; Francis, Suzanna C; Fraser, Louise J; Vehkaperä, Mikko; Lan, Yueheng; Corander, Jukka
2015-01-01
Estimation of bacterial community composition from high-throughput sequenced 16S rRNA gene amplicons is a key task in microbial ecology. Since the sequence data from each sample typically consist of a large number of reads and are adversely impacted by different levels of biological and technical noise, accurate analysis of such large datasets is challenging. There has been a recent surge of interest in using compressed sensing inspired and convex-optimization based methods to solve the estimation problem for bacterial community composition. These methods typically rely on summarizing the sequence data by frequencies of low-order k-mers and matching this information statistically with a taxonomically structured database. Here we show that the accuracy of the resulting community composition estimates can be substantially improved by aggregating the reads from a sample with an unsupervised machine learning approach prior to the estimation phase. The aggregation of reads is a pre-processing approach where we use a standard K-means clustering algorithm that partitions a large set of reads into subsets with reasonable computational cost to provide several vectors of first order statistics instead of only single statistical summarization in terms of k-mer frequencies. The output of the clustering is then processed further to obtain the final estimate for each sample. The resulting method is called Aggregation of Reads by K-means (ARK), and it is based on a statistical argument via mixture density formulation. ARK is found to improve the fidelity and robustness of several recently introduced methods, with only a modest increase in computational complexity. An open source, platform-independent implementation of the method in the Julia programming language is freely available at https://github.com/dkoslicki/ARK. A Matlab implementation is available at http://www.ee.kth.se/ctsoftware.
Armen, Roger S; Bernard, Brady M; Day, Ryan; Alonso, Darwin O V; Daggett, Valerie
2005-09-20
Several neurodegenerative diseases are linked to expanded repeats of glutamine residues, which lead to the formation of amyloid fibrils and neuronal death. The length of the repeats correlates with the onset of Huntington's disease, such that healthy individuals have <38 residues and individuals with >38 repeats exhibit symptoms. Because it is difficult to obtain atomic-resolution structural information for poly(l-glutamine) (polyQ) in aqueous solution experimentally, we performed molecular dynamics simulations to investigate the conformational behavior of this homopolymer. In simulations of 20-, 40-, and 80-mer polyQ, we observed the formation of the "alpha-extended chain" conformation, which is characterized by alternating residues in the alpha(L) and alpha(R) conformations to yield a sheet. The structural transition from disordered random-coil conformations to the alpha-extended chain conformation exhibits modest length and temperature dependence, in agreement with the experimental observation that aggregation depends on length and temperature. We propose that fibril formation in polyQ may occur through an alpha-sheet structure, which was proposed by Pauling and Corey. Also, we propose an atomic-resolution model of how the inhibitory peptide QBP1 (polyQ-binding peptide 1) may bind to polyQ in an alpha-extended chain conformation to inhibit fibril formation.
Middle East Respiratory Syndrome (MERS)
... Controls Cancel Submit Search The CDC Middle East Respiratory Syndrome (MERS) Note: Javascript is disabled or is ... Recommend on Facebook Tweet Share Compartir Middle East Respiratory Syndrome (MERS) is viral respiratory illness that was ...
MERS transmission and risk factors: a systematic review.
Park, Ji-Eun; Jung, Soyoung; Kim, Aeran; Park, Ji-Eun
2018-05-02
Since Middle East respiratory syndrome (MERS) infection was first reported in 2012, many studies have analysed its transmissibility and severity. However, the methodology and results of these studies have varied, and there has been no systematic review of MERS. This study reviews the characteristics and associated risk factors of MERS. We searched international (PubMed, ScienceDirect, Cochrane) and Korean databases (DBpia, KISS) for English- or Korean-language articles using the terms "MERS" and "Middle East respiratory syndrome". Only human studies with > 20 participants were analysed to exclude studies with low representation. Epidemiologic studies with information on transmissibility and severity of MERS as well as studies containing MERS risk factors were included. A total of 59 studies were included. Most studies from Saudi Arabia reported higher mortality (22-69.2%) than those from South Korea (20.4%). While the R 0 value in Saudi Arabia was < 1 in all but one study, in South Korea, the R 0 value was 2.5-8.09 in the early stage and decreased to < 1 in the later stage. The incubation period was 4.5-5.2 days in Saudi Arabia and 6-7.8 days in South Korea. Duration from onset was 4-10 days to confirmation, 2.9-5.3 days to hospitalization, 11-17 days to death, and 14-20 days to discharge. Older age and concomitant disease were the most common factors related to MERS infection, severity, and mortality. The transmissibility and severity of MERS differed by outbreak region and patient characteristics. Further studies assessing the risk of MERS should consider these factors.
Nomura, Koji; Vilalta, Anna; Allendorf, David H.; Hornik, Tamara C.
2017-01-01
Activated microglia can phagocytose dying, stressed, or excess neurons and synapses via the phagocytic receptor Mer tyrosine kinase (MerTK). Galectin-3 (Gal-3) can cross-link surface glycoproteins by binding galactose residues that are normally hidden below terminal sialic acid residues. Gal-3 was recently reported to opsonize cells via activating MerTK. We found that LPS-activated BV-2 microglia rapidly released Gal-3, which was blocked by calcineurin inhibitors. Gal-3 bound to MerTK on microglia and to stressed PC12 (neuron-like) cells, and it increased microglial phagocytosis of PC12 cells or primary neurons, which was blocked by inhibition of MerTK. LPS-activated microglia exhibited a sialidase activity that desialylated PC12 cells and could be inhibited by Tamiflu, a neuraminidase (sialidase) inhibitor. Sialidase treatment of PC12 cells enabled Gal-3 to bind and opsonize the live cells for phagocytosis by microglia. LPS-induced microglial phagocytosis of PC12 was prevented by small interfering RNA knockdown of Gal-3 in microglia, lactose inhibition of Gal-3 binding, inhibition of neuraminidase with Tamiflu, or inhibition of MerTK by UNC569. LPS-induced phagocytosis of primary neurons by primary microglia was also blocked by inhibition of MerTK. We conclude that activated microglia release Gal-3 and a neuraminidase that desialylates microglial and PC12 surfaces, enabling Gal-3 binding to PC12 cells and their phagocytosis via MerTK. Thus, Gal-3 acts as an opsonin of desialylated surfaces, and inflammatory loss of neurons or synapses may potentially be blocked by inhibiting neuraminidases, Gal-3, or MerTK. PMID:28500071
2003-05-15
KENNEDY SPACE CENTER, FLA. - At right is the Delta II rocket on Launch Complex 17-A, Cape Canaveral Air Force Station, that will launch Mars Exploration Rover 2 (MER-2) on June 5. In the center are three more solid rocket boosters that will be added to the Delta, which will carry nine in all. NASA’s twin Mars Exploration Rovers are designed to study the history of water on Mars. These robotic geologists are equipped with a robotic arm, a drilling tool, three spectrometers, and four pairs of cameras that allow them to have a human-like, 3D view of the terrain. Each rover could travel as far as 100 meters in one day to act as Mars scientists' eyes and hands, exploring an environment where humans can’t yet go. MER-2 is scheduled to launch as MER-A. MER-1 (MER-B) will launch June 25.
2003-05-15
KENNEDY SPACE CENTER, FLA. - The Delta II rocket on Launch Complex 17-A, Cape Canaveral Air Force Station, is having solid rocket boosters (SRBs) installed that will help launch Mars Exploration Rover 2 (MER-2) on June 5. In the center are three more solid rocket boosters that will be added to the Delta, which will carry nine in all. NASA’s twin Mars Exploration Rovers are designed to study the history of water on Mars. These robotic geologists are equipped with a robotic arm, a drilling tool, three spectrometers, and four pairs of cameras that allow them to have a human-like, 3D view of the terrain. Each rover could travel as far as 100 meters in one day to act as Mars scientists' eyes and hands, exploring an environment where humans can’t yet go. MER-2 is scheduled to launch as MER-A. MER-1 (MER-B) will launch June 25.
2003-05-31
KENNEDY SPACE CENTER, FLA. - At Launch Complex 17-A, Cape Canaveral Air Force Station, the first half of the fairing for the Mars Exploration Rover 2 (MER-2) is installed around the Mars Exploration Rover 2 (MER-2). MER-2 is one of NASA's twin Mars Exploration Rovers designed to study the history of water on Mars. These robotic geologists are equipped with a robotic arm, a drilling tool, three spectrometers, and four pairs of cameras that allow them to have a human-like, 3D view of the terrain. Each rover could travel as far as 100 meters in one day to act as Mars scientists' eyes and hands, exploring an environment where humans can't yet go. MER-2 is scheduled to launch no earlier than June 8 as MER-A, with two launch opportunities each day during the launch period that closes on June 19.
2003-05-15
KENNEDY SPACE CENTER, FLA. - Workers on the launch tower of Complex 17-A, Cape Canaveral Air Force Station, stand by while a solid rocket booster (SRB) is lifted to vertical. It is one of nine that will help launch Mars Exploration Rover 2 (MER-2). NASA’s twin Mars Exploration Rovers are designed to study the history of water on Mars. These robotic geologists are equipped with a robotic arm, a drilling tool, three spectrometers, and four pairs of cameras that allow them to have a human-like, 3D view of the terrain. Each rover could travel as far as 100 meters in one day to act as Mars scientists' eyes and hands, exploring an environment where humans can’t yet go. MER-2 is scheduled to launch June 5 as MER-A. MER-1 (MER-B) will launch June 25.
Diffusion-driven self-assembly of rodlike particles: Monte Carlo simulation on a square lattice
NASA Astrophysics Data System (ADS)
Lebovka, Nikolai I.; Tarasevich, Yuri Yu.; Gigiberiya, Volodymyr A.; Vygornitskii, Nikolai V.
2017-05-01
The diffusion-driven self-assembly of rodlike particles was studied by means of Monte Carlo simulation. The rods were represented as linear k -mers (i.e., particles occupying k adjacent sites). In the initial state, they were deposited onto a two-dimensional square lattice of size L ×L up to the jamming concentration using a random sequential adsorption algorithm. The size of the lattice, L , was varied from 128 to 2048, and periodic boundary conditions were applied along both x and y axes, while the length of the k -mers (determining the aspect ratio) was varied from 2 to 12. The k -mers oriented along the x and y directions (kx-mers and ky-mers, respectively) were deposited equiprobably. In the course of the simulation, the numbers of intraspecific and interspecific contacts between the same sort and between different sorts of k -mers, respectively, were calculated. Both the shift ratio of the actual number of shifts along the longitudinal or transverse axes of the k -mers and the electrical conductivity of the system were also examined. For the initial random configuration, quite different self-organization behavior was observed for short and long k -mers. For long k -mers (k ≥6 ), three main stages of diffusion-driven spatial segregation (self-assembly) were identified: the initial stage, reflecting destruction of the jamming state; the intermediate stage, reflecting continuous cluster coarsening and labyrinth pattern formation; and the final stage, reflecting the formation of diagonal stripe domains. Additional examination of two artificially constructed initial configurations showed that this pattern of diagonal stripe domains is an attractor, i.e., any spatial distribution of k -mers tends to transform into diagonal stripes. Nevertheless, the time for relaxation to the steady state essentially increases as the lattice size growth.
Increased Circulating and Urinary Levels of Soluble TAM Receptors in Diabetic Nephropathy.
Ochodnicky, Peter; Lattenist, Lionel; Ahdi, Mohamed; Kers, Jesper; Uil, Melissa; Claessen, Nike; Leemans, Jaklien C; Florquin, Sandrine; Meijers, Joost C M; Gerdes, Victor E A; Roelofs, Joris J T H
2017-09-01
TAM receptors (Tyro3, Axl, and Mer) have been implicated in innate immunity. Circulating TAM receptor soluble forms (sTyro3, sAxl, sMer) are related to autoimmune disorders. We investigated TAM and their ligand protein S in patients with diabetes. Urinary and plasma levels of protein S, sTyro3, sAxl, and sMer were determined in 126 patients with diabetes assigned to a normoalbuminuric or macroalbuminuric (urinary albumin excretion <30 mg/24 hours and >300 mg/24 hours, respectively) study group and 18 healthy volunteers. TAM and protein S immunostaining was performed on kidney biopsy specimens from patients with diabetic nephropathy (n = 9) and controls (n = 6). TAM expression and shedding by tubular epithelial cells were investigated by PCR and enzyme-linked immunosorbent assay in an in vitro diabetes model. Patients with macroalbuminuria diabetes had higher circulating levels of sMer and more urinary sTyro3 and sMer than normoalbuminuric diabetics. Increased clearance of sTyro3 and sMer was associated with loss of tubular Tyro3 and Mer expression in diabetic nephropathy tissue and glomerular depositions of protein S. During in vitro diabetes, human kidney cells had down-regulation of Tyro3 and Mer mRNA and increased shedding of sTyro3 and sMer. Renal injury in diabetes is associated with elevated systemic and urine levels of sMer and sTyro3. This is the first study reporting excretion of sTAM receptors in urine, identifying the kidney as a source of sTAM. Copyright © 2017 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
Alhamlan, F S; Majumder, M S; Brownstein, J S; Hawkins, J; Al-Abdely, H M; Alzahrani, A; Obaid, D A; Al-Ahdal, M N; BinSaeed, A
2017-01-12
As of 1 November 2015, the Saudi Ministry of Health had reported 1273 cases of Middle East respiratory syndrome (MERS); among these cases, which included 9 outbreaks at several hospitals, 717 (56%) patients recovered, 14 (1%) remain hospitalised and 543 (43%) died. This study aimed to determine the epidemiological, demographic and clinical characteristics that distinguished cases of MERS contracted during outbreaks from those contracted sporadically (ie, non-outbreak) between 2012 and 2015 in Saudi Arabia. Data from the Saudi Ministry of Health of confirmed outbreak and non-outbreak cases of MERS coronavirus (CoV) infections from September 2012 through October 2015 were abstracted and analysed. Univariate and descriptive statistical analyses were conducted, and the time between disease onset and confirmation, onset and notification and onset and death were examined. A total of 1250 patients (aged 0-109 years; mean, 50.825 years) were reported infected with MERS-CoV. Approximately two-thirds of all MERS cases were diagnosed in men for outbreak and non-outbreak cases. Healthcare workers comprised 22% of all MERS cases for outbreak and non-outbreak cases. Nosocomial infections comprised one-third of all Saudi MERS cases; however, nosocomial infections occurred more frequently in outbreak than non-outbreak cases (p<0.001). Patients contracting MERS during an outbreak were significantly more likely to die of MERS (p<0.001). To date, nosocomial infections have fuelled MERS outbreaks. Given that the Kingdom of Saudi Arabia is a worldwide religious travel destination, localised outbreaks may have massive global implications and effective outbreak preventive measures are needed. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Alhamlan, F S; Majumder, M S; Brownstein, J S; Hawkins, J; Al-Abdely, H M; Alzahrani, A; Obaid, D A; Al-Ahdal, M N; BinSaeed, A
2017-01-01
Objectives As of 1 November 2015, the Saudi Ministry of Health had reported 1273 cases of Middle East respiratory syndrome (MERS); among these cases, which included 9 outbreaks at several hospitals, 717 (56%) patients recovered, 14 (1%) remain hospitalised and 543 (43%) died. This study aimed to determine the epidemiological, demographic and clinical characteristics that distinguished cases of MERS contracted during outbreaks from those contracted sporadically (ie, non-outbreak) between 2012 and 2015 in Saudi Arabia. Design Data from the Saudi Ministry of Health of confirmed outbreak and non-outbreak cases of MERS coronavirus (CoV) infections from September 2012 through October 2015 were abstracted and analysed. Univariate and descriptive statistical analyses were conducted, and the time between disease onset and confirmation, onset and notification and onset and death were examined. Results A total of 1250 patients (aged 0–109 years; mean, 50.825 years) were reported infected with MERS-CoV. Approximately two-thirds of all MERS cases were diagnosed in men for outbreak and non-outbreak cases. Healthcare workers comprised 22% of all MERS cases for outbreak and non-outbreak cases. Nosocomial infections comprised one-third of all Saudi MERS cases; however, nosocomial infections occurred more frequently in outbreak than non-outbreak cases (p<0.001). Patients contracting MERS during an outbreak were significantly more likely to die of MERS (p<0.001). Conclusions To date, nosocomial infections have fuelled MERS outbreaks. Given that the Kingdom of Saudi Arabia is a worldwide religious travel destination, localised outbreaks may have massive global implications and effective outbreak preventive measures are needed. PMID:28082362
Low-maintenance energy requirements of obese dogs after weight loss.
German, Alexander J; Holden, Shelley L; Mather, Nicola J; Morris, Penelope J; Biourge, Vincent
2011-10-01
Weight rebound after successful weight loss is a well-known phenomenon in humans and dogs, possibly due to the fact that energy restriction improves metabolic efficiency, reducing post-weight-loss maintenance energy requirements (MER). The aim of the present study was to estimate post-weight-loss MER in obese pet dogs that had successfully lost weight and did not subsequently rebound. A total of twenty-four obese dogs, successfully completing a weight management programme at the Royal Canin Weight Management Clinic, University of Liverpool (Wirral, UK), were included. In all dogs, a period of >14 d of stable weight ( < 1 % change) was identified post-weight loss, when food intake was constant and activity levels were stable (assessed via owners' diary records). Post-weight-loss MER was indirectly estimated by determining dietary energy consumption during this stable weight period. Multivariable linear regression was used to identify factors that were associated with post-weight-loss MER. The mean length of stable weight after weight loss was 54 (SD 34.1) d. During this time, MER was 285 (SD 54.8) kJ/kg(0.75) per d. The rate of prior weight loss and food intake during the weight-loss phase was positively associated with post-weight-loss MER, while the amount of lean tissue lost was negatively associated with post-weight-loss MER. MER are low after weight loss in obese pet dogs (typically only 10 % more than required during weight-loss MER), which has implications for what should constitute the optimal diet during this period. Preserving lean tissue during weight loss may maximise post-weight-loss MER and help prevent rebound.
NASA Astrophysics Data System (ADS)
Rivard, Brea R.; Cooper, Sarah J.; Stubbs, John M.
2018-02-01
DNA duplexes consisting of a 25mer together with shorter complementary sequences were studied over a range of temperature and surface binding motifs using a coarse-grained two-site nucleotide model. Results were analyzed in terms of hydrogen bonding interactions and structural characteristics and indicate that hybridization is most stable when furthest from the surface binding site. Strand elongation and straightening near the bound end are found to be correlated to duplex destabilization.
Favorable 2'-substitution in the loop region of a thrombin-binding DNA aptamer.
Awachat, Ragini; Wagh, Atish A; Aher, Manisha; Fernandes, Moneesha; Kumar, Vaijayanti A
2018-06-01
Simple 2'-OMe-chemical modification in the loop region of the 15mer G-rich DNA sequence GGTTGGTGTGGTTGG is reported. The G-quadruplex structure of this thrombin-binding aptamer (TBA), is stabilized by single modifications (T → 2'-OMe-U), depending on the position of the modification. The structural stability also renders significantly increased inhibition of thrombin-induced fibrin polymerization, a process closely associated with blood-clotting. Copyright © 2018 Elsevier Ltd. All rights reserved.
Bolzán, Alejandro D
2017-07-01
By definition, telomeric sequences are located at the very ends or terminal regions of chromosomes. However, several vertebrate species show blocks of (TTAGGG)n repeats present in non-terminal regions of chromosomes, the so-called interstitial telomeric sequences (ITSs), interstitial telomeric repeats or interstitial telomeric bands, which include those intrachromosomal telomeric-like repeats located near (pericentromeric ITSs) or within the centromere (centromeric ITSs) and those telomeric repeats located between the centromere and the telomere (i.e., truly interstitial telomeric sequences) of eukaryotic chromosomes. According with their sequence organization, localization and flanking sequences, ITSs can be classified into four types: 1) short ITSs, 2) subtelomeric ITSs, 3) fusion ITSs, and 4) heterochromatic ITSs. The first three types have been described mainly in the human genome, whereas heterochromatic ITSs have been found in several vertebrate species but not in humans. Several lines of evidence suggest that ITSs play a significant role in genome instability and evolution. This review aims to summarize our current knowledge about the origin, function, instability and evolution of these telomeric-like repeats in vertebrate chromosomes. Copyright © 2017 Elsevier B.V. All rights reserved.
Weinger, Jason G.; Omari, Kakuri M.; Marsden, Kurt; Raine, Cedric S.; Shafit-Zagardo, Bridget
2009-01-01
Multiple sclerosis is a disease that is characterized by inflammation, demyelination, and axonal damage; it ultimately forms gliotic scars and lesions that severely compromise the function of the central nervous system. Evidence has shown previously that altered growth factor receptor signaling contributes to lesion formation, impedes recovery, and plays a role in disease progression. Growth arrest-specific protein 6 (Gas6), the ligand for the TAM receptor tyrosine kinase family, consisting of Tyro3, Axl, and Mer, is important for cell growth, survival, and clearance of debris. In this study, we show that levels of membrane-bound Mer (205 kd), soluble Mer (∼150 kd), and soluble Axl (80 kd) were all significantly elevated in homogenates from established multiple sclerosis lesions comprised of both chronic active and chronic silent lesions. Whereas in normal tissue Gas6 positively correlated with soluble Axl and Mer, there was a negative correlation between Gas6 and soluble Axl and Mer in established multiple sclerosis lesions. In addition, increased levels of soluble Axl and Mer were associated with increased levels of mature ADAM17, mature ADAM10, and Furin, proteins that are associated with Axl and Mer solubilization. Soluble Axl and Mer are both known to act as decoy receptors and block Gas6 binding to membrane-bound receptors. These data suggest that in multiple sclerosis lesions, dysregulation of protective Gas6 receptor signaling may prolong lesion activity. PMID:19541935