Sample records for direct sequence analysis

  1. Asymmetry of perceived key movement in chorale sequences: converging evidence from a probe-tone analysis.

    PubMed

    Cuddy, L L; Thompson, W F

    1992-01-01

    In a probe-tone experiment, two groups of listeners--one trained, the other untrained, in traditional music theory--rated the goodness of fit of each of the 12 notes of the chromatic scale to four-voice harmonic sequences. Sequences were 12 simplified excerpts from Bach chorales, 4 nonmodulating, and 8 modulating. Modulations occurred either one or two steps in either the clockwise or the counterclockwise direction on the cycle of fifths. A consistent pattern of probe-tone ratings was obtained for each sequence, with no significant differences between listener groups. Two methods of analysis (Fourier analysis and regression analysis) revealed a directional asymmetry in the perceived key movement conveyed by modulating sequences. For a given modulation distance, modulations in the counterclockwise direction effected a clearer shift in tonal organization toward the final key than did clockwise modulations. The nature of the directional asymmetry was consistent with results reported for identification and rating of key change in the sequences (Thompson & Cuddy, 1989a). Further, according to the multiple-regression analysis, probe-tone ratings did not merely reflect the distribution of tones in the sequence. Rather, ratings were sensitive to the temporal structure of the tonal organization in the sequence.

  2. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis.

    PubMed

    Jakupciak, John P; Wells, Jeffrey M; Karalus, Richard J; Pawlowski, David R; Lin, Jeffrey S; Feldman, Andrew B

    2013-01-01

    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.

  3. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

    PubMed Central

    Jakupciak, John P.; Wells, Jeffrey M.; Karalus, Richard J.; Pawlowski, David R.; Lin, Jeffrey S.; Feldman, Andrew B.

    2013-01-01

    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations. PMID:24455204

  4. Evaluation of microbial community in hydrothermal field by direct DNA sequencing

    NASA Astrophysics Data System (ADS)

    Kawarabayasi, Y.; Maruyama, A.

    2002-12-01

    Many extremophiles have been discovered from terrestrial and marine hydrothermal fields. Some thermophiles can grow beyond 90°C in culture, while direct microscopic analysis occasionally indicates that microbes may survive in much hotter hydrothermal fluids. However, it is very difficult to isolate and cultivate such microbes from the environments, i.e., over 99% of total microbes remains undiscovered. Based on experiences of entire microbial genome analysis (Y.K.) and microbial community analysis (A.M.), we started to find out unique microbes/genes in hydrothermal fields through direct sequencing of environmental DNA fragments. At first, shotgun plasmid libraries were directly constructed with the DNA molecules prepared from mixed microbes collected by an in situ filtration system from low-temperature fluids at RM24 in the Southern East Pacific Rise (S-EPR). A gene amplification (PCR) technique was not used for preventing mutation in the process. The nucleotide sequences of 285 clones indicated that no sequence had identical data in public databases. Among 27 clones determined entire sequences, no ORF was identified on 14 clones like intron in Eukaryote. On four clones, tetra-nucleotide-long multiple tandem repetitive sequences were identified. This type of sequence was identified in some familiar disease in human. The result indicates that living/dead materials with eukaryotic features may exist in this low temperature field. Secondly, shotgun plasmid libraries were constructed from the environmental DNA prepared from Beppu hot springs. In randomly-selected 143 clones used for sequencing, no known sequence was identified. Unlike the clones in S-EPR library, clear ORFs were identified on all nine clones determined the entire sequence. It was found that one clone, H4052, contained the complete Aspartyl-tRNA synthetase. Phylogenetic analysis using amino acid sequences of this gene indicated that this gene was separated from other Euryarchaea before the differentiation of species. Thus, some novel archaeal species are expected to be in this field. The present direct cloning and sequencing technique is now opening a window to the new world in hydrothermal microbial community analysis.

  5. Impact of cultivation on characterisation of species composition of soil bacterial communities.

    PubMed

    McCaig, A E.; Grayston, S J.; Prosser, J I.; Glover, L A.

    2001-03-01

    The species composition of culturable bacteria in Scottish grassland soils was investigated using a combination of Biolog and 16S rDNA analysis for characterisation of isolates. The inclusion of a molecular approach allowed direct comparison of sequences from culturable bacteria with sequences obtained during analysis of DNA extracted directly from the same soil samples. Bacterial strains were isolated on Pseudomonas isolation agar (PIA), a selective medium, and on tryptone soya agar (TSA), a general laboratory medium. In total, 12 and 21 morphologically different bacterial cultures were isolated on PIA and TSA, respectively. Biolog and sequencing placed PIA isolates in the same taxonomic groups, the majority of cultures belonging to the Pseudomonas (sensu stricto) group. However, analysis of 16S rDNA sequences proved more efficient than Biolog for characterising TSA isolates due to limitations of the Microlog database for identifying environmental bacteria. In general, 16S rDNA sequences from TSA isolates showed high similarities to cultured species represented in sequence databases, although TSA-8 showed only 92.5% similarity to the nearest relative, Bacillus insolitus. In general, there was very little overlap between the culturable and uncultured bacterial communities, although two sequences, PIA-2 and TSA-13, showed >99% similarity to soil clones. A cloning step was included prior to sequence analysis of two isolates, TSA-5 and TSA-14, and analysis of several clones confirmed that these cultures comprised at least four and three sequence types, respectively. All isolate clones were most closely related to uncultured bacteria, with clone TSA-5.1 showing 99.8% similarity to a sequence amplified directly from the same soil sample. Interestingly, one clone, TSA-5.4, clustered within a novel group comprising only uncultured sequences. This group, which is associated with the novel, deep-branching Acidobacterium capsulatum lineage, also included clones isolated during direct analysis of the same soil and from a wide range of other sample types studied elsewhere. The study demonstrates the value of fine-scale molecular analysis for identification of laboratory isolates and indicates the culturability of approximately 1% of the total population but under a restricted range of media and cultivation conditions.

  6. Direct repeat sequences in the Streptomyces chitinase-63 promoter direct both glucose repression and chitin induction

    PubMed Central

    Ni, Xiangyang; Westpheling, Janet

    1997-01-01

    The chi63 promoter directs glucose-sensitive, chitin-dependent transcription of a gene involved in the utilization of chitin as carbon source. Analysis of 5′ and 3′ deletions of the promoter region revealed that a 350-bp segment is sufficient for wild-type levels of expression and regulation. The analysis of single base changes throughout the promoter region, introduced by random and site-directed mutagenesis, identified several sequences to be important for activity and regulation. Single base changes at −10, −12, −32, −33, −35, and −37 upstream of the transcription start site resulted in loss of activity from the promoter, suggesting that bases in these positions are important for RNA polymerase interaction. The sequences centered around −10 (TATTCT) and −35 (TTGACC) in this promoter are, in fact, prototypical of eubacterial promoters. Overlapping the RNA polymerase binding site is a perfect 12-bp direct repeat sequence. Some base changes within this direct repeat resulted in constitutive expression, suggesting that this sequence is an operator for negative regulation. Other base changes resulted in loss of glucose repression while retaining the requirement for chitin induction, suggesting that this sequence is also involved in glucose repression. The fact that cis-acting mutations resulted in glucose resistance but not inducer independence rules out the possibility that glucose repression acts exclusively by inducer exclusion. The fact that mutations that affect glucose repression and chitin induction fall within the same direct repeat sequence module suggests that the direct repeat sequence facilitates both chitin induction and glucose repression. PMID:9371809

  7. Shifted termination assay (STA) fragment analysis to detect BRAF V600 mutations in papillary thyroid carcinomas

    PubMed Central

    2013-01-01

    Background BRAF mutation is an important diagnostic and prognostic marker in patients with papillary thyroid carcinoma (PTC). To be applicable in clinical laboratories with limited equipment, diverse testing methods are required to detect BRAF mutation. Methods A shifted termination assay (STA) fragment analysis was used to detect common V600 BRAF mutations in 159 PTCs with DNAs extracted from formalin-fixed paraffin-embedded tumor tissue. The results of STA fragment analysis were compared to those of direct sequencing. Serial dilutions of BRAF mutant cell line (SNU-790) were used to calculate limit of detection (LOD). Results BRAF mutations were detected in 119 (74.8%) PTCs by STA fragment analysis. In direct sequencing, BRAF mutations were observed in 118 (74.2%) cases. The results of STA fragment analysis had high correlation with those of direct sequencing (p < 0.00001, κ = 0.98). The LOD of STA fragment analysis and direct sequencing was 6% and 12.5%, respectively. In PTCs with pT3/T4 stages, BRAF mutation was observed in 83.8% of cases. In pT1/T2 carcinomas, BRAF mutation was detected in 65.9% and this difference was statistically significant (p = 0.007). Moreover, BRAF mutation was more frequent in PTCs with extrathyroidal invasion than tumors without extrathyroidal invasion (84.7% versus 62.2%, p = 0.001). To prepare and run the reactions, direct sequencing required 450 minutes while STA fragment analysis needed 290 minutes. Conclusions STA fragment analysis is a simple and sensitive method to detect BRAF V600 mutations in formalin-fixed paraffin-embedded clinical samples. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/5684057089135749 PMID:23883275

  8. Mutation Analysis in Classical Phenylketonuria Patients Followed by Detecting Haplotypes Linked to Some PAH Mutations.

    PubMed

    Dehghanian, Fatemeh; Silawi, Mohammad; Tabei, Seyed M B

    2017-02-01

    Deficiency of phenylalanine hydroxylase (PAH) enzyme and elevation of phenylalanine in body fluids cause phenylketonuria (PKU). The gold standard for confirming PKU and PAH deficiency is detecting causal mutations by direct sequencing of the coding exons and splicing involved sequences of the PAH gene. Furthermore, haplotype analysis could be considered as an auxiliary approach for detecting PKU causative mutations before direct sequencing of the PAH gene by making comparisons between prior detected mutation linked-haplotypes and new PKU case haplotypes with undetermined mutations. In this study, 13 unrelated classical PKU patients took part in the study detecting causative mutations. Mutations were identified by polymerase chain reaction (PCR) and direct sequencing in all patients. After that, haplotype analysis was performed by studying VNTR and PAHSTR markers (linked genetic markers of the PAH gene) through application of PCR and capillary electrophoresis (CE). Mutation analysis was performed successfully and the detected mutations were as follows: c.782G>A, c.754C>T, c.842C>G, c.113-115delTCT, c.688G>A, and c.696A>G. Additionally, PAHSTR/VNTR haplotypes were detected to discover haplotypes linked to each mutation. Mutation detection is the best approach for confirming PAH enzyme deficiency in PKU patients. Due to the relatively large size of the PAH gene and high cost of the direct sequencing in developing countries, haplotype analysis could be used before DNA sequencing and mutation detection for a faster and cheaper way via identifying probable mutated exons.

  9. MerCat: a versatile k-mer counter and diversity estimator for database-independent property analysis obtained from metagenomic and/or metatranscriptomic sequencing data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Richard A.; Panyala, Ajay R.; Glass, Kevin A.

    MerCat is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. MerCat inputs include assembled contigs and raw sequence reads from any platform resulting in feature abundance counts tables. MerCat allows for direct analysis of data properties without reference sequence database dependency commonly used by search tools such as BLAST and/or DIAMOND for compositional analysis of whole community shotgun sequencing (e.g. metagenomes and metatranscriptomes).

  10. Pyrosequencing analysis for detection of a BRAFV600E mutation in an FNAB specimen of thyroid nodules.

    PubMed

    Kim, Suk Kyeong; Kim, Dong-Lim; Han, Hye Seung; Kim, Wan Seop; Kim, Seung Ja; Moon, Won Jin; Oh, Seo Young; Hwang, Tae Sook

    2008-06-01

    Fine-needle aspiration biopsy (FNAB) is the primary means of distinguishing benign from malignant and of guiding therapeutic intervention in thyroid nodules. However, 10% to 30% of cases with indeterminate cytology in FNAB need other diagnostic tools to refine diagnosis. We compared the pyrosequencing method with the conventional direct DNA sequencing analysis and investigated the usefulness of preoperative BRAF mutation analysis as an adjunct diagnostic tool with routine FNAB. A total of 103 surgically confirmed patients' FNA slides were recruited and DNA was extracted after atypical cells were scraped from the slides. BRAF mutation was analyzed by pyrosequencing and direct DNA sequencing. Sixty-three (77.8%) of 81 histopathologically diagnosed malignant nodules revealed positive BRAF mutation on pyrosequencing analysis. In detail, 63 (84.0%) of 75 papillary thyroid carcinoma (PTC) samples showed positive BRAF mutation, whereas 3 follicular thyroid carcinomas, 1 anaplastic carcinoma, 1 medullary thyroid carcinoma, and 1 metastatic lung carcinoma did not show BRAF mutation. None of 22 benign nodules had BRAF mutation in both pyrosequencing and direct DNA sequencing. Out of 27 thyroid nodules classified as 'indeterminate' on cytologic examination preoperatively, 21 (77.8%) cases turned out to be malignant: 18 PTCs (including 2 follicular variant types) and 3 follicular thyroid carcinomas. Among these, 13 (61.9%) classic PTCs had BRAF mutation. None of 6 benign nodules, including 3 follicular adenomas and 3 nodular hyperplasias, had BRAF mutation. Among 63 PTCs with positive BRAF mutation detected by pyrosequencing analysis, 3 cases did not show BRAF mutation by direct DNA sequencing. Although it was not statistically significant, pyrosequencing was superior to direct DNA sequencing in detecting the BRAF mutation of thyroid nodules (P=0.25). Detecting BRAF mutation by pyrosequencing is more sensitive, faster, and less expensive than direct DNA sequencing and is proposed as an adjunct diagnostic tool in evaluating thyroid nodules of indeterminate cytology.

  11. Directionality analysis on functional magnetic resonance imaging during motor task using Granger causality.

    PubMed

    Anwar, A R; Muthalib, M; Perrey, S; Galka, A; Granert, O; Wolff, S; Deuschl, G; Raethjen, J; Heute, U; Muthuraman, M

    2012-01-01

    Directionality analysis of signals originating from different parts of brain during motor tasks has gained a lot of interest. Since brain activity can be recorded over time, methods of time series analysis can be applied to medical time series as well. Granger Causality is a method to find a causal relationship between time series. Such causality can be referred to as a directional connection and is not necessarily bidirectional. The aim of this study is to differentiate between different motor tasks on the basis of activation maps and also to understand the nature of connections present between different parts of the brain. In this paper, three different motor tasks (finger tapping, simple finger sequencing, and complex finger sequencing) are analyzed. Time series for each task were extracted from functional magnetic resonance imaging (fMRI) data, which have a very good spatial resolution and can look into the sub-cortical regions of the brain. Activation maps based on fMRI images show that, in case of complex finger sequencing, most parts of the brain are active, unlike finger tapping during which only limited regions show activity. Directionality analysis on time series extracted from contralateral motor cortex (CMC), supplementary motor area (SMA), and cerebellum (CER) show bidirectional connections between these parts of the brain. In case of simple finger sequencing and complex finger sequencing, the strongest connections originate from SMA and CMC, while connections originating from CER in either direction are the weakest ones in magnitude during all paradigms.

  12. A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.

    PubMed

    Razvi, F; Gargiulo, G; Worcel, A

    1983-08-01

    Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.

  13. Interuser Interference Analysis for Direct-Sequence Spread-Spectrum Systems Part I: Partial-Period Cross-Correlation

    NASA Technical Reports Server (NTRS)

    Ni, Jianjun (David)

    2012-01-01

    This presentation discusses an analysis approach to evaluate the interuser interference for Direct-Sequence Spread-Spectrum (DSSS) Systems for Space Network (SN) Users. Part I of this analysis shows that the correlation property of pseudo noise (PN) sequences is the critical factor which determines the interuser interference performance of the DSSS system. For non-standard DSSS systems in which PN sequence s period is much larger than one data symbol duration, it is the partial-period cross-correlation that determines the system performance. This study reveals through an example that a well-designed PN sequence set (e.g. Gold Sequence, in which the cross-correlation for a whole-period is well controlled) may have non-controlled partial-period cross-correlation which could cause severe interuser interference for a DSSS system. Since the analytical derivation of performance metric (bit error rate or signal-to-noise ratio) based on partial-period cross-correlation is prohibitive, the performance degradation due to partial-period cross-correlation will be evaluated using simulation in Part II of this analysis in the future.

  14. Novel primer specific false terminations during DNA sequencing reactions: danger of inaccuracy of mutation analysis in molecular diagnostics

    PubMed Central

    Anwar, R; Booth, A; Churchill, A J; Markham, A F

    1996-01-01

    The determination of nucleotide sequence is fundamental to the identification and molecular analysis of genes. Direct sequencing of PCR products is now becoming a commonplace procedure for haplotype analysis, and for defining mutations and polymorphism within genes, particularly for diagnostic purposes. A previously unrecognised phenomenon, primer related variability, observed in sequence data generated using Taq cycle sequencing and T7 Sequenase sequencing, is reported. This suggests that caution is necessary when interpreting DNA sequence data. This is particularly important in situations where treatment may be dependent on the accuracy of the molecular diagnosis. Images PMID:16696096

  15. Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions

    PubMed Central

    Sükösd, Zsuzsanna; Swenson, M. Shel; Kjems, Jørgen; Heitsch, Christine E.

    2013-01-01

    Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence. PMID:23325843

  16. Deep sequencing reveals double mutations in cis of MPL exon 10 in myeloproliferative neoplasms.

    PubMed

    Pietra, Daniela; Brisci, Angela; Rumi, Elisa; Boggi, Sabrina; Elena, Chiara; Pietrelli, Alessandro; Bordoni, Roberta; Ferrari, Maurizio; Passamonti, Francesco; De Bellis, Gianluca; Cremonesi, Laura; Cazzola, Mario

    2011-04-01

    Somatic mutations of MPL exon 10, mainly involving a W515 substitution, have been described in JAK2 (V617F)-negative patients with essential thrombocythemia and primary myelofibrosis. We used direct sequencing and high-resolution melt analysis to identify mutations of MPL exon 10 in 570 patients with myeloproliferative neoplasms, and allele specific PCR and deep sequencing to further characterize a subset of mutated patients. Somatic mutations were detected in 33 of 221 patients (15%) with JAK2 (V617F)-negative essential thrombocythemia or primary myelofibrosis. Only one patient with essential thrombocythemia carried both JAK2 (V617F) and MPL (W515L). High-resolution melt analysis identified abnormal patterns in all the MPL mutated cases, while direct sequencing did not detect the mutant MPL in one fifth of them. In 3 cases carrying double MPL mutations, deep sequencing analysis showed identical load and location in cis of the paired lesions, indicating their simultaneous occurrence on the same chromosome.

  17. PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

    PubMed

    Wimmer, Katharina; Wernstedt, Annekatrin

    2014-01-01

    The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.

  18. Phylogeny of sipunculan worms: A combined analysis of four gene regions and morphology.

    PubMed

    Schulze, Anja; Cutler, Edward B; Giribet, Gonzalo

    2007-01-01

    The intra-phyletic relationships of sipunculan worms were analyzed based on DNA sequence data from four gene regions and 58 morphological characters. Initially we analyzed the data under direct optimization using parsimony as optimality criterion. An implied alignment resulting from the direct optimization analysis was subsequently utilized to perform a Bayesian analysis with mixed models for the different data partitions. For this we applied a doublet model for the stem regions of the 18S rRNA. Both analyses support monophyly of Sipuncula and most of the same clades within the phylum. The analyses differ with respect to the relationships among the major groups but whereas the deep nodes in the direct optimization analysis generally show low jackknife support, they are supported by 100% posterior probability in the Bayesian analysis. Direct optimization has been useful for handling sequences of unequal length and generating conservative phylogenetic hypotheses whereas the Bayesian analysis under mixed models provided high resolution in the basal nodes of the tree.

  19. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    PubMed

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.

  20. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    PubMed Central

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096

  1. [High resolution melting analysis for detecting of JAK2V617F mutation in patients with myeloproliferative neoplasms].

    PubMed

    Chen, Hai-Hua; Yang, Ji-Long; Lu, Hui-Fang; Zhou, Wei-Jun; Yao, Fei; Deng, Lan

    2014-02-01

    This study was purposed to investigate the feasibility of high resolution melting (HRM) in the detection of JAK2V617F mutation in patients with myeloproliferative neoplasm (MPN). The 29 marrow samples randomly selected from patients with clinically diagnosed MPN from January 2008 to January 2011 were detected by HRM method. The results of HRM analysis were compared with that detected by allele specific polymerase chain reaction (AS-PCR) and DNA direct sequencing. The results showed that the JAK2V617F mutations were detected in 11 (37.9%, 11/29) cases by HRM, and its comparability with the direct sequencing result was 100%. While the consistency of AS-PCR with the direct sequencing was moderate (Kappa = 0.179, P = 0.316). It is concluded that the HRM analysis may be an optimal method for clinical screening of JAK2V617F mutation due to its simplicity and promptness with a high specificity.

  2. Whole-exome sequencing identifies USH2A mutations in a pseudo-dominant Usher syndrome family.

    PubMed

    Zheng, Sui-Lian; Zhang, Hong-Liang; Lin, Zhen-Lang; Kang, Qian-Yan

    2015-10-01

    Usher syndrome (USH) is an autosomal recessive (AR) multi-sensory degenerative disorder leading to deaf-blindness. USH is clinically subdivided into three subclasses, and 10 genes have been identified thus far. Clinical and genetic heterogeneities in USH make a precise diagnosis difficult. A dominant‑like USH family in successive generations was identified, and the present study aimed to determine the genetic predisposition of this family. Whole‑exome sequencing was performed in two affected patients and an unaffected relative. Systematic data were analyzed by bioinformatic analysis to remove the candidate mutations via step‑wise filtering. Direct Sanger sequencing and co‑segregation analysis were performed in the pedigree. One novel and two known mutations in the USH2A gene were identified, and were further confirmed by direct sequencing and co‑segregation analysis. The affected mother carried compound mutations in the USH2A gene, while the unaffected father carried a heterozygous mutation. The present study demonstrates that whole‑exome sequencing is a robust approach for the molecular diagnosis of disorders with high levels of genetic heterogeneity.

  3. To Clone or Not To Clone: Method Analysis for Retrieving Consensus Sequences In Ancient DNA Samples

    PubMed Central

    Winters, Misa; Barta, Jodi Lynn; Monroe, Cara; Kemp, Brian M.

    2011-01-01

    The challenges associated with the retrieval and authentication of ancient DNA (aDNA) evidence are principally due to post-mortem damage which makes ancient samples particularly prone to contamination from “modern” DNA sources. The necessity for authentication of results has led many aDNA researchers to adopt methods considered to be “gold standards” in the field, including cloning aDNA amplicons as opposed to directly sequencing them. However, no standardized protocol has emerged regarding the necessary number of clones to sequence, how a consensus sequence is most appropriately derived, or how results should be reported in the literature. In addition, there has been no systematic demonstration of the degree to which direct sequences are affected by damage or whether direct sequencing would provide disparate results from a consensus of clones. To address this issue, a comparative study was designed to examine both cloned and direct sequences amplified from ∼3,500 year-old ancient northern fur seal DNA extracts. Majority rules and the Consensus Confidence Program were used to generate consensus sequences for each individual from the cloned sequences, which exhibited damage at 31 of 139 base pairs across all clones. In no instance did the consensus of clones differ from the direct sequence. This study demonstrates that, when appropriate, cloning need not be the default method, but instead, should be used as a measure of authentication on a case-by-case basis, especially when this practice adds time and cost to studies where it may be superfluous. PMID:21738625

  4. Interactive computer programs for the graphic analysis of nucleotide sequence data.

    PubMed Central

    Luckow, V A; Littlewood, R K; Rownd, R H

    1984-01-01

    A group of interactive computer programs have been developed which aid in the collection and graphical analysis of nucleotide and protein sequence data. The programs perform the following basic functions: a) enter, edit, list, and rearrange sequence data; b) permit automatic entry of nucleotide sequence data directly from an autoradiograph into the computer; c) search for restriction sites or other specified patterns and plot a linear or circular restriction map, or print their locations; d) plot base composition; e) analyze homology between sequences by plotting a two-dimensional graphic matrix; and f) aid in plotting predicted secondary structures of RNA molecules. PMID:6546437

  5. [Identification of a HPGD mutation in three families affected with primary hypertrophic osteoarthropathy].

    PubMed

    Zhang, Wanying; Wang, Tao; Huang, Shuaiwu; Zhao, Xiuli

    2018-04-10

    To detect mutation of HPGD gene among three pedigrees affected with primary hypertrophic osteoarthropathy (PHO) by DNA sequencing and high-resolution melting (HRM) analysis. Genomic DNA was extracted from peripheral blood samples collected from the pedigrees. PCR and direct sequencing were carried out to identify potential mutations of the HPGD gene. Amplicons containing the mutation spot were generated by nested PCR. The products were then subjected to HRM analysis using the HR-1 instrument. Direct sequencing was carried out in family members and healthy individuals to confirm the result of HRM analysis. A homozygous mutation c.310_311delCT was detected in 2 affected probands, while a heterozygous mutation c.310_311delCT was detected in the third proband. HRM analysis of the fragments encompassing HPGD exon 3 showed 3 curve patterns representing three different genotypes, i.e., the wild type, the c.310_311delCT homozygote, and the c.310_311delCT heterozygote. Result of DNA sequencing was consistent with that of the HRM analysis and phenotype of the subjects. The c.310_311delCT mutation may be the most prevalent mutation among Chinese population. HRM analysis has provided an optimized method for genetic testing of HPGD mutation for its simplicity, rapid turnover and high sensitivity.

  6. A novel progesterone receptor membrane component (PGRMC) in the human and swine parasite Taenia solium: implications to the host-parasite relationship.

    PubMed

    Aguilar-Díaz, Hugo; Nava-Castro, Karen E; Escobedo, Galileo; Domínguez-Ramírez, Lenin; García-Varela, Martín; Del Río-Araiza, Víctor H; Palacios-Arreola, Margarita I; Morales-Montor, Jorge

    2018-03-09

    We have previously reported that progesterone (P 4 ) has a direct in vitro effect on the scolex evagination and growth of Taenia solium cysticerci. Here, we explored the hypothesis that the P 4 direct effect on T. solium might be mediated by a novel steroid-binding parasite protein. By way of using immunofluorescent confocal microscopy, flow cytometry analysis, double-dimension electrophoresis analysis, and sequencing the corresponding protein spot, we detected a novel PGRMC in T. solium. Molecular modeling studies accompanied by computer docking using the sequenced protein, together with phylogenetic analysis and sequence alignment clearly demonstrated that T. solium PGRMC is from parasite origin. Our results show that P 4 in vitro increases parasite evagination and scolex size. Using immunofluorescent confocal microscopy, we detected that parasite cells showed expression of a P 4 -binding like protein exclusively located at the cysticercus subtegumental tissue. Presence of the P 4 -binding protein in cyst cells was also confirmed by flow cytometry. Double-dimension electrophoresis analysis, followed by sequencing the corresponding protein spot, revealed a protein that was previously reported in the T. solium genome belonging to a membrane-associated progesterone receptor component (PGRMC). Molecular modeling studies accompanied by computer docking using the sequenced protein showed that PGRMC is potentially able to bind steroid hormones such as progesterone, estradiol, testosterone and dihydrodrotestosterone with different affinities. Phylogenetic analysis and sequence alignment clearly demonstrated that T. solium PGRMC is related to a steroid-binding protein of Echinoccocus granulosus, both of them being nested within a cluster including similar proteins present in platyhelminths such as Schistocephalus solidus and Schistosoma haematobium. Progesterone may directly act upon T. solium cysticerci probably by binding to PGRMC. This research has implications in the field of host-parasite co-evolution as well as the sex-associated susceptibility to this infection. In a more practical matter, present results may contribute to the molecular design of new drugs with anti-parasite actions.

  7. Mass fingerprinting of the venom and transcriptome of venom gland of scorpion Centruroides tecomanus.

    PubMed

    Valdez-Velázquez, Laura L; Quintero-Hernández, Verónica; Romero-Gutiérrez, Maria Teresa; Coronas, Fredy I V; Possani, Lourival D

    2013-01-01

    Centruroides tecomanus is a Mexican scorpion endemic of the State of Colima, that causes human fatalities. This communication describes a proteome analysis obtained from milked venom and a transcriptome analysis from a cDNA library constructed from two pairs of venom glands of this scorpion. High perfomance liquid chromatography separation of soluble venom produced 80 fractions, from which at least 104 individual components were identified by mass spectrometry analysis, showing to contain molecular masses from 259 to 44,392 Da. Most of these components are within the expected molecular masses for Na(+)- and K(+)-channel specific toxic peptides, supporting the clinical findings of intoxication, when humans are stung by this scorpion. From the cDNA library 162 clones were randomly chosen, from which 130 sequences of good quality were identified and were clustered in 28 contigs containing, each, two or more expressed sequence tags (EST) and 49 singlets with only one EST. Deduced amino acid sequence analysis from 53% of the total ESTs showed that 81% (24 sequences) are similar to known toxic peptides that affect Na(+)-channel activity, and 19% (7 unique sequences) are similar to K(+)-channel especific toxins. Out of the 31 sequences, at least 8 peptides were confirmed by direct Edman degradation, using components isolated directly from the venom. The remaining 19%, 4%, 4%, 15% and 5% of the ESTs correspond respectively to proteins involved in cellular processes, antimicrobial peptides, venom components, proteins without defined function and sequences without similarity in databases. Among the cloned genes are those similar to metalloproteinases.

  8. Forensic strategy to ensure the quality of sequencing data of mitochondrial DNA in highly degraded samples.

    PubMed

    Adachi, Noboru; Umetsu, Kazuo; Shojo, Hideki

    2014-01-01

    Mitochondrial DNA (mtDNA) is widely used for DNA analysis of highly degraded samples because of its polymorphic nature and high number of copies in a cell. However, as endogenous mtDNA in deteriorated samples is scarce and highly fragmented, it is not easy to obtain reliable data. In the current study, we report the risks of direct sequencing mtDNA in highly degraded material, and suggest a strategy to ensure the quality of sequencing data. It was observed that direct sequencing data of the hypervariable segment (HVS) 1 by using primer sets that generate an amplicon of 407 bp (long-primer sets) was different from results obtained by using newly designed primer sets that produce an amplicon of 120-139 bp (mini-primer sets). The data aligned with the results of mini-primer sets analysis in an amplicon length-dependent manner; the shorter the amplicon, the more evident the endogenous sequence became. Coding region analysis using multiplex amplified product-length polymorphisms revealed the incongruence of single nucleotide polymorphisms between the coding region and HVS 1 caused by contamination with exogenous mtDNA. Although the sequencing data obtained using long-primer sets turned out to be erroneous, it was unambiguous and reproducible. These findings suggest that PCR primers that produce amplicons shorter than those currently recognized should be used for mtDNA analysis in highly degraded samples. Haplogroup motif analysis of the coding region and HVS should also be performed to improve the reliability of forensic mtDNA data. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  9. Variations on a theme of Lander and Waterman

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Speed, T.

    1997-12-01

    The original Lander and Waterman mathematical analysis was for fingerprinting random clones. Since that time, a number of variants of their theory have appeared, including ones which apply to mapping by anchoring random clones, and to non-random or directed clone mapping. The same theory is now widely used to devise random sequencing strategies. In this talk I will review these developments, and go on the discuss the theory required for directed sequencing strategies.

  10. Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong

    2010-02-19

    Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less

  11. The Neandertal type site revisited: Interdisciplinary investigations of skeletal remains from the Neander Valley, Germany

    PubMed Central

    Schmitz, Ralf W.; Serre, David; Bonani, Georges; Feine, Susanne; Hillgruber, Felix; Krainitzki, Heike; Pääbo, Svante; Smith, Fred H.

    2002-01-01

    The 1856 discovery of the Neandertal type specimen (Neandertal 1) in western Germany marked the beginning of human paleontology and initiated the longest-standing debate in the discipline: the role of Neandertals in human evolutionary history. We report excavations of cave sediments that were removed from the Feldhofer caves in 1856. These deposits have yielded over 60 human skeletal fragments, along with a large series of Paleolithic artifacts and faunal material. Our analysis of this material represents the first interdisciplinary analysis of Neandertal remains incorporating genetic, direct dating, and morphological dimensions simultaneously. Three of these skeletal fragments fit directly on Neandertal 1, whereas several others have distinctively Neandertal features. At least three individuals are represented in the skeletal sample. Radiocarbon dates for Neandertal 1, from which a mtDNA sequence was determined in 1997, and a second individual indicate an age of ≈40,000 yr for both. mtDNA analysis on the same second individual yields a sequence that clusters with other published Neandertal sequences. PMID:12232049

  12. Detection of a new bat gammaherpesvirus in the Philippines.

    PubMed

    Watanabe, Shumpei; Ueda, Naoya; Iha, Koichiro; Masangkay, Joseph S; Fujii, Hikaru; Alviola, Phillip; Mizutani, Tetsuya; Maeda, Ken; Yamane, Daisuke; Walid, Azab; Kato, Kentaro; Kyuwa, Shigeru; Tohya, Yukinobu; Yoshikawa, Yasuhiro; Akashi, Hiroomi

    2009-08-01

    A new bat herpesvirus was detected in the spleen of an insectivorous bat (Hipposideros diadema, family Hipposideridae) collected on Panay Island, the Philippines. PCR analyses were performed using COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOPs) targeting the herpesvirus DNA polymerase (DPOL) gene. Although we obtained PCR products with CODEHOPs, direct sequencing using the primers was not possible because of high degree of degeneracy. Direct sequencing technology developed in our rapid determination system of viral RNA sequences (RDV) was applied in this study, and a partial DPOL nucleotide sequence was determined. In addition, a partial gB gene nucleotide sequence was also determined using the same strategy. We connected the partial gB and DPOL sequences with long-distance PCR, and a 3741-bp nucleotide fragment, including the 3' part of the gB gene and the 5' part of the DPOL gene, was finally determined. Phylogenetic analysis showed that the sequence was novel and most similar to those of the subfamily Gammaherpesvirinae.

  13. Massively Parallel DNA Sequencing Facilitates Diagnosis of Patients with Usher Syndrome Type 1

    PubMed Central

    Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-ichi

    2014-01-01

    Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance. PMID:24618850

  14. Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

    PubMed

    Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi

    2014-01-01

    Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.

  15. Extensive characterization of Tupaia belangeri neuropeptidome using an integrated mass spectrometric approach.

    PubMed

    Petruzziello, Filomena; Fouillen, Laetitia; Wadensten, Henrik; Kretz, Robert; Andren, Per E; Rainer, Gregor; Zhang, Xiaozhe

    2012-02-03

    Neuropeptidomics is used to characterize endogenous peptides in the brain of tree shrews (Tupaia belangeri). Tree shrews are small animals similar to rodents in size but close relatives of primates, and are excellent models for brain research. Currently, tree shrews have no complete proteome information available on which direct database search can be allowed for neuropeptide identification. To increase the capability in the identification of neuropeptides in tree shrews, we developed an integrated mass spectrometry (MS)-based approach that combines methods including data-dependent, directed, and targeted liquid chromatography (LC)-Fourier transform (FT)-tandem MS (MS/MS) analysis, database construction, de novo sequencing, precursor protein search, and homology analysis. Using this integrated approach, we identified 107 endogenous peptides that have sequences identical or similar to those from other mammalian species. High accuracy MS and tandem MS information, with BLAST analysis and chromatographic characteristics were used to confirm the sequences of all the identified peptides. Interestingly, further sequence homology analysis demonstrated that tree shrew peptides have a significantly higher degree of homology to equivalent sequences in humans than those in mice or rats, consistent with the close phylogenetic relationship between tree shrews and primates. Our results provide the first extensive characterization of the peptidome in tree shrews, which now permits characterization of their function in nervous and endocrine system. As the approach developed fully used the conservative properties of neuropeptides in evolution and the advantage of high accuracy MS, it can be portable for identification of neuropeptides in other species for which the fully sequenced genomes or proteomes are not available.

  16. Analysis of human mitochondrial DNA sequences from fecally polluted environmental waters as a tool to study population diversity

    EPA Science Inventory

    Mitochondrial signature sequences have frequently been used to study the demographics of many different populations around the world. Traditionally, this requires obtaining samples directly from individuals which is cumbersome, time consuming and limited to the number of individu...

  17. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.

  18. Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.

    PubMed

    Ozsolak, Fatih

    2016-01-01

    With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.

  19. Harnessing Whole Genome Sequencing in Medical Mycology.

    PubMed

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  20. Oil Analysis.

    DTIC Science & Technology

    1982-08-23

    LUBRICATION, FAILURE PROGRESSION WNITORING OIL-ANALYSIS, FAILURE ANALYSIS, TRIBOLOGY WEAR DEBRIS ANALYSIS, WEAR REGIMS DIAGNOSTICS, BENCH TESTING, FERROGRApHy ...Spectrometric Oil Analysis . ............... 400 G. Analytical Ferrography ............................. 411 3 NAEC-92-153 TABLE OF CONTENTS (Continued...of ferrography entry deposit mnicrographs of these sequences, which can be directly related to sample debris concentration levels. These micrographs

  1. Identification and nucleotide sequence analysis of the repetitive DNA element in the genome of fish lymphocystis disease virus.

    PubMed

    Schnitzler, P; Delius, H; Scholz, J; Touray, M; Orth, E; Darai, G

    1987-12-01

    The genome of the fish lymphocystis disease virus (FLDV) was screened for the existence of repetitive DNA sequences using a defined and complete gene library of the viral genome (98 kbp) by DNA-DNA hybridization, heteroduplex analysis, and restriction fine mapping. A repetitive DNA sequence was detected at the coordinates 0.034 to 0.057 and 0.718 to 0.736 map units (m.u.) of the FLDV genome. The first region (0.034 to 0.057 m.u.) corresponds to the 5' terminus of the EcoRI FLDV DNA fragment B (0.034 to 0.165 m.u.) and the second region (0.718 to 0.736 m.u.) is identical to the EcoRI DNA fragment M of the viral genome. The DNA nucleotide sequence of the EcoRI FLDV DNA fragment M was determined. This analysis revealed the presence of many short direct and inverted repetitions, e.g., a 18-mer direct repetition (TTTAAAATTTAATTAA) that started at nucleotide positions 812 and 942 and a 14-mer inverted repeat (TTAAATTTAAATTT) at nucleotide positions 820 and 959. Only short open reading frames were detected within this region. The DNA repetitions are discussed as sequences that play a possible regulatory role for virus replication. Furthermore, hybridization experiments revealed that the repetitive DNA sequences are conserved in the genome of different strains of fish lymphocystis disease virus isolated from two species of Pleuronectidae (flounder and dab).

  2. Direct Detection and Sequencing of Damaged DNA Bases

    PubMed Central

    2011-01-01

    Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597

  3. Direct detection and sequencing of damaged DNA bases.

    PubMed

    Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas

    2011-12-20

    Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.

  4. Microbial Characterization of Qatari Barchan Sand Dunes

    PubMed Central

    Chatziefthimiou, Aspassia D.; Nguyen, Hanh; Richer, Renee; Louge, Michel; Sultan, Ali A.; Schloss, Patrick; Hay, Anthony G.

    2016-01-01

    This study represents the first characterization of sand microbiota in migrating barchan sand dunes. Bacterial communities were studied through direct counts and cultivation, as well as 16S rRNA gene and metagenomic sequence analysis to gain an understanding of microbial abundance, diversity, and potential metabolic capabilities. Direct on-grain cell counts gave an average of 5.3 ± 0.4 x 105 cells g-1 of sand. Cultured isolates (N = 64) selected for 16S rRNA gene sequencing belonged to the phyla Actinobacteria (58%), Firmicutes (27%) and Proteobacteria (15%). Deep-sequencing of 16S rRNA gene amplicons from 18 dunes demonstrated a high relative abundance of Proteobacteria, particularly enteric bacteria, and a dune-specific-pattern of bacterial community composition that correlated with dune size. Shotgun metagenome sequences of two representative dunes were analyzed and found to have similar relative bacterial abundance, though the relative abundances of eukaryotic, viral and enterobacterial sequences were greater in sand from the dune closer to a camel-pen. Functional analysis revealed patterns similar to those observed in desert soils; however, the increased relative abundance of genes encoding sporulation and dormancy are consistent with the dune microbiome being well-adapted to the exceptionally hyper-arid Qatari desert. PMID:27655399

  5. Quantiprot - a Python package for quantitative analysis of protein sequences.

    PubMed

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  6. Single-strand conformation polymorphism (SSCP)-based mutation scanning approaches to fingerprint sequence variation in ribosomal DNA of ascaridoid nematodes.

    PubMed

    Zhu, X Q; Gasser, R B

    1998-06-01

    In this study, we assessed single-strand conformation polymorphism (SSCP)-based approaches for their capacity to fingerprint sequence variation in ribosomal DNA (rDNA) of ascaridoid nematodes of veterinary and/or human health significance. The second internal transcribed spacer region (ITS-2) of rDNA was utilised as the target region because it is known to provide species-specific markers for this group of parasites. ITS-2 was amplified by PCR from genomic DNA derived from individual parasites and subjected to analysis. Direct SSCP analysis of amplicons from seven taxa (Toxocara vitulorum, Toxocara cati, Toxocara canis, Toxascaris leonina, Baylisascaris procyonis, Ascaris suum and Parascaris equorum) showed that the single-strand (ss) ITS-2 patterns produced allowed their unequivocal identification to species. While no variation in SSCP patterns was detected in the ITS-2 within four species for which multiple samples were available, the method allowed the direct display of four distinct sequence types of ITS-2 among individual worms of T. cati. Comparison of SSCP/sequencing with the methods of dideoxy fingerprinting (ddF) and restriction endonuclease fingerprinting (REF) revealed that also ddF allowed the definition of the four sequence types, whereas REF displayed three of four. The findings indicate the usefulness of the SSCP-based approaches for the identification of ascaridoid nematodes to species, the direct display of sequence variation in rDNA and the detection of population variation. The ability to fingerprint microheterogeneity in ITS-2 rDNA using such approaches also has implications for studying fundamental aspects relating to mutational change in rDNA.

  7. Parallel gene analysis with allele-specific padlock probes and tag microarrays

    PubMed Central

    Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats

    2003-01-01

    Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977

  8. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    PubMed

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  9. Recent research on the high-probability instructional sequence: A brief review.

    PubMed

    Lipschultz, Joshua; Wilder, David A

    2017-04-01

    The high-probability (high-p) instructional sequence consists of the delivery of a series of high-probability instructions immediately before delivery of a low-probability or target instruction. It is commonly used to increase compliance in a variety of populations. Recent research has described variations of the high-p instructional sequence and examined the conditions under which the sequence is most effective. This manuscript reviews the most recent research on the sequence and identifies directions for future research. Recommendations for practitioners regarding the use of the high-p instructional sequence are also provided. © 2017 Society for the Experimental Analysis of Behavior.

  10. Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity

    NASA Astrophysics Data System (ADS)

    Mukherjee, Shashi Bajaj; Sen, Pradip Kumar

    2010-10-01

    Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.

  11. Self-Organizing Hidden Markov Model Map (SOHMMM): Biological Sequence Clustering and Cluster Visualization.

    PubMed

    Ferles, Christos; Beaufort, William-Scott; Ferle, Vanessa

    2017-01-01

    The present study devises mapping methodologies and projection techniques that visualize and demonstrate biological sequence data clustering results. The Sequence Data Density Display (SDDD) and Sequence Likelihood Projection (SLP) visualizations represent the input symbolical sequences in a lower-dimensional space in such a way that the clusters and relations of data elements are depicted graphically. Both operate in combination/synergy with the Self-Organizing Hidden Markov Model Map (SOHMMM). The resulting unified framework is in position to analyze automatically and directly raw sequence data. This analysis is carried out with little, or even complete absence of, prior information/domain knowledge.

  12. Coevolutionary modeling of protein sequences: Predicting structure, function, and mutational landscapes

    NASA Astrophysics Data System (ADS)

    Weigt, Martin

    Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).

  13. Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

    PubMed

    Dai, Qi; Yang, Yanchun; Wang, Tianming

    2008-10-15

    Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.

  14. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    PubMed Central

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  15. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    PubMed

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  16. Content Analysis of Informed Consent for Whole Genome Sequencing Offered by Direct-to-Consumer Genetic Testing Companies.

    PubMed

    Niemiec, Emilia; Borry, Pascal; Pinxten, Wim; Howard, Heidi Carmen

    2016-12-01

    Whole exome sequencing (WES) and whole genome sequencing (WGS) have become increasingly available in the research and clinical settings and are now also being offered by direct-to-consumer (DTC) genetic testing (GT) companies. This offer can be perceived as amplifying the already identified concerns regarding adequacy of informed consent (IC) for both WES/WGS and the DTC GT context. We performed a qualitative content analysis of Websites of four companies offering WES/WGS DTC regarding the following elements of IC: pre-test counseling, benefits and risks, and incidental findings (IFs). The analysis revealed concerns, including the potential lack of pre-test counseling in three of the companies studied, missing relevant information in the risks and benefits sections, and potentially misleading information for consumers. Regarding IFs, only one company, which provides opportunistic screening, provides basic information about their management. In conclusion, some of the information (and related practices) present on the companies' Web pages salient to the consent process are not adequate in reference to recommendations for IC for WGS or WES in the clinical context. Requisite resources should be allocated to ensure that commercial companies are offering high-throughput sequencing under responsible conditions, including an adequate consent process. © 2016 WILEY PERIODICALS, INC.

  17. Sequence analysis of dolphin ferritin H and L subunits and possible iron-dependent translational control of dolphin ferritin gene

    PubMed Central

    Takaesu, Azusa; Watanabe, Kiyotaka; Takai, Shinji; Sasaki, Yukako; Orino, Koichi

    2008-01-01

    Background Iron-storage protein, ferritin plays a central role in iron metabolism. Ferritin has dual function to store iron and segregate iron for protection of iron-catalyzed reactive oxygen species. Tissue ferritin is composed of two kinds of subunits (H: heavy chain or heart-type subunit; L: light chain or liver-type subunit). Ferritin gene expression is controlled at translational level in iron-dependent manner or at transcriptional level in iron-independent manner. However, sequencing analysis of marine mammalian ferritin subunits has not yet been performed fully. The purpose of this study is to reveal cDNA-derived amino acid sequences of cetacean ferritin H and L subunits, and demonstrate the possibility of expression of these subunits, especially H subunit, by iron. Methods Sequence analyses of cetacean ferritin H and L subunits were performed by direct sequencing of polymerase chain reaction (PCR) fragments from cDNAs generated via reverse transcription-PCR of leukocyte total RNA prepared from blood samples of six different dolphin species (Pseudorca crassidens, Lagenorhynchus obliquidens, Grampus griseus, Globicephala macrorhynchus, Tursiops truncatus, and Delphinapterus leucas). The putative iron-responsive element sequence in the 5'-untranslated region of the six different dolphin species was revealed by direct sequencing of PCR fragments obtained using leukocyte genomic DNA. Results Dolphin H and L subunits consist of 182 and 174 amino acids, respectively, and amino acid sequence identities of ferritin subunits among these dolphins are highly conserved (H: 99–100%, (99→98) ; L: 98–100%). The conserved 28 bp IRE sequence was located -144 bp upstream from the initiation codon in the six different dolphin species. Conclusion These results indicate that six different dolphin species have conserved ferritin sequences, and suggest that these genes are iron-dependently expressed. PMID:18954429

  18. Transcriptome analysis by strand-specific sequencing of complementary DNA

    PubMed Central

    Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

    2009-01-01

    High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212

  19. Transcriptome analysis by strand-specific sequencing of complementary DNA.

    PubMed

    Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

    2009-10-01

    High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.

  20. Method for high resolution magnetic resonance analysis using magic angle technique

    DOEpatents

    Wind, Robert A.; Hu, Jian Zhi

    2003-11-25

    A method of performing a magnetic resonance analysis of a biological object that includes placing the biological object in a main magnetic field and in a radio frequency field, the main magnetic field having a static field direction; rotating the biological object at a rotational frequency of less than about 100 Hz around an axis positioned at an angle of about 54.degree.44' relative to the main magnetic static field direction; pulsing the radio frequency to provide a sequence that includes a magic angle turning pulse segment; and collecting data generated by the pulsed radio frequency. According to another embodiment, the radio frequency is pulsed to provide a sequence capable of producing a spectrum that is substantially free of spinning sideband peaks.

  1. Validation of Skeletal Muscle cis-Regulatory Module Predictions Reveals Nucleotide Composition Bias in Functional Enhancers

    PubMed Central

    Kwon, Andrew T.; Chou, Alice Yi; Arenillas, David J.; Wasserman, Wyeth W.

    2011-01-01

    We performed a genome-wide scan for muscle-specific cis-regulatory modules (CRMs) using three computational prediction programs. Based on the predictions, 339 candidate CRMs were tested in cell culture with NIH3T3 fibroblasts and C2C12 myoblasts for capacity to direct selective reporter gene expression to differentiated C2C12 myotubes. A subset of 19 CRMs validated as functional in the assay. The rate of predictive success reveals striking limitations of computational regulatory sequence analysis methods for CRM discovery. Motif-based methods performed no better than predictions based only on sequence conservation. Analysis of the properties of the functional sequences relative to inactive sequences identifies nucleotide sequence composition can be an important characteristic to incorporate in future methods for improved predictive specificity. Muscle-related TFBSs predicted within the functional sequences display greater sequence conservation than non-TFBS flanking regions. Comparison with recent MyoD and histone modification ChIP-Seq data supports the validity of the functional regions. PMID:22144875

  2. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    PubMed

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  3. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering

    PubMed Central

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor

    2015-01-01

    Abstract To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice. PMID:25560745

  4. Sequence-structure mapping errors in the PDB: OB-fold domains

    PubMed Central

    Venclovas, Česlovas; Ginalski, Krzysztof; Kang, Chulhee

    2004-01-01

    The Protein Data Bank (PDB) is the single most important repository of structural data for proteins and other biologically relevant molecules. Therefore, it is critically important to keep the PDB data, as much as possible, error-free. In this study, we have analyzed PDB crystal structures possessing oligonucleotide/oligosaccharide binding (OB)-fold, one of the highly populated folds, for the presence of sequence-structure mapping errors. Using energy-based structure quality assessment coupled with sequence analyses, we have found that there are at least five OB-structures in the PDB that have regions where sequences have been incorrectly mapped onto the structure. We have demonstrated that the combination of these computation techniques is effective not only in detecting sequence-structure mapping errors, but also in providing guidance to correct them. Namely, we have used results of computational analysis to direct a revision of X-ray data for one of the PDB entries containing a fairly inconspicuous sequence-structure mapping error. The revised structure has been deposited with the PDB. We suggest use of computational energy assessment and sequence analysis techniques to facilitate structure determination when homologs having known structure are available to use as a reference. Such computational analysis may be useful in either guiding the sequence-structure assignment process or verifying the sequence mapping within poorly defined regions. PMID:15133161

  5. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples

    PubMed Central

    Quick, Josh; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

    2018-01-01

    Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples without isolation remains challenging for viruses such as Zika, where metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence complete genomes comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimised library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved starting with clinical samples in 1-2 days following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. PMID:28538739

  6. Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

    PubMed

    Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

    2010-07-01

    We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.

  7. Mycobacterium tuberculosis and whole genome sequencing: a practical guide and online tools available for the clinical microbiologist.

    PubMed

    Satta, G; Atzeni, A; McHugh, T D

    2017-02-01

    Whole genome sequencing (WGS) has the potential to revolutionize the diagnosis of Mycobacterium tuberculosis infection but the lack of bioinformatic expertise among clinical microbiologists is a barrier for adoption. Software products for analysis should be simple, free of charge, able to accept data directly from the sequencer (FASTQ files) and to provide the basic functionalities all-in-one. The main aim of this narrative review is to provide a practical guide for the clinical microbiologist, with little or no practical experience of WGS analysis, with a specific focus on software products tailor-made for M. tuberculosis analysis. With sequencing performed by an external provider, it is now feasible to implement WGS analysis in the routine clinical practice of any microbiology laboratory, with the potential to detect resistance weeks before traditional phenotypic culture methods, but the clinical microbiologist should be aware of the limitations of this approach. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.

  8. Direct detection of RNA in vitro and in situ by target-primed RCA: The impact of E. coli RNase III on the detection efficiency of RNA sequences distanced far from the 3'-end.

    PubMed

    Merkiene, Egle; Gaidamaviciute, Edita; Riauba, Laurynas; Janulaitis, Arvydas; Lagunavicius, Arunas

    2010-08-01

    We improved the target RNA-primed RCA technique for direct detection and analysis of RNA in vitro and in situ. Previously we showed that the 3' --> 5' single-stranded RNA exonucleolytic activity of Phi29 DNA polymerase converts the target RNA into a primer and uses it for RCA initiation. However, in some cases, the single-stranded RNA exoribonucleolytic activity of the polymerase is hindered by strong double-stranded structures at the 3'-end of target RNAs. We demonstrate that in such hampered cases, the double-stranded RNA-specific Escherichia coli RNase III efficiently assists Phi29 DNA polymerase in converting the target RNA into a primer. These observations extend the target RNA-primed RCA possibilities to test RNA sequences distanced far from the 3'-end and customize this technique for the inner RNA sequence analysis.

  9. The Role of the Y-Chromosome in the Establishment of Murine Hybrid Dysgenesis and in the Analysis of the Nucleotide Sequence Organization, Genetic Transmission and Evolution of Repeated Sequences.

    NASA Astrophysics Data System (ADS)

    Nallaseth, Ferez Soli

    The Y-chromosome presents a unique cytogenetic framework for the evolution of nucleotide sequences. Alignment of nine Y-chromosomal fragments in their increasing Y-specific/non Y-specific (male/female) sequence divergence ratios was directly and inversely related to their interspersion on these two respective genomic fractions. Sequence analysis confirmed a direct relationship between divergence ratios and the Alu, LINE-1, Satellite and their derivative oligonucleotide contents. Thus their relocation on the Y-chromosome is followed by sequence divergence rather than the well documented concerted evolution of these non-coding progenitor repeated sequences. Five of the nine Y-chromosomal fragments are non-pseudoautosomal and transcribed into heterogeneous PolyA^+ RNA and thus can be retrotransposed. Evolutionary and computer analysis identified homologous oligonucleotide tracts in several human loci suggesting common and random mechanistic origins. Dysgenic genomes represent the accelerated evolution driving sequence divergence (McClintock, 1984). Sex reversal and sterility characterizing dysgenesis occurs in C57BL/6JY ^{rm Pos} but not in 129/SvY^{rm Pos} derivative strains. High frequency, random, multi-locus deletion products of the feral Y^{ rm Pos}-chromosome are generated in the germlines of F1(C57BL/6J X 129/SvY^{ rm Pos})(male) and C57BL/6JY ^{rm Pos}(male) but not in 129/SvY^{rm Pos}(male). Equal, 10^{-1}, 10^ {-2}, and 0 copies (relative to males) of Y^{rm Pos}-specific deletion products respectively characterize C57BL/6JY ^{rm Pos} (HC), (LC), (T) and (F) females. The testes determining loci of inactive Y^{rm Pos}-chromosomes in C57BL/6JY^{rm Pos} HC females are the preferentially deleted/rearranged Y ^{rm Pos}-sequences. Disruption of regulation of plasma testosterone and hepatic MUP-A mRNA levels, TRD of a 4.7 Kbp EcoR1 fragment suggest disruption of autosomal/X-chromosomal sequences. These data and the highly repeated progenitor (Alu, GATA, LINE-1) sequence content of deletion products confirmed the previously unidentified loss of genetic control of mammalian chromosome biology and hybrid dysgenesis.

  10. Forensic massively parallel sequencing data analysis tool: Implementation of MyFLq as a standalone web- and Illumina BaseSpace(®)-application.

    PubMed

    Van Neste, Christophe; Gansemans, Yannick; De Coninck, Dieter; Van Hoofstat, David; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2015-03-01

    Routine use of massively parallel sequencing (MPS) for forensic genomics is on the horizon. The last few years, several algorithms and workflows have been developed to analyze forensic MPS data. However, none have yet been tailored to the needs of the forensic analyst who does not possess an extensive bioinformatics background. We developed our previously published forensic MPS data analysis framework MyFLq (My-Forensic-Loci-queries) into an open-source, user-friendly, web-based application. It can be installed as a standalone web application, or run directly from the Illumina BaseSpace environment. In the former, laboratories can keep their data on-site, while in the latter, data from forensic samples that are sequenced on an Illumina sequencer can be uploaded to Basespace during acquisition, and can subsequently be analyzed using the published MyFLq BaseSpace application. Additional features were implemented such as an interactive graphical report of the results, an interactive threshold selection bar, and an allele length-based analysis in addition to the sequenced-based analysis. Practical use of the application is demonstrated through the analysis of four 16-plex short tandem repeat (STR) samples, showing the complementarity between the sequence- and length-based analysis of the same MPS data. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  11. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

    1997-05-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  12. TDRSS telecommunications system, PN code analysis

    NASA Technical Reports Server (NTRS)

    Dixon, R.; Gold, R.; Kaiser, F.

    1976-01-01

    The pseudo noise (PN) codes required to support the TDRSS telecommunications services are analyzed and the impact of alternate coding techniques on the user transponder equipment, the TDRSS equipment, and all factors that contribute to the acquisition and performance of these telecommunication services is assessed. Possible alternatives to the currently proposed hybrid FH/direct sequence acquisition procedures are considered and compared relative to acquisition time, implementation complexity, operational reliability, and cost. The hybrid FH/direct sequence technique is analyzed and rejected in favor of a recommended approach which minimizes acquisition time and user transponder complexity while maximizing probability of acquisition and overall link reliability.

  13. [Vertical variability of Pinus sylvestris var. mongolica tree ring delta13C and its relationship with tree ring width in northern Daxing' an Mountains of Northeast China].

    PubMed

    Shang, Zhi-Yuan; Wang, Jian; Zhang, Wen; Li, Yan-Yan; Cui, Ming-Xing; Chen, Zhen-Ju; Zhao, Xing-Yun

    2013-01-01

    A measurement was made on the vertical direction tree ring stable carbon isotope ratio (delta13C) and tree ring width of Pinus sylvestris var. mongolica in northern Daxing' an Mountains of Northeast China, with the relationship between the vertical direction variations of the tree ring delta13C and tree ring width analyzed. In the whole ring of xylem, earlywood (EW) and bark endodermis, the delta13C all exhibited an increasing trend from the top to the base at first, with the maximum at the bottom of tree crown, and then, decreased rapidly to the minimum downward. The EW and late-wood (LW) had an increasing ratio of average tree ring width from the base to the top. The average annual sequence of the delta13C in vertical direction had an obvious reverse correspondence with the average annual sequence of tree ring width, and had a trend comparatively in line with the average annual sequence of the tree ring width ratio of EW to LW above tree crown. The variance analysis showed that there existed significant differences in the sequences of tree ring delta13C and ring width in vertical direction, and the magnitude of vertical delta13C variability was basically the same as that of the inter-annual delta13C variability. The year-to-year variation trend of the vertical delta13C sequence was approximately identical. For each sample, the delta13C sequence at the same heights was negatively correlated with the ring width sequence, but the statistical significance differed with tree height.

  14. Characterization of expressed sequence tag-derived simple sequence repeat markers for Aspergillus flavus: emphasis on variability of isolates from the southern United States.

    PubMed

    Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard

    2012-12-01

    Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.

  15. Reduced expression of APC-1B but not APC-1A by the deletion of promoter 1B is responsible for familial adenomatous polyposis.

    PubMed

    Yamaguchi, Kiyoshi; Nagayama, Satoshi; Shimizu, Eigo; Komura, Mitsuhiro; Yamaguchi, Rui; Shibuya, Tetsuo; Arai, Masami; Hatakeyama, Seira; Ikenoue, Tsuneo; Ueno, Masashi; Miyano, Satoru; Imoto, Seiya; Furukawa, Yoichi

    2016-05-24

    Germline mutations in the tumor suppressor gene APC are associated with familial adenomatous polyposis (FAP). Here we applied whole-genome sequencing (WGS) to the DNA of a sporadic FAP patient in which we did not find any pathological APC mutations by direct sequencing. WGS identified a promoter deletion of approximately 10 kb encompassing promoter 1B and exon1B of APC. Additional allele-specific expression analysis by deep cDNA sequencing revealed that the deletion reduced the expression of the mutated APC allele to as low as 11.2% in the total APC transcripts, suggesting that the residual mutant transcripts were driven by other promoter(s). Furthermore, cap analysis of gene expression (CAGE) demonstrated that the deleted promoter 1B region is responsible for the great majority of APC transcription in many tissues except the brain. The deletion decreased the transcripts of APC-1B to 39-45% in the patient compared to the healthy controls, but it did not decrease those of APC-1A. Different deletions including promoter 1B have been reported in FAP patients. Taken together, our results strengthen the evidence that analysis of structural variations in promoter 1B should be considered for the FAP patients whose pathological mutations are not identified by conventional direct sequencing.

  16. Meta sequence analysis of human blood peptides and their parent proteins.

    PubMed

    Bowden, Peter; Pendrak, Voitek; Zhu, Peihong; Marshall, John G

    2010-04-18

    Sequence analysis of the blood peptides and their qualities will be key to understanding the mechanisms that contribute to error in LC-ESI-MS/MS. Analysis of peptides and their proteins at the level of sequences is much more direct and informative than the comparison of disparate accession numbers. A portable database of all blood peptide and protein sequences with descriptor fields and gene ontology terms might be useful for designing immunological or MRM assays from human blood. The results of twelve studies of human blood peptides and/or proteins identified by LC-MS/MS and correlated against a disparate array of genetic libraries were parsed and matched to proteins from the human ENSEMBL, SwissProt and RefSeq databases by SQL. The reported peptide and protein sequences were organized into an SQL database with full protein sequences and up to five unique peptides in order of prevalence along with the peptide count for each protein. Structured query language or BLAST was used to acquire descriptive information in current databases. Sampling error at the level of peptides is the largest source of disparity between groups. Chi Square analysis of peptide to protein distributions confirmed the significant agreement between groups on identified proteins. Copyright 2010. Published by Elsevier B.V.

  17. Molecular characterization of long direct repeat (LDR) sequences expressing a stable mRNA encoding for a 35-amino-acid cell-killing peptide and a cis-encoded small antisense RNA in Escherichia coli.

    PubMed

    Kawano, Mitsuoki; Oshima, Taku; Kasai, Hiroaki; Mori, Hirotada

    2002-07-01

    Genome sequence analyses of Escherichia coli K-12 revealed four copies of long repetitive elements. These sequences are designated as long direct repeat (LDR) sequences. Three of the repeats (LDR-A, -B, -C), each approximately 500 bp in length, are located as tandem repeats at 27.4 min on the genetic map. Another copy (LDR-D), 450 bp in length and nearly identical to LDR-A, -B and -C, is located at 79.7 min, a position that is directly opposite the position of LDR-A, -B and -C. In this study, we demonstrate that LDR-D encodes a 35-amino-acid peptide, LdrD, the overexpression of which causes rapid cell killing and nucleoid condensation of the host cell. Northern blot and primer extension analysis showed constitutive transcription of a stable mRNA (approximately 370 nucleotides) encoding LdrD and an unstable cis-encoded antisense RNA (approximately 60 nucleotides), which functions as a trans-acting regulator of ldrD translation. We propose that LDR encodes a toxin-antitoxin module. LDR-homologous sequences are not pre-sent on any known plasmids but are conserved in Salmonella and other enterobacterial species.

  18. Proteomics analysis of "Rovabiot Excel", a secreted protein cocktail from the filamentous fungus Penicillium funiculosum grown under industrial process fermentation.

    PubMed

    Guais, Olivier; Borderies, Gisèle; Pichereaux, Carole; Maestracci, Marc; Neugnot, Virginie; Rossignol, Michel; François, Jean Marie

    2008-12-01

    MS/MS techniques are well customized now for proteomic analysis, even for non-sequenced organisms, since peptide sequences obtained by these methods can be matched with those found in databases from closely related sequenced organisms. We used this approach to characterize the protein content of the "Rovabio Excel", an enzymatic cocktail produced by Penicillium funiculosum that is used as feed additive in animal nutrition. Protein separation by bi-dimensional electrophoresis yielded more than 100 spots, from which 37 proteins were unambiguously assigned from peptide sequences. By one-dimensional SDS-gel electrophoresis, 34 proteins were identified among which 8 were not found in the 2-DE analysis. A third method, termed 'peptidic shotgun', which consists in a direct treatment of the cocktail by trypsin followed by separation of the peptides on two-dimensional liquid chromatography, resulted in the identification of two additional proteins not found by the two other methods. Altogether, more than 50 proteins, among which several glycosylhydrolytic, hemicellulolytic and proteolytic enzymes, were identified by combining three separation methods in this enzymatic cocktail. This work confirmed the power of proteome analysis to explore the genome expression of a non-sequenced fungus by taking advantage of sequences from phylogenetically related filamentous fungi and pave the way for further functional analysis of P. funiculosum.

  19. Analysis of short tandem repeat polymorphisms using infrared fluorescence with M18 tailed primers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oetting, W.S.; Wiesner, G.; Laken, S.

    The use of short tandem repeat polymorphisms (STRPs) are becoming increasingly important as markers for linkage analysis due to their large numbers of the human genome and their high degree of polymorphism. Fluorescence based detection of the STRP pattern using the LI-COR model 4000S automated DNA sequencer eliminates the need for radioactivity and produces a digitized image that can be used for the analysis of the polymorphisms. In an effort to reduce the cost of STRP analysis, we have synthesized primers with a 19 bp extension complementary to the sequence of the M13 primer on the 5{prime} end of onemore » of the two primers used in the amplification of the STRP instead of using primers with direct conjugation of the infrared fluorescent dye. Up to 5 primer pairs can be multiplexed together with the M13 primer-dye conjugate as the sole primer conjugated to the fluorescent dye. Comparisons between primers that have been directly conjugated to the fluor with those having the M13 sequence extension show no difference in the ability to determine the STRP pattern. At present, the entire Weber 4A set of STRP markers is available with the M13 5{prime} extension. We are currently using this technique for linkage analysis of familial breast cancer and asthma. The combination of STRP analysis using fluorescence detection will allow this technique to be fully automated for allele scoring and linkage analysis.« less

  20. Use of amplicon sequencing to improve sensitivity in PCR-based detection of microbial pathogen in environmental samples.

    PubMed

    Saingam, Prakit; Li, Bo; Yan, Tao

    2018-06-01

    DNA-based molecular detection of microbial pathogens in complex environments is still plagued by sensitivity, specificity and robustness issues. We propose to address these issues by viewing them as inadvertent consequences of requiring specific and adequate amplification (SAA) of target DNA molecules by current PCR methods. Using the invA gene of Salmonella as the model system, we investigated if next generation sequencing (NGS) can be used to directly detect target sequences in false-negative PCR reaction (PCR-NGS) in order to remove the SAA requirement from PCR. False-negative PCR and qPCR reactions were first created using serial dilutions of laboratory-prepared Salmonella genomic DNA and then analyzed directly by NGS. Target invA sequences were detected in all false-negative PCR and qPCR reactions, which lowered the method detection limits near the theoretical minimum of single gene copy detection. The capability of the PCR-NGS approach in correcting false negativity was further tested and confirmed under more environmentally relevant conditions using Salmonella-spiked stream water and sediment samples. Finally, the PCR-NGS approach was applied to ten urban stream water samples and detected invA sequences in eight samples that would be otherwise deemed Salmonella negative. Analysis of the non-target sequences in the false-negative reactions helped to identify primer dime-like short sequences as the main cause of the false negativity. Together, the results demonstrated that the PCR-NGS approach can significantly improve method sensitivity, correct false-negative detections, and enable sequence-based analysis for failure diagnostics in complex environmental samples. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. Phenotypic and genotypic analysis of Borrelia burgdorferi isolates from various sources.

    PubMed Central

    Adam, T; Gassmann, G S; Rasiah, C; Göbel, U B

    1991-01-01

    A total of 17 B. burgdorferi isolates from various sources were characterized by sodium dodecyl sulfate-polyacrylamide gel electrophoresis of whole-cell proteins, restriction enzyme analysis, Southern hybridization with probes complementary to unique regions of evolutionarily conserved genes (16S rRNA and fla), and direct sequencing of in vitro polymerase chain reaction-amplified fragments of the 16S rRNA gene. Three groups were distinguished on the basis of phenotypic and genotypic traits, the latter traced to the nucleotide sequence level. Images PMID:1649797

  2. A novel model for DNA sequence similarity analysis based on graph theory.

    PubMed

    Qi, Xingqin; Wu, Qin; Zhang, Yusen; Fuller, Eddie; Zhang, Cun-Quan

    2011-01-01

    Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method's efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history.

  3. Accurate and rapid modeling of iron-bleomycin-induced DNA damage using tethered duplex oligonucleotides and electrospray ionization ion trap mass spectrometric analysis.

    PubMed

    Harsch, A; Marzilli, L A; Bunt, R C; Stubbe, J; Vouros, P

    2000-05-01

    Bleomycin B(2)(BLM) in the presence of iron [Fe(II)] and O(2)catalyzes single-stranded (ss) and double-stranded (ds) cleavage of DNA. Electrospray ionization ion trap mass spectrometry was used to monitor these cleavage processes. Two duplex oligonucleotides containing an ethylene oxide tether between both strands were used in this investigation, allowing facile monitoring of all ss and ds cleavage events. A sequence for site-specific binding and cleavage by Fe-BLM was incorporated into each analyte. One of these core sequences, GTAC, is a known hot-spot for ds cleavage, while the other sequence, GGCC, is a hot-spot for ss cleavage. Incubation of each oligo-nucleotide under anaerobic conditions with Fe(II)-BLM allowed detection of the non-covalent ternary Fe-BLM/oligonucleotide complex in the gas phase. Cleavage studies were then performed utilizing O(2)-activated Fe(II)-BLM. No work-up or separation steps were required and direct MS and MS/MS analyses of the crude reaction mixtures confirmed sequence-specific Fe-BLM-induced cleavage. Comparison of the cleavage patterns for both oligonucleotides revealed sequence-dependent preferences for ss and ds cleavages in accordance with previously established gel electrophoresis analysis of hairpin oligonucleotides. This novel methodology allowed direct, rapid and accurate determination of cleavage profiles of model duplex oligonucleotides after exposure to activated Fe-BLM.

  4. A sequential analysis of classroom discourse in Italian primary schools: the many faces of the IRF pattern.

    PubMed

    Molinari, Luisa; Mameli, Consuelo; Gnisci, Augusto

    2013-09-01

    A sequential analysis of classroom discourse is needed to investigate the conditions under which the triadic initiation-response-feedback (IRF) pattern may host different teaching orientations. The purpose of the study is twofold: first, to describe the characteristics of classroom discourse and, second, to identify and explore the different interactive sequences that can be captured with a sequential statistical analysis. Twelve whole-class activities were video recorded in three Italian primary schools. We observed classroom interaction as it occurs naturally on an everyday basis. In total, we collected 587 min of video recordings. Subsequently, 828 triadic IRF patterns were extracted from this material and analysed with the programme Generalized Sequential Query (GSEQ). The results indicate that classroom discourse may unfold in different ways. In particular, we identified and described four types of sequences. Dialogic sequences were triggered by authentic questions, and continued through further relaunches. Monologic sequences were directed to fulfil the teachers' pre-determined didactic purposes. Co-constructive sequences fostered deduction, reasoning, and thinking. Scaffolding sequences helped and sustained children with difficulties. The application of sequential analyses allowed us to show that interactive sequences may account for a variety of meanings, thus making a significant contribution to the literature and research practice in classroom discourse. © 2012 The British Psychological Society.

  5. Endophytic bacterial diversity in grapevine (Vitis vinifera L.) leaves described by 16S rRNA gene sequence analysis and length heterogeneity-PCR.

    PubMed

    Bulgari, Daniela; Casati, Paola; Brusetti, Lorenzo; Quaglino, Fabio; Brasca, Milena; Daffonchio, Daniele; Bianco, Piero Attilio

    2009-08-01

    Diversity of bacterial endophytes associated with grapevine leaf tissues was analyzed by cultivation and cultivation-independent methods. In order to identify bacterial endophytes directly from metagenome, a protocol for bacteria enrichment and DNA extraction was optimized. Sequence analysis of 16S rRNA gene libraries underscored five diverse Operational Taxonomic Units (OTUs), showing best sequence matches with gamma-Proteobacteria, family Enterobacteriaceae, with a dominance of the genus Pantoea. Bacteria isolation through cultivation revealed the presence of six OTUs, showing best sequence matches with Actinobacteria, genus Curtobacterium, and with Firmicutes genera Bacillus and Enterococcus. Length Heterogeneity-PCR (LH-PCR) electrophoretic peaks from single bacterial clones were used to setup a database representing the bacterial endophytes identified in association with grapevine tissues. Analysis of healthy and phytoplasma-infected grapevine plants showed that LH-PCR could be a useful complementary tool for examining the diversity of bacterial endophytes especially for diversity survey on a large number of samples.

  6. Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth

    PubMed Central

    Marcy, Yann; Ouverney, Cleber; Bik, Elisabeth M.; Lösekann, Tina; Ivanova, Natalia; Martin, Hector Garcia; Szeto, Ernest; Platt, Darren; Hugenholtz, Philip; Relman, David A.; Quake, Stephen R.

    2007-01-01

    We have developed a microfluidic device that allows the isolation and genome amplification of individual microbial cells, thereby enabling organism-level genomic analysis of complex microbial ecosystems without the need for culture. This device was used to perform a directed survey of the human subgingival crevice and to isolate bacteria having rod-like morphology. Several isolated microbes had a 16S rRNA sequence that placed them in candidate phylum TM7, which has no cultivated or sequenced members. Genome amplification from individual TM7 cells allowed us to sequence and assemble >1,000 genes, providing insight into the physiology of members of this phylum. This approach enables single-cell genetic analysis of any uncultivated minority member of a microbial community. PMID:17620602

  7. Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

    PubMed

    Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L

    2017-09-27

    Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Direct identification of non-polio enteroviruses in residual paralysis cases by analysis of VP1 sequences.

    PubMed

    Rahimi, Pooneh; Tabatabaie, H; Gouya, Mohammad M; Mahmudi, M; Musavi, T; Rad, K Samimi; Azad, T Mokhtari; Nategh, R

    2009-06-01

    The 66 serotypes of human enteroviruses (EVs) are classified into four species A-D, based on phylogenetic relationships in multiple genome regions. Partial VP(1) amplification and sequence analysis are reliable methods for identifying non-polio enterovirus serotypes, especially in negative cell culture specimens from patients with residual paralysis. In Iran during the years 2000-2002, there were 29 residual paralysis cases with negative cell (RD, HEp(2) and L(20)B) culture results. The genomic RNA was extracted from stool specimens from cases of residual paralysis and detected by amplification of the 5'-nontranslated region using RT-PCR with Pan-EV primers. Partial VP(1) amplification by semi-nested RT-PCR (snRT-PCR) and sequence analysis were done. Specimens from the 29 culture-negative cases contained echoviruses of six different serotypes. The global eradication of wild polioviruses is near and study of non-polio enteroviruses, which can cause poliomyelitis, is increasingly important to understand their pathogenesis. The VP(1) sequences, derived from the snRT-PCR products, allowed rapid molecular analysis of these non-polio strains.

  9. The scattering of electromagnetic pulses by a slit in a conducting screen

    NASA Technical Reports Server (NTRS)

    Ackerknecht, W. E., III; Chen, C.-L.

    1975-01-01

    A direct method for calculating the impulse response of a slit in a conducting screen is presented which is derived specifically for the analysis of transient scattering by two-dimensional objects illuminated by a plane incident wave. The impulse response is obtained by assuming that the total response is composed of two sequences of diffracted waves. The solution is determined for the first two waves in one sequence by using Green's functions and the equivalence principle, for additional waves in the sequence by iteration, and for the other sequence by a transformation of coordinates. The cases of E-polarization and H-polarization are considered.

  10. Construction of high-quality recombination maps with low-coverage genomic sequencing for joint linkage analysis in maize

    USDA-ARS?s Scientific Manuscript database

    A genome-wide association study (GWAS) is the foremost strategy used for finding genes that control human diseases and agriculturally important traits, but it often reports false positives. In contrast, its complementary method, linkage analysis, provides direct genetic confirmation, but with limite...

  11. Structurally detailed coarse-grained model for Sec-facilitated co-translational protein translocation and membrane integration

    PubMed Central

    Miller, Thomas F.

    2017-01-01

    We present a coarse-grained simulation model that is capable of simulating the minute-timescale dynamics of protein translocation and membrane integration via the Sec translocon, while retaining sufficient chemical and structural detail to capture many of the sequence-specific interactions that drive these processes. The model includes accurate geometric representations of the ribosome and Sec translocon, obtained directly from experimental structures, and interactions parameterized from nearly 200 μs of residue-based coarse-grained molecular dynamics simulations. A protocol for mapping amino-acid sequences to coarse-grained beads enables the direct simulation of trajectories for the co-translational insertion of arbitrary polypeptide sequences into the Sec translocon. The model reproduces experimentally observed features of membrane protein integration, including the efficiency with which polypeptide domains integrate into the membrane, the variation in integration efficiency upon single amino-acid mutations, and the orientation of transmembrane domains. The central advantage of the model is that it connects sequence-level protein features to biological observables and timescales, enabling direct simulation for the mechanistic analysis of co-translational integration and for the engineering of membrane proteins with enhanced membrane integration efficiency. PMID:28328943

  12. Hadoop-BAM: directly manipulating next generation sequencing data in the cloud.

    PubMed

    Niemenmaa, Matti; Kallio, Aleksi; Schumacher, André; Klemelä, Petri; Korpelainen, Eija; Heljanko, Keijo

    2012-03-15

    Hadoop-BAM is a novel library for the scalable manipulation of aligned next-generation sequencing data in the Hadoop distributed computing framework. It acts as an integration layer between analysis applications and BAM files that are processed using Hadoop. Hadoop-BAM solves the issues related to BAM data access by presenting a convenient API for implementing map and reduce functions that can directly operate on BAM records. It builds on top of the Picard SAM JDK, so tools that rely on the Picard API are expected to be easily convertible to support large-scale distributed processing. In this article we demonstrate the use of Hadoop-BAM by building a coverage summarizing tool for the Chipster genome browser. Our results show that Hadoop offers good scalability, and one should avoid moving data in and out of Hadoop between analysis steps.

  13. Microsatellite analysis in the genome of Acanthaceae: An in silico approach.

    PubMed

    Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

    2015-01-01

    Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future.

  14. Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners

    PubMed Central

    Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea

    2014-01-01

    In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061

  15. Identification of succinimide sites in proteins by N-terminal sequence analysis after alkaline hydroxylamine cleavage.

    PubMed Central

    Kwong, M. Y.; Harris, R. J.

    1994-01-01

    Under favorable conditions, Asp or Asn residues can undergo rearrangement to a succinimide (cyclic imide), which may also serve as an intermediate for deamidation and/or isoaspartate formation. Direct identification of such succinimides by peptide mapping is hampered by their lability at neutral and alkaline pH. We determined that incubation in 2 M hydroxylamine, 0.2 M Tris buffer, pH 9, for 2 h at 45 degrees C will specifically cleave on the C-terminal side of succinimides without cleavage at Asn-Gly bonds; yields are typically approximately 50%. N-terminal sequence analysis can then be used to identify an internal sequence generated by cleavage of the succinimide, hence identifying the succinimide site. PMID:8142891

  16. Whole-genome CNV analysis: advances in computational approaches.

    PubMed

    Pirooznia, Mehdi; Goes, Fernando S; Zandi, Peter P

    2015-01-01

    Accumulating evidence indicates that DNA copy number variation (CNV) is likely to make a significant contribution to human diversity and also play an important role in disease susceptibility. Recent advances in genome sequencing technologies have enabled the characterization of a variety of genomic features, including CNVs. This has led to the development of several bioinformatics approaches to detect CNVs from next-generation sequencing data. Here, we review recent advances in CNV detection from whole genome sequencing. We discuss the informatics approaches and current computational tools that have been developed as well as their strengths and limitations. This review will assist researchers and analysts in choosing the most suitable tools for CNV analysis as well as provide suggestions for new directions in future development.

  17. Can We Improve Structured Sequence Processing? Exploring the Direct and Indirect Effects of Computerized Training Using a Mediational Model

    PubMed Central

    Smith, Gretchen N. L.; Conway, Christopher M.; Bauernschmidt, Althea; Pisoni, David B.

    2015-01-01

    Recent research suggests that language acquisition may rely on domain-general learning abilities, such as structured sequence processing, which is the ability to extract, encode, and represent structured patterns in a temporal sequence. If structured sequence processing supports language, then it may be possible to improve language function by enhancing this foundational learning ability. The goal of the present study was to use a novel computerized training task as a means to better understand the relationship between structured sequence processing and language function. Participants first were assessed on pre-training tasks to provide baseline behavioral measures of structured sequence processing and language abilities. Participants were then quasi-randomly assigned to either a treatment group involving adaptive structured visuospatial sequence training, a treatment group involving adaptive non-structured visuospatial sequence training, or a control group. Following four days of sequence training, all participants were assessed with the same pre-training measures. Overall comparison of the post-training means revealed no group differences. However, in order to examine the potential relations between sequence training, structured sequence processing, and language ability, we used a mediation analysis that showed two competing effects. In the indirect effect, adaptive sequence training with structural regularities had a positive impact on structured sequence processing performance, which in turn had a positive impact on language processing. This finding not only identifies a potential novel intervention to treat language impairments but also may be the first demonstration that structured sequence processing can be improved and that this, in turn, has an impact on language processing. However, in the direct effect, adaptive sequence training with structural regularities had a direct negative impact on language processing. This unexpected finding suggests that adaptive training with structural regularities might potentially interfere with language processing. Taken together, these findings underscore the importance of pursuing designs that promote a better understanding of the mechanisms underlying training-related changes, so that regimens can be developed that help reduce these types of negative effects while simultaneously maximizing the benefits to outcome measures of interest. PMID:25946222

  18. Can we improve structured sequence processing? Exploring the direct and indirect effects of computerized training using a mediational model.

    PubMed

    Smith, Gretchen N L; Conway, Christopher M; Bauernschmidt, Althea; Pisoni, David B

    2015-01-01

    Recent research suggests that language acquisition may rely on domain-general learning abilities, such as structured sequence processing, which is the ability to extract, encode, and represent structured patterns in a temporal sequence. If structured sequence processing supports language, then it may be possible to improve language function by enhancing this foundational learning ability. The goal of the present study was to use a novel computerized training task as a means to better understand the relationship between structured sequence processing and language function. Participants first were assessed on pre-training tasks to provide baseline behavioral measures of structured sequence processing and language abilities. Participants were then quasi-randomly assigned to either a treatment group involving adaptive structured visuospatial sequence training, a treatment group involving adaptive non-structured visuospatial sequence training, or a control group. Following four days of sequence training, all participants were assessed with the same pre-training measures. Overall comparison of the post-training means revealed no group differences. However, in order to examine the potential relations between sequence training, structured sequence processing, and language ability, we used a mediation analysis that showed two competing effects. In the indirect effect, adaptive sequence training with structural regularities had a positive impact on structured sequence processing performance, which in turn had a positive impact on language processing. This finding not only identifies a potential novel intervention to treat language impairments but also may be the first demonstration that structured sequence processing can be improved and that this, in turn, has an impact on language processing. However, in the direct effect, adaptive sequence training with structural regularities had a direct negative impact on language processing. This unexpected finding suggests that adaptive training with structural regularities might potentially interfere with language processing. Taken together, these findings underscore the importance of pursuing designs that promote a better understanding of the mechanisms underlying training-related changes, so that regimens can be developed that help reduce these types of negative effects while simultaneously maximizing the benefits to outcome measures of interest.

  19. Arkas: Rapid reproducible RNAseq analysis

    PubMed Central

    Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan

    2017-01-01

    The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments.  We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways .  Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing.   Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import.  Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134

  20. A DS-UWB Cognitive Radio System Based on Bridge Function Smart Codes

    NASA Astrophysics Data System (ADS)

    Xu, Yafei; Hong, Sheng; Zhao, Guodong; Zhang, Fengyuan; di, Jinshan; Zhang, Qishan

    This paper proposes a direct-sequence UWB Gaussian pulse of cognitive radio systems based on bridge function smart sequence matrix and the Gaussian pulse. As the system uses the spreading sequence code, that is the bridge function smart code sequence, the zero correlation zones (ZCZs) which the bridge function sequences' auto-correlation functions had, could reduce multipath fading of the pulse interference. The Modulated channel signal was sent into the IEEE 802.15.3a UWB channel. We analysis the ZCZs's inhibition to the interference multipath interference (MPI), as one of the main system sources interferences. The simulation in SIMULINK/MATLAB is described in detail. The result shows the system has better performance by comparison with that employing Walsh sequence square matrix, and it was verified by the formula in principle.

  1. Stresses in Implant-Supported Fixed Complete Dentures with Different Screw-Tightening Sequences and Torque Application Modes.

    PubMed

    Barcellos, Leonardo H; Palmeiro, Marina Lobato; Naconecy, Marcos M; Geremia, Tomás; Cervieri, André; Shinkai, Rosemary S

    2018-05-17

    To compare the effects of different screw-tightening sequences and torque applications on stresses in implant-supported fixed complete dentures supported by five abutments. Strain gauges fixed to the abutments were used to test the sequences 2-4-3-1-5; 1-2-3-4-5; 3-2-4-1-5; and 2-5-4-1-3 with direct 10-Ncm torque or progressive torque (5 + 10 Ncm). Data were analyzed using analysis of variance and standardized effect size. No effects of tightening sequence or torque application were found except for the sequence 3-2-4-1-5 and some small to moderate effect sizes. Screw-tightening sequences and torque application modes have only a marginal effect on residual stresses.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jarocki, John Charles; Zage, David John; Fisher, Andrew N.

    LinkShop is a software tool for applying the method of Linkography to the analysis time-sequence data. LinkShop provides command line, web, and application programming interfaces (API) for input and processing of time-sequence data, abstraction models, and ontologies. The software creates graph representations of the abstraction model, ontology, and derived linkograph. Finally, the tool allows the user to perform statistical measurements of the linkograph and refine the ontology through direct manipulation of the linkograph.

  3. Genomic deletions of OFD1 account for 23% of oral-facial-digital type 1 syndrome after negative DNA sequencing.

    PubMed

    Thauvin-Robinet, Christel; Franco, Brunella; Saugier-Veber, Pascale; Aral, Bernard; Gigot, Nadège; Donzel, Anne; Van Maldergem, Lionel; Bieth, Eric; Layet, Valérie; Mathieu, Michèle; Teebi, Ahmad; Lespinasse, James; Callier, Patrick; Mugneret, Francine; Masurel-Paulet, Alice; Gautier, Elodie; Huet, Frédéric; Teyssier, Jean-Raymond; Tosi, Mario; Frébourg, Thierry; Faivre, Laurence

    2009-02-01

    Oral-facial-digital type I syndrome (OFDI) is characterised by an X-linked dominant mode of inheritance with lethality in males. Clinical features include facial dysmorphism with oral, dental and distal abnormalities, polycystic kidney disease and central nervous system malformations. Considerable allelic heterogeneity has been reported within the OFD1 gene, but DNA bi-directional sequencing of the exons and intron-exon boundaries of the OFD1 gene remains negative in more than 20% of cases. We hypothesized that genomic rearrangements could account for the majority of the remaining undiagnosed cases. Thus, we took advantage of two independent available series of patients with OFDI syndrome and negative DNA bi-directional sequencing of the exons and intron-exon boundaries of the OFD1 gene from two different European labs: 13/36 cases from the French lab; 13/95 from the Italian lab. All patients were screened by a semiquantitative fluorescent multiplex method (QFMPSF) and relative quantification by real-time PCR (qPCR). Six OFD1 genomic deletions (exon 5, exons 1-8, exons 1-14, exons 10-11, exons 13-23 and exon 17) were identified, accounting for 5% of OFDI patients and for 23% of patients with negative mutation screening by DNA sequencing. The association of DNA direct sequencing, QFMPSF and qPCR detects OFD1 alteration in up to 85% of patients with a phenotype suggestive of OFDI syndrome. Given the average percentage of large genomic rearrangements (5%), we suggest that dosage methods should be performed in addition to DNA direct sequencing analysis to exclude the involvement of the OFD1 transcript when there are genetic counselling issues. (c) 2008 Wiley-Liss, Inc.

  4. How should Fitts' Law be applied to human-computer interaction?

    NASA Technical Reports Server (NTRS)

    Gillan, D. J.; Holden, K.; Adam, S.; Rudisill, M.; Magee, L.

    1992-01-01

    The paper challenges the notion that any Fitts' Law model can be applied generally to human-computer interaction, and proposes instead that applying Fitts' Law requires knowledge of the users' sequence of movements, direction of movement, and typical movement amplitudes as well as target sizes. Two experiments examined a text selection task with sequences of controlled movements (point-click and point-drag). For the point-click sequence, a Fitts' Law model that used the diagonal across the text object in the direction of pointing (rather than the horizontal extent of the text object) as the target size provided the best fit for the pointing time data, whereas for the point-drag sequence, a Fitts' Law model that used the vertical size of the text object as the target size gave the best fit. Dragging times were fitted well by Fitts' Law models that used either the vertical or horizontal size of the terminal character in the text object. Additional results of note were that pointing in the point-click sequence was consistently faster than in the point-drag sequence, and that pointing in either sequence was consistently faster than dragging. The discussion centres around the need to define task characteristics before applying Fitts' Law to an interface design or analysis, analyses of pointing and of dragging, and implications for interface design.

  5. Oligonucleotide gap-fill ligation for mutation detection and sequencing in situ

    PubMed Central

    Mignardi, Marco; Mezger, Anja; Qian, Xiaoyan; La Fleur, Linnea; Botling, Johan; Larsson, Chatarina; Nilsson, Mats

    2015-01-01

    In clinical diagnostics a great need exists for targeted in situ multiplex nucleic acid analysis as the mutational status can offer guidance for effective treatment. One well-established method uses padlock probes for mutation detection and multiplex expression analysis directly in cells and tissues. Here, we use oligonucleotide gap-fill ligation to further increase specificity and to capture molecular substrates for in situ sequencing. Short oligonucleotides are joined at both ends of a padlock gap probe by two ligation events and are then locally amplified by target-primed rolling circle amplification (RCA) preserving spatial information. We demonstrate the specific detection of the A3243G mutation of mitochondrial DNA and we successfully characterize a single nucleotide variant in the ACTB mRNA in cells by in situ sequencing of RCA products generated by padlock gap-fill ligation. To demonstrate the clinical applicability of our assay, we show specific detection of a point mutation in the EGFR gene in fresh frozen and formalin-fixed, paraffin-embedded (FFPE) lung cancer samples and confirm the detected mutation by in situ sequencing. This approach presents several advantages over conventional padlock probes allowing simpler assay design for multiplexed mutation detection to screen for the presence of mutations in clinically relevant mutational hotspots directly in situ. PMID:26240388

  6. RNA-Seq analysis to capture the transcriptome landscape of a single cell

    PubMed Central

    Tang, Fuchou; Barbacioru, Catalin; Nordman, Ellen; Xu, Nanlan; Bashkirov, Vladimir I; Lao, Kaiqin; Surani, M. Azim

    2013-01-01

    We describe here a protocol for digital transcriptome analysis in a single mouse blastomere using a deep sequencing approach. An individual blastomere was first isolated and put into lysate buffer by mouth pipette. Reverse transcription was then performed directly on the whole cell lysate. After this, the free primers were removed by Exonuclease I and a poly(A) tail was added to the 3′ end of the first-strand cDNA by Terminal Deoxynucleotidyl Transferase. Then the single cell cDNAs were amplified by 20 plus 9 cycles of PCR. Then 100-200 ng of these amplified cDNAs were used to construct a sequencing library. The sequencing library can be used for deep sequencing using the SOLiD system. Compared with the cDNA microarray technique, our assay can capture up to 75% more genes expressed in early embryos. The protocol can generate deep sequencing libraries within 6 days for 16 single cell samples. PMID:20203668

  7. Direct bisulfite sequencing for examination of DNA methylation with gene and nucleotide resolution from brain tissues.

    PubMed

    Parrish, R Ryley; Day, Jeremy J; Lubin, Farah D

    2012-07-01

    DNA methylation is an epigenetic modification that is essential for the development and mature function of the central nervous system. Due to the relevance of this modification to the transcriptional control of gene expression, it is often necessary to examine changes in DNA methylation patterns with both gene and single-nucleotide resolution. Here, we describe an in-depth basic protocol for direct bisulfite sequencing of DNA isolated from brain tissue, which will permit direct assessment of methylation status at individual genes as well as individual cytosine molecules/nucleotides within a genomic region. This method yields analysis of DNA methylation patterns that is robust, accurate, and reproducible, thereby allowing insights into the role of alterations in DNA methylation in brain tissue.

  8. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples.

    PubMed

    Quick, Joshua; Grubaugh, Nathan D; Pullan, Steven T; Claro, Ingra M; Smith, Andrew D; Gangavarapu, Karthik; Oliveira, Glenn; Robles-Sikisaka, Refugio; Rogers, Thomas F; Beutler, Nathan A; Burton, Dennis R; Lewis-Ximenez, Lia Laura; de Jesus, Jaqueline Goes; Giovanetti, Marta; Hill, Sarah C; Black, Allison; Bedford, Trevor; Carroll, Miles W; Nunes, Marcio; Alcantara, Luiz Carlos; Sabino, Ester C; Baylis, Sally A; Faria, Nuno R; Loose, Matthew; Simpson, Jared T; Pybus, Oliver G; Andersen, Kristian G; Loman, Nicholas J

    2017-06-01

    Genome sequencing has become a powerful tool for studying emerging infectious diseases; however, genome sequencing directly from clinical samples (i.e., without isolation and culture) remains challenging for viruses such as Zika, for which metagenomic sequencing methods may generate insufficient numbers of viral reads. Here we present a protocol for generating coding-sequence-complete genomes, comprising an online primer design tool, a novel multiplex PCR enrichment protocol, optimized library preparation methods for the portable MinION sequencer (Oxford Nanopore Technologies) and the Illumina range of instruments, and a bioinformatics pipeline for generating consensus sequences. The MinION protocol does not require an Internet connection for analysis, making it suitable for field applications with limited connectivity. Our method relies on multiplex PCR for targeted enrichment of viral genomes from samples containing as few as 50 genome copies per reaction. Viral consensus sequences can be achieved in 1-2 d by starting with clinical samples and following a simple laboratory workflow. This method has been successfully used by several groups studying Zika virus evolution and is facilitating an understanding of the spread of the virus in the Americas. The protocol can be used to sequence other viral genomes using the online Primal Scheme primer designer software. It is suitable for sequencing either RNA or DNA viruses in the field during outbreaks or as an inexpensive, convenient method for use in the lab.

  9. Multilocus sequence typing of Pseudomonas syringae sensu lato confirms previously described genomospecies and permits rapid identification of P. syringae pv. coriandricola and P. syringae pv. apii causing bacterial leaf spot on parsley.

    PubMed

    Bull, Carolee T; Clarke, Christopher R; Cai, Rongman; Vinatzer, Boris A; Jardini, Teresa M; Koike, Steven T

    2011-07-01

    Since 2002, severe leaf spotting on parsley (Petroselinum crispum) has occurred in Monterey County, CA. Either of two different pathovars of Pseudomonas syringae sensu lato were isolated from diseased leaves from eight distinct outbreaks and once from the same outbreak. Fragment analysis of DNA amplified between repetitive sequence polymerase chain reaction; 16S rDNA sequence analysis; and biochemical, physiological, and host range tests identified the pathogens as Pseudomonas syringae pv. apii and P. syringae pv. coriandricola. Koch's postulates were completed for the isolates from parsley, and host range tests with parsley isolates and pathotype strains demonstrated that P. syringae pv. apii and P. syringae pv. coriandricola cause leaf spot diseases on parsley, celery, and coriander or cilantro. In a multilocus sequence typing (MLST) approach, four housekeeping gene fragments were sequenced from 10 strains isolated from parsley and 56 pathotype strains of P. syringae. Allele sequences were uploaded to the Plant-Associated Microbes Database and a phylogenetic tree was built based on concatenated sequences. Tree topology directly corresponded to P. syringae genomospecies and P. syringae pv. apii was allocated appropriately to genomospecies 3. This is the first demonstration that MLST can accurately allocate new pathogens directly to P. syringae sensu lato genomospecies. According to MLST, P. syringae pv. coriandricola is a member of genomospecies 9, P. cannabina. In a blind test, both P. syringae pv. coriandricola and P. syringae pv. apii isolates from parsley were correctly identified to pathovar. In both cases, MLST described diversity within each pathovar that was previously unknown.

  10. Metagenomics workflow analysis of endophytic bacteria from oil palm fruits

    NASA Astrophysics Data System (ADS)

    Tanjung, Z. A.; Aditama, R.; Sudania, W. M.; Utomo, C.; Liwang, T.

    2017-05-01

    Next-Generation Sequencing (NGS) has become a powerful sequencing tool for microbial study especially to lead the establishment of the field area of metagenomics. This study described a workflow to analyze metagenomics data of a Sequence Read Archive (SRA) file under accession ERP004286 deposited by University of Sao Paulo. It was a direct sequencing data generated by 454 pyrosequencing platform originated from oil palm fruits endophytic bacteria which were cultured using oil-palm enriched medium. This workflow used SortMeRNA to split ribosomal reads sequence, Newbler (GS Assembler and GS Mapper) to assemble and map reads into genome reference, BLAST package to identify and annotate contigs sequence, and QualiMap for statistical analysis. Eight bacterial species were identified in this study. Enterobacter cloacae was the most abundant species followed by Citrobacter koseri, Seratia marcescens, Latococcus lactis subsp. lactis, Klebsiella pneumoniae, Citrobacter amalonaticus, Achromobacter xylosoxidans, and Pseudomonas sp. respectively. All of these species have been reported as endophyte bacteria in various plant species and each has potential as plant growth promoting bacteria or another application in agricultural industries.

  11. CBrowse: a SAM/BAM-based contig browser for transcriptome assembly visualization and analysis.

    PubMed

    Li, Pei; Ji, Guoli; Dong, Min; Schmidt, Emily; Lenox, Douglas; Chen, Liangliang; Liu, Qi; Liu, Lin; Zhang, Jie; Liang, Chun

    2012-09-15

    To address the impending need for exploring rapidly increased transcriptomics data generated for non-model organisms, we developed CBrowse, an AJAX-based web browser for visualizing and analyzing transcriptome assemblies and contigs. Designed in a standard three-tier architecture with a data pre-processing pipeline, CBrowse is essentially a Rich Internet Application that offers many seamlessly integrated web interfaces and allows users to navigate, sort, filter, search and visualize data smoothly. The pre-processing pipeline takes the contig sequence file in FASTA format and its relevant SAM/BAM file as the input; detects putative polymorphisms, simple sequence repeats and sequencing errors in contigs and generates image, JSON and database-compatible CSV text files that are directly utilized by different web interfaces. CBowse is a generic visualization and analysis tool that facilitates close examination of assembly quality, genetic polymorphisms, sequence repeats and/or sequencing errors in transcriptome sequencing projects. CBrowse is distributed under the GNU General Public License, available at http://bioinfolab.muohio.edu/CBrowse/ liangc@muohio.edu or liangc.mu@gmail.com; glji@xmu.edu.cn Supplementary data are available at Bioinformatics online.

  12. Cancer systems biology in the genome sequencing era: part 1, dissecting and modeling of tumor clones and their networks.

    PubMed

    Wang, Edwin; Zou, Jinfeng; Zaman, Naif; Beitel, Lenore K; Trifiro, Mark; Paliouras, Miltiadis

    2013-08-01

    Recent tumor genome sequencing confirmed that one tumor often consists of multiple cell subpopulations (clones) which bear different, but related, genetic profiles such as mutation and copy number variation profiles. Thus far, one tumor has been viewed as a whole entity in cancer functional studies. With the advances of genome sequencing and computational analysis, we are able to quantify and computationally dissect clones from tumors, and then conduct clone-based analysis. Emerging technologies such as single-cell genome sequencing and RNA-Seq could profile tumor clones. Thus, we should reconsider how to conduct cancer systems biology studies in the genome sequencing era. We will outline new directions for conducting cancer systems biology by considering that genome sequencing technology can be used for dissecting, quantifying and genetically characterizing clones from tumors. Topics discussed in Part 1 of this review include computationally quantifying of tumor subpopulations; clone-based network modeling, cancer hallmark-based networks and their high-order rewiring principles and the principles of cell survival networks of fast-growing clones. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.

  13. Short communication: novel truncating mutations in the CFTR gene causing a severe form of cystic fibrosis in Italian patients.

    PubMed

    Lenarduzzi, S; Morgutti, M; Crovella, S; Coiana, A; Rosatelli, M C

    2014-11-14

    Cystic fibrosis (CF) is a common recessive genetic disease caused by mutations in the gene encoding for the cystic fibrosis transmembrane conductance regulator (CFTR) protein. More than 1800 different mutations have been described to date. Here, we report 3 novel mutations in CFTR in 3 Italian CF patients. To detect and identify 36 frequent mutations in Caucasians, we used the INNO-LiPA CFTR19 and INNO-LiPA CFTR17+Tn Update kits (Innogenetics; Ghent, Belgium). Our first analysis did not reveal both of the responsible mutations; thus, direct sequencing of the CFTR gene coding region was performed. The 3 patients were compound heterozygous. In one allele, the F508del (c.1521_1523delCTT, p.PHE508del) mutation in exon 11 was observed in each case. For the second allele, in patient No.1, direct sequencing revealed an 11-base pair deletion (GAGGCGATACT) in exon 14 (c.2236_2246del; pGlu746Alafs*29). In patient No. 2, direct sequencing revealed a nonsense mutation at nucleotide 3892 (c.3892G>T) in exon 24. In patient No. 3, direct sequencing revealed a deletion of cytosine in exon 27 (c.4296delC; p.Asn1432Lysfs*16). These 3 novel mutations indicate the production of a truncated protein, which consequently results in a non-functional polypeptide.

  14. Effects of the Laramide Structures on the Regional Distribution of Tight-Gas Sandstone in the Upper Mesaverde Group, Uinta Basin, Utah

    NASA Astrophysics Data System (ADS)

    Sitaula, R. P.; Aschoff, J.

    2013-12-01

    Regional-scale sequence stratigraphic correlation, well log analysis, syntectonic unconformity mapping, isopach maps, and depositional environment maps of the upper Mesaverde Group (UMG) in Uinta basin, Utah suggest higher accommodation in northeastern part (Natural Buttes area) and local development of lacustrine facies due to increased subsidence caused by uplift of San Rafael Swell (SRS) in southern and Uinta Uplift in northern parts. Recently discovered lacustrine facies in Natural Buttes area are completely different than the dominant fluvial facies in outcrops along Book Cliffs and could have implications for significant amount of tight-gas sand production from this area. Data used for sequence stratigraphic correlation, isopach maps and depositional environmental maps include > 100 well logs, 20 stratigraphic profiles, 35 sandstone thin sections and 10 outcrop-based gamma ray profiles. Seven 4th order depositional sequences (~0.5 my duration) are identified and correlated within UMG. Correlation was constructed using a combination of fluvial facies and stacking patterns in outcrops, chert-pebble conglomerates and tidally influenced strata. These surfaces were extrapolated into subsurface by matching GR profiles. GR well logs and core log of Natural Buttes area show intervals of coarsening upward patterns suggesting possible lacustrine intervals that might contain high TOC. Locally, younger sequences are completely truncated across SRS whereas older sequences are truncated and thinned toward SRS. The cycles of truncation and thinning represent phases of SRS uplift. Thinning possibly related with the Uinta Uplift is also observed in northwestern part. Paleocurrents are consistent with interpretation of periodic segmentation and deflection of sedimentation. Regional paleocurrents are generally E-NE-directed in Sequences 1-4, and N-directed in Sequences 5-7. From isopach maps and paleocurrent direction it can be interpreted that uplift of SRS changed route of sediment supply from west to southwest. Locally, paleocurrents are highly variable near SRS further suggesting UMG basin-fill was partitioned by uplift of SRS. Sandstone composition analysis also suggests the uplift of SRS causing the variation of source rocks in upper sequences than the lower sequences. In conclusion, we suggest that Uinta basin was episodically partitioned during the deposition of UMG due to uplift of Laramide structures in the basin and accommodation was localized in northeastern part. Understanding of structural controls on accommodation, sedimentation patterns and depositional environments will aid prediction of the best-producing gas reservoirs.

  15. PANGEA: pipeline for analysis of next generation amplicons

    PubMed Central

    Giongo, Adriana; Crabb, David B; Davis-Richardson, Austin G; Chauliac, Diane; Mobberley, Jennifer M; Gano, Kelsey A; Mukherjee, Nabanita; Casella, George; Roesch, Luiz FW; Walts, Brandon; Riva, Alberto; King, Gary; Triplett, Eric W

    2010-01-01

    High-throughput DNA sequencing can identify organisms and describe population structures in many environmental and clinical samples. Current technologies generate millions of reads in a single run, requiring extensive computational strategies to organize, analyze and interpret those sequences. A series of bioinformatics tools for high-throughput sequencing analysis, including preprocessing, clustering, database matching and classification, have been compiled into a pipeline called PANGEA. The PANGEA pipeline was written in Perl and can be run on Mac OSX, Windows or Linux. With PANGEA, sequences obtained directly from the sequencer can be processed quickly to provide the files needed for sequence identification by BLAST and for comparison of microbial communities. Two different sets of bacterial 16S rRNA sequences were used to show the efficiency of this workflow. The first set of 16S rRNA sequences is derived from various soils from Hawaii Volcanoes National Park. The second set is derived from stool samples collected from diabetes-resistant and diabetes-prone rats. The workflow described here allows the investigator to quickly assess libraries of sequences on personal computers with customized databases. PANGEA is provided for users as individual scripts for each step in the process or as a single script where all processes, except the χ2 step, are joined into one program called the ‘backbone’. PMID:20182525

  16. PANGEA: pipeline for analysis of next generation amplicons.

    PubMed

    Giongo, Adriana; Crabb, David B; Davis-Richardson, Austin G; Chauliac, Diane; Mobberley, Jennifer M; Gano, Kelsey A; Mukherjee, Nabanita; Casella, George; Roesch, Luiz F W; Walts, Brandon; Riva, Alberto; King, Gary; Triplett, Eric W

    2010-07-01

    High-throughput DNA sequencing can identify organisms and describe population structures in many environmental and clinical samples. Current technologies generate millions of reads in a single run, requiring extensive computational strategies to organize, analyze and interpret those sequences. A series of bioinformatics tools for high-throughput sequencing analysis, including pre-processing, clustering, database matching and classification, have been compiled into a pipeline called PANGEA. The PANGEA pipeline was written in Perl and can be run on Mac OSX, Windows or Linux. With PANGEA, sequences obtained directly from the sequencer can be processed quickly to provide the files needed for sequence identification by BLAST and for comparison of microbial communities. Two different sets of bacterial 16S rRNA sequences were used to show the efficiency of this workflow. The first set of 16S rRNA sequences is derived from various soils from Hawaii Volcanoes National Park. The second set is derived from stool samples collected from diabetes-resistant and diabetes-prone rats. The workflow described here allows the investigator to quickly assess libraries of sequences on personal computers with customized databases. PANGEA is provided for users as individual scripts for each step in the process or as a single script where all processes, except the chi(2) step, are joined into one program called the 'backbone'.

  17. Phylogenetic analysis of a transfusion-transmitted hepatitis A outbreak.

    PubMed

    Hettmann, Andrea; Juhász, Gabriella; Dencs, Ágnes; Tresó, Bálint; Rusvai, Erzsébet; Barabás, Éva; Takács, Mária

    2017-02-01

    A transfusion-associated hepatitis A outbreak was found in the first time in Hungary. The outbreak involved five cases. Parenteral transmission of hepatitis A is rare, but may occur during viraemia. Direct sequencing of nested PCR products was performed, and all the examined samples were identical in the VP1/2A region of the hepatitis A virus genome. HAV sequences found in recent years were compared and phylogenetic analysis showed that the strain which caused these cases is the same as that had spread in Hungary recently causing several hepatitis A outbreaks throughout the country.

  18. Transcriptome Analysis at the Single-Cell Level Using SMART Technology.

    PubMed

    Fish, Rachel N; Bostick, Magnolia; Lehman, Alisa; Farmer, Andrew

    2016-10-10

    RNA sequencing (RNA-seq) is a powerful method for analyzing cell state, with minimal bias, and has broad applications within the biological sciences. However, transcriptome analysis of seemingly homogenous cell populations may in fact overlook significant heterogeneity that can be uncovered at the single-cell level. The ultra-low amount of RNA contained in a single cell requires extraordinarily sensitive and reproducible transcriptome analysis methods. As next-generation sequencing (NGS) technologies mature, transcriptome profiling by RNA-seq is increasingly being used to decipher the molecular signature of individual cells. This unit describes an ultra-sensitive and reproducible protocol to generate cDNA and sequencing libraries directly from single cells or RNA inputs ranging from 10 pg to 10 ng. Important considerations for working with minute RNA inputs are given. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  19. Discrimination of Bacillus anthracis from closely related microorganisms by analysis of 16S and 23S rRNA with oligonucleotide microchips

    DOEpatents

    Bavykin, Sergei G.; Mirzabekov, Andrei D.

    2007-10-30

    The present invention is directed to a novel method of discriminating a highly infectious bacterium Bacillus anthracis from a group of closely related microorganisms. Sequence variations in the 16S and 23S rRNA of the B. cereus subgroup including B. anthracis are utilized to construct an array that can detect these sequence variations through selective hybridizations. The identification and analysis of these sequence variations enables positive discrimination of isolates of the B. cereus group that includes B. anthracis. Discrimination of single base differences in rRNA was achieved with a microchip during analysis of B. cereus group isolates from both single and in mixed probes, as well as identification of polymorphic sites. Successful use of a microchip to determine the appropriate subgroup classification using eight reference microorganisms from the B. cereus group as a study set, was demonstrated.

  20. A Generalized Least-Squares Estimate for the Origin of Sporophytic Self-Incompatibility

    PubMed Central

    Uyenoyama, M. K.

    1995-01-01

    Analysis of nucleotide sequences that regulate the expression of self-incompatibility in flowering plants affords a direct means of examining classical hypotheses for the origin and evolution of this major feature of mating systems. Departing from the classical view of monophyly of all forms of self-incompatibility, the current paradigm for the origin of self-incompatibility postulates multiple episodes of recruitment and modification of preexisting genes. In Brassica, the S locus, which regulates sporophytic self-incompatibility, shows homology to a multigene family present both in self-compatible congeners and in groups for which this form of self-incompatibility is atypical. A phylogenetic analysis of S-allele sequences together with homologous sequences that do not cosegregate with self-incompatibility permits dating the change of function that marked the origin of self-incompatibility. A generalized least-squares method is introduced that provides closed-form expressions for estimates and standard errors for function-specific divergence rates and times of divergence among sequences. This analysis suggests that the age of the sporophytic self-incompatibility system expressed in Brassica exceeds species divergence within the genus by four- to fivefold. The extraordinarily high levels of sequence diversity exhibited by S alleles appears to reflect their ancient derivation, with the alternative hypothesis of hypermutability rejected by the analysis. PMID:7713446

  1. Molecular Detection, Isolation, and Physiological Characterization of Functionally Dominant Phenol-Degrading Bacteria in Activated Sludge

    PubMed Central

    Watanabe, Kazuya; Teramoto, Maki; Futamata, Hiroyuki; Harayama, Shigeaki

    1998-01-01

    DNA was isolated from phenol-digesting activated sludge, and partial fragments of the 16S ribosomal DNA (rDNA) and the gene encoding the largest subunit of multicomponent phenol hydroxylase (LmPH) were amplified by PCR. An analysis of the amplified fragments by temperature gradient gel electrophoresis (TGGE) demonstrated that two major 16S rDNA bands (bands R2 and R3) and two major LmPH gene bands (bands P2 and P3) appeared after the activated sludge became acclimated to phenol. The nucleotide sequences of these major bands were determined. In parallel, bacteria were isolated from the activated sludge by direct plating or by plating after enrichment either in batch cultures or in a chemostat culture. The bacteria isolated were classified into 27 distinct groups by a repetitive extragenic palindromic sequence PCR analysis. The partial nucleotide sequences of 16S rDNAs and LmPH genes of members of these 27 groups were then determined. A comparison of these nucleotide sequences with the sequences of the major TGGE bands indicated that the major bacterial populations, R2 and R3, possessed major LmPH genes P2 and P3, respectively. The dominant populations could be isolated either by direct plating or by chemostat culture enrichment but not by batch culture enrichment. One of the dominant strains (R3) which contained a novel type of LmPH (P3), was closely related to Valivorax paradoxus, and the result of a kinetic analysis of its phenol-oxygenating activity suggested that this strain was the principal phenol digester in the activated sludge. PMID:9797297

  2. Genome-directed analysis of prophage excision, host defence systems, and central fermentative metabolism in Clostridium pasteurianum.

    PubMed

    Pyne, Michael E; Liu, Xuejia; Moo-Young, Murray; Chung, Duane A; Chou, C Perry

    2016-09-19

    Clostridium pasteurianum is emerging as a prospective host for the production of biofuels and chemicals, and has recently been shown to directly consume electric current. Despite this growing biotechnological appeal, the organism's genetics and central metabolism remain poorly understood. Here we present a concurrent genome sequence for the C. pasteurianum type strain and provide extensive genomic analysis of the organism's defence mechanisms and central fermentative metabolism. Next generation genome sequencing produced reads corresponding to spontaneous excision of a novel phage, designated φ6013, which could be induced using mitomycin C and detected using PCR and transmission electron microscopy. Methylome analysis of sequencing reads provided a near-complete glimpse into the organism's restriction-modification systems. We also unveiled the chief C. pasteurianum Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) locus, which was found to exemplify a Type I-B system. Finally, we show that C. pasteurianum possesses a highly complex fermentative metabolism whereby the metabolic pathways enlisted by the cell is governed by the degree of reductance of the substrate. Four distinct fermentation profiles, ranging from exclusively acidogenic to predominantly alcohologenic, were observed through redox consideration of the substrate. A detailed discussion of the organism's central metabolism within the context of metabolic engineering is provided.

  3. Genomic analysis suggests that mRNA destabilization by the microprocessor is specialized for the auto-regulation of Dgcr8.

    PubMed

    Shenoy, Archana; Blelloch, Robert

    2009-09-11

    The Microprocessor, containing the RNA binding protein Dgcr8 and RNase III enzyme Drosha, is responsible for processing primary microRNAs to precursor microRNAs. The Microprocessor regulates its own levels by cleaving hairpins in the 5'UTR and coding region of the Dgcr8 mRNA, thereby destabilizing the mature transcript. To determine whether the Microprocessor has a broader role in directly regulating other coding mRNA levels, we integrated results from expression profiling and ultra high-throughput deep sequencing of small RNAs. Expression analysis of mRNAs in wild-type, Dgcr8 knockout, and Dicer knockout mouse embryonic stem (ES) cells uncovered mRNAs that were specifically upregulated in the Dgcr8 null background. A number of these transcripts had evolutionarily conserved predicted hairpin targets for the Microprocessor. However, analysis of deep sequencing data of 18 to 200nt small RNAs in mouse ES, HeLa, and HepG2 indicates that exonic sequence reads that map in a pattern consistent with Microprocessor activity are unique to Dgcr8. We conclude that the Microprocessor's role in directly destabilizing coding mRNAs is likely specifically targeted to Dgcr8 itself, suggesting a specialized cellular mechanism for gene auto-regulation.

  4. Amoeba proteus displays a walking form of locomotion.

    PubMed

    Cameron, Ivan; Rinaldi, Robert A; Kirby, Gerald; Davidson, David

    2007-08-01

    This report deals with observations on the directional locomotion of amoeba before and after fixation and scanning electron microscopy. The study was aimed at visualization of the stepwise events of directional movements. After the analysis of the data it is proposed that the amoeba undergoes a sequence of movement events that can be defined as a walking form of locomotion.

  5. DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.

    PubMed

    Eernisse, D J

    1992-04-01

    DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.

  6. Head direction cells in the postsubiculum do not show replay of prior waking sequences during sleep

    PubMed Central

    Brandon, Mark P.; Bogaard, Andrew; Andrews, Chris M.; Hasselmo, Michael E.

    2011-01-01

    During slow-wave sleep and REM sleep, hippocampal place cells in the rat show replay of sequences previously observed during waking. We tested the hypothesis from computational modelling that the temporal structure of REM sleep replay could arise from an interplay of place cells with head direction cells in the postsubiculum. Physiological single-unit recording was performed simultaneously from five or more head direction or place by head direction cells in the postsubiculum during running on a circular track allowing sampling of a full range of head directions, and during sleep periods before and after running on the circular track. Data analysis compared the spiking activity during individual REM periods with waking as in previous analysis procedures for REM sleep. We also used a new procedure comparing groups of similar runs during waking with REM sleep periods. There was no consistent evidence for a statistically significant correlation of the temporal structure of spiking during REM sleep with spiking during waking running periods. Thus, the spiking activity of head direction cells during REM sleep does not show replay of head direction cell activity occurring during a previous waking period of running on the task. In addition, we compared the spiking of postsubiculum neurons during hippocampal sharp wave ripple events. We show that head direction cells are not activated during sharp wave ripples, while neurons responsive to place in the postsubiculum show reliable spiking at ripple events. PMID:21509854

  7. Palaeoproteomic evidence identifies archaic hominins associated with the Châtelperronian at the Grotte du Renne

    PubMed Central

    Welker, Frido; Hajdinjak, Mateja; Talamo, Sahra; Jaouen, Klervia; Dannemann, Michael; David, Francine; Julien, Michèle; Meyer, Matthias; Barnes, Ian; Brace, Selina; Kamminga, Pepijn; Fischer, Roman; Kessler, Benedikt M.; Stewart, John R.; Pääbo, Svante; Collins, Matthew J.; Hublin, Jean-Jacques

    2016-01-01

    In Western Europe, the Middle to Upper Paleolithic transition is associated with the disappearance of Neandertals and the spread of anatomically modern humans (AMHs). Current chronological, behavioral, and biological models of this transitional period hinge on the Châtelperronian technocomplex. At the site of the Grotte du Renne, Arcy-sur-Cure, morphological Neandertal specimens are not directly dated but are contextually associated with the Châtelperronian, which contains bone points and beads. The association between Neandertals and this “transitional” assemblage has been controversial because of the lack either of a direct hominin radiocarbon date or of molecular confirmation of the Neandertal affiliation. Here we provide further evidence for a Neandertal–Châtelperronian association at the Grotte du Renne through biomolecular and chronological analysis. We identified 28 additional hominin specimens through zooarchaeology by mass spectrometry (ZooMS) screening of morphologically uninformative bone specimens from Châtelperronian layers at the Grotte du Renne. Next, we obtain an ancient hominin bone proteome through liquid chromatography-MS/MS analysis and error-tolerant amino acid sequence analysis. Analysis of this palaeoproteome allows us to provide phylogenetic and physiological information on these ancient hominin specimens. We distinguish Late Pleistocene clades within the genus Homo based on ancient protein evidence through the identification of an archaic-derived amino acid sequence for the collagen type X, alpha-1 (COL10α1) protein. We support this by obtaining ancient mtDNA sequences, which indicate a Neandertal ancestry for these specimens. Direct accelerator mass spectometry radiocarbon dating and Bayesian modeling confirm that the hominin specimens date to the Châtelperronian at the Grotte du Renne. PMID:27638212

  8. Palaeoproteomic evidence identifies archaic hominins associated with the Châtelperronian at the Grotte du Renne.

    PubMed

    Welker, Frido; Hajdinjak, Mateja; Talamo, Sahra; Jaouen, Klervia; Dannemann, Michael; David, Francine; Julien, Michèle; Meyer, Matthias; Kelso, Janet; Barnes, Ian; Brace, Selina; Kamminga, Pepijn; Fischer, Roman; Kessler, Benedikt M; Stewart, John R; Pääbo, Svante; Collins, Matthew J; Hublin, Jean-Jacques

    2016-10-04

    In Western Europe, the Middle to Upper Paleolithic transition is associated with the disappearance of Neandertals and the spread of anatomically modern humans (AMHs). Current chronological, behavioral, and biological models of this transitional period hinge on the Châtelperronian technocomplex. At the site of the Grotte du Renne, Arcy-sur-Cure, morphological Neandertal specimens are not directly dated but are contextually associated with the Châtelperronian, which contains bone points and beads. The association between Neandertals and this "transitional" assemblage has been controversial because of the lack either of a direct hominin radiocarbon date or of molecular confirmation of the Neandertal affiliation. Here we provide further evidence for a Neandertal-Châtelperronian association at the Grotte du Renne through biomolecular and chronological analysis. We identified 28 additional hominin specimens through zooarchaeology by mass spectrometry (ZooMS) screening of morphologically uninformative bone specimens from Châtelperronian layers at the Grotte du Renne. Next, we obtain an ancient hominin bone proteome through liquid chromatography-MS/MS analysis and error-tolerant amino acid sequence analysis. Analysis of this palaeoproteome allows us to provide phylogenetic and physiological information on these ancient hominin specimens. We distinguish Late Pleistocene clades within the genus Homo based on ancient protein evidence through the identification of an archaic-derived amino acid sequence for the collagen type X, alpha-1 (COL10α1) protein. We support this by obtaining ancient mtDNA sequences, which indicate a Neandertal ancestry for these specimens. Direct accelerator mass spectometry radiocarbon dating and Bayesian modeling confirm that the hominin specimens date to the Châtelperronian at the Grotte du Renne.

  9. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  10. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  11. NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads.

    PubMed

    Kulsum, Umay; Kapil, Arti; Singh, Harpreet; Kaur, Punit

    2018-01-01

    Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline (pipeline.pl) is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from https://github.com/Biomedinformatics/NGSPanPipe .

  12. MIPS: analysis and annotation of proteins from whole genomes

    PubMed Central

    Mewes, H. W.; Amid, C.; Arnold, R.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Münsterkötter, M.; Pagel, P.; Strack, N.; Stümpflen, V.; Warfsmann, J.; Ruepp, A.

    2004-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein–protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354

  13. MIPS: analysis and annotation of proteins from whole genomes.

    PubMed

    Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

    2004-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  14. Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization.

    PubMed

    Veidenberg, Andres; Medlar, Alan; Löytynoja, Ari

    2016-04-01

    Wasabi is an open source, web-based environment for evolutionary sequence analysis. Wasabi visualizes sequence data together with a phylogenetic tree within a modern, user-friendly interface: The interface hides extraneous options, supports context sensitive menus, drag-and-drop editing, and displays additional information, such as ancestral sequences, associated with specific tree nodes. The Wasabi environment supports reproducibility by automatically storing intermediate analysis steps and includes built-in functions to share data between users and publish analysis results. For computational analysis, Wasabi supports PRANK and PAGAN for phylogeny-aware alignment and alignment extension, and it can be easily extended with other tools. Along with drag-and-drop import of local files, Wasabi can access remote data through URL and import sequence data, GeneTrees and EPO alignments directly from Ensembl. To demonstrate a typical workflow using Wasabi, we reproduce key findings from recent comparative genomics studies, including a reanalysis of the EGLN1 gene from the tiger genome study: These case studies can be browsed within Wasabi at http://wasabiapp.org:8000?id=usecases. Wasabi runs inside a web browser and does not require any installation. One can start using it at http://wasabiapp.org. All source code is licensed under the AGPLv3. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  15. Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

    PubMed

    Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

    2017-11-13

    A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

  16. Methods for magnetic resonance analysis using magic angle technique

    DOEpatents

    Hu, Jian Zhi [Richland, WA; Wind, Robert A [Kennewick, WA; Minard, Kevin R [Kennewick, WA; Majors, Paul D [Kennewick, WA

    2011-11-22

    Methods of performing a magnetic resonance analysis of a biological object are disclosed that include placing the object in a main magnetic field (that has a static field direction) and in a radio frequency field; rotating the object at a frequency of less than about 100 Hz around an axis positioned at an angle of about 54.degree.44' relative to the main magnetic static field direction; pulsing the radio frequency to provide a sequence that includes a phase-corrected magic angle turning pulse segment; and collecting data generated by the pulsed radio frequency. In particular embodiments the method includes pulsing the radio frequency to provide at least two of a spatially selective read pulse, a spatially selective phase pulse, and a spatially selective storage pulse. Further disclosed methods provide pulse sequences that provide extended imaging capabilities, such as chemical shift imaging or multiple-voxel data acquisition.

  17. Species-specific identification of commercial probiotic strains.

    PubMed

    Yeung, P S M; Sanders, M E; Kitts, C L; Cano, R; Tong, P S

    2002-05-01

    Products containing probiotic bacteria are gaining popularity, increasing the importance of their accurate speciation. Unfortunately, studies have suggested that improper labeling of probiotic species is common in commercial products. Species identification of a bank of commercial probiotic strains was attempted using partial 16S rDNA sequencing, carbohydrate fermentation analysis, and cellular fatty acid methyl ester analysis. Results from partial 16S rDNA sequencing indicated discrepancies between species designations for 26 out of 58 strains tested, including two ATCC Lactobacillus strains. When considering only the commercial strains obtained directly from the manufacturers, 14 of 29 strains carried species designations different from those obtained by partial 16S rDNA sequencing. Strains from six commercial products were species not listed on the label. The discrepancies mainly occurred in Lactobacillus acidophilus and Lactobacillus casei groups. Carbohydrate fermentation analysis was not sensitive enough to identify species within the L. acidophilus group. Fatty acid methyl ester analysis was found to be variable and inaccurate and is not recommended to identify probiotic lactobacilli.

  18. Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

    PubMed

    Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

    2006-08-11

    Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.

  19. UCbase 2.0: ultraconserved sequences database (2014 update)

    PubMed Central

    Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

    2014-01-01

    UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it PMID:24951797

  20. Performance Analysis of Direct-Sequence Code-Division Multiple-Access Communications with Asymmetric Quadrature Phase-Shift-Keying Modulation

    NASA Technical Reports Server (NTRS)

    Wang, C.-W.; Stark, W.

    2005-01-01

    This article considers a quaternary direct-sequence code-division multiple-access (DS-CDMA) communication system with asymmetric quadrature phase-shift-keying (AQPSK) modulation for unequal error protection (UEP) capability. Both time synchronous and asynchronous cases are investigated. An expression for the probability distribution of the multiple-access interference is derived. The exact bit-error performance and the approximate performance using a Gaussian approximation and random signature sequences are evaluated by extending the techniques used for uniform quadrature phase-shift-keying (QPSK) and binary phase-shift-keying (BPSK) DS-CDMA systems. Finally, a general system model with unequal user power and the near-far problem is considered and analyzed. The results show that, for a system with UEP capability, the less protected data bits are more sensitive to the near-far effect that occurs in a multiple-access environment than are the more protected bits.

  1. BCM Search Launcher--an integrated interface to molecular biology data base search and analysis services available on the World Wide Web.

    PubMed

    Smith, R F; Wiese, B A; Wojzynski, M K; Davison, D B; Worley, K C

    1996-05-01

    The BCM Search Launcher is an integrated set of World Wide Web (WWW) pages that organize molecular biology-related search and analysis services available on the WWW by function, and provide a single point of entry for related searches. The Protein Sequence Search Page, for example, provides a single sequence entry form for submitting sequences to WWW servers that offer remote access to a variety of different protein sequence search tools, including BLAST, FASTA, Smith-Waterman, BEAUTY, PROSITE, and BLOCKS searches. Other Launch pages provide access to (1) nucleic acid sequence searches, (2) multiple and pair-wise sequence alignments, (3) gene feature searches, (4) protein secondary structure prediction, and (5) miscellaneous sequence utilities (e.g., six-frame translation). The BCM Search Launcher also provides a mechanism to extend the utility of other WWW services by adding supplementary hypertext links to results returned by remote servers. For example, links to the NCBI's Entrez data base and to the Sequence Retrieval System (SRS) are added to search results returned by the NCBI's WWW BLAST server. These links provide easy access to auxiliary information, such as Medline abstracts, that can be extremely helpful when analyzing BLAST data base hits. For new or infrequent users of sequence data base search tools, we have preset the default search parameters to provide the most informative first-pass sequence analysis possible. We have also developed a batch client interface for Unix and Macintosh computers that allows multiple input sequences to be searched automatically as a background task, with the results returned as individual HTML documents directly to the user's system. The BCM Search Launcher and batch client are available on the WWW at URL http:@gc.bcm.tmc.edu:8088/search-launcher.html.

  2. Microsatellite analysis in the genome of Acanthaceae: An in silico approach

    PubMed Central

    Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

    2015-01-01

    Background: Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. Objective: The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. Materials and Methods: The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Results: Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. Conclusion: The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future. PMID:25709226

  3. Type III Bartter-like syndrome in an infant boy with Gitelman syndrome and autosomal dominant familial neurohypophyseal diabetes insipidus.

    PubMed

    Brugnara, Milena; Gaudino, Rossella; Tedeschi, Silvana; Syrèn, Marie-Louise; Perrotta, Silverio; Maines, Evelina; Zaffanello, Marco

    2014-09-01

    We report the case of an infant boy with polyuria and a familial history of central diabetes insipidus. Laboratory blood tests disclosed hypokalemia, metabolic alkalosis, hyperreninemia, and hyperaldosteronism. Plasma magnesium concentration was slightly low. Urine analysis showed hypercalciuria, hyposthenuria, and high excretion of potassium. Such findings oriented toward type III Bartter syndrome (BSIII). Direct sequencing of the CLCNKB gene revealed no disease-causing mutations. The water deprivation test was positive. Magnetic resonance imaging showed a lack of posterior pituitary hyperintensity. Finally, direct sequencing of the AVP-NPII gene showed a point mutation (c.1884G>A) in a heterozygous state, confirming an autosomal dominant familial neurohypophyseal diabetes insipidus (adFNDI). This condition did not explain the patient's phenotype; thus, we investigated for Gitelman syndrome (GS). A direct sequencing of the SLC12A3 gene showed c.269A>C and c.1205C>A new mutations. In conclusion, the patient had a genetic combination of GS and adFNDI with a BSIII-like phenotype.

  4. Domain fusion analysis by applying relational algebra to protein sequence and domain databases

    PubMed Central

    Truong, Kevin; Ikura, Mitsuhiko

    2003-01-01

    Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020

  5. Direct Formalin Fixation Induces Widespread Genomic Effects in Archival Tissues

    EPA Science Inventory

    Recent advances in next generation sequencing have dramatically improved transcriptional analysis of degraded RNA from formalin-fixed paraffin-embedded (FFPE) samples. However, little is known about potential genomic artifacts induced by formalin fixation, which could affect toxi...

  6. Structure-Function Analysis of Chloroplast Proteins via Random Mutagenesis Using Error-Prone PCR.

    PubMed

    Dumas, Louis; Zito, Francesca; Auroy, Pascaline; Johnson, Xenie; Peltier, Gilles; Alric, Jean

    2018-06-01

    Site-directed mutagenesis of chloroplast genes was developed three decades ago and has greatly advanced the field of photosynthesis research. Here, we describe a new approach for generating random chloroplast gene mutants that combines error-prone polymerase chain reaction of a gene of interest with chloroplast complementation of the knockout Chlamydomonas reinhardtii mutant. As a proof of concept, we targeted a 300-bp sequence of the petD gene that encodes subunit IV of the thylakoid membrane-bound cytochrome b 6 f complex. By sequencing chloroplast transformants, we revealed 149 mutations in the 300-bp target petD sequence that resulted in 92 amino acid substitutions in the 100-residue target subunit IV sequence. Our results show that this method is suited to the study of highly hydrophobic, multisubunit, and chloroplast-encoded proteins containing cofactors such as hemes, iron-sulfur clusters, and chlorophyll pigments. Moreover, we show that mutant screening and sequencing can be used to study photosynthetic mechanisms or to probe the mutational robustness of chloroplast-encoded proteins, and we propose that this method is a valuable tool for the directed evolution of enzymes in the chloroplast. © 2018 American Society of Plant Biologists. All rights reserved.

  7. Protein domain analysis of genomic sequence data reveals regulation of LRR related domains in plant transpiration in Ficus.

    PubMed

    Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K

    2014-01-01

    Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.

  8. Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM)☆

    PubMed Central

    Parson, Walther; Strobl, Christina; Huber, Gabriela; Zimmermann, Bettina; Gomes, Sibylle M.; Souto, Luis; Fendt, Liane; Delport, Rhena; Langit, Reina; Wootton, Sharon; Lagacé, Robert; Irwin, Jodi

    2013-01-01

    Insights into the human mitochondrial phylogeny have been primarily achieved by sequencing full mitochondrial genomes (mtGenomes). In forensic genetics (partial) mtGenome information can be used to assign haplotypes to their phylogenetic backgrounds, which may, in turn, have characteristic geographic distributions that would offer useful information in a forensic case. In addition and perhaps even more relevant in the forensic context, haplogroup-specific patterns of mutations form the basis for quality control of mtDNA sequences. The current method for establishing (partial) mtDNA haplotypes is Sanger-type sequencing (STS), which is laborious, time-consuming, and expensive. With the emergence of Next Generation Sequencing (NGS) technologies, the body of available mtDNA data can potentially be extended much more quickly and cost-efficiently. Customized chemistries, laboratory workflows and data analysis packages could support the community and increase the utility of mtDNA analysis in forensics. We have evaluated the performance of mtGenome sequencing using the Personal Genome Machine (PGM) and compared the resulting haplotypes directly with conventional Sanger-type sequencing. A total of 64 mtGenomes (>1 million bases) were established that yielded high concordance with the corresponding STS haplotypes (<0.02% differences). About two-thirds of the differences were observed in or around homopolymeric sequence stretches. In addition, the sequence alignment algorithm employed to align NGS reads played a significant role in the analysis of the data and the resulting mtDNA haplotypes. Further development of alignment software would be desirable to facilitate the application of NGS in mtDNA forensic genetics. PMID:23948325

  9. ScaffoldSeq: Software for characterization of directed evolution populations.

    PubMed

    Woldring, Daniel R; Holec, Patrick V; Hackel, Benjamin J

    2016-07-01

    ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  10. A core microbiome associated with the peritoneal tumors of pseudomyxoma peritonei

    PubMed Central

    2013-01-01

    Background Pseudomyxoma peritonei (PMP) is a malignancy characterized by dissemination of mucus-secreting cells throughout the peritoneum. This disease is associated with significant morbidity and mortality and despite effective treatment options for early-stage disease, patients with PMP often relapse. Thus, there is a need for additional treatment options to reduce relapse rate and increase long-term survival. A previous study identified the presence of both typed and non-culturable bacteria associated with PMP tissue and determined that increased bacterial density was associated with more severe disease. These findings highlighted the possible role for bacteria in PMP disease. Methods To more clearly define the bacterial communities associated with PMP disease, we employed a sequenced-based analysis to profile the bacterial populations found in PMP tumor and mucin tissue in 11 patients. Sequencing data were confirmed by in situ hybridization at multiple taxonomic depths and by culturing. A pilot clinical study was initiated to determine whether the addition of antibiotic therapy affected PMP patient outcome. Main results We determined that the types of bacteria present are highly conserved in all PMP patients; the dominant phyla are the Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes. A core set of taxon-specific sequences were found in all 11 patients; many of these sequences were classified into taxonomic groups that also contain known human pathogens. In situ hybridization directly confirmed the presence of bacteria in PMP at multiple taxonomic depths and supported our sequence-based analysis. Furthermore, culturing of PMP tissue samples allowed us to isolate 11 different bacterial strains from eight independent patients, and in vitro analysis of subset of these isolates suggests that at least some of these strains may interact with the PMP-associated mucin MUC2. Finally, we provide evidence suggesting that targeting these bacteria with antibiotic treatment may increase the survival of PMP patients. Conclusions Using 16S amplicon-based sequencing, direct in situ hybridization analysis and culturing methods, we have identified numerous bacterial taxa that are consistently present in all PMP patients tested. Combined with data from a pilot clinical study, these data support the hypothesis that adding antimicrobials to the standard PMP treatment could improve PMP patient survival. PMID:23844722

  11. Measurement of Electromagnetic Energy Flow Through a Sparse Particulate Medium: A Perspective

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.

    2013-01-01

    First-principle analysis of the functional design of a well-collimated radiometer (WCR) reveals that in general, this instrument does not record the instantaneous directional flow of electromagnetic energy. Only in special cases can a sequence of measurements with a WCR yield the magnitude and direction of the local time-averaged Poynting vector. Our analysis demonstrates that it is imperative to clearly formulate the physical nature of the actual measurement afforded by a directional radiometer rather than presume desirable measurement capabilities. Only then can the directional radiometer be considered a legitimate part of physically based remote sensing and radiation-budget applications. We also emphasize the need for a better understanding of the nature of measurements with panoramic radiometers.

  12. Identification of a precursor genomic segment that provided a sequence unique to glycophorin B and E genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Onda, M.; Kudo, S.; Fukuda, M.

    Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less

  13. The complete chloroplast genome sequence of Gossypium hirsutum: organization and phylogenetic relationships to other angiosperms

    PubMed Central

    Lee, Seung-Bum; Kaittanis, Charalambos; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry

    2006-01-01

    Background Cotton (Gossypium hirsutum) is the most important fiber crop grown in 90 countries. In 2004–2005, US farmers planted 79% of the 5.7-million hectares of nuclear transgenic cotton. Unfortunately, genetically modified cotton has the potential to hybridize with other cultivated and wild relatives, resulting in geographical restrictions to cultivation. However, chloroplast genetic engineering offers the possibility of containment because of maternal inheritance of transgenes. The complete chloroplast genome of cotton provides essential information required for genetic engineering. In addition, the sequence data were used to assess phylogenetic relationships among the major clades of rosids using cotton and 25 other completely sequenced angiosperm chloroplast genomes. Results The complete cotton chloroplast genome is 160,301 bp in length, with 112 unique genes and 19 duplicated genes within the IR, containing a total of 131 genes. There are four ribosomal RNAs, 30 distinct tRNA genes and 17 intron-containing genes. The gene order in cotton is identical to that of tobacco but lacks rpl22 and infA. There are 30 direct and 24 inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Most of the direct repeats are within intergenic spacer regions, introns and a 72 bp-long direct repeat is within the psaA and psaB genes. Comparison of protein coding sequences with expressed sequence tags (ESTs) revealed nucleotide substitutions resulting in amino acid changes in ndhC, rpl23, rpl20, rps3 and clpP. Phylogenetic analysis of a data set including 61 protein-coding genes using both maximum likelihood and maximum parsimony were performed for 28 taxa, including cotton and five other angiosperm chloroplast genomes that were not included in any previous phylogenies. Conclusion Cotton chloroplast genome lacks rpl22 and infA and contains a number of dispersed direct and inverted repeats. RNA editing resulted in amino acid changes with significant impact on their hydropathy. Phylogenetic analysis provides strong support for the position of cotton in the Malvales in the eurosids II clade sister to Arabidopsis in the Brassicales. Furthermore, there is strong support for the placement of the Myrtales sister to the eurosid I clade, although expanded taxon sampling is needed to further test this relationship. PMID:16553962

  14. Triplex-mediated analysis of cytosine methylation at CpA sites in DNA.

    PubMed

    Johannsen, Marie W; Gerrard, Simon R; Melvin, Tracy; Brown, Tom

    2014-01-18

    Modified triplex-forming oligonucleotides distinguish 5-methyl cytosine from unmethylated cytosine in DNA duplexes by differences in triplex melting temperatures. The discrimination is sequence-specific; dramatic differences in stabilisation are seen for CpA methylation, whereas CpG methylation is not detected. This direct detection of DNA methylation constitutes a new approach for epigenetic analysis.

  15. GUTSS: An Alignment-Free Sequence Comparison Method for Use in Human Intestinal Microbiome and Fecal Microbiota Transplantation Analysis.

    PubMed

    Brittnacher, Mitchell J; Heltshe, Sonya L; Hayden, Hillary S; Radey, Matthew C; Weiss, Eli J; Damman, Christopher J; Zisman, Timothy L; Suskind, David L; Miller, Samuel I

    2016-01-01

    Comparative analysis of gut microbiomes in clinical studies of human diseases typically rely on identification and quantification of species or genes. In addition to exploring specific functional characteristics of the microbiome and potential significance of species diversity or expansion, microbiome similarity is also calculated to study change in response to therapies directed at altering the microbiome. Established ecological measures of similarity can be constructed from species abundances, however methods for calculating these commonly used ecological measures of similarity directly from whole genome shotgun (WGS) metagenomic sequence are lacking. We present an alignment-free method for calculating similarity of WGS metagenomic sequences that is analogous to the Bray-Curtis index for species, implemented by the General Utility for Testing Sequence Similarity (GUTSS) software application. This method was applied to intestinal microbiomes of healthy young children to measure developmental changes toward an adult microbiome during the first 3 years of life. We also calculate similarity of donor and recipient microbiomes to measure establishment, or engraftment, of donor microbiota in fecal microbiota transplantation (FMT) studies focused on mild to moderate Crohn's disease. We show how a relative index of similarity to donor can be calculated as a measure of change in a patient's microbiome toward that of the donor in response to FMT. Because clinical efficacy of the transplant procedure cannot be fully evaluated without analysis methods to quantify actual FMT engraftment, we developed a method for detecting change in the gut microbiome that is independent of species identification and database bias, sensitive to changes in relative abundance of the microbial constituents, and can be formulated as an index for correlating engraftment success with clinical measures of disease. More generally, this method may be applied to clinical evaluation of human microbiomes and provide potential diagnostic determination of individuals who may be candidates for specific therapies directed at alteration of the microbiome.

  16. Pse-Analysis: a python package for DNA/RNA and protein/ peptide sequence analysis based on pseudo components and kernel methods.

    PubMed

    Liu, Bin; Wu, Hao; Zhang, Deyuan; Wang, Xiaolong; Chou, Kuo-Chen

    2017-02-21

    To expedite the pace in conducting genome/proteome analysis, we have developed a Python package called Pse-Analysis. The powerful package can automatically complete the following five procedures: (1) sample feature extraction, (2) optimal parameter selection, (3) model training, (4) cross validation, and (5) evaluating prediction quality. All the work a user needs to do is to input a benchmark dataset along with the query biological sequences concerned. Based on the benchmark dataset, Pse-Analysis will automatically construct an ideal predictor, followed by yielding the predicted results for the submitted query samples. All the aforementioned tedious jobs can be automatically done by the computer. Moreover, the multiprocessing technique was adopted to enhance computational speed by about 6 folds. The Pse-Analysis Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/Pse-Analysis/, and can be directly run on Windows, Linux, and Unix.

  17. SNaPshot and StripAssay as Valuable Alternatives to Direct Sequencing for KRAS Mutation Detection in Colon Cancer Routine Diagnostics

    PubMed Central

    Fariña Sarasqueta, Arantza; Moerland, Elna; de Bruyne, Hanneke; de Graaf, Henk; Vrancken, Tamara; van Lijnschoten, Gesina; van den Brule, Adriaan J.C.

    2011-01-01

    Although direct sequencing is the gold standard for KRAS mutation detection in routine diagnostics, it remains laborious, time consuming, and not very sensitive. Our objective was to evaluate SNaPshot and the KRAS StripAssay as alternatives to sequencing for KRAS mutation detection in daily practice. KRAS exon 2–specific PCR followed by sequencing or by a SNaPshot reaction was performed. For the StripAssay, a mutant-enriched PCR was followed by hybridization to KRAS-specific probes bound to a nitrocellulose strip. To test sensitivities, dilution series of mutated DNA in wild-type DNA were made. Additionally, direct sequencing and SNaPshot were evaluated in 296 colon cancer samples. Detection limits of direct sequencing, SNaPshot, and StripAssay were 20%, 10%, and 1% tumor cells, respectively. Direct sequencing and SNaPshot can detect all 12 mutations in KRAS codons 12 and 13, whereas the StripAssay detects 10 of the most frequent ones. Workload and time to results are comparable for SNaPshot and direct sequencing. SNaPshot is flexible and easy to multiplex. The StripAssay is less time consuming for daily laboratory practice. SNaPshot is more flexible and slightly more sensitive than direct sequencing. The clinical evaluation showed comparable performances between direct sequencing and SNaPshot. The StripAssay is rapid and an extremely sensitive assay that could be considered when few tumor cells are available. However, found mutants should be confirmed to avoid risk of false positives. PMID:21354055

  18. Migration pattern of hepatitis A virus genotype IA in North-Central Tunisia.

    PubMed

    Beji-Hamza, Abir; Taffon, Stefania; Mhalla, Salma; Lo Presti, Alessandra; Equestre, Michele; Chionne, Paola; Madonna, Elisabetta; Cella, Eleonora; Bruni, Roberto; Ciccozzi, Massimo; Aouni, Mahjoub; Ciccaglione, Anna Rita

    2015-02-08

    Hepatitis A virus (HAV) epidemiology in Tunisia has changed from high to intermediate endemicity in the last decades. However, several outbreaks continue to occur. The last reported sequences from Tunisian HAV strains date back to 2006. In order to provide an updated overview of the strains currently circulating in Tunisia, a large-scale molecular analysis of samples from hepatitis A cases was performed, the first in Tunisia. Biological samples were collected from patients with laboratory confirmed hepatitis A: 145 sera samples in Tunis, Monastir, Sousse and Kairouan from 2008 to 2013 and 45 stool samples in Mahdia in 2009. HAV isolates were characterised by nested RT-PCR (VP1/2A region) and sequencing. The sequences finally obtained from 81 samples showed 78 genotype IA and 3 genotype IB isolates. A Tunisian genotype IA sequence dataset, including both the 78 newly obtained IA sequences and 51 sequences retrieved from GenBank, was used for phylogenetic investigation, including analysis of migration pattern among six towns. Virus gene flow from Sfax and Monastir was directed to all other towns; in contrast, the gene flows from Sousse, Tunis, Mahdia and Kairouan were directed to three, two, one and no towns, respectively. Several different HAV strains co-circulate in Tunisia, but the predominant genotype still continues to be IA (78/81, 96% isolates). A complex gene flow (migration) of HAV genotype IA was observed, with Sfax and Monastir showing gene flows to all other investigated towns. This approach coupled to a wider sampling can prove useful to investigate the factors underlying the spread of HAV in Tunisia and, thus, to implement appropriate preventing measures.

  19. Analysis of Claviceps africana and C. sorghi from India using AFLPs, EF-1alpha gene intron 4, and beta-tubulin gene intron 3.

    PubMed

    Tooley, Paul W; Bandyopadhyay, Ranajit; Carras, Marie M; Pazoutová, Sylvie

    2006-04-01

    Isolates of Claviceps causing ergot on sorghum in India were analysed by AFLP analysis, and by analysis of DNA sequences of the EF-1alpha gene intron 4 and beta-tubulin gene intron 3 region. Of 89 isolates assayed from six states in India, four were determined to be C. sorghi, and the rest C. africana. A relatively low level of genetic diversity was observed within the Indian C. africana population. No evidence of genetic exchange between C. africana and C. sorghi was observed in either AFLP or DNA sequence analysis. Phylogenetic analysis was conducted using DNA sequences from 14 different Claviceps species. A multigene phylogeny based on the EF-1alpha gene intron 4, the beta-tubulin gene intron 3 region, and rDNA showed that C. sorghi grouped most closely with C. gigantea and C. africana. Although the Claviceps species we analysed were closely related, they colonize hosts that are taxonomically very distinct suggesting that there is no direct coevolution of Claviceps with its hosts.

  20. Genetic analysis of PAX3 for diagnosis of Waardenburg syndrome type I.

    PubMed

    Matsunaga, Tatsuo; Mutai, Hideki; Namba, Kazunori; Morita, Noriko; Masuda, Sawako

    2013-04-01

    PAX3 genetic analysis increased the diagnostic accuracy for Waardenburg syndrome type I (WS1). Analysis of the three-dimensional (3D) structure of PAX3 helped verify the pathogenicity of a missense mutation, and multiple ligation-dependent probe amplification (MLPA) analysis of PAX3 increased the sensitivity of genetic diagnosis in patients with WS1. Clinical diagnosis of WS1 is often difficult in individual patients with isolated, mild, or non-specific symptoms. The objective of the present study was to facilitate the accurate diagnosis of WS1 through genetic analysis of PAX3 and to expand the spectrum of known PAX3 mutations. In two Japanese families with WS1, we conducted a clinical evaluation of symptoms and genetic analysis, which involved direct sequencing, MLPA analysis, quantitative PCR of PAX3, and analysis of the predicted 3D structure of PAX3. The normal-hearing control group comprised 92 subjects who had normal hearing according to pure tone audiometry. In one family, direct sequencing of PAX3 identified a heterozygous mutation, p.I59F. Analysis of PAX3 3D structures indicated that this mutation distorted the DNA-binding site of PAX3. In the other family, MLPA analysis and subsequent quantitative PCR detected a large, heterozygous deletion spanning 1759-2554 kb that eliminated 12-18 genes including a whole PAX3 gene.

  1. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs.

    PubMed

    Sanders, Ashley D; Falconer, Ester; Hills, Mark; Spierings, Diana C J; Lansdorp, Peter M

    2017-06-01

    The ability to distinguish between genome sequences of homologous chromosomes in single cells is important for studies of copy-neutral genomic rearrangements (such as inversions and translocations), building chromosome-length haplotypes, refining genome assemblies, mapping sister chromatid exchange events and exploring cellular heterogeneity. Strand-seq is a single-cell sequencing technology that resolves the individual homologs within a cell by restricting sequence analysis to the DNA template strands used during DNA replication. This protocol, which takes up to 4 d to complete, relies on the directionality of DNA, in which each single strand of a DNA molecule is distinguished based on its 5'-3' orientation. Culturing cells in a thymidine analog for one round of cell division labels nascent DNA strands, allowing for their selective removal during genomic library construction. To preserve directionality of template strands, genomic preamplification is bypassed and labeled nascent strands are nicked and not amplified during library preparation. Each single-cell library is multiplexed for pooling and sequencing, and the resulting sequence data are aligned, mapping to either the minus or plus strand of the reference genome, to assign template strand states for each chromosome in the cell. The major adaptations to conventional single-cell sequencing protocols include harvesting of daughter cells after a single round of BrdU incorporation, bypassing of whole-genome amplification, and removal of the BrdU + strand during Strand-seq library preparation. By sequencing just template strands, the structure and identity of each homolog are preserved.

  2. Direct Detection of Rifampin and Isoniazid Resistance in Sputum Samples from Tuberculosis Patients by High-Resolution Melt Curve Analysis

    PubMed Central

    Anthwal, Divya; Gupta, Rakesh Kumar; Bhalla, Manpreet; Bhatnagar, Shinjini

    2017-01-01

    ABSTRACT Drug-resistant tuberculosis (TB) is a major threat to TB control worldwide. Globally, only 40% of the 340,000 notified TB patients estimated to have multidrug-resistant-TB (MDR-TB) were detected in 2015. This study was carried out to evaluate the utility of high-resolution melt curve analysis (HRM) for the rapid and direct detection of MDR-TB in Mycobacterium tuberculosis in sputum samples. A reference plasmid library was first generated of the most frequently observed mutations in the resistance-determining regions of rpoB, katG, and an inhA promoter and used as positive controls in HRM. The assay was first validated in 25 MDR M. tuberculosis clinical isolates. The assay was evaluated on DNA isolated from 99 M. tuberculosis culture-positive sputum samples that included 84 smear-negative sputum samples, using DNA sequencing as gold standard. Mutants were discriminated from the wild type by comparing melting-curve patterns with those of control plasmids using HRM software. Rifampin (RIF) and isoniazid (INH) monoresistance were detected in 11 and 21 specimens, respectively, by HRM. Six samples were classified as MDR-TB by sequencing, one of which was missed by HRM. The HRM-RIF, INH-katG, and INH-inhA assays had 89% (95% confidence interval [CI], 52, 100%), 85% (95% CI, 62, 97%), and 100% (95% CI, 74, 100%) sensitivity, respectively, in smear-negative samples, while all assays had 100% sensitivity in smear-positive samples. All assays had 100% specificity. Concordance of 97% to 100% (κ value, 0.9 to 1) was noted between sequencing and HRM. Heteroresistance was observed in 5 of 99 samples by sequencing. In conclusion, the HRM assay was a cost-effective (Indian rupee [INR]400/US$6), rapid, and closed-tube method for the direct detection of MDR-TB in sputum, especially for direct smear-negative cases. PMID:28330890

  3. RNA-Seq Alignment to Individualized Genomes Improves Transcript Abundance Estimates in Multiparent Populations

    PubMed Central

    Munger, Steven C.; Raghupathy, Narayanan; Choi, Kwangbom; Simons, Allen K.; Gatti, Daniel M.; Hinerfeld, Douglas A.; Svenson, Karen L.; Keller, Mark P.; Attie, Alan D.; Hibbs, Matthew A.; Graber, Joel H.; Chesler, Elissa J.; Churchill, Gary A.

    2014-01-01

    Massively parallel RNA sequencing (RNA-seq) has yielded a wealth of new insights into transcriptional regulation. A first step in the analysis of RNA-seq data is the alignment of short sequence reads to a common reference genome or transcriptome. Genetic variants that distinguish individual genomes from the reference sequence can cause reads to be misaligned, resulting in biased estimates of transcript abundance. Fine-tuning of read alignment algorithms does not correct this problem. We have developed Seqnature software to construct individualized diploid genomes and transcriptomes for multiparent populations and have implemented a complete analysis pipeline that incorporates other existing software tools. We demonstrate in simulated and real data sets that alignment to individualized transcriptomes increases read mapping accuracy, improves estimation of transcript abundance, and enables the direct estimation of allele-specific expression. Moreover, when applied to expression QTL mapping we find that our individualized alignment strategy corrects false-positive linkage signals and unmasks hidden associations. We recommend the use of individualized diploid genomes over reference sequence alignment for all applications of high-throughput sequencing technology in genetically diverse populations. PMID:25236449

  4. SAMSA2: a standalone metatranscriptome analysis pipeline.

    PubMed

    Westreich, Samuel T; Treiber, Michelle L; Mills, David A; Korf, Ian; Lemay, Danielle G

    2018-05-21

    Complex microbial communities are an area of growing interest in biology. Metatranscriptomics allows researchers to quantify microbial gene expression in an environmental sample via high-throughput sequencing. Metatranscriptomic experiments are computationally intensive because the experiments generate a large volume of sequence data and each sequence must be compared with reference sequences from thousands of organisms. SAMSA2 is an upgrade to the original Simple Annotation of Metatranscriptomes by Sequence Analysis (SAMSA) pipeline that has been redesigned for standalone use on a supercomputing cluster. SAMSA2 is faster due to the use of the DIAMOND aligner, and more flexible and reproducible because it uses local databases. SAMSA2 is available with detailed documentation, and example input and output files along with examples of master scripts for full pipeline execution. SAMSA2 is a rapid and efficient metatranscriptome pipeline for analyzing large RNA-seq datasets in a supercomputing cluster environment. SAMSA2 provides simplified output that can be examined directly or used for further analyses, and its reference databases may be upgraded, altered or customized to fit the needs of any experiment.

  5. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    PubMed

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  6. DNA-encoded chemistry: enabling the deeper sampling of chemical space.

    PubMed

    Goodnow, Robert A; Dumelin, Christoph E; Keefe, Anthony D

    2017-02-01

    DNA-encoded chemical library technologies are increasingly being adopted in drug discovery for hit and lead generation. DNA-encoded chemistry enables the exploration of chemical spaces four to five orders of magnitude more deeply than is achievable by traditional high-throughput screening methods. Operation of this technology requires developing a range of capabilities including aqueous synthetic chemistry, building block acquisition, oligonucleotide conjugation, large-scale molecular biological transformations, selection methodologies, PCR, sequencing, sequence data analysis and the analysis of large chemistry spaces. This Review provides an overview of the development and applications of DNA-encoded chemistry, highlighting the challenges and future directions for the use of this technology.

  7. GeneChip{sup {trademark}} screening assay for cystic fibrosis mutations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cronn, M.T.; Miyada, C.G.; Fucini, R.V.

    1994-09-01

    GeneChip{sup {trademark}} assays are based on high density, carefully designed arrays of short oligonucleotide probes (13-16 bases) built directly on derivatized silica substrates. DNA target sequence analysis is achieved by hybridizing fluorescently labeled amplification products to these arrays. Fluorescent hybridization signals located within the probe array are translated into target sequence information using the known probe sequence at each array feature. The mutation screening assay for cystic fibrosis includes sets of oligonucleotide probes designed to detect numerous different mutations that have been described in 14 exons and one intron of the CFTR gene. Each mutation site is addressed by amore » sub-array of at least 40 probe sequences, half designed to detect the wild type gene sequence and half designed to detect the reported mutant sequence. Hybridization with homozygous mutant, homozygous wild type or heterozygous targets results in distinctive hybridization patterns within a sub-array, permitting specific discrimination of each mutation. The GeneChip probe arrays are very small (approximately 1 cm{sup 2}). There miniature size coupled with their high information content make GeneChip probe arrays a useful and practical means for providing CF mutation analysis in a clinical setting.« less

  8. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical mechanisms.

  9. Persistency of rupture directivity in moderate-magnitude earthquakes in Italy: Implications for seismic hazard

    NASA Astrophysics Data System (ADS)

    Rovelli, A.; Calderoni, G.

    2012-12-01

    A simple method based on the EGF deconvolution in the frequency domain is applied to detect the occurrence of unilateral ruptures in recent damaging earthquakes in Italy. The spectral ratio between event pairs with different magnitudes at individual stations shows large azimuthal variations above corner frequency when the target event is affected by source directivity and the EGF is not or vice versa. The analysis is applied to seismograms and accelerograms recorded during the seismic sequence following the 20 May 2012, Mw 5.6 main shock in Emilia, northern Italy, the 6 April 2009, Mw 6.1 earthquake of L'Aquila, central Italy, and the 26 September 1997, Mw 5.7 and 6.0 shocks in Umbria-Marche, central Italy. Events of each seismic sequence are selected as having consistent focal mechanisms, and the station selection obeys to the constraint of a similar source-to-receiver path for the event pairs. The analyzed data set of L'Aquila consists of 962 broad-band seismograms relative to 69 normal-faulting earthquakes (3.3 ≤ MW ≤ 6.1, according to Herrmann et al., 2011), stations are selected in the distance range 100 to 250 km to minimize differences in propagation paths. The seismogram analysis reveals that a strong along-strike (toward SE) source directivity characterized all of the three Mw > 5.0 shocks. Source directivity was also persistent up to the smallest magnitudes: 65% of earthquakes under study showed evidence of directivity toward SE whereas only one (Mw 3.7) event showed directivity in the opposite direction. Also the Mw 5.6 main shock of the 20 May 2012 in Emilia result in large azimuthal spectral variations indicating unilateral rupture propagation toward SE. According to the reconstructed geometry of the trust-fault plane, the inferred directivity direction suggests top-down rupture propagation. The analysis over the Emilia aftershock sequence is in progress. The third seismic sequence, dated 1997-1998, occurred in the northern Apennines and, similarly to L'Aquila faults, was characterized by normal-faulting earthquakes with strike substantially parallel to the Apennine trend. Although the amount of data is not as abundant as for the most recent earthquakes, the available data were already object of previous studies indicating unilateral rupture propagation in several of the strongest (5.5 < Mw < 6.0) shocks. We show that the effect of directivity is particularly significant in intermontane basins where long-period (T > 1 sec) ground motions are amplified by soft sediments and the combination of local amplification with source directivity causes exceedance of spectral ordinates at those periods up to more than 2 standard deviations from the expected values of commonly used GMPEs for soft sites. These results arise a concern in terms of seismic hazard because source directivity is found to be recurrent feature in the Apennines. Moreover, the predominant fault strike and intermontane basins are both aligned along the Apennine chain offering a condition potentially favorable to extra-amplifications at periods relevant to seismic risk.

  10. A DNA sequence element that advances replication origin activation time in Saccharomyces cerevisiae.

    PubMed

    Pohl, Thomas J; Kolor, Katherine; Fangman, Walton L; Brewer, Bonita J; Raghuraman, M K

    2013-11-06

    Eukaryotic origins of DNA replication undergo activation at various times in S-phase, allowing the genome to be duplicated in a temporally staggered fashion. In the budding yeast Saccharomyces cerevisiae, the activation times of individual origins are not intrinsic to those origins but are instead governed by surrounding sequences. Currently, there are two examples of DNA sequences that are known to advance origin activation time, centromeres and forkhead transcription factor binding sites. By combining deletion and linker scanning mutational analysis with two-dimensional gel electrophoresis to measure fork direction in the context of a two-origin plasmid, we have identified and characterized a 19- to 23-bp and a larger 584-bp DNA sequence that are capable of advancing origin activation time.

  11. Distinctive archaebacterial species associated with anaerobic rumen protozoan Entodinium caudatum.

    PubMed

    Tóthová, T; Piknová, M; Kisidayová, S; Javorský, P; Pristas, P

    2008-01-01

    The diversity of archaebacteria associated with anaerobic rumen protozoan Entodinium caudatum in long term in vitro culture was investigated by denaturing gradient gel electrophoresis (DGGE) analysis of hypervariable V3 region of archaebacterial 16S rRNA gene. PCR was accomplished directly from DNA extracted from a single protozoal cell and from total community genomic DNA and the obtained fingerprints were compared. The analysis indicated the presence of a solitary intensive band present in Entodinium caudatum single cell DNA, which had no counterparts in the profile from total DNA. The identity of archaebacterium represented by this band was determined by sequence analysis which showed that the sequence fell to the cluster of ciliate symbiotic methanogens identified recently by 16S gene library approach.

  12. Primary structures of ribosomal proteins from the archaebacterium Halobacterium marismortui and the eubacterium Bacillus stearothermophilus.

    PubMed

    Arndt, E; Scholzen, T; Krömer, W; Hatakeyama, T; Kimura, M

    1991-06-01

    Approximately 40 ribosomal proteins from each Halobacterium marismortui and Bacillus stearothermophilus have been sequenced either by direct protein sequence analysis or by DNA sequence analysis of the appropriate genes. The comparison of the amino acid sequences from the archaebacterium H marismortui with the available ribosomal proteins from the eubacterial and eukaryotic kingdoms revealed four different groups of proteins: 24 proteins are related to both eubacterial as well as eukaryotic proteins. Eleven proteins are exclusively related to eukaryotic counterparts. For three proteins only eubacterial relatives-and for another three proteins no counterpart-could be found. The similarities of the halobacterial ribosomal proteins are in general somewhat higher to their eukaryotic than to their eubacterial counterparts. The comparison of B stearothermophilus proteins with their E coli homologues showed that the proteins evolved at different rates. Some proteins are highly conserved with 64-76% identity, others are poorly conserved with only 25-34% identical amino acid residues.

  13. Novel, non-symbiotic isolates of Neorhizobium from a dryland agricultural soil.

    PubMed

    Soenens, Amalia; Imperial, Juan

    2018-01-01

    Semi-selective enrichment, followed by PCR screening, resulted in the successful direct isolation of fast-growing Rhizobia from a dryland agricultural soil. Over 50% of these isolates belong to the genus Neorhizobium , as concluded from partial rpoB and near-complete 16S rDNA sequence analysis. Further genotypic and genomic analysis of five representative isolates confirmed that they form a coherent group within Neorhizobium , closer to N. galegae than to the remaining Neorhizobium species, but clearly differentiated from the former, and constituting at least one new genomospecies within Neorhizobium. All the isolates lacked nod and nif symbiotic genes but contained a repABC replication/maintenance region, characteristic of rhizobial plasmids, within large contigs from their draft genome sequences. These repABC sequences were related, but not identical, to repABC sequences found in symbiotic plasmids from N. galegae , suggesting that the non-symbiotic isolates have the potential to harbor symbiotic plasmids. This is the first report of non-symbiotic members of Neorhizobium from soil.

  14. Comparison of direct boiling method with commercial kits for extracting fecal microbiome DNA by Illumina sequencing of 16S rRNA tags.

    PubMed

    Peng, Xin; Yu, Ke-Qiang; Deng, Guan-Hua; Jiang, Yun-Xia; Wang, Yu; Zhang, Guo-Xia; Zhou, Hong-Wei

    2013-12-01

    Low cost and high throughput capacity are major advantages of using next generation sequencing (NGS) techniques to determine metagenomic 16S rRNA tag sequences. These methods have significantly changed our view of microorganisms in the fields of human health and environmental science. However, DNA extraction using commercial kits has shortcomings of high cost and time constraint. In the present study, we evaluated the determination of fecal microbiomes using a direct boiling method compared with 5 different commercial extraction methods, e.g., Qiagen and MO BIO kits. Principal coordinate analysis (PCoA) using UniFrac distances and clustering showed that direct boiling of a wide range of feces concentrations gave a similar pattern of bacterial communities as those obtained from most of the commercial kits, with the exception of the MO BIO method. Fecal concentration by boiling method affected the estimation of α-diversity indices, otherwise results were generally comparable between boiling and commercial methods. The operational taxonomic units (OTUs) determined through direct boiling showed highly consistent frequencies with those determined through most of the commercial methods. Even those for the MO BIO kit were also obtained by the direct boiling method with high confidence. The present study suggested that direct boiling could be used to determine the fecal microbiome and using this method would significantly reduce the cost and improve the efficiency of the sample preparation for studying gut microbiome diversity. © 2013 Elsevier B.V. All rights reserved.

  15. The LAM-PCR Method to Sequence LV Integration Sites.

    PubMed

    Wang, Wei; Bartholomae, Cynthia C; Gabriel, Richard; Deichmann, Annette; Schmidt, Manfred

    2016-01-01

    Integrating viral gene transfer vectors are commonly used gene delivery tools in clinical gene therapy trials providing stable integration and continuous gene expression of the transgene in the treated host cell. However, integration of the reverse-transcribed vector DNA into the host genome is a potentially mutagenic event that may directly contribute to unwanted side effects. A comprehensive and accurate analysis of the integration site (IS) repertoire is indispensable to study clonality in transduced cells obtained from patients undergoing gene therapy and to identify potential in vivo selection of affected cell clones. To date, next-generation sequencing (NGS) of vector-genome junctions allows sophisticated studies on the integration repertoire in vitro and in vivo. We have explored the use of the Illumina MiSeq Personal Sequencer platform to sequence vector ISs amplified by non-restrictive linear amplification-mediated PCR (nrLAM-PCR) and LAM-PCR. MiSeq-based high-quality IS sequence retrieval is accomplished by the introduction of a double-barcode strategy that substantially minimizes the frequency of IS sequence collisions compared to the conventionally used single-barcode protocol. Here, we present an updated protocol of (nr)LAM-PCR for the analysis of lentiviral IS using a double-barcode system and followed by deep sequencing using the MiSeq device.

  16. Severe chronic osteomyelitis caused by Morganella morganii with high population diversity.

    PubMed

    Zhu, Jialiang; Li, Haifeng; Feng, Li; Yang, Min; Yang, Ronggong; Yang, Lin; Li, Li; Li, Ruoyan; Liu, Minshan; Hou, Shuxun; Ke, Yuehua; Li, Wenfeng; Bai, Fan

    2016-09-01

    A case of chronic osteomyelitis probably caused by Morganella morganii, occurring over a period of 30 years, is reported. The organism was identified through a combination of sample culture, direct sequencing, and 16S RNA gene amplicon sequencing. Further whole-genome sequencing and population structure analysis of the isolates from the patient showed the bacterial population to be highly diverse. This case provides a valuable example of a long-term infection caused by an opportunistic pathogen, M. morganii, with high diversity, which might evolve during replication within the host. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  17. Human Promoters Are Intrinsically Directional

    PubMed Central

    Duttke, Sascha H.C.; Lacadie, Scott A.; Ibrahim, Mahmoud M.; Glass, Christopher K.; Corcoran, David L.; Benner, Christopher; Heinz, Sven; Kadonaga, James T.; Ohler, Uwe

    2015-01-01

    Divergent transcription, in which reverse-oriented transcripts occur upstream of eukaryotic promoters in regions devoid of annotated genes, has been suggested to be a general property of active promoters. Here we show that the human basal RNA polymerase II transcriptional machinery and core promoter are inherently unidirectional, and that reverse-oriented transcripts originate from their own cognate reverse-directed core promoters. In vitro transcription analysis and mapping of nascent transcripts in cells revealed that sequences at reverse start sites are similar to those of their forward counterparts. The use of DNase I accessibility to define proximal promoter borders revealed that up to half of promoters are unidirectional and that unidirectional promoters are depleted at their upstream edges of reverse core promoter sequences and their associated chromatin features. Divergent transcription is thus not an inherent property of the transcription process, but rather the consequence of the presence of both forward- and reverse-directed core promoters. PMID:25639469

  18. Comparison of Immunohistochemistry and Direct Sanger Sequencing for Detection of the BRAFV600E Mutation in Thyroid Neoplasm

    PubMed Central

    Oh, Hye-Seon; Kwon, Hyemi; Park, Suyeon; Kim, Mijin; Jeon, Min Ji; Kim, Tae Yong; Shong, Young Kee; Kim, Won Bae; Choi, Jene

    2018-01-01

    Background The BRAFV600E mutation is the most common genetic alteration identified in papillary thyroid carcinoma (PTC). Because of its costs effectiveness and sensitivity, direct Sanger sequencing has several limitations. The aim of this study was to evaluate the efficiency of immunohistochemistry (IHC) as an alternative method to detect the BRAFV600E mutation in preoperative and postoperative tissue samples. Methods We evaluated 71 patients who underwent thyroid surgery with the result of direct sequencing of the BRAFV600E mutation. IHC staining of the BRAFV600E mutation was performed in 49 preoperative and 23 postoperative thyroid specimens. Results Sixty-two patients (87.3%) had PTC, and of these, BRAFV600E was confirmed by direct sequencing in 57 patients (91.9%). In 23 postoperative tissue samples, the BRAFV600E mutation was detected in 16 samples (70%) by direct sequencing and 18 samples (78%) by IHC. In 24 fine needle aspiration (FNA) samples, BRAFV600E was detected in 18 samples (75%) by direct sequencing and 16 samples (67%) by IHC. In 25 core needle biopsy (CNB) samples, the BRAFV600E mutation was detected in 15 samples (60%) by direct sequencing and 16 samples (64%) by IHC. The sensitivity and specificity of IHC for detecting the BRAFV600E mutation were 77.8% and 66.7% in FNA samples and 99.3% and 80.0% in CNB samples. Conclusion IHC could be an alternative method to direct Sanger sequencing for BRAFV600E mutation detection both in postoperative and preoperative samples. However, application of IHC to detect the BRAFV600E mutation in FNA samples is of limited value compared with direct sequencing. PMID:29388401

  19. Polymorphism at codon 36 of the p53 gene.

    PubMed

    Felix, C A; Brown, D L; Mitsudomi, T; Ikagaki, N; Wong, A; Wasserman, R; Womer, R B; Biegel, J A

    1994-01-01

    A polymorphism at codon 36 in exon 4 of the p53 gene was identified by single strand conformation polymorphism (SSCP) analysis and direct sequencing of genomic DNA PCR products. The polymorphic allele, present in the heterozygous state in genomic DNAs of four of 100 individuals (4%), changes the codon 36 CCG to CCA, eliminates a FinI restriction site and creates a BccI site. Including this polymorphism there are four known polymorphisms in the p53 coding sequence.

  20. Multiplex sequence analysis demonstrates the competitive growth advantage of the A-to-G mutants of clarithromycin-resistant Helicobacter pylori.

    PubMed

    Wang, G; Rahman, M S; Humayun, M Z; Taylor, D E

    1999-03-01

    Clarithromycin resistance in Helicobacter pylori is due to point mutation within the 23S rRNA. We examined the growth rates of different types of site-directed mutants and demonstrated quantitatively the competitive growth advantage of A-to-G mutants over other types of mutants by a multiplex sequencing assay. The results provide a rational explanation of why A-to-G mutants are predominantly observed among clarithromycin-resistant clinical isolates.

  1. Multiplex Sequence Analysis Demonstrates the Competitive Growth Advantage of the A-to-G Mutants of Clarithromycin-Resistant Helicobacter pylori

    PubMed Central

    Wang, Ge; Rahman, M. Sayeedur; Humayun, M. Zafri; Taylor, Diane E.

    1999-01-01

    Clarithromycin resistance in Helicobacter pylori is due to point mutation within the 23S rRNA. We examined the growth rates of different types of site-directed mutants and demonstrated quantitatively the competitive growth advantage of A-to-G mutants over other types of mutants by a multiplex sequencing assay. The results provide a rational explanation of why A-to-G mutants are predominantly observed among clarithromycin-resistant clinical isolates. PMID:10049289

  2. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    PubMed Central

    Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

    2003-01-01

    Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626

  3. Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.

    PubMed

    Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A

    2014-10-01

    Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.

  4. Application of a reverse dot blot DNA-DNA hydridization method to quantify host-feeding tendencies of two sibling species in the Anopheles gambiae complex.

    PubMed

    Fritz, M L; Miller, J R; Bayoh, M N; Vulule, J M; Landgraf, J R; Walker, E D

    2013-12-01

    A DNA-DNA hybridization method, reverse dot blot analysis (RDBA), was used to identify Anopheles gambiae s.s. and Anopheles arabiensis (Diptera: Culicidae) hosts. Of 299 blood-fed and semi-gravid An. gambiae s.l. collected from Kisian, Kenya, 244 individuals were identifiable to species; of these, 69.5% were An. arabiensis and 29.5% were An. gambiae s.s. Host identifications with RDBA were comparable with those of conventional polymerase chain reaction (PCR) followed by direct sequencing of amplicons of the vertebrate mitochondrial cytochrome b gene. Of the 174 amplicon-producing samples used to compare these two methods, 147 were identifiable by direct sequencing and 139 of these were identifiable by RDBA. Anopheles arabiensis bloodmeals were mostly (94.6%) bovine in origin, whereas An. gambiae s.s. fed upon humans more than 91.8% of the time. Tests by RDBA detected that two of 112 An. arabiensis contained blood from more than one host species, whereas PCR and direct sequencing did not. Recent use of insecticide-treated bednets in Kisian is likely to have caused the shift in the dominant vector species from An. gambiae s.s. to An. arabiensis. Reverse dot blot analysis provides an opportunity to study changes in host-feeding by members of the An. gambiae complex in response to the broadening distribution of vector control measures targeting host-selection behaviours. © 2013 The Royal Entomological Society.

  5. UGT1A1 (TA)n genotyping in sickle-cell disease: high resolution melting (HRM) curve analysis or direct sequencing, what is the best way?

    PubMed

    Thomas, Vincent; Mazard, Blandine; Garcia, Caroline; Lacan, Philippe; Gagnieu, Marie-Claude; Joly, Philippe

    2013-09-23

    Minucci et al. have proposed in 2010 a rapid, simple and cost-effective HRM method on the LightCycler 480® apparatus (Roche) for the determination of the 6/6, 6/7 and 7/7 genotypes of the (TA)n UGT1A1 promoter polymorphism. However, they have not studied the n=5 and n=8 alleles which can be quite frequent in sickle-cell disease patients. The aim of our study was to test this HRM protocol to all the 10 possible (TA)n UGT1A1 genotypes (i.e. 5/5, 5/6, 5/7, 5/8, 6/6, 6/7, 6/8, 7/7, 7/8 and 8/8) by using our SCD cohort of patients. All genotypes could be unambiguously identified except 6/7 and 6/8 which give a similar HRM profile. For those two genotypes, the differentiation necessitates either a direct Sanger sequencing or a second PCR protocol followed by a 3% agarose gel migration. For the (TA)n UGT1A1 promoter genotyping of African patients, each lab has to wonder what is the best way between (i) direct Sanger sequencing of all patients and (ii) HRM protocol for all patients followed by a complementary analysis to differentiate the 6/7 and 6/8 genotypes. © 2013. Published by Elsevier B.V. All rights reserved.

  6. High-resolution melt PCR analysis for genotyping of Ureaplasma parvum isolates directly from clinical samples.

    PubMed

    Payne, Matthew S; Tabone, Tania; Kemp, Matthew W; Keelan, Jeffrey A; Spiller, O Brad; Newnham, John P

    2014-02-01

    Ureaplasma sp. infection in neonates and adults underlies a variety of disease pathologies. Of the two human Ureaplasma spp., Ureaplasma parvum is clinically the most common. We have developed a high-resolution melt (HRM) PCR assay for the differentiation of the four serovars of U. parvum in a single step. Currently U. parvum strains are separated into four serovars by sequencing the promoter and coding region of the multiple-banded antigen (MBA) gene. We designed primers to conserved sequences within this region for PCR amplification and HRM analysis to generate reproducible and distinct melt profiles that distinguish clonal representatives of serovars 1, 3, 6, and 14. Furthermore, our HRM PCR assay could classify DNA extracted from 74 known (MBA-sequenced) test strains with 100% accuracy. Importantly, HRM PCR was also able to identify U. parvum serovars directly from 16 clinical swabs. HRM PCR performed with DNA consisting of mixtures of combined known serovars yielded profiles that were easily distinguished from those for single-serovar controls. These profiles mirrored clinical samples that contained mixed serovars. Unfortunately, melt curve analysis software is not yet robust enough to identify the composition of mixed serovar samples, only that more than one serovar is present. HRM PCR provides a single-step, rapid, cost-effective means to differentiate the four serovars of U. parvum that did not amplify any of the known 10 serovars of Ureaplasma urealyticum tested in parallel. Choice of reaction reagents was found to be crucial to allow sufficient sensitivity to differentiate U. parvum serovars directly from clinical swabs rather than requiring cell enrichment using microbial culture techniques.

  7. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    PubMed Central

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  8. DNA sequence-level analyses reveal potential phenotypic modifiers in a large family with psychiatric disorders.

    PubMed

    Ryan, Niamh M; Lihm, Jayon; Kramer, Melissa; McCarthy, Shane; Morris, Stewart W; Arnau-Soler, Aleix; Davies, Gail; Duff, Barbara; Ghiban, Elena; Hayward, Caroline; Deary, Ian J; Blackwood, Douglas H R; Lawrie, Stephen M; McIntosh, Andrew M; Evans, Kathryn L; Porteous, David J; McCombie, W Richard; Thomson, Pippa A

    2018-06-07

    Psychiatric disorders are a group of genetically related diseases with highly polygenic architectures. Genome-wide association analyses have made substantial progress towards understanding the genetic architecture of these disorders. More recently, exome- and whole-genome sequencing of cases and families have identified rare, high penetrant variants that provide direct functional insight. There remains, however, a gap in the heritability explained by these complementary approaches. To understand how multiple genetic variants combine to modify both severity and penetrance of a highly penetrant variant, we sequenced 48 whole genomes from a family with a high loading of psychiatric disorder linked to a balanced chromosomal translocation. The (1;11)(q42;q14.3) translocation directly disrupts three genes: DISC1, DISC2, DISC1FP and has been linked to multiple brain imaging and neurocognitive outcomes in the family. Using DNA sequence-level linkage analysis, functional annotation and population-based association, we identified common and rare variants in GRM5 (minor allele frequency (MAF) > 0.05), PDE4D (MAF > 0.2) and CNTN5 (MAF < 0.01) that may help explain the individual differences in phenotypic expression in the family. We suggest that whole-genome sequencing in large families will improve the understanding of the combined effects of the rare and common sequence variation underlying psychiatric phenotypes.

  9. Diverse tulasnelloid fungi form mycorrhizas with epiphytic orchids in an Andean cloud forest.

    PubMed

    Suárez, Juan Pablo; Weiss, Michael; Abele, Andrea; Garnica, Sigisfredo; Oberwinkler, Franz; Kottke, Ingrid

    2006-11-01

    The mycorrhizal state of epiphytic orchids has been controversially discussed, and the state and mycobionts of the pleurothallid orchids, occurring abundantly and with a high number of species on stems of trees in the Andean cloud forest, were unknown. Root samples of 77 adult individuals of the epiphytic orchids Stelis hallii, S. superbiens, S. concinna and Pleurothallis lilijae were collected in a tropical mountain rainforest of southern Ecuador. Ultrastructural evidence of symbiotic interaction was combined with molecular sequencing of fungi directly from the mycorrhizas and isolation of mycobionts. Ultrastructural analyses displayed vital orchid mycorrhizas formed by fungi with an imperforate parenthesome and cell wall slime bodies typical for the genus Tulasnella. Three different Tulasnella isolates were obtained in pure culture. Phylogenetic analysis of nuclear rDNA sequences from coding regions of the ribosomal large subunit (nucLSU) and the 5.8S subunit, including parts of the internal transcribed spacers, obtained directly from the roots and from the fungal isolates, yielded seven distinct Tulasnella clades. Tulasnella mycobionts in Stelis concinna were restricted to two Tulasnella sequence types while the other orchids were associated with up to six Tulasnella sequence types. All Tulasnella sequences are new to science and distinct from known sequences of mycobionts of terrestrial orchids. The results indicate that tulasnelloid fungi, adapted to the conditions on tree stems, might be important for orchid growth and maintenance in the Andean cloud forest.

  10. Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

    PubMed

    Truong, Kevin; Ikura, Mitsuhiko

    2003-05-06

    Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.

  11. Classification of viral zoonosis through receptor pattern analysis.

    PubMed

    Bae, Se-Eun; Son, Hyeon Seok

    2011-04-13

    Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins. Zoonosis can occur not only through direct transmission from vertebrates to humans, but also through intermediate reservoirs or other environmental factors. Viruses can be categorized according to genotype (ssDNA, dsDNA, ssRNA and dsRNA viruses). Among them, the RNA viruses exhibit particularly high mutation rates and are especially problematic for this reason. Most zoonotic viruses are RNA viruses that change their envelope proteins to facilitate binding to various receptors of host species. In this study, we sought to predict zoonotic propensity through the analysis of receptor characteristics. We hypothesized that the major barrier to interspecies virus transmission is that receptor sequences vary among species--in other words, that the specific amino acid sequence of the receptor determines the ability of the viral envelope protein to attach to the cell. We analysed host-cell receptor sequences for their hydrophobicity/hydrophilicity characteristics. We then analysed these properties for similarities among receptors of different species and used a statistical discriminant analysis to predict the likelihood of transmission among species. This study is an attempt to predict zoonosis through simple computational analysis of receptor sequence differences. Our method may be useful in predicting the zoonotic potential of newly discovered viral strains.

  12. Single-Cell Genomic Analysis in Plants

    PubMed Central

    Hu, Haifei; Scheben, Armin; Edwards, David

    2018-01-01

    Individual cells in an organism are variable, which strongly impacts cellular processes. Advances in sequencing technologies have enabled single-cell genomic analysis to become widespread, addressing shortcomings of analyses conducted on populations of bulk cells. While the field of single-cell plant genomics is in its infancy, there is great potential to gain insights into cell lineage and functional cell types to help understand complex cellular interactions in plants. In this review, we discuss current approaches for single-cell plant genomic analysis, with a focus on single-cell isolation, DNA amplification, next-generation sequencing, and bioinformatics analysis. We outline the technical challenges of analysing material from a single plant cell, and then examine applications of single-cell genomics and the integration of this approach with genome editing. Finally, we indicate future directions we expect in the rapidly developing field of plant single-cell genomic analysis. PMID:29361790

  13. PHYLOGENETIC DIVERSITY IN DRINKING WATER BACTERIA IN A DISTRIBUTION SYSTEM SIMULATOR

    EPA Science Inventory

    This work was carried out to characterize the composition of microbial populations in a distribution system simulator (DSS) by direct sequence analysis of 16S rDNA clone libraries. Bacterial populations were examined in chlorinated distribution water and chloraminated DSS feed an...

  14. Hadoop-BAM: directly manipulating next generation sequencing data in the cloud

    PubMed Central

    Niemenmaa, Matti; Kallio, Aleksi; Schumacher, André; Klemelä, Petri; Korpelainen, Eija; Heljanko, Keijo

    2012-01-01

    Summary: Hadoop-BAM is a novel library for the scalable manipulation of aligned next-generation sequencing data in the Hadoop distributed computing framework. It acts as an integration layer between analysis applications and BAM files that are processed using Hadoop. Hadoop-BAM solves the issues related to BAM data access by presenting a convenient API for implementing map and reduce functions that can directly operate on BAM records. It builds on top of the Picard SAM JDK, so tools that rely on the Picard API are expected to be easily convertible to support large-scale distributed processing. In this article we demonstrate the use of Hadoop-BAM by building a coverage summarizing tool for the Chipster genome browser. Our results show that Hadoop offers good scalability, and one should avoid moving data in and out of Hadoop between analysis steps. Availability: Available under the open-source MIT license at http://sourceforge.net/projects/hadoop-bam/ Contact: matti.niemenmaa@aalto.fi Supplementary information: Supplementary material is available at Bioinformatics online. PMID:22302568

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rosatelli, M.C.; Faa, V.; Sardu, R.

    This study reports the molecular characterization of [beta]-thalassemia in the Sardinian population. Three thousand [beta]-thalassemia chromosomes from prospective parents presenting at the genetic service were initially analyzed by dot blot analysis with oligonucleotide probes complementary to the most common [beta]-thalassemia mutations in the Mediterranean at-risk populations. The mutation which remained uncharacterized by this approach were defined by denaturing gradient gel electrophoresis (DGGE) followed by direct sequence analysis on amplified DNA. The authors reconfirmed that the predominant mutation in the Sardinian population is the codon 39 nonsense mutation, which accounts for 95.7% of the [beta]-thalassemia chromosomes. The other two relatively commonmore » mutations are frameshifts at codon 6 (2.1%) and at codon 76 (0.7%), relatively uncommon in other Mediterranean-origin populations. In this study they have detected a novel [beta]-thalassemia mutation, i.e., a frameshift at codon 1, in three [beta]-thalassemia chromosomes. The DGGE procedure followed by direct sequencing on amplified DNA is a powerful approach for the characterization of unknown mutations in this genetic system.« less

  16. Combining protein sequence, structure, and dynamics: A novel approach for functional evolution analysis of PAS domain superfamily.

    PubMed

    Dong, Zheng; Zhou, Hongyu; Tao, Peng

    2018-02-01

    PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.

  17. UCbase 2.0: ultraconserved sequences database (2014 update).

    PubMed

    Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

    2014-01-01

    UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it. © The Author(s) 2014. Published by Oxford University Press.

  18. Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer

    PubMed Central

    Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R.; Cheetham, R. Keira; Cheng, William; Connor, Thomas R.; Cox, Anthony J.; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J.; Harris, Simon R.; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J.; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J.; Wedge, David C.; Woods, Gregory M.; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M.J.; Carter, Nigel P.; Papenfuss, Anthony T.; Futreal, P. Andrew; Campbell, Peter J.; Yang, Fengtang; Bentley, David R.; Evers, Dirk J.; Stratton, Michael R.

    2012-01-01

    Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PaperClip PMID:22341448

  19. Staphylococcus nepalensis in the guano of bats (Mammalia).

    PubMed

    Vandžurová, A; Bačkor, P; Javorský, P; Pristaš, P

    2013-05-31

    Thirty randomly selected mesophilic isolates from the six years old guano sample from mixed Myotis myotis and M. blythii summer roosts colony were isolated and identified as Staphylococcus nepalensis using MALDI TOF analysis. 16S rRNA gene sequencing of selected five isolates and subsequent phylogenetic analysis confirmed that all sequences showed the highest similarity to S. nepalensis sequences. Several virulence factors were produced by tested isolates, mainly capsule formation and resistance to tetracycline, ampicillin, gentamycin, and chloramphenicol antibiotics. Our experiments show that the majority of cultivable mesophilic bacteria from the guano of bats belong to the S. nepalensis species. This is the first report on the occurrence of this species in the guano of bats and our results indicate that the guano accumulated near or directly in human dwellings and buildings may represent a significant risk for human health. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. A DNA Sequence Element That Advances Replication Origin Activation Time in Saccharomyces cerevisiae

    PubMed Central

    Pohl, Thomas J.; Kolor, Katherine; Fangman, Walton L.; Brewer, Bonita J.; Raghuraman, M. K.

    2013-01-01

    Eukaryotic origins of DNA replication undergo activation at various times in S-phase, allowing the genome to be duplicated in a temporally staggered fashion. In the budding yeast Saccharomyces cerevisiae, the activation times of individual origins are not intrinsic to those origins but are instead governed by surrounding sequences. Currently, there are two examples of DNA sequences that are known to advance origin activation time, centromeres and forkhead transcription factor binding sites. By combining deletion and linker scanning mutational analysis with two-dimensional gel electrophoresis to measure fork direction in the context of a two-origin plasmid, we have identified and characterized a 19- to 23-bp and a larger 584-bp DNA sequence that are capable of advancing origin activation time. PMID:24022751

  1. aes, the gene encoding the esterase B in Escherichia coli, is a powerful phylogenetic marker of the species.

    PubMed

    Lescat, Mathilde; Hoede, Claire; Clermont, Olivier; Garry, Louis; Darlu, Pierre; Tuffery, Pierre; Denamur, Erick; Picard, Bertrand

    2009-12-29

    Previous studies have established a correlation between electrophoretic polymorphism of esterase B, and virulence and phylogeny of Escherichia coli. Strains belonging to the phylogenetic group B2 are more frequently implicated in extraintestinal infections and include esterase B2 variants, whereas phylogenetic groups A, B1 and D contain less virulent strains and include esterase B1 variants. We investigated esterase B as a marker of phylogeny and/or virulence, in a thorough analysis of the esterase B-encoding gene. We identified the gene encoding esterase B as the acetyl-esterase gene (aes) using gene disruption. The analysis of aes nucleotide sequences in a panel of 78 reference strains, including the E. coli reference (ECOR) strains, demonstrated that the gene is under purifying selection. The phylogenetic tree reconstructed from aes sequences showed a strong correlation with the species phylogenetic history, based on multi-locus sequence typing using six housekeeping genes. The unambiguous distinction between variants B1 and B2 by electrophoresis was consistent with Aes amino-acid sequence analysis and protein modelling, which showed that substituted amino acids in the two esterase B variants occurred mostly at different sites on the protein surface. Studies in an experimental mouse model of septicaemia using mutant strains did not reveal a direct link between aes and extraintestinal virulence. Moreover, we did not find any genes in the chromosomal region of aes to be associated with virulence. Our findings suggest that aes does not play a direct role in the virulence of E. coli extraintestinal infection. However, this gene acts as a powerful marker of phylogeny, illustrating the extensive divergence of B2 phylogenetic group strains from the rest of the species.

  2. Rather than by direct acquisition via lateral gene transfer, GHF5 cellulases were passed on from early Pratylenchidae to root-knot and cyst nematodes.

    PubMed

    Rybarczyk-Mydłowska, Katarzyna; Maboreke, Hazel Ruvimbo; van Megen, Hanny; van den Elsen, Sven; Mooyman, Paul; Smant, Geert; Bakker, Jaap; Helder, Johannes

    2012-11-21

    Plant parasitic nematodes are unusual Metazoans as they are equipped with genes that allow for symbiont-independent degradation of plant cell walls. Among the cell wall-degrading enzymes, glycoside hydrolase family 5 (GHF5) cellulases are relatively well characterized, especially for high impact parasites such as root-knot and cyst nematodes. Interestingly, ancestors of extant nematodes most likely acquired these GHF5 cellulases from a prokaryote donor by one or multiple lateral gene transfer events. To obtain insight into the origin of GHF5 cellulases among evolutionary advanced members of the order Tylenchida, cellulase biodiversity data from less distal family members were collected and analyzed. Single nematodes were used to obtain (partial) genomic sequences of cellulases from representatives of the genera Meloidogyne, Pratylenchus, Hirschmanniella and Globodera. Combined Bayesian analysis of ≈ 100 cellulase sequences revealed three types of catalytic domains (A, B, and C). Represented by 84 sequences, type B is numerically dominant, and the overall topology of the catalytic domain type shows remarkable resemblance with trees based on neutral (= pathogenicity-unrelated) small subunit ribosomal DNA sequences. Bayesian analysis further suggested a sister relationship between the lesion nematode Pratylenchus thornei and all type B cellulases from root-knot nematodes. Yet, the relationship between the three catalytic domain types remained unclear. Superposition of intron data onto the cellulase tree suggests that types B and C are related, and together distinct from type A that is characterized by two unique introns. All Tylenchida members investigated here harbored one or multiple GHF5 cellulases. Three types of catalytic domains are distinguished, and the presence of at least two types is relatively common among plant parasitic Tylenchida. Analysis of coding sequences of cellulases suggests that root-knot and cyst nematodes did not acquire this gene directly by lateral genes transfer. More likely, these genes were passed on by ancestors of a family nowadays known as the Pratylenchidae.

  3. ChIP-seq analysis of the σ E regulon of Salmonella enterica serovar typhimurium reveals new genes implicated in heat shock and oxidative stress response

    DOE PAGES

    Li, Jie; Overall, Christopher C.; Johnson, Rudd C.; ...

    2015-09-21

    The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less

  4. ChIP-seq analysis of the σ E regulon of Salmonella enterica serovar typhimurium reveals new genes implicated in heat shock and oxidative stress response

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Jie; Overall, Christopher C.; Johnson, Rudd C.

    The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less

  5. Aircraft stress sequence development: A complex engineering process made simple

    NASA Technical Reports Server (NTRS)

    Schrader, K. H.; Butts, D. G.; Sparks, W. A.

    1994-01-01

    Development of stress sequences for critical aircraft structure requires flight measured usage data, known aircraft loads, and established relationships between aircraft flight loads and structural stresses. Resulting cycle-by-cycle stress sequences can be directly usable for crack growth analysis and coupon spectra tests. Often, an expert in loads and spectra development manipulates the usage data into a typical sequence of representative flight conditions for which loads and stresses are calculated. For a fighter/trainer type aircraft, this effort is repeated many times for each of the fatigue critical locations (FCL) resulting in expenditure of numerous engineering hours. The Aircraft Stress Sequence Computer Program (ACSTRSEQ), developed by Southwest Research Institute under contract to San Antonio Air Logistics Center, presents a unique approach for making complex technical computations in a simple, easy to use method. The program is written in Microsoft Visual Basic for the Microsoft Windows environment.

  6. Sequence analysis of the 5.8S ribosomal DNA and internal transcribed spacers (ITS1 and ITS2) from five species of the Oxalis tuberosa alliance.

    PubMed

    Tosto, D S; Hopp, H E

    1996-01-01

    The internal transcribed spacer region (ITS1 and ITS2) of the 18S-25S nuclear ribosomal DNA sequence and the intervening 5.8S region from five species of the genus Oxalis was amplified by polymerase chain reaction and subjected to direct DNA sequencing. On the basis of cytogenetic studies some species of this genus were postulated to be related by the number of chromosomes. Sequence homologies in the ITS1, 5.8S and ITS2 among species are in good agreement with previous relationships established on the basis of chromosome numbers. We also identified a highly conserved sequence of six bp in the ITS1, reported to be present in a wide range of flowering plants, but not in the Oxalidaceae family to which the genus Oxalis belongs to.

  7. The maize stripe virus major noncapsid protein messenger RNA transcripts contain heterogeneous leader sequences at their 5' termini.

    PubMed

    Huiet, L; Feldstein, P A; Tsai, J H; Falk, B W

    1993-12-01

    Primer extension analyses and a PCR-based cloning strategy were used to identify and characterize 5' nucleotide sequences on the maize stripe virus (MStV) RNA4 mRNA transcripts encoding the major noncapsid protein (NCP). Direct RNA sequence analysis by primer extension showed that the NCP mRNA transcripts had 10-15 nucleotides beyond the 5' terminus of the MStV RNA4 nucleotide sequence. MStV genomic RNAs isolated from ribonucleoprotein particles (RNPs) lacked the additional 5' nucleotides. cDNA clones representing the 5' region of the mRNA transcripts were constructed, and the nucleotide sequences of the 5' regions were determined for 16 clones. Each was found to have a distinct 10-15 nucleotide sequence immediately 5' of the MStV RNA4 sequence. Eleven of 16 clones had the correct MStV RNA4 5' nucleotide sequence, while five showed minor variations at or near the 5' most MStV RNA4 nucleotide. These characteristics show strong similarities to other viral mRNA transcripts which are synthesized by cap snatching.

  8. Mutation detection using automated fluorescence-based sequencing.

    PubMed

    Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju

    2008-04-01

    The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.

  9. TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats.

    PubMed

    Richard, François D; Kajava, Andrey V

    2014-06-01

    The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. High-resolution melting analysis for detection of MYH9 mutations.

    PubMed

    Provaznikova, Dana; Kumstyrova, Tereza; Kotlin, Roman; Salaj, Peter; Matoska, Vaclav; Hrachovinova, Ingrid; Rittich, Simon

    2008-09-01

    May-Hegglin anomaly (MHA), Sebastian (SBS), Fechtner (FTNS) and Epstein (EPS) syndromes are rare autosomal dominant disorders with giant platelets and thrombocytopenia. Other manifestations of these disorders are combinations of the presence of granulocyte inclusions and deafness, cataracts and renal failure. Currently, MHA, SBS, FTNS and EPS are considered to be distinct clinical manifestation of a single illness caused by mutations of the MYH9 gene encoding the heavy chain of non-muscle myosin IIA (NMMHC-IIA). As the MYH9 gene has a high number of exons, it takes much time and material to use this method for the detection of MYH9 mutations. Recently, a new method has been introduced for scanning DNA mutations without the need for direct sequencing: high-resolution melting analysis (HRMA). Mutation detection with HRMA relies on the intercalation of the specific dye (LC Green plus) in double-strand DNA and fluorescence monitoring of PCR product melting profiles. In our study, we optimized the conditions and used HRMA for rapid screening of mutations in all MYH9 exons in seven affected individuals from four unrelated families with suspected MYH9 disorders. Samples identified by HRMA as positive for the mutation were analysed by direct sequencing. HRMA saved us over 85% of redundant sequencing.

  11. Analysis of p53 gene mutations in human gliomas by polymerase chain reaction-based single-strand conformation polymorphism and DNA sequencing.

    PubMed

    Sarkar, F H; Kupsky, W J; Li, Y W; Sreepathi, P

    1994-03-01

    Mutations in the p53 gene have been recognized in brain tumors, and clonal expansion of p53 mutant cells has been shown to be associated with glioma progression. However, studies on the p53 gene have been limited by the need for frozen tissues. We have developed a method utilizing polymerase chain reaction (PCR) for the direct analysis of p53 mutation by single-strand conformation polymorphism (SSCP) and by direct DNA sequencing of the p53 gene using a single 10-microns paraffin-embedded tissue section. We applied this method to screen for p53 gene mutations in exons 5-8 in human gliomas utilizing paraffin-embedded tissues. Twenty paraffin blocks containing tumor were selected from surgical specimens from 17 different adult patients. Tumors included six anaplastic astrocytomas (AAs), nine glioblastomas (GBs), and two mixed malignant gliomas (MMGs). The tissue section on the stained glass slide was used to guide microdissection of an unstained adjacent tissue section to ensure > 90% of the tumor cell population for p53 mutational analysis. Simultaneously, microdissection of the tissue was also carried out to obtain normal tissue from adjacent areas as a control. Mutations in the p53 gene were identified in 3 of 17 (18%) patients by PCR-SSCP analysis and subsequently confirmed by PCR-based DNA sequencing. Mutations in exon 5 resulting in amino acid substitution were found in one thalamic AA (codon 158, CGC > CTT: Arg > Leu) and one cerebral hemispheric GB (codon 151, CCG > CTG: Pro > Leu).(ABSTRACT TRUNCATED AT 250 WORDS)

  12. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    PubMed

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  13. Molecular Cytogenetics Guides Massively Parallel Sequencing of a Radiation-Induced Chromosome Translocation in Human Cells.

    PubMed

    Cornforth, Michael N; Anur, Pavana; Wang, Nicholas; Robinson, Erin; Ray, F Andrew; Bedford, Joel S; Loucas, Bradford D; Williams, Eli S; Peto, Myron; Spellman, Paul; Kollipara, Rahul; Kittler, Ralf; Gray, Joe W; Bailey, Susan M

    2018-05-11

    Chromosome rearrangements are large-scale structural variants that are recognized drivers of oncogenic events in cancers of all types. Cytogenetics allows for their rapid, genome-wide detection, but does not provide gene-level resolution. Massively parallel sequencing (MPS) promises DNA sequence-level characterization of the specific breakpoints involved, but is strongly influenced by bioinformatics filters that affect detection efficiency. We sought to characterize the breakpoint junctions of chromosomal translocations and inversions in the clonal derivatives of human cells exposed to ionizing radiation. Here, we describe the first successful use of DNA paired-end analysis to locate and sequence across the breakpoint junctions of a radiation-induced reciprocal translocation. The analyses employed, with varying degrees of success, several well-known bioinformatics algorithms, a task made difficult by the involvement of repetitive DNA sequences. As for underlying mechanisms, the results of Sanger sequencing suggested that the translocation in question was likely formed via microhomology-mediated non-homologous end joining (mmNHEJ). To our knowledge, this represents the first use of MPS to characterize the breakpoint junctions of a radiation-induced chromosomal translocation in human cells. Curiously, these same approaches were unsuccessful when applied to the analysis of inversions previously identified by directional genomic hybridization (dGH). We conclude that molecular cytogenetics continues to provide critical guidance for structural variant discovery, validation and in "tuning" analysis filters to enable robust breakpoint identification at the base pair level.

  14. Study of cnidarian-algal symbiosis in the "omics" age.

    PubMed

    Meyer, Eli; Weis, Virginia M

    2012-08-01

    The symbiotic associations between cnidarians and dinoflagellate algae (Symbiodinium) support productive and diverse ecosystems in coral reefs. Many aspects of this association, including the mechanistic basis of host-symbiont recognition and metabolic interaction, remain poorly understood. The first completed genome sequence for a symbiotic anthozoan is now available (the coral Acropora digitifera), and extensive expressed sequence tag resources are available for a variety of other symbiotic corals and anemones. These resources make it possible to profile gene expression, protein abundance, and protein localization associated with the symbiotic state. Here we review the history of "omics" studies of cnidarian-algal symbiosis and the current availability of sequence resources for corals and anemones, identifying genes putatively involved in symbiosis across 10 anthozoan species. The public availability of candidate symbiosis-associated genes leaves the field of cnidarian-algal symbiosis poised for in-depth comparative studies of sequence diversity and gene expression and for targeted functional studies of genes associated with symbiosis. Reviewing the progress to date suggests directions for future investigations of cnidarian-algal symbiosis that include (i) sequencing of Symbiodinium, (ii) proteomic analysis of the symbiosome membrane complex, (iii) glycomic analysis of Symbiodinium cell surfaces, and (iv) expression profiling of the gastrodermal cells hosting Symbiodinium.

  15. The C-Terminal Sequence of RhoB Directs Protein Degradation through an Endo-Lysosomal Pathway

    PubMed Central

    Ramos, Irene; Herrera, Mónica; Stamatakis, Konstantinos

    2009-01-01

    Background Protein degradation is essential for cell homeostasis. Targeting of proteins for degradation is often achieved by specific protein sequences or posttranslational modifications such as ubiquitination. Methodology/Principal Findings By using biochemical and genetic tools we have monitored the localization and degradation of endogenous and chimeric proteins in live primary cells by confocal microscopy and ultra-structural analysis. Here we identify an eight amino acid sequence from the C-terminus of the short-lived GTPase RhoB that directs the rapid degradation of both RhoB and chimeric proteins bearing this sequence through a lysosomal pathway. Elucidation of the RhoB degradation pathway unveils a mechanism dependent on protein isoprenylation and palmitoylation that involves sorting of the protein into multivesicular bodies, mediated by the ESCRT machinery. Moreover, RhoB sorting is regulated by late endosome specific lipid dynamics and is altered in human genetic lipid traffic disease. Conclusions/Significance Our findings characterize a short-lived cytosolic protein that is degraded through a lysosomal pathway. In addition, we define a novel motif for protein sorting and rapid degradation, which allows controlling protein levels by means of clinically used drugs. PMID:19956591

  16. Comparison of Direct Sequence Spread Spectrum Rake Receiver with a Maximum Ratio Combining Multicarrier Spread Spectrum Receiver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daryl Leon Wasden; Hussein Moradi; Behrouz Farhang-Broujeny

    2014-06-01

    This paper presents a theoretical analysis of the performance of a filter bank-based multicarrier spread spectrum (FB-MC-SS) system. We consider an FB-MC-SS setup where each data symbol is spread across multiple subcarriers, but there is no spreading in time. The results are then compared with those of the well-known direct sequence spread spectrum (DS-SS) system with a rake receiver for its best performance. We compare the two systems when the channel noise is white. We prove that as the processing gains of the two systems tend to infinity both approach the same performance. However, numerical simulations show that, in practice,more » where processing gain is limited, FB-MC-SS outperforms DS-SS.« less

  17. Whole genome sequencing identifies influenza A H3N2 transmission and offers superior resolution to classical typing methods.

    PubMed

    Meinel, Dominik M; Heinzinger, Susanne; Eberle, Ute; Ackermann, Nikolaus; Schönberger, Katharina; Sing, Andreas

    2018-02-01

    Influenza with its annual epidemic waves is a major cause of morbidity and mortality worldwide. However, only little whole genome data are available regarding the molecular epidemiology promoting our understanding of viral spread in human populations. We implemented a RT-PCR strategy starting from patient material to generate influenza A whole genome sequences for molecular epidemiological surveillance. Samples were obtained within the Bavarian Influenza Sentinel. The complete influenza virus genome was amplified by a one-tube multiplex RT-PCR and sequenced on an Illumina MiSeq. We report whole genomic sequences for 50 influenza A H3N2 viruses, which was the predominating virus in the season 2014/15, directly from patient specimens. The dataset included random samples from Bavaria (Germany) throughout the influenza season and samples from three suspected transmission clusters. We identified the outbreak samples based on sequence identity. Whole genome sequencing (WGS) was superior in resolution compared to analysis of single segments or partial segment analysis. Additionally, we detected manifestation of substantial amounts of viral quasispecies in several patients, carrying mutations varying from the dominant virus in each patient. Our rapid whole genome sequencing approach for influenza A virus shows that WGS can effectively be used to detect and understand outbreaks in large communities. Additionally, the genomic data provide in-depth details about the circulating virus within one season.

  18. First molecular data on the phylum Loricifera: an investigation into the phylogeny of ecdysozoa with emphasis on the positions of Loricifera and Priapulida.

    PubMed

    Park, Joong-Ki; Rho, Hyun Soo; Kristensen, Reinhardt Møbjerg; Kim, Won; Giribet, Gonzalo

    2006-11-01

    Recent progress in molecular techniques has generated a wealth of information for phylogenetic analysis. Among metazoans all but a single phylum have been incorporated into some sort of molecular analysis. However, the minute and rare species of the phylum Loricifera have remained elusive to molecular systematists. Here we report the first molecular sequence data (nearly complete 18S rRNA) for a member of the phylum Loricifera, Pliciloricus sp. from Korea. The new sequence data were analyzed together with 52 other ecdysozoan sequences, with all other phyla represented by three or more sequences. The data set was analyzed using parsimony as an optimality criterion under direct optimization as well as using a Bayesian approach. The parsimony analysis was also accompanied by a sensitivity analysis. The results of both analyses are largely congruent, finding monophyly of each ecdysozoan phylum, except for Priapulida, in which the coelomate Meiopriapulus is separate from a clade of pseudocoelomate priapulids. The data also suggest a relationship of the pseudocoelomate priapulids to kinorhynchs, and a relationship of nematodes to tardigrades. The Bayesian analysis placed the arthropods as the sister group to a clade that includes tardigrades and nematodes. However, these results were shown to be parameter dependent in the sensitivity analysis. The position of Loricifera was extremely unstable to parameter variation, and support for a relationship of loriciferans to any particular ecdysozoan phylum was not found in the data.

  19. Complete plastid genome sequence of Daucus carota: implications for biotechnology and phylogeny of angiosperms.

    PubMed

    Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry

    2006-08-31

    Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats > or = 30 bp with a sequence identity > or = 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements.

  20. Genomic Sequence Variation Markup Language (GSVML).

    PubMed

    Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

    2010-02-01

    With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as a potential data exchanging format for genomic sequence variation data exchange focusing on human health applications. The international standardization of GSVML is necessary, and is currently underway. GSVML can be applied to enhance the utilization of genomic sequence variation data worldwide by providing a communicable platform between clinical and research applications. Copyright 2009 Elsevier Ireland Ltd. All rights reserved.

  1. An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

    PubMed Central

    Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

    1999-01-01

    A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707

  2. Alkahest NuclearBLAST : a user-friendly BLAST management and analysis system

    PubMed Central

    Diener, Stephen E; Houfek, Thomas D; Kalat, Sam E; Windham, DE; Burke, Mark; Opperman, Charles; Dean, Ralph A

    2005-01-01

    Background - Sequencing of EST and BAC end datasets is no longer limited to large research groups. Drops in per-base pricing have made high throughput sequencing accessible to individual investigators. However, there are few options available which provide a free and user-friendly solution to the BLAST result storage and data mining needs of biologists. Results - Here we describe NuclearBLAST, a batch BLAST analysis, storage and management system designed for the biologist. It is a wrapper for NCBI BLAST which provides a user-friendly web interface which includes a request wizard and the ability to view and mine the results. All BLAST results are stored in a MySQL database which allows for more advanced data-mining through supplied command-line utilities or direct database access. NuclearBLAST can be installed on a single machine or clustered amongst a number of machines to improve analysis throughput. NuclearBLAST provides a platform which eases data-mining of multiple BLAST results. With the supplied scripts, the program can export data into a spreadsheet-friendly format, automatically assign Gene Ontology terms to sequences and provide bi-directional best hits between two datasets. Users with SQL experience can use the database to ask even more complex questions and extract any subset of data they require. Conclusion - This tool provides a user-friendly interface for requesting, viewing and mining of BLAST results which makes the management and data-mining of large sets of BLAST analyses tractable to biologists. PMID:15958161

  3. Evaluation of the arrestin gene in patients with retinitis pigmentosa or an allied disease

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DeStefano, D.J.; Berson, E.L.; Dryja, T.P.

    1994-09-01

    Arrestin, also called 48K protein or S-antigen, plays a role in deactivating rhodopsin, the photosensitive, seven-helix, G-protein receptor found in rod photoreceptors. In Drosophila, null mutations in arrestin genes cause a light-dependent photoreceptor degeneration. It is possible that a comparable photoreceptor degeneration in humans is caused by defects in the rod arrestin gene. In order to evaluate this possibility, we are characterizing the human arrestin locus on chromosome 2q. We screened a genomic library (5 million plaques) using an arrestin cDNA clone. Sixty-eight hybridizing clones were identified; portions of 7 clones were sequenced to determine the intron sequence flanking themore » exons. We are using SSCP analysis and direct genomic sequencing to screen the entire coding region, splice donor and acceptor sites, and the promoter region of the arrestin gene in 188 patients with autosomal dominant and 104 patients with autosomal recessive retinitis pigmentosa. We have already obtained flanking intron sequences necessary for SSCP analysis for 13 of 16 exons. So far, we have identified 4 silent base changes at codons 67 (TGC-to-TGT), 107 (CTG-to-CTC), 163 (GCC-to-GCT), and 288 (CTG-to-TGT), all with allele frequencies at 1% or less. Several other variant bands detected by SSCP analysis are currently being sequenced.« less

  4. Analysis of the recE locus of Escherichia coli K-12 by use of polyclonal antibodies to exonuclease VIII.

    PubMed Central

    Luisi-DeLuca, C; Clark, A J; Kolodner, R D

    1988-01-01

    Exonuclease VIII (exoVIII) of Escherichia coli has been purified from a strain carrying a plasmid-encoded recE gene by using a new procedure. This procedure yielded 30 times more protein per gram of cells, and the protein had a twofold higher specific activity than the enzyme purified by the previously published procedure (J. W. Joseph and R. Kolodner, J. Biol. Chem. 258:10411-10417, 1983). The sequence of the 12 N-terminal amino acids was also obtained and found to correspond to one of the open reading frames predicted from the nucleic acid sequence of the recE region of Rac (C. Chu, A. Templin, and A. J. Clark, manuscript in preparation). Polyclonal antibodies directed against purified exoVIII were also prepared. Cell-free extracts prepared from strains containing a wide range of chromosomal- or plasmid-encoded point, insertion, and deletion mutations which result in expression of exoVIII were examined by Western blot (immunoblot) analysis. This analysis showed that two point sbcA mutations (sbcA5 and sbcA23) and the sbc insertion mutations led to the synthesis of the 140-kilodalton (kDa) polypeptide of wild-type exoVIII. Plasmid-encoded partial deletion mutations of recE reduced the size of the cross-reacting protein(s) in direct proportion to the size of the deletion, even though exonuclease activity was still present. The analysis suggests that 39 kDa of the 140-kDa exoVIII subunit is all that is essential for exonuclease activity. One of the truncated but functional exonucleases (the pRAC3 exonuclease) has been purified and confirmed to be a 41-kDa polypeptide. The first 18 amino acids from the N terminus of the 41-kDa pRAC3 exonuclease were sequenced and fond to correspond to one of the translational start signals predicted from the nucleotide sequence of radC (Chu et al., in preparation). Images PMID:3056915

  5. The murine Cd48 gene: allelic polymorphism in the IgV-like region.

    PubMed

    Cabrero, J G; Freeman, G J; Reiser, H

    1998-12-01

    The murine CD48 molecule is a member of the immunoglobulin superfamily which regulates the activation of T lymphocytes. prior cloning experiments using mRNA from two different mouse strains had yielded discrepant sequences within the IgV-like domain of murine CD48. To resolve this issue, we have directly sequenced genomic DNA of 10 laboratory strains and two inbred strains of wild origin. The results of our analysis reveal an allelic polymorphism within the IgV-like domain of murine CD48.

  6. Phylogenomics of Phrynosomatid Lizards: Conflicting Signals from Sequence Capture versus Restriction Site Associated DNA Sequencing

    PubMed Central

    Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.

    2015-01-01

    Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487

  7. Multiple bidirectional initiations and terminations of transcription in the Marek's disease virus long repeat regions.

    PubMed Central

    Chen, X B; Velicer, L F

    1991-01-01

    Marek's disease is an oncogenic disease of chickens caused by a herpesvirus, Marek's disease virus (MDV). Serial in vitro passage of pathogenic MDV results in amplification of a 132-bp direct repeat in the MDV genome's TRL and IRL repeat regions and loss of tumorigenicity. This led to the hypothesis that upon such expansion, one or more tumor-inducing genes fail to be expressed. In this report a group of cDNAs mapping in the expanded regions were isolated from a pathogenic MDV strain in which the 132-bp direct repeat number was found to range between one and seven. Partial cDNA sequencing and S1 nuclease protection analysis revealed that the corresponding transcripts are either initiated or terminated within or near the expanded regions at multiple sites in both rightward and leftward directions. Furthermore, each 132-bp repeat contains one TATA box and two polyadenylation consensus sequences in each direction. These RNAs contain a partial copy or one or more full copies of the 132-bp direct repeat at either their 5' or 3' end. Northern (RNA) blot analysis showed that the majority of transcripts are 1.8 kb in size, while the minor species range in size from 0.67 to 3.1 kb. Together, these data raise the possibility that the 132-bp direct repeat, and indirectly its copy number, may be involved in the regulation of transcriptional initiation and termination and therefore in the generation of four groups of transcripts from the TRL and IRL, although this remains to be demonstrated. Images PMID:1850022

  8. Analysis of the promoter of the cudA gene reveals novel mechanisms of Dictyostelium cell type differentiation.

    PubMed

    Fukuzawa, M; Williams, J G

    2000-06-01

    The cudA gene encodes a nuclear protein that is essential for normal multicellular development. At the slug stage cudA is expressed in the prespore cells and in a sub-region of the prestalk zone. We show that cap site distal promoter sequences direct cudA expression in prespore cells, while proximal sequences direct expression in the prestalk sub-region. The promoter domain that directs prespore-specific transcription consists of a positively acting region, that has the potential to direct expression in all cells within the slug, and a negatively acting region that prevents expression in the prestalk cells. Dd-STATa is the STAT protein that regulates commitment to stalk cell gene expression, where it is known to function as a transcriptional repressor. We show that Dd-STATa binds in vitro to the positively acting part of the prespore domain of the cudA promoter. However, Dd-STATa cannot be utilised for this purpose in vivo, because analysis of a Dd-STATa null mutant strain shows that Dd-STATa is not necessary for cudA transcription in prespore cells. In contrast, the part of the cudA promoter that directs prestalk-specific expression contains a binding site for Dd-STATa that is essential for its biological activity. Dd-STATa appears therefore to serve as a direct activator of cudA transcription in prestalk cells, while a protein with a DNA binding specificity highly related to that of Dd-STATa is utilised to activate cudA transcription in prespore cells.

  9. The Shine-Dalgarno sequence of riboswitch-regulated single mRNAs shows ligand-dependent accessibility bursts

    NASA Astrophysics Data System (ADS)

    Rinaldi, Arlie J.; Lund, Paul E.; Blanco, Mario R.; Walter, Nils G.

    2016-01-01

    In response to intracellular signals in Gram-negative bacteria, translational riboswitches--commonly embedded in messenger RNAs (mRNAs)--regulate gene expression through inhibition of translation initiation. It is generally thought that this regulation originates from occlusion of the Shine-Dalgarno (SD) sequence upon ligand binding; however, little direct evidence exists. Here we develop Single Molecule Kinetic Analysis of RNA Transient Structure (SiM-KARTS) to investigate the ligand-dependent accessibility of the SD sequence of an mRNA hosting the 7-aminomethyl-7-deazaguanine (preQ1)-sensing riboswitch. Spike train analysis reveals that individual mRNA molecules alternate between two conformational states, distinguished by `bursts' of probe binding associated with increased SD sequence accessibility. Addition of preQ1 decreases the lifetime of the SD's high-accessibility (bursting) state and prolongs the time between bursts. In addition, ligand-jump experiments reveal imperfect riboswitching of single mRNA molecules. Such complex ligand sensing by individual mRNA molecules rationalizes the nuanced ligand response observed during bulk mRNA translation.

  10. High compression image and image sequence coding

    NASA Technical Reports Server (NTRS)

    Kunt, Murat

    1989-01-01

    The digital representation of an image requires a very large number of bits. This number is even larger for an image sequence. The goal of image coding is to reduce this number, as much as possible, and reconstruct a faithful duplicate of the original picture or image sequence. Early efforts in image coding, solely guided by information theory, led to a plethora of methods. The compression ratio reached a plateau around 10:1 a couple of years ago. Recent progress in the study of the brain mechanism of vision and scene analysis has opened new vistas in picture coding. Directional sensitivity of the neurones in the visual pathway combined with the separate processing of contours and textures has led to a new class of coding methods capable of achieving compression ratios as high as 100:1 for images and around 300:1 for image sequences. Recent progress on some of the main avenues of object-based methods is presented. These second generation techniques make use of contour-texture modeling, new results in neurophysiology and psychophysics and scene analysis.

  11. Identification of a novel circular DNA virus in pig feces

    USDA-ARS?s Scientific Manuscript database

    Metagenomic analysis of fecal samples collected from a swine with diarrhea detected sequences encoding a replicase (Rep) protein typically found in small circular Rep-encoding ssDNA (CRESS-DNA) viruses. The complete 3,062 nucleotide genome was generated and found to encode two bi-directionally trans...

  12. A new polymorphic and multicopy MHC gene family related to nonmammalian class I

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leelayuwat, C.; Degli-Esposti, M.A.; Abraham, L.J.

    1994-12-31

    The authors have used genomic analysis to characterize a region of the central major histocompatibility complex (MHC) spanning {approximately} 300 kilobases (kb) between TNF and HLA-B. This region has been suggested to carry genetic factors relevant to the development of autoimmune diseases such as myasthenia gravis (MG) and insulin dependent diabetes mellitus (IDDM). Genomic sequence was analyzed for coding potential, using two neural network programs, GRAIL and GeneParser. A genomic probe, JAB, containing putative coding sequences (PERB11) located 60 kb centromeric of HLA-B, was used for northern analysis of human tissues. Multiple transcripts were detected. Southern analysis of genomic DNAmore » and overlapping YAC clones, covering the region from BAT1 to HLA-F, indicated that there are at least five copies of PERB11, four of which are located within this region of the MHC. The partial cDNA sequence of PERB11 was obtained from poly-A RNA derived from skeletal muscle. The putative amino acid sequence of PERB11 shares {approximately} 30% identity to MHC class I molecules from various species, including reptiles, chickens, and frogs, as well as to other MHC class I-like molecules, such as the IgG FcR of the mouse and rat and the human Zn-{alpha}2-glycoprotein. From direct comparison of amino acid sequences, it is concluded that PERB11 is a distinct molecule more closely related to nonmammalian than known mammalian MHC class I molecules. Genomic sequence analysis of PERB11 from five MHC ancestral haplotypes (AH) indicated that the gene is polymorphic at both DNA and protein level. The results suggest that the authors have identified a novel polymorphic gene family with multiple copies within the MHC. 48 refs., 10 figs., 2 tabs.« less

  13. Whole genome sequencing reveals mycobacterial microevolution among concurrent isolates from sputum and blood in HIV infected TB patients.

    PubMed

    Ssengooba, Willy; de Jong, Bouke C; Joloba, Moses L; Cobelens, Frank G; Meehan, Conor J

    2016-08-05

    In the context of advanced immunosuppression, M. tuberculosis is known to cause detectable mycobacteremia. However, little is known about the intra-patient mycobacterial microevolution and the direction of seeding between the sputum and blood compartments. From a diagnostic study of HIV-infected TB patients, 51 pairs of concurrent blood and sputum M. tuberculosis isolates from the same patient were available. In a previous analysis, we identified a subset with genotypic concordance, based on spoligotyping and 24 locus MIRU-VNTR. These paired isolates with identical genotypes were analyzed by whole genome sequencing and phylogenetic analysis. Of the 25 concordant pairs (49 % of the 51 paired isolates), 15 (60 %) remained viable for extraction of high quality DNA for whole genome sequencing. Two patient pairs were excluded due to poor quality sequence reads. The median CD4 cell count was 32 (IQR; 16-101)/mm(3) and ten (77 %) patients were on ART. No drug resistance mutations were identified in any of the sequences analyzed. Three (23.1 %) of 13 patients had SNPs separating paired isolates from blood and sputum compartments, indicating evidence of microevolution. Using a phylogenetic approach to identify the ancestral compartment, in two (15 %) patients the blood isolate was ancestral to the sputum isolate, in one (8 %) it was the opposite, and ten (77 %) of the pairs were identical. Among HIV-infected patients with poor cellular immunity, infection with multiple strains of M. tuberculosis was found in half of the patients. In those patients with identical strains, whole genome sequencing indicated that M. tuberculosis intra-patient microevolution does occur in a few patients, yet did not reveal a consistent direction of spread between sputum and blood. This suggests that these compartments are highly connected and potentially seed each other repeatedly.

  14. Variants of the D{sub 5} dopamine receptor gene found in patients with schizophrenia: Identification of a nonsense mutation and multiple missense changes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sobell, J.L.; Lind, T.J.; Sommer, S.S.

    To determine whether mutations in the D{sub 5} dopamine receptor (D{sub 5}DR) gene are associated with schizophrenia, the gene was examined in 78 unrelated schizophrenic individuals. After amplification by the polymerase chain reaction, products were examined by dideoxy fingerprinting (ddF), a highly sensitive screening method related to single strand conformational polymorphism analysis. All samples with unusual ddF patterns were sequenced to precisely identify the sequence change. In the 156 D{sub 5}DR alleles examined, nine sequence changes were identified. Four of the nine did not affect protein structure; of these, three were silent changes and one was a transition in themore » 3{prime} untranslated region. The remaining five sequence changes result in protein alterations: of these, one is a missense change in a non-conserved amino acid, 3 are missense changes in amino acids that are conserved in some dopamine D{sub 5} receptors and the last is a nonsense mutation. To investigate whether the nonsense mutation was associated with schizophrenia, 400 additional schizophrenic cases of western European descent and 1914 ethnically-similar controls were screened for the change. One additional schizophrenic carrier was identified and verified by direct genomic sequencing (allele frequency: .0013), but eight carriers also were found and confirmed among the non-schizophrenics (allele frequency: .0021)(p>.25). The gene was re-examined in all newly identified carriers of the nonsense mutation by direct sequencing and/or ddF in search of additional mutations. None were identified. Family studies also were conducted to investigate possible cosegregation of the mutation with other neuropsychiatric diseases, but this was not demonstrated. Thus, the mutation does not appear to be associated with an increased risk of schizophrenia nor does an initial analysis suggest cosegregation with other neuropsychiatric disorders or symptom complexes.« less

  15. A novel ATTR L32V mutation causes familial amyloid polyneuropathy in a Bolivian family.

    PubMed

    Martínez-Ulloa, Pedro L; Vallejo, Manuela; Corral, Iñigo; García-Barragán, Nuria; Alcazar, Alberto; Martínez-Alonso, Emma; Martínez-Poles, Javier; Pian, Hector; Jiménez-Escrig, Adriano

    2017-09-01

    We report a new transthyretin (ATTR) gene c.272C>G mutation and variant protein, p.Leu32Val, in a kindred of Bolivian origin with a rapid progressive peripheral neuropathy and cardiomyopathy. Three individuals from a kindred with peripheral nerve and cardiac amyloidosis were examined. Analysis of the TTR gene was performed by Sanger direct sequencing. Neuropathologic examination was obtained on the index patient with mass spectrometry study of the ATTR deposition. Direct DNA sequence analysis of exons 2, 3, and 4 of the TTR gene demonstrated a c.272 C>G mutation in exon 2 (p.L32V). Sural nerve biopsy revealed massive amyloid deposition in the perineurium, endoneurium and vasa nervorum. Mass spectrometric analyses of ATTR immunoprecipitated from nerve biopsy showed the presence of both wild-type and variant proteins. The observed mass results for the wild-type and variant proteins were consistent with the predicted values calculated from the genetic analysis data. The ATTR L32V is associated with a severe course. This has implications for treatment of affected individuals and counseling of family members. © 2017 Peripheral Nerve Society.

  16. Search-based optimization

    NASA Technical Reports Server (NTRS)

    Wheeler, Ward C.

    2003-01-01

    The problem of determining the minimum cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete (Wang and Jiang, 1994). Traditionally, point estimations of hypothetical ancestral sequences have been used to gain heuristic, upper bounds on cladogram cost. These include procedures with such diverse approaches as non-additive optimization of multiple sequence alignment, direct optimization (Wheeler, 1996), and fixed-state character optimization (Wheeler, 1999). A method is proposed here which, by extending fixed-state character optimization, replaces the estimation process with a search. This form of optimization examines a diversity of potential state solutions for cost-efficient hypothetical ancestral sequences and can result in greatly more parsimonious cladograms. Additionally, such an approach can be applied to other NP-complete phylogenetic optimization problems such as genomic break-point analysis. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.

  17. Converting science to policy through stakeholder involvement: an analysis of the European Marine Strategy Directive.

    PubMed

    Fletcher, Stephen

    2007-12-01

    The Marine Strategy Directive requires European Union Member States to develop science-based marine strategies with the involvement of stakeholders, in order that Europe's marine environment reaches 'good environmental status' by 2021. The scientific requirements of marine strategies are clearly defined within the Directive, however, the requirements related to stakeholder involvement are not. This paper presents a critical analysis of the provisions for stakeholder involvement with in the Marine Strategy Directive. In particular, the paper is focused upon the definition of stakeholder, the sequencing of involvement, and the form and purpose of involvement. The critique is set within an evaluative framework that considers policy-making to be a social process, rather than a purely scientific one. It is concluded that the Marine Strategy Directive lacks coherency with respect to stakeholder involvement which may perpetuate the traditional tension between marine science and policy. This in turn may compromise the ability of the Directive to protect Europe's marine environment.

  18. A generic assay for whole-genome amplification and deep sequencing of enterovirus A71

    PubMed Central

    Tan, Le Van; Tuyen, Nguyen Thi Kim; Thanh, Tran Tan; Ngan, Tran Thuy; Van, Hoang Minh Tu; Sabanathan, Saraswathy; Van, Tran Thi My; Thanh, Le Thi My; Nguyet, Lam Anh; Geoghegan, Jemma L.; Ong, Kien Chai; Perera, David; Hang, Vu Thi Ty; Ny, Nguyen Thi Han; Anh, Nguyen To; Ha, Do Quang; Qui, Phan Tu; Viet, Do Chau; Tuan, Ha Manh; Wong, Kum Thong; Holmes, Edward C.; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H. Rogier

    2015-01-01

    Enterovirus A71 (EV-A71) has emerged as the most important cause of large outbreaks of severe and sometimes fatal hand, foot and mouth disease (HFMD) across the Asia-Pacific region. EV-A71 outbreaks have been associated with (sub)genogroup switches, sometimes accompanied by recombination events. Understanding EV-A71 population dynamics is therefore essential for understanding this emerging infection, and may provide pivotal information for vaccine development. Despite the public health burden of EV-A71, relatively few EV-A71 complete-genome sequences are available for analysis and from limited geographical localities. The availability of an efficient procedure for whole-genome sequencing would stimulate effort to generate more viral sequence data. Herein, we report for the first time the development of a next-generation sequencing based protocol for whole-genome sequencing of EV-A71 directly from clinical specimens. We were able to sequence viruses of subgenogroup C4 and B5, while RNA from culture materials of diverse EV-A71 subgenogroups belonging to both genogroup B and C was successfully amplified. The nature of intra-host genetic diversity was explored in 22 clinical samples, revealing 107 positions carrying minor variants (ranging from 0 to 15 variants per sample). Our analysis of EV-A71 strains sampled in 2013 showed that they all belonged to subgenogroup B5, representing the first report of this subgenogroup in Vietnam. In conclusion, we have successfully developed a high-throughput next-generation sequencing-based assay for whole-genome sequencing of EV-A71 from clinical samples. PMID:25704598

  19. CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects

    PubMed Central

    Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

    2014-01-01

    CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234

  20. CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects.

    PubMed

    Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

    2014-01-01

    CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.

  1. Metagenomic insight into methanogenic reactors promoting direct interspecies electron transfer via granular activated carbon.

    PubMed

    Park, Jeong-Hoon; Park, Jong-Hun; Je Seong, Hoon; Sul, Woo Jun; Jin, Kang-Hyun; Park, Hee-Deung

    2018-07-01

    To provide insight into direct interspecies electron transfer via granular activated carbon (GAC), the effect of GAC supplementation on anaerobic digestion was evaluated. Compared to control samples, the GAC supplementation increased the total amount of methane production and its production rate by 31% and 72%, respectively. 16S rDNA sequencing analysis revealed a shift in the archaeal community composition; the Methanosarcina proportion decreased 17%, while the Methanosaeta proportion increased 5.6%. Metagenomic analyses based on shotgun sequencing demonstrated that the abundance of pilA and omcS genes belonging to Geobacter species decreased 69.4% and 29.4%, respectively. Furthermore, the analyses suggested a carbon dioxide reduction pathway rather than an acetate decarboxylation pathway for methane formation. Taken together, these results suggest that GAC improved methane production performance by shifting the microbial community and altering functional genes associated with direct interspecies electron transfer via conductive materials. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Structural Analysis of Single-Point Mutations Given an RNA Sequence: A Case Study with RNAMute

    NASA Astrophysics Data System (ADS)

    Churkin, Alexander; Barash, Danny

    2006-12-01

    We introduce here for the first time the RNAMute package, a pattern-recognition-based utility to perform mutational analysis and detect vulnerable spots within an RNA sequence that affect structure. Mutations in these spots may lead to a structural change that directly relates to a change in functionality. Previously, the concept was tried on RNA genetic control elements called "riboswitches" and other known RNA switches, without an organized utility that analyzes all single-point mutations and can be further expanded. The RNAMute package allows a comprehensive categorization, given an RNA sequence that has functional relevance, by exploring the patterns of all single-point mutants. For illustration, we apply the RNAMute package on an RNA transcript for which individual point mutations were shown experimentally to inactivate spectinomycin resistance in Escherichia coli. Functional analysis of mutations on this case study was performed experimentally by creating a library of point mutations using PCR and screening to locate those mutations. With the availability of RNAMute, preanalysis can be performed computationally before conducting an experiment.

  3. De novo transcriptome sequencing in Frankliniella occidentalis to identify genes involved in plant virus transmission and insecticide resistance.

    PubMed

    Zhang, Zhijun; Zhang, Pengjun; Li, Weidi; Zhang, Jinming; Huang, Fang; Yang, Jian; Bei, Yawei; Lu, Yaobin

    2013-05-01

    The western flower thrips (WFT), Frankliniella occidentalis, a world-wide invasive insect, causes agricultural damage by directly feeding and by indirectly vectoring Tospoviruses, such as Tomato spotted wilt virus (TSWV). We characterized the transcriptome of WFT and analyzed global gene expression of WFT response to TSWV infection using Illumina sequencing platform. We compiled 59,932 unigenes, and identified 36,339 unigenes by similarity analysis against public databases, most of which were annotated using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Within these annotated transcripts, we collected 278 sequences related to insecticide resistance. GO and KEGG analysis of different expression genes between TSWV-infected and non-infected WFT population revealed that TSWV can regulate cellular process and immune response, which might lead to low virus titers in thrips cells and no detrimental effects on F. occidentalis. This data-set not only enriches genomic resource for WFT, but also benefits research into its molecular genetics and functional genomics. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. Benzofurazane as a new redox label for electrochemical detection of DNA: towards multipotential redox coding of DNA bases.

    PubMed

    Balintová, Jana; Plucnara, Medard; Vidláková, Pavlína; Pohl, Radek; Havran, Luděk; Fojta, Miroslav; Hocek, Michal

    2013-09-16

    Benzofurazane has been attached to nucleosides and dNTPs, either directly or through an acetylene linker, as a new redox label for electrochemical analysis of nucleotide sequences. Primer extension incorporation of the benzofurazane-modified dNTPs by polymerases has been developed for the construction of labeled oligonucleotide probes. In combination with nitrophenyl and aminophenyl labels, we have successfully developed a three-potential coding of DNA bases and have explored the relevant electrochemical potentials. The combination of benzofurazane and nitrophenyl reducible labels has proved to be excellent for ratiometric analysis of nucleotide sequences and is suitable for bioanalytical applications. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  5. WHAM!: a web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data.

    PubMed

    Devlin, Joseph C; Battaglia, Thomas; Blaser, Martin J; Ruggles, Kelly V

    2018-06-25

    Exploration of large data sets, such as shotgun metagenomic sequence or expression data, by biomedical experts and medical professionals remains as a major bottleneck in the scientific discovery process. Although tools for this purpose exist for 16S ribosomal RNA sequencing analysis, there is a growing but still insufficient number of user-friendly interactive visualization workflows for easy data exploration and figure generation. The development of such platforms for this purpose is necessary to accelerate and streamline microbiome laboratory research. We developed the Workflow Hub for Automated Metagenomic Exploration (WHAM!) as a web-based interactive tool capable of user-directed data visualization and statistical analysis of annotated shotgun metagenomic and metatranscriptomic data sets. WHAM! includes exploratory and hypothesis-based gene and taxa search modules for visualizing differences in microbial taxa and gene family expression across experimental groups, and for creating publication quality figures without the need for command line interface or in-house bioinformatics. WHAM! is an interactive and customizable tool for downstream metagenomic and metatranscriptomic analysis providing a user-friendly interface allowing for easy data exploration by microbiome and ecological experts to facilitate discovery in multi-dimensional and large-scale data sets.

  6. High Resolution Melt analysis for mutation screening in PKD1 and PKD2

    PubMed Central

    2011-01-01

    Background Autosomal dominant polycystic kidney disease (ADPKD) is the most common hereditary kidney disorder. It is characterized by focal development and progressive enlargement of renal cysts leading to end-stage renal disease. PKD1 and PKD2 have been implicated in ADPKD pathogenesis but genetic features and the size of PKD1 make genetic diagnosis tedious. Methods We aim to prove that high resolution melt analysis (HRM), a recent technique in molecular biology, can facilitate molecular diagnosis of ADPKD. We screened for mutations in PKD1 and PKD2 with HRM in 37 unrelated patients with ADPKD. Results We identified 440 sequence variants in the 37 patients. One hundred and thirty eight were different. We found 28 pathogenic mutations (25 in PKD1 and 3 in PKD2 ) within 28 different patients, which is a diagnosis rate of 75% consistent with literature mean direct sequencing diagnosis rate. We describe 52 new sequence variants in PKD1 and two in PKD2. Conclusion HRM analysis is a sensitive and specific method for molecular diagnosis of ADPKD. HRM analysis is also costless and time sparing. Thus, this method is efficient and might be used for mutation pre-screening in ADPKD genes. PMID:22008521

  7. The Essential Genome of Escherichia coli K-12

    PubMed Central

    2018-01-01

    ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657

  8. Direct Detection and Identification of Prosthetic Joint Infection Pathogens in Synovial Fluid by Metagenomic Shotgun Sequencing.

    PubMed

    Ivy, Morgan I; Thoendel, Matthew J; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Hanssen, Arlen D; Abdel, Matthew P; Chia, Nicholas; Yao, Janet Z; Tande, Aaron J; Mandrekar, Jayawant N; Patel, Robin

    2018-05-30

    Background: Metagenomic shotgun sequencing has the potential to transform how serious infections are diagnosed by offering universal, culture-free pathogen detection. This may be especially advantageous for microbial diagnosis of prosthetic joint infection (PJI) by synovial fluid analysis, since synovial fluid cultures are not universally positive, and synovial fluid is easily obtained pre-operatively. We applied a metagenomics-based approach to synovial fluid in an attempt to detect microorganisms in 168 failed total knee arthroplasties. Results: Genus- and species-level analysis of metagenomic sequencing yielded the known pathogen in 74 (90%) and 68 (83%) of the 82 culture-positive PJIs analyzed, respectively, with testing of two (2%) and three (4%) samples, respectively, yielding additional pathogens not detected by culture. For the 25 culture-negative PJIs tested, genus- and species-level analysis yielded 19 (76%) and 21 (84%) samples with insignificant findings, respectively, and 6 (24%) and 4 (16%) with potential pathogens detected, respectively. Genus- and species-level analysis of the 60 culture-negative aseptic failure cases yielded 53 (88.3%) and 56 (93.3%) cases with insignificant findings, and 7 (11.7%) and 4 (6.7%) with potential clinically-significant organisms detected, respectively. There was one case of aseptic failure with synovial fluid culture growth; metagenomic analysis showed insignificant findings, suggesting possible synovial fluid culture contamination. Conclusion: Metagenomic shotgun sequencing can detect pathogens involved in PJI when applied to synovial fluid and may be particularly useful for culture-negative cases. Copyright © 2018 American Society for Microbiology.

  9. Clustered regularly interspaced short palindromic repeats (CRISPRs) analysis of members of the Mycobacterium tuberculosis complex.

    PubMed

    Botelho, Ana; Canto, Ana; Leão, Célia; Cunha, Mónica V

    2015-01-01

    Typical CRISPR (clustered, regularly interspaced, short palindromic repeat) regions are constituted by short direct repeats (DRs), interspersed with similarly sized non-repetitive spacers, derived from transmissible genetic elements, acquired when the cell is challenged with foreign DNA. The analysis of the structure, in number and nature, of CRISPR spacers is a valuable tool for molecular typing since these loci are polymorphic among strains, originating characteristic signatures. The existence of CRISPR structures in the genome of the members of Mycobacterium tuberculosis complex (MTBC) enabled the development of a genotyping method, based on the analysis of the presence or absence of 43 oligonucleotide spacers separated by conserved DRs. This method, called spoligotyping, consists on PCR amplification of the DR chromosomal region and recognition after hybridization of the spacers that are present. The workflow beneath this methodology implies that the PCR products are brought onto a membrane containing synthetic oligonucleotides that have complementary sequences to the spacer sequences. Lack of hybridization of the PCR products to a specific oligonucleotide sequence indicates absence of the correspondent spacer sequence in the examined strain. Spoligotyping gained great notoriety as a robust identification and typing tool for members of MTBC, enabling multiple epidemiological studies on human and animal tuberculosis.

  10. Genetic diversity and molecular evolution of Naga King Chili inferred from internal transcribed spacer sequence of nuclear ribosomal DNA.

    PubMed

    Kehie, Mechuselie; Kumaria, Suman; Devi, Khumuckcham Sangeeta; Tandon, Pramod

    2016-02-01

    Sequences of the Internal Transcribed Spacer (ITS1-5.8S-ITS2) of nuclear ribosomal DNAs were explored to study the genetic diversity and molecular evolution of Naga King Chili. Our study indicated the occurrence of nucleotide polymorphism and haplotypic diversity in the ITS regions. The present study demonstrated that the variability of ITS1 with respect to nucleotide diversity and sequence polymorphism exceeded that of ITS2. Sequence analysis of 5.8S gene revealed a much conserved region in all the accessions of Naga King Chili. However, strong phylogenetic information of this species is the distinct 13 bp deletion in the 5.8S gene which discriminated Naga King Chili from the rest of the Capsicum sp. Neutrality test results implied a neutral variation, and population seems to be evolving at drift-mutation equilibrium and free from directed selection pressure. Furthermore, mismatch analysis showed multimodal curve indicating a demographic equilibrium. Phylogenetic relationships revealed by Median Joining Network (MJN) analysis denoted a clear discrimination of Naga King Chili from its closest sister species (Capsicum chinense and Capsicum frutescens). The absence of star-like network of haplotypes suggested an ancient population expansion of this chili.

  11. Site-directed mutagenesis in Petunia × hybrida protoplast system using direct delivery of purified recombinant Cas9 ribonucleoproteins.

    PubMed

    Subburaj, Saminathan; Chung, Sung Jin; Lee, Choongil; Ryu, Seuk-Min; Kim, Duk Hyoung; Kim, Jin-Soo; Bae, Sangsu; Lee, Geung-Joo

    2016-07-01

    Site-directed mutagenesis of nitrate reductase genes using direct delivery of purified Cas9 protein preassembled with guide RNA produces mutations efficiently in Petunia × hybrida protoplast system. The clustered, regularly interspaced, short palindromic repeat (CRISPR)-CRISPR associated endonuclease 9 (CRISPR/Cas9) system has been recently announced as a powerful molecular breeding tool for site-directed mutagenesis in higher plants. Here, we report a site-directed mutagenesis method targeting Petunia nitrate reductase (NR) gene locus. This method could create mutations efficiently using direct delivery of purified Cas9 protein and single guide RNA (sgRNA) into protoplast cells. After transient introduction of RNA-guided endonuclease (RGEN) ribonucleoproteins (RNPs) with different sgRNAs targeting NR genes, mutagenesis at the targeted loci was detected by T7E1 assay and confirmed by targeted deep sequencing. T7E1 assay showed that RGEN RNPs induced site-specific mutations at frequencies ranging from 2.4 to 21 % at four different sites (NR1, 2, 4 and 6) in the PhNR gene locus with average mutation efficiency of 14.9 ± 2.2 %. Targeted deep DNA sequencing revealed mutation rates of 5.3-17.8 % with average mutation rate of 11.5 ± 2 % at the same NR gene target sites in DNA fragments of analyzed protoplast transfectants. Further analysis from targeted deep sequencing showed that the average ratio of deletion to insertion produced collectively by the four NR-RGEN target sites (NR1, 2, 4, and 6) was about 63:37. Our results demonstrated that direct delivery of RGEN RNPs into protoplast cells of Petunia can be exploited as an efficient tool for site-directed mutagenesis of genes or genome editing in plant systems.

  12. Multimodal RNA-seq using single-strand, double-strand, and CircLigase-based capture yields a refined and extended description of the C. elegans transcriptome.

    PubMed

    Lamm, Ayelet T; Stadler, Michael R; Zhang, Huibin; Gent, Jonathan I; Fire, Andrew Z

    2011-02-01

    We have used a combination of three high-throughput RNA capture and sequencing methods to refine and augment the transcriptome map of a well-studied genetic model, Caenorhabditis elegans. The three methods include a standard (non-directional) library preparation protocol relying on cDNA priming and foldback that has been used in several previous studies for transcriptome characterization in this species, and two directional protocols, one involving direct capture of single-stranded RNA fragments and one involving circular-template PCR (CircLigase). We find that each RNA-seq approach shows specific limitations and biases, with the application of multiple methods providing a more complete map than was obtained from any single method. Of particular note in the analysis were substantial advantages of CircLigase-based and ssRNA-based capture for defining sequences and structures of the precise 5' ends (which were lost using the double-strand cDNA capture method). Of the three methods, ssRNA capture was most effective in defining sequences to the poly(A) junction. Using data sets from a spectrum of C. elegans strains and stages and the UCSC Genome Browser, we provide a series of tools, which facilitate rapid visualization and assignment of gene structures.

  13. A cost-effectiveness analysis of first-line induction and maintenance treatment sequences in patients with advanced nonsquamous non-small-cell lung cancer in France

    PubMed Central

    Taipale, Kaisa; Winfree, Katherine B; Boye, Mark; Basson, Mickael; Sleilaty, Ghassan; Eaton, James; Evans, Rachel; Chouaid, Christos

    2017-01-01

    Background Comparative effectiveness and cost-effectiveness data for induction–maintenance (I–M) sequences for the treatment of patients with nonsquamous non-small-cell lung cancer (nsqNSCLC) are limited because of a lack of direct evidence. This analysis aimed to compare the cost-effectiveness of I–M pemetrexed with those of other I–M regimens used for the treatment of patients with advanced nsqNSCLC in the French health-care setting. Materials and methods A previously developed global partitioned survival model was adapted to the France-only setting by restricting treatment sequences to include 12 I–M regimens most relevant to France, and incorporating French costs and resource-use data. Following a systematic literature review, network meta-analyses were performed to obtain hazard ratios for progression-free survival (PFS) and overall survival (OS) relative to gemcitabine + cisplatin (induction sequences) or best supportive care (BSC) (maintenance sequences). Modeled health-care benefits were expressed as life-years (LYs) and quality-adjusted LYs (QALYs) (estimated using French EuroQol five-dimension questionnaire tariffs). The study was conducted from the payer perspective (National Health Insurance). Cost- and benefit-model inputs were discounted at an annual rate of 4%. Results Base-case results showed pemetrexed + cisplatin induction followed by (→) pemetrexed maintenance had the longest mean OS and PFS and highest LYs and QALYs. Costs ranged from €12,762 for paclitaxel + carboplatin → BSC to €35,617 for pemetrexed + cisplatin → pemetrexed (2015 values). Gemcitabine + cisplatin → BSC, pemetrexed + cisplatin → BSC, and pemetrexed + cisplatin → pemetrexed were associated with fully incremental cost-effectiveness ratios (ICERs) of €16,593, €80,656, and €102,179, respectively, per QALY gained versus paclitaxel + carboplatin → BSC. All other treatment sequences were either dominated (ie, another sequence had lower costs and better/equivalent outcomes) or extendedly dominated (ie, the comparator had a higher ICER than a more effective comparator) in the model. Sensitivity analyses showed the model to be relatively insensitive to plausible changes in the main assumptions, with none increasing or decreasing the ICER by more than ~€20,000 per QALY gained. Conclusion In the absence of direct comparative trial evidence, this cost-effectiveness analysis indicated that of a large number of I–M sequences used for the treatment of patients with nsqNSCLC in France, pemetrexed + cisplatin → pemetrexed achieved the best clinical outcomes (0.28 incremental QALYs gained) versus paclitaxel + carboplatin → BSC. PMID:28860832

  14. Conservation of tubulin-binding sequences in TRPV1 throughout evolution.

    PubMed

    Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

    2012-01-01

    Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Our analysis identifies the regions of TRPV1, which are important for structure-function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol®-induced neuropathy.

  15. FunGene: the functional gene pipeline and repository.

    PubMed

    Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

    2013-01-01

    Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.

  16. A comparative molecular analysis of water-filled limestone sinkholes in north-eastern Mexico.

    PubMed

    Sahl, Jason W; Gary, Marcus O; Harris, J Kirk; Spear, John R

    2011-01-01

    Sistema Zacatón in north-eastern Mexico is host to several deep, water-filled, anoxic, karstic sinkholes (cenotes). These cenotes were explored, mapped, and geochemically and microbiologically sampled by the autonomous underwater vehicle deep phreatic thermal explorer (DEPTHX). The community structure of the filterable fraction of the water column and extensive microbial mats that coat the cenote walls was investigated by comparative analysis of small-subunit (SSU) 16S rRNA gene sequences. Full-length Sanger gene sequence analysis revealed novel microbial diversity that included three putative bacterial candidate phyla and three additional groups that showed high intra-clade distance with poorly characterized bacterial candidate phyla. Limited functional gene sequence analysis in these anoxic environments identified genes associated with methanogenesis, sulfate reduction and anaerobic ammonium oxidation. A directed, barcoded amplicon, multiplex pyrosequencing approach was employed to compare ∼100,000 bacterial SSU gene sequences from water column and wall microbial mat samples from five cenotes in Sistema Zacatón. A new, high-resolution sequence distribution profile (SDP) method identified changes in specific phylogenetic types (phylotypes) in microbial mats at varied depths; Mantel tests showed a correlation of the genetic distances between mat communities in two cenotes and the geographic location of each cenote. Community structure profiles from the water column of three neighbouring cenotes showed distinct variation; statistically significant differences in the concentration of geochemical constituents suggest that the variation observed in microbial communities between neighbouring cenotes are due to geochemical variation. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.

  17. Draft genome sequence and annotation of Lactobacillus acetotolerans BM-LA14527, a beer-spoilage bacteria.

    PubMed

    Liu, Junyan; Li, Lin; Peters, Brian M; Li, Bing; Deng, Yang; Xu, Zhenbo; Shirtliff, Mark E

    2016-09-01

    Lactobacillus acetotolerans is a hard-to-culture beer-spoilage bacterium capable of entering into the viable putative nonculturable (VPNC) state. As part of an initial strategy to investigate the phenotypic behavior of L. acetotolerans, draft genome sequencing was performed. Results demonstrated a total of 1824 predicted annotated genes, with several potential VPNC- and beer-spoilage-associated genes identified. Importantly, this is the first genome sequence of L. acetotolerans as beer-spoilage bacteria and it may aid in further analysis of L. acetotolerans and other beer-spoilage bacteria, with direct implications for food safety control in the beer brewing industry. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  18. Complete genome sequence and architecture of crucian carp Carassius auratus herpesvirus (CaHV).

    PubMed

    Zeng, Xiao-Tao; Chen, Zhong-Yuan; Deng, Yuan-Sheng; Gui, Jian-Fang; Zhang, Qi-Ya

    2016-12-01

    Crucian carp Carassius auratus herpesvirus (CaHV) was isolated from diseased crucian carp with acute gill hemorrhages and high mortality. The CaHV genome was sequenced and analyzed. The data showed that it consists of 275,348 bp and contains 150 predicted ORFs. The architecture of the CaHV genome differs from those of four cyprinid herpesviruses (CyHV1, CyHV2, SY-C1, CyHV3), with insertions, deletions and the absence of a terminal direct repeat. Phylogenetic analysis of the DNA polymerase sequences of 17 strains of Herpesvirales members, and the concatenated 12 core ORFs from 10 strains of alloherpesviruses showed that CaHV clustered together with members of the genus Cyprinivirus, family Alloherpesviridae.

  19. Direct profiling of environmental microbial populations by thermal dissociation analysis of native rRNAs hybridized to oligonucleotide microarrays

    NASA Technical Reports Server (NTRS)

    El Fantroussi, Said; Urakawa, Hidetoshi; Bernhard, Anne E.; Kelly, John J.; Noble, Peter A.; Smidt, H.; Yershov, G. M.; Stahl, David A.

    2003-01-01

    Oligonucleotide microarrays were used to profile directly extracted rRNA from environmental microbial populations without PCR amplification. In our initial inspection of two distinct estuarine study sites, the hybridization patterns were reproducible and varied between estuarine sediments of differing salinities. The determination of a thermal dissociation curve (i.e., melting profile) for each probe-target duplex provided information on hybridization specificity, which is essential for confirming adequate discrimination between target and nontarget sequences.

  20. Direct protein interaction underlies gene-for-gene specificity and coevolution of the flax resistance genes and flax rust avirulence genes

    PubMed Central

    Dodds, Peter N.; Lawrence, Gregory J.; Catanzariti, Ann-Maree; Teh, Trazel; Wang, Ching-I. A.; Ayliffe, Michael A.; Kobe, Bostjan; Ellis, Jeffrey G.

    2006-01-01

    Plant resistance proteins (R proteins) recognize corresponding pathogen avirulence (Avr) proteins either indirectly through detection of changes in their host protein targets or through direct R–Avr protein interaction. Although indirect recognition imposes selection against Avr effector function, pathogen effector molecules recognized through direct interaction may overcome resistance through sequence diversification rather than loss of function. Here we show that the flax rust fungus AvrL567 genes, whose products are recognized by the L5, L6, and L7 R proteins of flax, are highly diverse, with 12 sequence variants identified from six rust strains. Seven AvrL567 variants derived from Avr alleles induce necrotic responses when expressed in flax plants containing corresponding resistance genes (R genes), whereas five variants from avr alleles do not. Differences in recognition specificity between AvrL567 variants and evidence for diversifying selection acting on these genes suggest they have been involved in a gene-specific arms race with the corresponding flax R genes. Yeast two-hybrid assays indicate that recognition is based on direct R–Avr protein interaction and recapitulate the interaction specificity observed in planta. Biochemical analysis of Escherichia coli-produced AvrL567 proteins shows that variants that escape recognition nevertheless maintain a conserved structure and stability, suggesting that the amino acid sequence differences directly affect the R–Avr protein interaction. We suggest that direct recognition associated with high genetic diversity at corresponding R and Avr gene loci represents an alternative outcome of plant–pathogen coevolution to indirect recognition associated with simple balanced polymorphisms for functional and nonfunctional R and Avr genes. PMID:16731621

  1. Novel USH2A compound heterozygous mutations cause RP/USH2 in a Chinese family.

    PubMed

    Liu, Xiaowen; Tang, Zhaohui; Li, Chang; Yang, Kangjuan; Gan, Guanqi; Zhang, Zibo; Liu, Jingyu; Jiang, Fagang; Wang, Qing; Liu, Mugen

    2010-03-17

    To identify the disease-causing gene in a four-generation Chinese family affected with retinitis pigmentosa (RP). Linkage analysis was performed with a panel of microsatellite markers flanking the candidate genetic loci of RP. These loci included 38 known RP genes. The complete coding region and exon-intron boundaries of Usher syndrome 2A (USH2A) were sequenced with the proband DNA to screen the disease-causing gene mutation. Restriction fragment length polymorphism (RFLP) analysis and direct DNA sequence analysis were done to demonstrate co-segregation of the USH2A mutations with the family disease. One hundred normal controls were used without the mutations. The disease-causing gene in this Chinese family was linked to the USH2A locus on chromosome 1q41. Direct DNA sequence analysis of USH2A identified two novel mutations in the patients: one missense mutation p.G1734R in exon 26 and a splice site mutation, IVS32+1G>A, which was found in the donor site of intron 32 of USH2A. Neither the p.G1734R nor the IVS32+1G>A mutation was found in the unaffected family members or the 100 normal controls. One patient with a homozygous mutation displayed only RP symptoms until now, while three patients with compound heterozygous mutations in the family of study showed both RP and hearing impairment. This study identified two novel mutations: p.G1734R and IVS32+1G>A of USH2A in a four-generation Chinese RP family. In this study, the heterozygous mutation and the homozygous mutation in USH2A may cause Usher syndrome Type II or RP, respectively. These two mutations expand the mutant spectrum of USH2A.

  2. Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast

    PubMed Central

    Oud, Bart; Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

    2012-01-01

    Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. PMID:22152095

  3. Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast.

    PubMed

    Oud, Bart; van Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

    2012-03-01

    Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  4. Understanding the complex evolution of rapidly mutating viruses with deep sequencing: Beyond the analysis of viral diversity.

    PubMed

    Leung, Preston; Eltahla, Auda A; Lloyd, Andrew R; Bull, Rowena A; Luciani, Fabio

    2017-07-15

    With the advent of affordable deep sequencing technologies, detection of low frequency variants within genetically diverse viral populations can now be achieved with unprecedented depth and efficiency. The high-resolution data provided by next generation sequencing technologies is currently recognised as the gold standard in estimation of viral diversity. In the analysis of rapidly mutating viruses, longitudinal deep sequencing datasets from viral genomes during individual infection episodes, as well as at the epidemiological level during outbreaks, now allow for more sophisticated analyses such as statistical estimates of the impact of complex mutation patterns on the evolution of the viral populations both within and between hosts. These analyses are revealing more accurate descriptions of the evolutionary dynamics that underpin the rapid adaptation of these viruses to the host response, and to drug therapies. This review assesses recent developments in methods and provide informative research examples using deep sequencing data generated from rapidly mutating viruses infecting humans, particularly hepatitis C virus (HCV), human immunodeficiency virus (HIV), Ebola virus and influenza virus, to understand the evolution of viral genomes and to explore the relationship between viral mutations and the host adaptive immune response. Finally, we discuss limitations in current technologies, and future directions that take advantage of publically available large deep sequencing datasets. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. Molecular cloning and analysis of Schizosaccharomyces pombe Reb1p: sequence-specific recognition of two sites in the far upstream rDNA intergenic spacer.

    PubMed Central

    Zhao, A; Guo, A; Liu, Z; Pape, L

    1997-01-01

    The coding sequences for a Schizosaccharomyces pombe sequence-specific DNA binding protein, Reb1p, have been cloned. The predicted S. pombe Reb1p is 24-29% identical to mouse TTF-1 (transcription termination factor-1) and Saccharomyces cerevisiae REB1 protein, both of which direct termination of RNA polymerase I catalyzed transcripts. The S.pombe Reb1 cDNA encodes a predicted polypeptide of 504 amino acids with a predicted molecular weight of 58.4 kDa. The S. pombe Reb1p is unusual in that the bipartite DNA binding motif identified originally in S.cerevisiae and Klyveromyces lactis REB1 proteins is uninterrupted and thus S.pombe Reb1p may contain the smallest natural REB1 homologous DNA binding domain. Its genomic coding sequences were shown to be interrupted by two introns. A recombinant histidine-tagged Reb1 protein bearing the rDNA binding domain has two homologous, sequence-specific binding sites in the S. pomber DNA intergenic spacer, located between 289 and 480 nt downstream of the end of the approximately 25S rRNA coding sequences. Each binding site is 13-14 bp downstream of two of the three proposed in vivo termination sites. The core of this 17 bp site, AGGTAAGGGTAATGCAC, is specifically protected by Reb1p in footprinting analysis. PMID:9016645

  6. Integrating protein structural dynamics and evolutionary analysis with Bio3D.

    PubMed

    Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J

    2014-12-10

    Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .

  7. Metagenomics: Probing pollutant fate in natural and engineered ecosystems.

    PubMed

    Bouhajja, Emna; Agathos, Spiros N; George, Isabelle F

    2016-12-01

    Polluted environments are a reservoir of microbial species able to degrade or to convert pollutants to harmless compounds. The proper management of microbial resources requires a comprehensive characterization of their genetic pool to assess the fate of contaminants and increase the efficiency of bioremediation processes. Metagenomics offers appropriate tools to describe microbial communities in their whole complexity without lab-based cultivation of individual strains. After a decade of use of metagenomics to study microbiomes, the scientific community has made significant progress in this field. In this review, we survey the main steps of metagenomics applied to environments contaminated with organic compounds or heavy metals. We emphasize technical solutions proposed to overcome encountered obstacles. We then compare two metagenomic approaches, i.e. library-based targeted metagenomics and direct sequencing of metagenomes. In the former, environmental DNA is cloned inside a host, and then clones of interest are selected based on (i) their expression of biodegradative functions or (ii) sequence homology with probes and primers designed from relevant, already known sequences. The highest score for the discovery of novel genes and degradation pathways has been achieved so far by functional screening of large clone libraries. On the other hand, direct sequencing of metagenomes without a cloning step has been more often applied to polluted environments for characterization of the taxonomic and functional composition of microbial communities and their dynamics. In this case, the analysis has focused on 16S rRNA genes and marker genes of biodegradation. Advances in next generation sequencing and in bioinformatic analysis of sequencing data have opened up new opportunities for assessing the potential of biodegradation by microbes, but annotation of collected genes is still hampered by a limited number of available reference sequences in databases. Although metagenomics is still facing technical and computational challenges, our review of the recent literature highlights its value as an aid to efficiently monitor the clean-up of contaminated environments and develop successful strategies to mitigate the impact of pollutants on ecosystems. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. msgbsR: An R package for analysing methylation-sensitive restriction enzyme sequencing data.

    PubMed

    Mayne, Benjamin T; Leemaqz, Shalem Y; Buckberry, Sam; Rodriguez Lopez, Carlos M; Roberts, Claire T; Bianco-Miotto, Tina; Breen, James

    2018-02-01

    Genotyping-by-sequencing (GBS) or restriction-site associated DNA marker sequencing (RAD-seq) is a practical and cost-effective method for analysing large genomes from high diversity species. This method of sequencing, coupled with methylation-sensitive enzymes (often referred to as methylation-sensitive restriction enzyme sequencing or MRE-seq), is an effective tool to study DNA methylation in parts of the genome that are inaccessible in other sequencing techniques or are not annotated in microarray technologies. Current software tools do not fulfil all methylation-sensitive restriction sequencing assays for determining differences in DNA methylation between samples. To fill this computational need, we present msgbsR, an R package that contains tools for the analysis of methylation-sensitive restriction enzyme sequencing experiments. msgbsR can be used to identify and quantify read counts at methylated sites directly from alignment files (BAM files) and enables verification of restriction enzyme cut sites with the correct recognition sequence of the individual enzyme. In addition, msgbsR assesses DNA methylation based on read coverage, similar to RNA sequencing experiments, rather than methylation proportion and is a useful tool in analysing differential methylation on large populations. The package is fully documented and available freely online as a Bioconductor package ( https://bioconductor.org/packages/release/bioc/html/msgbsR.html ).

  9. Mycobacterium marinum infections in fish and humans in Israel.

    PubMed

    Ucko, M; Colorni, A

    2005-02-01

    Israeli Mycobacterium marinum isolates from humans and fish were compared by direct sequencing of the 16S rRNA and hsp65 genes, restriction mapping, and amplified fragment length polymorphism analysis. Significant molecular differences separated all clinical isolates from the piscine isolates, ruling out the local aquaculture industry as the source of human infections.

  10. Modelling Rate for Change of Speed in Calculus Proposal of Inductive Inquiry

    ERIC Educational Resources Information Center

    Sokolowski, Andrzej

    2014-01-01

    Research has shown that students have difficulties with understanding the process of determining whether an object is speeding up or slowing down, especially when it is applied to the analysis of motion in the negative direction. As inductively organized learning through its scaffolding sequencing supports the process of knowledge acquisition…

  11. Mutations in ABCR (ABCA4) in patients with Stargardt macular degeneration or cone-rod degeneration.

    PubMed

    Briggs, C E; Rucinski, D; Rosenfeld, P J; Hirose, T; Berson, E L; Dryja, T P

    2001-09-01

    To determine the spectrum of ABCR mutations associated with Stargardt macular degeneration and cone-rod degeneration (CRD). One hundred eighteen unrelated patients with recessive Stargardt macular degeneration and eight with recessive CRD were screened for mutations in ABCR (ABCA4) by single-strand conformation polymorphism analysis. Variants were characterized by direct genomic sequencing. Segregation analysis was performed on the families of 20 patients in whom at least two or more likely pathogenic sequence changes were identified. The authors found 77 sequence changes likely to be pathogenic: 21 null mutations (15 novel), 55 missense changes (26 novel), and one deletion of a consensus glycosylation site (also novel). Fifty-two patients with Stargardt macular degeneration (44% of those screened) and five with CRD each had two of these sequence changes or were homozygous for one of them. Segregation analyses in the families of 19 of these patients were informative and revealed that the index cases and all available affected siblings were compound heterozygotes or homozygotes. The authors found one instance of an apparently de novo mutation, Ile824Thr, in a patient. Thirty-seven (31%) of the 118 patients with Stargardt disease and one with CRD had only one likely pathogenic sequence change. Twenty-nine patients with Stargardt disease (25%) and two with CRD had no identified sequence changes. This report of 42 novel mutations brings the growing number of identified likely pathogenic sequence changes in ABCR to approximately 250.

  12. Copy number variants calling for single cell sequencing data by multi-constrained optimization.

    PubMed

    Xu, Bo; Cai, Hongmin; Zhang, Changsheng; Yang, Xi; Han, Guoqiang

    2016-08-01

    Variations in DNA copy number carry important information on genome evolution and regulation of DNA replication in cancer cells. The rapid development of single-cell sequencing technology allows one to explore gene expression heterogeneity among single-cells, thus providing important cancer cell evolution information. Single-cell DNA/RNA sequencing data usually have low genome coverage, which requires an extra step of amplification to accumulate enough samples. However, such amplification will introduce large bias and makes bioinformatics analysis challenging. Accurately modeling the distribution of sequencing data and effectively suppressing the bias influence is the key to success variations analysis. Recent advances demonstrate the technical noises by amplification are more likely to follow negative binomial distribution, a special case of Poisson distribution. Thus, we tackle the problem CNV detection by formulating it into a quadratic optimization problem involving two constraints, in which the underling signals are corrupted by Poisson distributed noises. By imposing the constraints of sparsity and smoothness, the reconstructed read depth signals from single-cell sequencing data are anticipated to fit the CNVs patterns more accurately. An efficient numerical solution based on the classical alternating direction minimization method (ADMM) is tailored to solve the proposed model. We demonstrate the advantages of the proposed method using both synthetic and empirical single-cell sequencing data. Our experimental results demonstrate that the proposed method achieves excellent performance and high promise of success with single-cell sequencing data. Crown Copyright © 2016. Published by Elsevier Ltd. All rights reserved.

  13. Site directed recombination

    DOEpatents

    Jurka, Jerzy W.

    1997-01-01

    Enhanced homologous recombination is obtained by employing a consensus sequence which has been found to be associated with integration of repeat sequences, such as Alu and ID. The consensus sequence or sequence having a single transition mutation determines one site of a double break which allows for high efficiency of integration at the site. By introducing single or double stranded DNA having the consensus sequence flanking region joined to a sequence of interest, one can reproducibly direct integration of the sequence of interest at one or a limited number of sites. In this way, specific sites can be identified and homologous recombination achieved at the site by employing a second flanking sequence associated with a sequence proximal to the 3'-nick.

  14. In vivo evolution of antimicrobial resistance in a series of Staphylococcus aureus patient isolates: the entire picture or a cautionary tale?

    PubMed Central

    van Hal, Sebastiaan J.; Steen, Jason A.; Espedido, Björn A.; Grimmond, Sean M.; Cooper, Matthew A.; Holden, Matthew T. G.; Bentley, Stephen D.; Gosbell, Iain B.; Jensen, Slade O.

    2014-01-01

    Objectives To obtain an expanded understanding of antibiotic resistance evolution in vivo, particularly in the context of vancomycin exposure. Methods The whole genomes of six consecutive methicillin-resistant Staphylococcus aureus blood culture isolates (ST239-MRSA-III) from a single patient exposed to various antimicrobials (over a 77 day period) were sequenced and analysed. Results Variant analysis revealed the existence of non-susceptible sub-populations derived from a common susceptible ancestor, with the predominant circulating clone(s) selected for by type and duration of antimicrobial exposure. Conclusions This study highlights the dynamic nature of bacterial evolution and that non-susceptible sub-populations can emerge from clouds of variation upon antimicrobial exposure. Diagnostically, this has direct implications for sample selection when using whole-genome sequencing as a tool to guide clinical therapy. In the context of bacteraemia, deep sequencing of bacterial DNA directly from patient blood samples would avoid culture ‘bias’ and identify mutations associated with circulating non-susceptible sub-populations, some of which may confer cross-resistance to alternate therapies. PMID:24047554

  15. In vivo evolution of antimicrobial resistance in a series of Staphylococcus aureus patient isolates: the entire picture or a cautionary tale?

    PubMed

    van Hal, Sebastiaan J; Steen, Jason A; Espedido, Björn A; Grimmond, Sean M; Cooper, Matthew A; Holden, Matthew T G; Bentley, Stephen D; Gosbell, Iain B; Jensen, Slade O

    2014-02-01

    To obtain an expanded understanding of antibiotic resistance evolution in vivo, particularly in the context of vancomycin exposure. The whole genomes of six consecutive methicillin-resistant Staphylococcus aureus blood culture isolates (ST239-MRSA-III) from a single patient exposed to various antimicrobials (over a 77 day period) were sequenced and analysed. Variant analysis revealed the existence of non-susceptible sub-populations derived from a common susceptible ancestor, with the predominant circulating clone(s) selected for by type and duration of antimicrobial exposure. This study highlights the dynamic nature of bacterial evolution and that non-susceptible sub-populations can emerge from clouds of variation upon antimicrobial exposure. Diagnostically, this has direct implications for sample selection when using whole-genome sequencing as a tool to guide clinical therapy. In the context of bacteraemia, deep sequencing of bacterial DNA directly from patient blood samples would avoid culture 'bias' and identify mutations associated with circulating non-susceptible sub-populations, some of which may confer cross-resistance to alternate therapies.

  16. Genotyping of Leptospira directly in urine samples of cattle demonstrates a diversity of species and strains in Brazil.

    PubMed

    Hamond, C; Pestana, C P; Medeiros, M A; Lilenbaum, W

    2016-01-01

    The aim of this study was to identify Leptospira in urine samples of cattle by direct sequencing of the secY gene. The validity of this approach was assessed using ten Leptospira strains obtained from cattle in Brazil and 77 DNA samples previously extracted from cattle urine, that were positive by PCR for the genus-specific lipL32 gene of Leptospira. Direct sequencing identified 24 (31·1%) interpretable secY sequences and these were identical to those obtained from direct DNA sequencing of the urine samples from which they were recovered. Phylogenetic analyses identified four species: L. interrogans, L. borgpetersenii, L. noguchii, and L. santarosai with the most prevalent genotypes being associated with L. borgpetersenii. While direct sequencing cannot, as yet, replace culturing of leptospires, it is a valid additional tool for epidemiological studies. An unexpected finding from this study was the genetic diversity of Leptospira infecting Brazilian cattle.

  17. Using Wave-Current Observations to Predict Bottom Sediment Processes on Muddy Beaches

    DTIC Science & Technology

    2012-09-30

    Hill and Foda , 1999; Chan and Liu, 2009; Holland et al., 2009; and others). Many theoretical models of wave-mud interaction have been proposed...transformation (see Section Figure 5) emerges from the analysis Sheremet et al., 2005; Jaramillo et al., 2008; Robillard, 2009; ?; ?. Under energetic waves, the...et al., 2010). The ongoing work has three directions of research: Data analysis : reconstruct the sequence of bed states in storms captured in the

  18. Sequencing of the amylopullulanase (apu) gene of Thermoanaerobacter ethanolicus 39E, and identification of the active site by site-directed mutagenesis.

    PubMed

    Mathupala, S P; Lowe, S E; Podkovyrov, S M; Zeikus, J G

    1993-08-05

    The complete nucleotide sequence of the gene encoding the dual active amylopullulanase of Thermoanaerobacter ethanolicus 39E (formerly Clostridium thermohydrosulfuricum) was determined. The structural gene (apu) contained a single open reading frame 4443 base pairs in length, corresponding to 1481 amino acids, with an estimated molecular weight of 162,780. Analysis of the deduced sequence of apu with sequences of alpha-amylases and alpha-1,6 debranching enzymes enabled the identification of four conserved regions putatively involved in substrate binding and in catalysis. The conserved regions were localized within a 2.9-kilobase pair gene fragment, which encoded a M(r) 100,000 protein that maintained the dual activities and thermostability of the native enzyme. The catalytic residues of amylopullulanase were tentatively identified by using hydrophobic cluster analysis for comparison of amino acid sequences of amylopullulanase and other amylolytic enzymes. Asp597, Glu626, and Asp703 were individually modified to their respective amide form, or the alternate acid form, and in all cases both alpha-amylase and pullulanase activities were lost, suggesting the possible involvement of 3 residues in a catalytic triad, and the presence of a putative single catalytic site within the enzyme. These findings substantiate amylopullulanase as a new type of amylosaccharidase.

  19. Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

    2016-03-01

    Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.

  20. Genomics approach to the environmental community of microorganisms

    NASA Astrophysics Data System (ADS)

    Kawarabayasi, Y.; Maruyama, A.

    2004-12-01

    It was indicated by microscopic observation or comparison of 16S rDNA sequence that many extremophiles were surviving in many hydrothermal environments. But it is generally said that over 99% of total microbes are now uncultivable. Thus, we planned to identify uncultivable microbes through direct sequencing of environmental DNA. At first, shotgun plasmid libraries were directly constructed with the DNA molecules prepared from mixed microbes collected from low-temperature hydrothermal water at RM24 in the Southern East Pacific Rise (S-EPR). It was shown that the sequences of some number of clones indicated the similar feature to the intron in eukaryote or tandem repetitive sequence identified in some human familiar diseases. The results indicated that many microorganisms with eukaryotic feature were dominant in low temperature water of S-EPR. Secondly, shotgun plasmid libraries were constructed from the environmental DNA prepared from Beppu hot springs. The ORFs were easily identified all clones determined entire sequence. Thus it can be said that hot springs is good resources for searching novel genes. At last, the mixed microbes isolated from Suiyo seamount were used for construction of shotgun library. The clones in this library contained the ORFs. From some clones in hot spring and Suiyo sample, aminoacyl-tRNA synthatase, which is generally present in all organisms, was isolated by similarity. The phylogenetic analysis of aminoacyl-tRNA synthetase identified indicated that novel and unidentified microorganisms should be present in hot spring or Suiyo seamount. The novel genes identified from Suiyo seamount were also utilized for expression in E. coli. Some gene products were successfully obtained from the E. coli cells as soluble proteins. Some protein indicated the thermostability up to 70_E#8249;C, meaning that the original host cell of this gene should be stable up to the same temperature. Our work indicates that environmental genomics, including the direct cloning, sequencing of environmental DNA and expression of gene identified, is powerful approach to collect novel uncultivable microbes or novel active genes.

  1. Analysis of protein-coding genetic variation in 60,706 humans.

    PubMed

    Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose C; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin M; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah M; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark J; MacArthur, Daniel G

    2016-08-18

    Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

  2. The complete genome sequence and proteomics of Yersinia pestis phage Yep-phi.

    PubMed

    Zhao, Xiangna; Wu, Weili; Qi, Zhizhen; Cui, Yujun; Yan, Yanfeng; Guo, Zhaobiao; Wang, Zuyun; Wang, Hu; Deng, Haijun; Xue, Yan; Chen, Weijun; Wang, Xiaoyi; Yang, Ruifu

    2011-01-01

    Yep-phi, a lytic phage of Yersinia pestis, was isolated in China and is routinely used as a diagnostic phage for the identification of the plague pathogen. Yep-phi has an isometric hexagonal head containing dsDNA and a short non-contractile conical tail. In this study, we sequenced the Yep-phi genome (GenBank accession no. HQ333270) and performed proteomics analysis. The genome consists of 38 ,616 bp of DNA, including direct terminal repeats of 222 bp, and is predicted to contain 45 ORFs. Most structural proteins were identified by proteomics analysis. Compared with the three available genome sequences of lytic phages for Y. pestis, the phages could be divided into two subgroups. Yep-phi displays marked homology to the bacteriophages Berlin (GenBank accession no. AM183667) and Yepe2 (GenBank accession no. EU734170), and these comprise one subgroup. The other subgroup is represented by bacteriophage ΦA1122 (GenBank accession no. AY247822). Potential recombination was detected among the Yep-phi subgroup.

  3. Review of General Algorithmic Features for Genome Assemblers for Next Generation Sequencers

    PubMed Central

    Wajid, Bilal; Serpedin, Erchin

    2012-01-01

    In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity. PMID:22768980

  4. Compositional segmentation and complexity measurement in stock indices

    NASA Astrophysics Data System (ADS)

    Wang, Haifeng; Shang, Pengjian; Xia, Jianan

    2016-01-01

    In this paper, we introduce a complexity measure based on the entropic segmentation called sequence compositional complexity (SCC) into the analysis of financial time series. SCC was first used to deal directly with the complex heterogeneity in nonstationary DNA sequences. We already know that SCC was found to be higher in sequences with long-range correlation than those with low long-range correlation, especially in the DNA sequences. Now, we introduce this method into financial index data, subsequently, we find that the values of SCC of some mature stock indices, such as S & P 500 (simplified with S & P in the following) and HSI, are likely to be lower than the SCC value of Chinese index data (such as SSE). What is more, we find that, if we classify the indices with the method of SCC, the financial market of Hong Kong has more similarities with mature foreign markets than Chinese ones. So we believe that a good correspondence is found between the SCC of the index sequence and the complexity of the market involved.

  5. APPLaUD: access for patients and participants to individual level uninterpreted genomic data.

    PubMed

    Thorogood, Adrian; Bobe, Jason; Prainsack, Barbara; Middleton, Anna; Scott, Erick; Nelson, Sarah; Corpas, Manuel; Bonhomme, Natasha; Rodriguez, Laura Lyman; Murtagh, Madeleine; Kleiderman, Erika

    2018-02-17

    There is a growing support for the stance that patients and research participants should have better and easier access to their raw (uninterpreted) genomic sequence data in both clinical and research contexts. We review legal frameworks and literature on the benefits, risks, and practical barriers of providing individuals access to their data. We also survey genomic sequencing initiatives that provide or plan to provide individual access. Many patients and research participants expect to be able to access their health and genomic data. Individuals have a legal right to access their genomic data in some countries and contexts. Moreover, increasing numbers of participatory research projects, direct-to-consumer genetic testing companies, and now major national sequencing initiatives grant individuals access to their genomic sequence data upon request. Drawing on current practice and regulatory analysis, we outline legal, ethical, and practical guidance for genomic sequencing initiatives seeking to offer interested patients and participants access to their raw genomic data.

  6. Ancestry estimation and control of population stratification for sequence-based association studies.

    PubMed

    Wang, Chaolong; Zhan, Xiaowei; Bragg-Gresham, Jennifer; Kang, Hyun Min; Stambolian, Dwight; Chew, Emily Y; Branham, Kari E; Heckenlively, John; Fulton, Robert; Wilson, Richard K; Mardis, Elaine R; Lin, Xihong; Swaroop, Anand; Zöllner, Sebastian; Abecasis, Gonçalo R

    2014-04-01

    Estimating individual ancestry is important in genetic association studies where population structure leads to false positive signals, although assigning ancestry remains challenging with targeted sequence data. We propose a new method for the accurate estimation of individual genetic ancestry, based on direct analysis of off-target sequence reads, and implement our method in the publicly available LASER software. We validate the method using simulated and empirical data and show that the method can accurately infer worldwide continental ancestry when used with sequencing data sets with whole-genome shotgun coverage as low as 0.001×. For estimates of fine-scale ancestry within Europe, the method performs well with coverage of 0.1×. On an even finer scale, the method improves discrimination between exome-sequenced study participants originating from different provinces within Finland. Finally, we show that our method can be used to improve case-control matching in genetic association studies and to reduce the risk of spurious findings due to population structure.

  7. Single-Cell Semiconductor Sequencing

    PubMed Central

    Kohn, Andrea B.; Moroz, Tatiana P.; Barnes, Jeffrey P.; Netherton, Mandy; Moroz, Leonid L.

    2014-01-01

    RNA-seq or transcriptome analysis of individual cells and small-cell populations is essential for virtually any biomedical field. It is especially critical for developmental, aging, and cancer biology as well as neuroscience where the enormous heterogeneity of cells present a significant methodological and conceptual challenge. Here we present two methods that allow for fast and cost-efficient transcriptome sequencing from ultra-small amounts of tissue or even from individual cells using semiconductor sequencing technology (Ion Torrent, Life Technologies). The first method is a reduced representation sequencing which maximizes capture of RNAs and preserves transcripts’ directionality. The second, a template-switch protocol, is designed for small mammalian neurons. Both protocols, from cell/tissue isolation to final sequence data, take up to 4 days. The efficiency of these protocols has been validated with single hippocampal neurons and various invertebrate tissues including individually identified neurons within a simpler memory-forming circuit of Aplysia californica and early (1-, 2-, 4-, 8-cells) embryonic and developmental stages from basal metazoans. PMID:23929110

  8. Dynamically hot galaxies. I - Structural properties

    NASA Technical Reports Server (NTRS)

    Bender, Ralf; Burstein, David; Faber, S. M.

    1992-01-01

    Results are reported from an analysis of the structural properties of dynamically hot galaxies which combines central velocity dispersion, effective surface brightness, and effective radius into a new 3-space (k), in which the axes are parameters that are physically meaningful. Hot galaxies are found to divide into groups in k-space that closely parallel conventional morphological classifications, namely, luminous ellipticals, compacts, bulges, bright dwarfs, and dwarf spheroidals. A major sequence is defined by luminous ellipticals, bulges, and most compacts, which together constitute a smooth continuum in k-space. Several properties vary smoothly with mass along this continuum, including bulge-to-disk ratio, radio properties, rotation, degree of velocity anisotropy, and 'unrelaxed'. A second major sequence is comprised of dwarf ellipticals and dwarf spheroidals. It is suggested that mass loss is a major factor in hot dwarf galaxies, but the dwarf sequence cannot be simply a mass-loss sequence, as it has the wrong direction in k-space.

  9. Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives.

    PubMed

    Zhao, Min; Wang, Qingguo; Wang, Quan; Jia, Peilin; Zhao, Zhongming

    2013-01-01

    Copy number variation (CNV) is a prevalent form of critical genetic variation that leads to an abnormal number of copies of large genomic regions in a cell. Microarray-based comparative genome hybridization (arrayCGH) or genotyping arrays have been standard technologies to detect large regions subject to copy number changes in genomes until most recently high-resolution sequence data can be analyzed by next-generation sequencing (NGS). During the last several years, NGS-based analysis has been widely applied to identify CNVs in both healthy and diseased individuals. Correspondingly, the strong demand for NGS-based CNV analyses has fuelled development of numerous computational methods and tools for CNV detection. In this article, we review the recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data. Additionally, we discuss their strengths and weaknesses and suggest directions for future development.

  10. Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives

    PubMed Central

    2013-01-01

    Copy number variation (CNV) is a prevalent form of critical genetic variation that leads to an abnormal number of copies of large genomic regions in a cell. Microarray-based comparative genome hybridization (arrayCGH) or genotyping arrays have been standard technologies to detect large regions subject to copy number changes in genomes until most recently high-resolution sequence data can be analyzed by next-generation sequencing (NGS). During the last several years, NGS-based analysis has been widely applied to identify CNVs in both healthy and diseased individuals. Correspondingly, the strong demand for NGS-based CNV analyses has fuelled development of numerous computational methods and tools for CNV detection. In this article, we review the recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data. Additionally, we discuss their strengths and weaknesses and suggest directions for future development. PMID:24564169

  11. The Central Italy Seismic Sequence (2016): Spatial Patterns and Dynamic Fingerprints

    NASA Astrophysics Data System (ADS)

    Suteanu, Cristian; Liucci, Luisa; Melelli, Laura

    2018-01-01

    The paper investigates spatio-temporal aspects of the seismic sequence that started in Central Italy (Amatrice, Lazio region) in August 2016, causing hundreds of fatalities and producing major damage to settlements. On one hand, scaling properties of the landscape topography are identified and related to geomorphological processes, supporting the identification of preferential spatial directions in tectonic activity and confirming the role of the past tectonic periods and ongoing processes with respect to the driving of the geomorphological evolution of the area. On the other hand, relations between the spatio-temporal evolution of the sequence and the seismogenic fault systems are studied. The dynamic fingerprints of seismicity are established with the help of events thread analysis (ETA), which characterizes anisotropy in spatio-temporal earthquake patterns. ETA confirms the fact that the direction of the seismogenic normal fault-oriented (N)NW-(S)SE is characterized by persistent seismic activity. More importantly, it also highlights the role of the pre-existing compressive structures, Neogenic thrust and transpressive regional fronts, with a trend-oriented (N)NE-(S)SW, in the stress transfer. Both the fractal features of the topographic surface and the dynamic fingerprint of the recent seismic sequence point to the hypothesis of an active interaction between the Quaternary fault systems and the pre-existing compressional structures.

  12. Retinitis Pigmentosa with EYS Mutations Is the Most Prevalent Inherited Retinal Dystrophy in Japanese Populations.

    PubMed

    Arai, Yuuki; Maeda, Akiko; Hirami, Yasuhiko; Ishigami, Chie; Kosugi, Shinji; Mandai, Michiko; Kurimoto, Yasuo; Takahashi, Masayo

    2015-01-01

    The aim of this study was to gain information about disease prevalence and to identify the responsible genes for inherited retinal dystrophies (IRD) in Japanese populations. Clinical and molecular evaluations were performed on 349 patients with IRD. For segregation analyses, 63 of their family members were employed. Bioinformatics data from 1,208 Japanese individuals were used as controls. Molecular diagnosis was obtained by direct sequencing in a stepwise fashion utilizing one or two panels of 15 and 27 genes for retinitis pigmentosa patients. If a specific clinical diagnosis was suspected, direct sequencing of disease-specific genes, that is, ABCA4 for Stargardt disease, was conducted. Limited availability of intrafamily information and decreasing family size hampered identifying inherited patterns. Differential disease profiles with lower prevalence of Stargardt disease from European and North American populations were obtained. We found 205 sequence variants in 159 of 349 probands with an identification rate of 45.6%. This study found 43 novel sequence variants. In silico analysis suggests that 20 of 25 novel missense variants are pathogenic. EYS mutations had the highest prevalence at 23.5%. c.4957_4958insA and c.8868C>A were the two major EYS mutations identified in this cohort. EYS mutations are the most prevalent among Japanese patients with IRD.

  13. Analysis of SNP rs16754 of WT1 gene in a series of de novo acute myeloid leukemia patients.

    PubMed

    Luna, Irene; Such, Esperanza; Cervera, Jose; Barragán, Eva; Jiménez-Velasco, Antonio; Dolz, Sandra; Ibáñez, Mariam; Gómez-Seguí, Inés; López-Pavía, María; Llop, Marta; Fuster, Óscar; Oltra, Silvestre; Moscardó, Federico; Martínez-Cuadrón, David; Senent, M Leonor; Gascón, Adriana; Montesinos, Pau; Martín, Guillermo; Bolufer, Pascual; Sanz, Miguel A

    2012-12-01

    The single nucleotide polymorphism (SNP) rs16754 of the WT1 gene has been previously described as a possible prognostic marker in normal karyotype acute myeloid leukemia (AML) patients. Nevertheless, the findings in this field are not always reproducible in different series. One hundred and seventy-five adult de novo AML patients were screened with two different methods for the detection of SNP rs16754: high-resolution melting (HRM) and FRET hybridization probes. Direct sequencing was used to validate both techniques. The SNP was detected in 52 out of 175 patients (30 %), both by HRM and hybridization probes. Direct sequencing confirmed that every positive sample in the screening methods had a variation in the DNA sequence. Patients with the wild-type genotype (WT1(AA)) for the SNP rs16754 were significantly younger than those with the heterozygous WT1(AG) genotype. No other difference was observed for baseline characteristic or outcome between patients with or without the SNP. Both techniques are equally reliable and reproducible as screening methods for the detection of the SNP rs16754, allowing for the selection of those samples that will need to be sequenced. We were unable to confirm the suggested favorable outcome of SNP rs16754 in de novo AML.

  14. Transcriptome and Small RNA Deep Sequencing Reveals Deregulation of miRNA Biogenesis in Human Glioma

    PubMed Central

    Moore, Lynette M.; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N.; Zhang, Wei; Nykter, Matti

    2013-01-01

    Altered expression of oncogenic and tumor-suppressing microRNAs (miRNAs) is widely associated with tumorigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumors. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and interrogated expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression. PMID:23007860

  15. Pyrin gene and mutants thereof, which cause familial Mediterranean fever

    DOEpatents

    Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL

    2003-09-30

    The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.

  16. Methods and apparatus for analysis of chromatographic migration patterns

    DOEpatents

    Stockham, Thomas G.; Ives, Jeffrey T.

    1993-01-01

    A method and apparatus for sharpening signal peaks in a signal representing the distribution of biological or chemical components of a mixture separated by a chromatographic technique such as, but not limited to, electrophoresis. A key step in the method is the use of a blind deconvolution technique, presently embodied as homomorphic filtering, to reduce the contribution of a blurring function to the signal encoding the peaks of the distribution. The invention further includes steps and apparatus directed to determination of a nucleotide sequence from a set of four such signals representing DNA sequence data derived by electrophoretic means.

  17. Recovery and characterization of a Citrus clementina Hort. ex Tan. 'Clemenules' haploid plant selected to establish the reference whole Citrus genome sequence.

    PubMed

    Aleza, Pablo; Juárez, José; Hernández, María; Pina, José A; Ollitrault, Patrick; Navarro, Luis

    2009-08-22

    In recent years, the development of structural genomics has generated a growing interest in obtaining haploid plants. The use of homozygous lines presents a significant advantage for the accomplishment of sequencing projects. Commercial citrus species are characterized by high heterozygosity, making it difficult to assemble large genome sequences. Thus, the International Citrus Genomic Consortium (ICGC) decided to establish a reference whole citrus genome sequence from a homozygous plant. Due to the existence of important molecular resources and previous success in obtaining haploid clementine plants, haploid clementine was selected as the target for the implementation of the reference whole genome citrus sequence. To obtain haploid clementine lines we used the technique of in situ gynogenesis induced by irradiated pollen. Flow cytometry, chromosome counts and SSR marker (Simple Sequence Repeats) analysis facilitated the identification of six different haploid lines (2n = x = 9), one aneuploid line (2n = 2x+4 = 22) and one doubled haploid plant (2n = 2x = 18) of 'Clemenules' clementine. One of the haploids, obtained directly from an original haploid embryo, grew vigorously and produced flowers after four years. This is the first haploid plant of clementine that has bloomed and we have, for the first time, characterized the histology of haploid and diploid flowers of clementine. Additionally a double haploid plant was obtained spontaneously from this haploid line. The first haploid plant of 'Clemenules' clementine produced directly by germination of a haploid embryo, which grew vigorously and produced flowers, has been obtained in this work. This haploid line has been selected and it is being used by the ICGC to establish the reference sequence of the nuclear genome of citrus.

  18. Automated one-step DNA sequencing based on nanoliter reaction volumes and capillary electrophoresis.

    PubMed

    Pang, H M; Yeung, E S

    2000-08-01

    An integrated system with a nano-reactor for cycle-sequencing reaction coupled to on-line purification and capillary gel electrophoresis has been demonstrated. Fifty nanoliters of reagent solution, which includes dye-labeled terminators, polymerase, BSA and template, was aspirated and mixed with the template inside the nano-reactor followed by cycle-sequencing reaction. The reaction products were then purified by a size-exclusion chromatographic column operated at 50 degrees C followed by room temperature on-line injection of the DNA fragments into a capillary for gel electrophoresis. Over 450 bases of DNA can be separated and identified. As little as 25 nl reagent solution can be used for the cycle-sequencing reaction with a slightly shorter read length. Significant savings on reagent cost is achieved because the remaining stock solution can be reused without contamination. The steps of cycle sequencing, on-line purification, injection, DNA separation, capillary regeneration, gel-filling and fluidic manipulation were performed with complete automation. This system can be readily multiplexed for high-throughput DNA sequencing or PCR analysis directly from templates or even biological materials.

  19. Scop3D: three-dimensional visualization of sequence conservation.

    PubMed

    Vermeire, Tessa; Vermaere, Stijn; Schepens, Bert; Saelens, Xavier; Van Gucht, Steven; Martens, Lennart; Vandermarliere, Elien

    2015-04-01

    The integration of a protein's structure with its known sequence variation provides insight on how that protein evolves, for instance in terms of (changing) function or immunogenicity. Yet, collating the corresponding sequence variants into a multiple sequence alignment, calculating each position's conservation, and mapping this information back onto a relevant structure is not straightforward. We therefore built the Sequence Conservation on Protein 3D structure (scop3D) tool to perform these tasks automatically. The output consists of two modified PDB files in which the B-values for each position are replaced by the percentage sequence conservation, or the information entropy for each position, respectively. Furthermore, text files with absolute and relative amino acid occurrences for each position are also provided, along with snapshots of the protein from six distinct directions in space. The visualization provided by scop3D can for instance be used as an aid in vaccine development or to identify antigenic hotspots, which we here demonstrate based on an analysis of the fusion proteins of human respiratory syncytial virus and mumps virus. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Combining De Novo Peptide Sequencing Algorithms, A Synergistic Approach to Boost Both Identifications and Confidence in Bottom-up Proteomics.

    PubMed

    Blank-Landeshammer, Bernhard; Kollipara, Laxmikanth; Biß, Karsten; Pfenninger, Markus; Malchow, Sebastian; Shuvaev, Konstantin; Zahedi, René P; Sickmann, Albert

    2017-09-01

    Complex mass spectrometry based proteomics data sets are mostly analyzed by protein database searches. While this approach performs considerably well for sequenced organisms, direct inference of peptide sequences from tandem mass spectra, i.e., de novo peptide sequencing, oftentimes is the only way to obtain information when protein databases are absent. However, available algorithms suffer from drawbacks such as lack of validation and often high rates of false positive hits (FP). Here we present a simple method of combining results from commonly available de novo peptide sequencing algorithms, which in conjunction with minor tweaks in data acquisition ensues lower empirical FDR compared to the analysis using single algorithms. Results were validated using state-of-the art database search algorithms as well specifically synthesized reference peptides. Thus, we could increase the number of PSMs meeting a stringent FDR of 5% more than 3-fold compared to the single best de novo sequencing algorithm alone, accounting for an average of 11 120 PSMs (combined) instead of 3476 PSMs (alone) in triplicate 2 h LC-MS runs of tryptic HeLa digestion.

  1. Molecular analysis of the AGXT gene in Italian patients with primary hyperoxaluria type 1 (PH1).

    PubMed

    Ferrettini, C; Pirulli, D; Cosseddu, D; Marangella, M; Petrarulo, M; Mazzola, G; Vatta, S; Amoroso, A

    1998-01-01

    Specimens were collected from 22 Italian patients with primary hyperoxaluria type 1 (PH1). Ten of them had already been analyzed by molecular biology. To clarify the molecular characteristics of the AGXT gene disease responsible for PH1, DNA samples were examined for known mutations by hybridisation of PCR products with Sequence Specific Oligonucleotides (PCR-SSO). We planned to identify new mutations of the AGXT gene by heteroduplex analysis followed by direct sequencing. We had already standardized a) the conditions for the amplification of the 11 exons of AGXT, b) the PCR-SSO technique and c) the heteroduplex analysis of amplified products. Preliminary results demonstrated that the AGXT mutations described in previous studies were found only in 40% of the examined Italian patients with PH1. The remaining 60% of mutations should be characterised in future studies.

  2. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing

    PubMed Central

    Manske, Magnus; Miotto, Olivo; Campino, Susana; Auburn, Sarah; Almagro-Garcia, Jacob; Maslen, Gareth; O’Brien, Jack; Djimde, Abdoulaye; Doumbo, Ogobara; Zongo, Issaka; Ouedraogo, Jean-Bosco; Michon, Pascal; Mueller, Ivo; Siba, Peter; Nzila, Alexis; Borrmann, Steffen; Kiara, Steven M.; Marsh, Kevin; Jiang, Hongying; Su, Xin-Zhuan; Amaratunga, Chanaki; Fairhurst, Rick; Socheat, Duong; Nosten, Francois; Imwong, Mallika; White, Nicholas J.; Sanders, Mandy; Anastasi, Elisa; Alcock, Dan; Drury, Eleanor; Oyola, Samuel; Quail, Michael A.; Turner, Daniel J.; Rubio, Valentin Ruano; Jyothi, Dushyanth; Amenga-Etego, Lucas; Hubbart, Christina; Jeffreys, Anna; Rowlands, Kate; Sutherland, Colin; Roper, Cally; Mangano, Valentina; Modiano, David; Tan, John C.; Ferdig, Michael T.; Amambua-Ngwa, Alfred; Conway, David J.; Takala-Harrison, Shannon; Plowe, Christopher V.; Rayner, Julian C.; Rockett, Kirk A.; Clark, Taane G.; Newbold, Chris I.; Berriman, Matthew; MacInnis, Bronwyn; Kwiatkowski, Dominic P.

    2013-01-01

    Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. 1,2 Here we describe methods for large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short term culture. Analysis of 86,158 exonic SNPs that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome. PMID:22722859

  3. Estimation of a Killer Whale (Orcinus orca) Population’s Diet Using Sequencing Analysis of DNA from Feces

    PubMed Central

    Ford, Michael J.; Hempelmann, Jennifer; Hanson, M. Bradley; Ayres, Katherine L.; Baird, Robin W.; Emmons, Candice K.; Lundin, Jessica I.; Schorr, Gregory S.; Wasser, Samuel K.; Park, Linda K.

    2016-01-01

    Estimating diet composition is important for understanding interactions between predators and prey and thus illuminating ecosystem function. The diet of many species, however, is difficult to observe directly. Genetic analysis of fecal material collected in the field is therefore a useful tool for gaining insight into wild animal diets. In this study, we used high-throughput DNA sequencing to quantitatively estimate the diet composition of an endangered population of wild killer whales (Orcinus orca) in their summer range in the Salish Sea. We combined 175 fecal samples collected between May and September from five years between 2006 and 2011 into 13 sample groups. Two known DNA composition control groups were also created. Each group was sequenced at a ~330bp segment of the 16s gene in the mitochondrial genome using an Illumina MiSeq sequencing system. After several quality controls steps, 4,987,107 individual sequences were aligned to a custom sequence database containing 19 potential fish prey species and the most likely species of each fecal-derived sequence was determined. Based on these alignments, salmonids made up >98.6% of the total sequences and thus of the inferred diet. Of the six salmonid species, Chinook salmon made up 79.5% of the sequences, followed by coho salmon (15%). Over all years, a clear pattern emerged with Chinook salmon dominating the estimated diet early in the summer, and coho salmon contributing an average of >40% of the diet in late summer. Sockeye salmon appeared to be occasionally important, at >18% in some sample groups. Non-salmonids were rarely observed. Our results are consistent with earlier results based on surface prey remains, and confirm the importance of Chinook salmon in this population’s summer diet. PMID:26735849

  4. Estimation of a Killer Whale (Orcinus orca) Population's Diet Using Sequencing Analysis of DNA from Feces.

    PubMed

    Ford, Michael J; Hempelmann, Jennifer; Hanson, M Bradley; Ayres, Katherine L; Baird, Robin W; Emmons, Candice K; Lundin, Jessica I; Schorr, Gregory S; Wasser, Samuel K; Park, Linda K

    2016-01-01

    Estimating diet composition is important for understanding interactions between predators and prey and thus illuminating ecosystem function. The diet of many species, however, is difficult to observe directly. Genetic analysis of fecal material collected in the field is therefore a useful tool for gaining insight into wild animal diets. In this study, we used high-throughput DNA sequencing to quantitatively estimate the diet composition of an endangered population of wild killer whales (Orcinus orca) in their summer range in the Salish Sea. We combined 175 fecal samples collected between May and September from five years between 2006 and 2011 into 13 sample groups. Two known DNA composition control groups were also created. Each group was sequenced at a ~330bp segment of the 16s gene in the mitochondrial genome using an Illumina MiSeq sequencing system. After several quality controls steps, 4,987,107 individual sequences were aligned to a custom sequence database containing 19 potential fish prey species and the most likely species of each fecal-derived sequence was determined. Based on these alignments, salmonids made up >98.6% of the total sequences and thus of the inferred diet. Of the six salmonid species, Chinook salmon made up 79.5% of the sequences, followed by coho salmon (15%). Over all years, a clear pattern emerged with Chinook salmon dominating the estimated diet early in the summer, and coho salmon contributing an average of >40% of the diet in late summer. Sockeye salmon appeared to be occasionally important, at >18% in some sample groups. Non-salmonids were rarely observed. Our results are consistent with earlier results based on surface prey remains, and confirm the importance of Chinook salmon in this population's summer diet.

  5. SACCHARIS: an automated pipeline to streamline discovery of carbohydrate active enzyme activities within polyspecific families and de novo sequence datasets.

    PubMed

    Jones, Darryl R; Thomas, Dallas; Alger, Nicholas; Ghavidel, Ata; Inglis, G Douglas; Abbott, D Wade

    2018-01-01

    Deposition of new genetic sequences in online databases is expanding at an unprecedented rate. As a result, sequence identification continues to outpace functional characterization of carbohydrate active enzymes (CAZymes). In this paradigm, the discovery of enzymes with novel functions is often hindered by high volumes of uncharacterized sequences particularly when the enzyme sequence belongs to a family that exhibits diverse functional specificities (i.e., polyspecificity). Therefore, to direct sequence-based discovery and characterization of new enzyme activities we have developed an automated in silico pipeline entitled: Sequence Analysis and Clustering of CarboHydrate Active enzymes for Rapid Informed prediction of Specificity (SACCHARIS). This pipeline streamlines the selection of uncharacterized sequences for discovery of new CAZyme or CBM specificity from families currently maintained on the CAZy website or within user-defined datasets. SACCHARIS was used to generate a phylogenetic tree of a GH43, a CAZyme family with defined subfamily designations. This analysis confirmed that large datasets can be organized into sequence clusters of manageable sizes that possess related functions. Seeding this tree with a GH43 sequence from Bacteroides dorei DSM 17855 (BdGH43b, revealed it partitioned as a single sequence within the tree. This pattern was consistent with it possessing a unique enzyme activity for GH43 as BdGH43b is the first described α-glucanase described for this family. The capacity of SACCHARIS to extract and cluster characterized carbohydrate binding module sequences was demonstrated using family 6 CBMs (i.e., CBM6s). This CBM family displays a polyspecific ligand binding profile and contains many structurally determined members. Using SACCHARIS to identify a cluster of divergent sequences, a CBM6 sequence from a unique clade was demonstrated to bind yeast mannan, which represents the first description of an α-mannan binding CBM. Additionally, we have performed a CAZome analysis of an in-house sequenced bacterial genome and a comparative analysis of B. thetaiotaomicron VPI-5482 and B. thetaiotaomicron 7330, to demonstrate that SACCHARIS can generate "CAZome fingerprints", which differentiate between the saccharolytic potential of two related strains in silico. Establishing sequence-function and sequence-structure relationships in polyspecific CAZyme families are promising approaches for streamlining enzyme discovery. SACCHARIS facilitates this process by embedding CAZyme and CBM family trees generated from biochemically to structurally characterized sequences, with protein sequences that have unknown functions. In addition, these trees can be integrated with user-defined datasets (e.g., genomics, metagenomics, and transcriptomics) to inform experimental characterization of new CAZymes or CBMs not currently curated, and for researchers to compare differential sequence patterns between entire CAZomes. In this light, SACCHARIS provides an in silico tool that can be tailored for enzyme bioprospecting in datasets of increasing complexity and for diverse applications in glycobiotechnology.

  6. Newborn Screening in the Era of Precision Medicine.

    PubMed

    Yang, Lan; Chen, Jiajia; Shen, Bairong

    2017-01-01

    As newborn screening success stories gained general confirmation during the past 50 years, scientists quickly discovered diagnostic tests for a host of genetic disorders that could be treated at birth. Outstanding progress in sequencing technologies over the last two decades has made it possible to comprehensively profile newborn screening (NBS) and identify clinically relevant genomic alterations. With the rapid developments in whole-genome sequencing (WGS) and whole-exome sequencing (WES) recently, we can detect newborns at the genomic level and be able to direct the appropriate diagnosis to the different individuals at the appropriate time, which is also encompassed in the concept of precision medicine. Besides, we can develop novel interventions directed at the molecular characteristics of genetic diseases in newborns. The implementation of genomics in NBS programs would provide an effective premise for the identification of the majority of genetic aberrations and primarily help in accurate guidance in treatment and better prediction. However, there are some debate correlated with the widespread application of genome sequencing in NBS due to some major concerns such as clinical analysis, result interpretation, storage of sequencing data, and communication of clinically relevant mutations to pediatricians and parents, along with the ethical, legal, and social implications (so-called ELSI). This review is focused on these critical issues and concerns about the expanding role of genomics in NBS for precision medicine. If WGS or WES is to be incorporated into NBS practice, considerations about these challenges should be carefully regarded and tackled properly to adapt the requirement of genome sequencing in the era of precision medicine.

  7. Single-molecule, full-length transcript sequencing provides insight into the extreme metabolism of the ruby-throated hummingbird Archilochus colubris.

    PubMed

    Workman, Rachael E; Myrka, Alexander M; Wong, G William; Tseng, Elizabeth; Welch, Kenneth C; Timp, Winston

    2018-03-01

    Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids that are derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism.

  8. Genome Improvement at JGI-HAGSC

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

    Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence.more » For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.« less

  9. Classification of Plant Associated Bacteria Using RIF, a Computationally Derived DNA Marker

    PubMed Central

    Schneider, Kevin L.; Marrero, Glorimar; Alvarez, Anne M.; Presting, Gernot G.

    2011-01-01

    A DNA marker that distinguishes plant associated bacteria at the species level and below was derived by comparing six sequenced genomes of Xanthomonas, a genus that contains many important phytopathogens. This DNA marker comprises a portion of the dnaA replication initiation factor (RIF). Unlike the rRNA genes, dnaA is a single copy gene in the vast majority of sequenced bacterial genomes, and amplification of RIF requires genus-specific primers. In silico analysis revealed that RIF has equal or greater ability to differentiate closely related species of Xanthomonas than the widely used ribosomal intergenic spacer region (ITS). Furthermore, in a set of 263 Xanthomonas, Ralstonia and Clavibacter strains, the RIF marker was directly sequenced in both directions with a success rate approximately 16% higher than that for ITS. RIF frameworks for Xanthomonas, Ralstonia and Clavibacter were constructed using 682 reference strains representing different species, subspecies, pathovars, races, hosts and geographic regions, and contain a total of 109 different RIF sequences. RIF sequences showed subspecific groupings but did not place strains of X. campestris or X. axonopodis into currently named pathovars nor R. solanacearum strains into their respective races, confirming previous conclusions that pathovar and race designations do not necessarily reflect genetic relationships. The RIF marker also was sequenced for 24 reference strains from three genera in the Enterobacteriaceae: Pectobacterium, Pantoea and Dickeya. RIF sequences of 70 previously uncharacterized strains of Ralstonia, Clavibacter, Pectobacterium and Dickeya matched, or were similar to, those of known reference strains, illustrating the utility of the frameworks to classify bacteria below the species level and rapidly match unknown isolates to reference strains. The RIF sequence frameworks are available at the online RIF database, RIFdb, and can be queried for diagnostic purposes with RIF sequences obtained from unknown strains in both chromatogram and FASTA format. PMID:21533033

  10. The cyc1-11 mutation in yeast reverts by recombination with a nonallelic gene: composite genes determining the iso-cytochromes c.

    PubMed Central

    Ernst, J F; Stewart, J W; Sherman, F

    1981-01-01

    DNA sequence analysis of a cloned fragment directly established that the cyc1-11 mutation of iso-1-cytochrome c in the yeast Saccharomyces cerevisiae is a two-base-pair substitution that changes the CCA proline codon at amino acid position 76 to a UAA nonsense codon. Analysis of 11 revertant proteins and one cloned revertant gene showed that reversion of the cyc1-11 mutation can occur in three ways: a single base-pair substitution, which produces a serine replacement at position 76; recombination with the nonallelic CYC7 gene of iso-2-cytochrome c, which causes replacement of a segment in the cyc1-11 gene by the corresponding segment of the CYC7 gene; and either a two-base-pair substitution or recombination with the CYC7 gene, which causes the formation of the normal iso-1-cytochrome c sequence. These results demonstrate the occurrence of low frequencies of recombination between nonallelic genes having extensive but not complete homology. The formation of composite genes that share sequences from nonallelic genes may be an evolutionary mechanism for producing protein diversities and for maintaining identical sequences at different loci. Images PMID:6273865

  11. Unlimited Thirst for Genome Sequencing, Data Interpretation, and Database Usage in Genomic Era: The Road towards Fast-Track Crop Plant Improvement

    PubMed Central

    Govindaraj, Mahalingam

    2015-01-01

    The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away. PMID:25874133

  12. End Joining-Mediated Gene Expression in Mammalian Cells Using PCR-Amplified DNA Constructs that Contain Terminator in Front of Promoter.

    PubMed

    Nakamura, Mikiko; Suzuki, Ayako; Akada, Junko; Tomiyoshi, Keisuke; Hoshida, Hisashi; Akada, Rinji

    2015-12-01

    Mammalian gene expression constructs are generally prepared in a plasmid vector, in which a promoter and terminator are located upstream and downstream of a protein-coding sequence, respectively. In this study, we found that front terminator constructs-DNA constructs containing a terminator upstream of a promoter rather than downstream of a coding region-could sufficiently express proteins as a result of end joining of the introduced DNA fragment. By taking advantage of front terminator constructs, FLAG substitutions, and deletions were generated using mutagenesis primers to identify amino acids specifically recognized by commercial FLAG antibodies. A minimal epitope sequence for polyclonal FLAG antibody recognition was also identified. In addition, we analyzed the sequence of a C-terminal Ser-Lys-Leu peroxisome localization signal, and identified the key residues necessary for peroxisome targeting. Moreover, front terminator constructs of hepatitis B surface antigen were used for deletion analysis, leading to the identification of regions required for the particle formation. Collectively, these results indicate that front terminator constructs allow for easy manipulations of C-terminal protein-coding sequences, and suggest that direct gene expression with PCR-amplified DNA is useful for high-throughput protein analysis in mammalian cells.

  13. Mining dynamic noteworthy functions in software execution sequences.

    PubMed

    Zhang, Bing; Huang, Guoyan; Wang, Yuqian; He, Haitao; Ren, Jiadong

    2017-01-01

    As the quality of crucial entities can directly affect that of software, their identification and protection become an important premise for effective software development, management, maintenance and testing, which thus contribute to improving the software quality and its attack-defending ability. Most analysis and evaluation on important entities like codes-based static structure analysis are on the destruction of the actual software running. In this paper, from the perspective of software execution process, we proposed an approach to mine dynamic noteworthy functions (DNFM)in software execution sequences. First, according to software decompiling and tracking stack changes, the execution traces composed of a series of function addresses were acquired. Then these traces were modeled as execution sequences and then simplified so as to get simplified sequences (SFS), followed by the extraction of patterns through pattern extraction (PE) algorithm from SFS. After that, evaluating indicators inner-importance and inter-importance were designed to measure the noteworthiness of functions in DNFM algorithm. Finally, these functions were sorted by their noteworthiness. Comparison and contrast were conducted on the experiment results from two traditional complex network-based node mining methods, namely PageRank and DegreeRank. The results show that the DNFM method can mine noteworthy functions in software effectively and precisely.

  14. Evidence for Horizontal Gene Transfer in Evolution of Elongation Factor Tu in Enterococci

    PubMed Central

    Ke, Danbing; Boissinot, Maurice; Huletsky, Ann; Picard, François J.; Frenette, Johanne; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.

    2000-01-01

    The elongation factor Tu, encoded by tuf genes, is a GTP binding protein that plays a central role in protein synthesis. One to three tuf genes per genome are present, depending on the bacterial species. Most low-G+C-content gram-positive bacteria carry only one tuf gene. We have designed degenerate PCR primers derived from consensus sequences of the tuf gene to amplify partial tuf sequences from 17 enterococcal species and other phylogenetically related species. The amplified DNA fragments were sequenced either by direct sequencing or by sequencing cloned inserts containing putative amplicons. Two different tuf genes (tufA and tufB) were found in 11 enterococcal species, including Enterococcus avium, Enterococcus casseliflavus, Enterococcus dispar, Enterococcus durans, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Enterococcus malodoratus, Enterococcus mundtii, Enterococcus pseudoavium, and Enterococcus raffinosus. For the other six enterococcal species (Enterococcus cecorum, Enterococcus columbae, Enterococcus faecalis, Enterococcus sulfureus, Enterococcus saccharolyticus, and Enterococcus solitarius), only the tufA gene was present. Based on 16S rRNA gene sequence analysis, the 11 species having two tuf genes all have a common ancestor, while the six species having only one copy diverged from the enterococcal lineage before that common ancestor. The presence of one or two copies of the tuf gene in enterococci was confirmed by Southern hybridization. Phylogenetic analysis of tuf sequences demonstrated that the enterococcal tufA gene branches with the Bacillus, Listeria, and Staphylococcus genera, while the enterococcal tufB gene clusters with the genera Streptococcus and Lactococcus. Primary structure analysis showed that four amino acid residues encoded within the sequenced regions are conserved and unique to the enterococcal tufB genes and the tuf genes of streptococci and Lactococcus lactis. The data suggest that an ancestral streptococcus or a streptococcus-related species may have horizontally transferred a tuf gene to the common ancestor of the 11 enterococcal species which now carry two tuf genes. PMID:11092850

  15. Identification of defective illegitimate recombinational repair of oxidatively-induced DNA double-strand breaks in ataxia-telangiectasia cells

    NASA Technical Reports Server (NTRS)

    Dar, M. E.; Winters, T. A.; Jorgensen, T. J.

    1997-01-01

    Ataxia-telangiectasia (A-T) is an autosomal-recessive lethal human disease. Homozygotes suffer from a number of neurological disorders, as well as very high cancer incidence. Heterozygotes may also have a higher than normal risk of cancer, particularly for the breast. The gene responsible for the disease (ATM) has been cloned, but its role in mechanisms of the disease remain unknown. Cellular A-T phenotypes, such as radiosensitivity and genomic instability, suggest that a deficiency in the repair of DNA double-strand breaks (DSBs) may be the primary defect; however, overall levels of DSB rejoining appear normal. We used the shuttle vector, pZ189, containing an oxidatively-induced DSB, to compare the integrity of DSB rejoining in one normal and two A-T fibroblast cells lines. Mutation frequencies were two-fold higher in A-T cells, and the mutational spectrum was different. The majority of the mutations found in all three cell lines were deletions (44-63%). The DNA sequence analysis indicated that 17 of the 17 plasmids with deletion mutations in normal cells occurred between short direct-repeat sequences (removing one of the repeats plus the intervening sequences), implicating illegitimate recombination in DSB rejoining. The combined data from both A-T cell lines showed that 21 of 24 deletions did not involve direct-repeats sequences, implicating a defect in the illegitimate recombination pathway. These findings suggest that the A-T gene product may either directly participate in illegitimate recombination or modulate the pathway. Regardless, this defect is likely to be important to a mechanistic understanding of this lethal disease.

  16. Complete plastid genome sequence of Daucus carota: Implications for biotechnology and phylogeny of angiosperms

    PubMed Central

    Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry

    2006-01-01

    Background Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. Results The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats ≥ 30 bp with a sequence identity ≥ 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. Conclusion The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements. PMID:16945140

  17. Reference-free comparative genomics of 174 chloroplasts.

    PubMed

    Kua, Chai-Shian; Ruan, Jue; Harting, John; Ye, Cheng-Xi; Helmus, Matthew R; Yu, Jun; Cannon, Charles H

    2012-01-01

    Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ~18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied genomes and rapid discovery of informative candidate regions.

  18. Reference-Free Comparative Genomics of 174 Chloroplasts

    PubMed Central

    Kua, Chai-Shian; Ruan, Jue; Harting, John; Ye, Cheng-Xi; Helmus, Matthew R.; Yu, Jun; Cannon, Charles H.

    2012-01-01

    Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ∼18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied genomes and rapid discovery of informative candidate regions. PMID:23185288

  19. A family-based probabilistic method for capturing de novo mutations from high-throughput short-read sequencing data.

    PubMed

    Cartwright, Reed A; Hussin, Julie; Keebler, Jonathan E M; Stone, Eric A; Awadalla, Philip

    2012-01-06

    Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors' genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.

  20. Bending and stretching finite element analysis of anisotropic viscoelastic composite plates

    NASA Technical Reports Server (NTRS)

    Hilton, Harry H.; Yi, Sung

    1990-01-01

    Finite element algorithms have been developed to analyze linear anisotropic viscoelastic plates, with or without holes, subjected to mechanical (bending, tension), temperature, and hygrothermal loadings. The analysis is based on Laplace transforms rather than direct time integrations in order to improve the accuracy of the results and save on extensive computational time and storage. The time dependent displacement fields in the transverse direction for the cross ply and angle ply laminates are calculated and the stacking sequence effects of the laminates are discussed in detail. Creep responses for the plates with or without a circular hole are also studied. The numerical results compare favorably with analytical solutions, i.e. within 1.8 percent for bending and 10(exp -3) 3 percent for tension. The tension results of the present method are compared with those using the direct time integration scheme.

  1. Isolation and molecular cloning of a fast-growing strain of human hepatitis A virus from its double-stranded replicative form.

    PubMed Central

    Venuti, A; Di Russo, C; del Grosso, N; Patti, A M; Ruggeri, F; De Stasio, P R; Martiniello, M G; Pagnotti, P; Degener, A M; Midulla, M

    1985-01-01

    A fast-growing strain of human hepatitis A virus was selected and characterized. The virus has the unusual property of developing a strong cytopathic effect in tissue culture in 7 to 10 days. Sequences of the viral genome were cloned into recombinant plasmids with the double-stranded replicative form as a template for the reverse transcription of cDNA. Restriction analysis and direct sequencing indicate that this strain is different from that described by Ticehurst et al. (Proc. Natl. Acad. Sci. USA 80:5885-5889, 1983) in the region that presumptively codes for the major capsid protein VP1, but both isolates have conserved large areas of homology in the untranslated 5'-terminal sequences of the genome. Images PMID:2997478

  2. Next-Generation Sequencing of Coccidioides immitis Isolated during Cluster Investigation

    PubMed Central

    Engelthaler, David M.; Chiller, Tom; Schupp, James A.; Colvin, Joshua; Beckstrom-Sternberg, Stephen M.; Driebe, Elizabeth M.; Moses, Tracy; Tembe, Waibhav; Sinari, Shripad; Beckstrom-Sternberg, James S.; Christoforides, Alexis; Pearson, John V.; Carpten, John; Keim, Paul; Peterson, Ashley; Terashita, Dawn

    2011-01-01

    Next-generation sequencing enables use of whole-genome sequence typing (WGST) as a viable and discriminatory tool for genotyping and molecular epidemiologic analysis. We used WGST to confirm the linkage of a cluster of Coccidioides immitis isolates from 3 patients who received organ transplants from a single donor who later had positive test results for coccidioidomycosis. Isolates from the 3 patients were nearly genetically identical (a total of 3 single-nucleotide polymorphisms identified among them), thereby demonstrating direct descent of the 3 isolates from an original isolate. We used WGST to demonstrate the genotypic relatedness of C. immitis isolates that were also epidemiologically linked. Thus, WGST offers unique benefits to public health for investigation of clusters considered to be linked to a single source. PMID:21291593

  3. Identification and comparative analysis of differential gene expression in soybean leaf tissue under drought and flooding stress revealed by RNA-Seq

    USDA-ARS?s Scientific Manuscript database

    Soybean is the second largest crop in the US. Its yield directly impacts US agricultural economics. Drought and flooding are two major causes for soybean yield loss. To better understand their underlying molecular regulatory mechanisms, we sequenced the transcriptomes of soybean grown in drought a...

  4. Effect of regulatory peptides on gene transcription.

    PubMed

    Khavinson, V Kh; Shataeva, L K; Chernova, A A

    2003-09-01

    Experimental studies of geroprotective activity of synthetic oligopeptides and conformational analysis of the tetrapeptide Epithalon allowed us to hypothesize that regulatory oligopeptides directly initiate transcription of genes for vitally important proteins. Sequences of nucleotide pairs that can serve as binding sites for tetrapeptide Epithalon were identified in the promoter regions of retinal genes F379, telomerase, and RNA polymerase II.

  5. Development and evaluation of a culture-independent method for source determination of fecal wastes in surface and storm waters using reverse transcriptase-PCR detection of FRNA coliphage genogroup gene sequences.

    EPA Science Inventory

    A complete method, incorporating recently improved reverse transcriptase-PCR primer/probe assays and including controls for determining interferences to phage recoveries from water sample concentrates and for detecting interferences to their analysis, was developed for the direct...

  6. Mycobacterium marinum Infections in Fish and Humans in Israel

    PubMed Central

    Ucko, M.; Colorni, A.

    2005-01-01

    Israeli Mycobacterium marinum isolates from humans and fish were compared by direct sequencing of the 16S rRNA and hsp65 genes, restriction mapping, and amplified fragment length polymorphism analysis. Significant molecular differences separated all clinical isolates from the piscine isolates, ruling out the local aquaculture industry as the source of human infections. PMID:15695698

  7. Development and evaluation of a culture-independent method for source determination of fecal wastes in surface and storm waters using reverse transcriptase-PCR detection of FRNA coliphage genogroup gene sequences

    EPA Science Inventory

    A complete method, incorporating recently improved reverse transcriptase-PCR primer/probe assays and including controls for determining interferences to phage recoveries from water sample concentrates and for detecting interferences to their analysis, was developed for the direct...

  8. From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction

    PubMed Central

    Cocco, Simona; Monasson, Remi; Weigt, Martin

    2013-01-01

    Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. PMID:23990764

  9. Intact and Top-Down Characterization of Biomolecules and Direct Analysis Using Infrared Matrix-Assisted Laser Desorption Electrospray Ionization Coupled to FT-ICR Mass Spectrometry

    PubMed Central

    Sampson, Jason S.; Murray, Kermit K.; Muddiman, David C.

    2013-01-01

    We report the implementation of an infrared laser onto our previously reported matrix-assisted laser desorption electrospray ionization (MALDESI) source with ESI post-ionization yielding multiply charged peptides and proteins. Infrared (IR)-MALDESI is demonstrated for atmospheric pressure desorption and ionization of biological molecules ranging in molecular weight from 1.2 to 17 kDa. High resolving power, high mass accuracy single-acquisition Fourier transform ion cyclotron resonance (FT-ICR) mass spectra were generated from liquid-and solid-state peptide and protein samples by desorption with an infrared laser (2.94 µm) followed by ESI post-ionization. Intact and top-down analysis of equine myoglobin (17 kDa) desorbed from the solid state with ESI post-ionization demonstrates the sequencing capabilities using IR-MALDESI coupled to FT-ICR mass spectrometry. Carbohydrates and lipids were detected through direct analysis of milk and egg yolk using both UV- and IR-MALDESI with minimal sample preparation. Three of the four classes of biological macromolecules (proteins, carbohydrates, and lipids) have been ionized and detected using MALDESI with minimal sample preparation. Sequencing of O-linked glycans, cleaved from mucin using reductive β-elimination chemistry, is also demonstrated. PMID:19185512

  10. Organization and variation analysis of 5S rDNA in gynogenetic offspring of Carassius auratus red var. (♀) × Megalobrama amblycephala (♂).

    PubMed

    Qin, QinBo; Wang, Juan; Wang, YuDe; Liu, Yun; Liu, ShaoJun

    2015-03-13

    The offspring with 100 chromosomes (abbreviated as GRCC) have been obtained in the first generation of Carassius auratus red var. (abbreviated as RCC, 2n = 100) (♀) × Megalobrama amblycephala (abbreviated as BSB, 2n = 48) (♂), in which the females and unexpected males both are found. Chromosomal and karyotypic analysis has been reported in GRCC which gynogenesis origin has been suggested, but lack genetic evidence. Fluorescence in situ hybridization with species-specific centromere probes directly proves that GRCC possess two sets of RCC-derived chromosomes. Sequence analysis of the coding region (5S) and adjacent nontranscribed spacer (abbreviated as NTS) reveals that three types of 5S rDNA class (class I; class II and class III) in GRCC are completely inherited from their female parent (RCC), and show obvious base variations and insertions-deletions. Fluorescence in situ hybridization with the entire 5S rDNA probe reveals obvious chromosomal loci (class I and class II) variation in GRCC. This paper provides directly genetic evidence that GRCC is gynogenesis origin. In addition, our result is also reveals that distant hybridization inducing gynogenesis can lead to sequence and partial chromosomal loci of 5S rDNA gene obvious variation.

  11. Determination of the DNA-binding kinetics of three related but heteroimmune bacteriophage repressors using EMSA and SPR analysis

    PubMed Central

    Henriksson-Peltola, Petri; Sehlén, Wilhelmina; Haggård-Ljungquist, Elisabeth

    2007-01-01

    Bacteriophages P2, P2 Hy dis and WΦ are very similar but heteroimmune Escherichia coli phages. The structural genes show over 96% identity, but the repressors show between 43 and 63% identities. Furthermore, the operators, which contain two directly repeated sequences, vary in sequence, length, location relative to the promoter and spacing between the direct repeats. We have compared the in vivo effects of the wild type and mutated operators on gene expression with the complexes formed between the repressors and their wild type or mutated operators using electrophoretic mobility shift assay (EMSA), and real-time kinetics of the protein–DNA interactions using surface plasmon resonance (SPR) analysis. Using EMSA, the repressors formed different protein–DNA complexes, and only WΦ was significantly affected by point mutations. However, SPR analysis showed a reduced association rate constant and an increased dissociation rate constant for P2 and WΦ operator mutants. The association rate constants of P2 Hy dis was too fast to be determined. The P2 Hy dis dissociation response curves were shown to be triphasic, while both P2 and WΦ C were biphasic. Thus, the kinetics of complex formation and the nature of the complexes formed differ extensively between these very closely related phages. PMID:17412705

  12. Environmental Barcoding: A Next-Generation Sequencing Approach for Biomonitoring Applications Using River Benthos

    PubMed Central

    Hajibabaei, Mehrdad; Shokralla, Shadi; Zhou, Xin; Singer, Gregory A. C.; Baird, Donald J.

    2011-01-01

    Timely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs. PMID:21533287

  13. Analysis of Duck Hepatitis B Virus Reverse Transcription Indicates a Common Mechanism for the Two Template Switches during Plus-Strand DNA Synthesis

    PubMed Central

    Havert, Michael B.; Ji, Lin; Loeb, Daniel D.

    2002-01-01

    The synthesis of the hepadnavirus relaxed circular DNA genome requires two template switches, primer translocation and circularization, during plus-strand DNA synthesis. Repeated sequences serve as donor and acceptor templates for these template switches, with direct repeat 1 (DR1) and DR2 for primer translocation and 5′r and 3′r for circularization. These donor and acceptor sequences are at, or near, the ends of the minus-strand DNA. Analysis of plus-strand DNA synthesis of duck hepatitis B virus (DHBV) has indicated that there are at least three other cis-acting sequences that make contributions during the synthesis of relaxed circular DNA. These sequences, 5E, M, and 3E, are located near the 5′ end, the middle, and the 3′ end of minus-strand DNA, respectively. The mechanism by which these sequences contribute to the synthesis of plus-strand DNA was unclear. Our aim was to better understand the mechanism by which 5E and M act. We localized the DHBV 5E element to a short sequence of approximately 30 nucleotides that is 100 nucleotides 3′ of DR2 on minus-strand DNA. We found that the new 5E mutants were partially defective for primer translocation/utilization at DR2. They were also invariably defective for circularization. In addition, examination of several new DHBV M variants indicated that they too were defective for primer translocation/utilization and circularization. Thus, this analysis indicated that 5E and M play roles in both primer translocation/utilization and circularization. In conjunction with earlier findings that 3E functions in both template switches, our findings indicate that the processes of primer translocation and circularization share a common underlying mechanism. PMID:11861843

  14. [Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].

    PubMed

    Xia, Kai; Liang, Xin-le; Li, Yu-dong

    2015-12-01

    The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria.

  15. Implementation on a nonlinear concrete cracking algorithm in NASTRAN

    NASA Technical Reports Server (NTRS)

    Herting, D. N.; Herendeen, D. L.; Hoesly, R. L.; Chang, H.

    1976-01-01

    A computer code for the analysis of reinforced concrete structures was developed using NASTRAN as a basis. Nonlinear iteration procedures were developed for obtaining solutions with a wide variety of loading sequences. A direct access file system was used to save results at each load step to restart within the solution module for further analysis. A multi-nested looping capability was implemented to control the iterations and change the loads. The basis for the analysis is a set of mutli-layer plate elements which allow local definition of materials and cracking properties.

  16. Visually driven chaining of elementary swim patterns into a goal-directed motor sequence: a virtual reality study of zebrafish prey capture.

    PubMed

    Trivedi, Chintan A; Bollmann, Johann H

    2013-01-01

    Prey capture behavior critically depends on rapid processing of sensory input in order to track, approach, and catch the target. When using vision, the nervous system faces the problem of extracting relevant information from a continuous stream of input in order to detect and categorize visible objects as potential prey and to select appropriate motor patterns for approach. For prey capture, many vertebrates exhibit intermittent locomotion, in which discrete motor patterns are chained into a sequence, interrupted by short periods of rest. Here, using high-speed recordings of full-length prey capture sequences performed by freely swimming zebrafish larvae in the presence of a single paramecium, we provide a detailed kinematic analysis of first and subsequent swim bouts during prey capture. Using Fourier analysis, we show that individual swim bouts represent an elementary motor pattern. Changes in orientation are directed toward the target on a graded scale and are implemented by an asymmetric tail bend component superimposed on this basic motor pattern. To further investigate the role of visual feedback on the efficiency and speed of this complex behavior, we developed a closed-loop virtual reality setup in which minimally restrained larvae recapitulated interconnected swim patterns closely resembling those observed during prey capture in freely moving fish. Systematic variation of stimulus properties showed that prey capture is initiated within a narrow range of stimulus size and velocity. Furthermore, variations in the delay and location of swim triggered visual feedback showed that the reaction time of secondary and later swims is shorter for stimuli that appear within a narrow spatio-temporal window following a swim. This suggests that the larva may generate an expectation of stimulus position, which enables accelerated motor sequencing if the expectation is met by appropriate visual feedback.

  17. Identification of single amino acid substitutions (SAAS) in neuraminidase from influenza a virus (H1N1) via mass spectrometry analysis coupled with de novo peptide sequencing.

    PubMed

    Peng, Qisheng; Wang, Zijian; Wu, Donglin; Li, Xiaoou; Liu, Xiaofeng; Sun, Wanchun; Liu, Ning

    2016-08-01

    Amino acid substitutions in the neuraminidase of the influenza virus are the main cause of the emergence of resistance to zanamivir or oseltamivir during seasonal influenza treatment; they are the result of non-synonymous mutations in the viral genome that can be successfully detected by polymer chain reaction (PCR)-based approaches. There is always an urgent need to detect variation in amino acid sequences directly at the protein level. Mass spectrometry coupled with de novo sequencing has been explored as an alternative and straightforward strategy for detecting amino acid substitutions, as well - this approach is the primary focus of the present study. Influenza virus (A/Puerto Rico/8/1934 H1N1) propagated in embryonated chicken eggs was purified by ultracentrifugation, followed by PNGase F treatment. The deglycosylated virion was lysed and separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). The gel band corresponding to neuraminidase was picked up and subjected to liquid chromatography tandem mass spectrometry (LC-MS/MS) analysis. LC-MS/MS analyses, coupled with manual de novo sequencing, allowed the determination of three amino acid substitutions: R346K, S349 N, and S370I/L, in the neuraminidase from the influenza virus (A/Puerto Rico/8/1934 H1N1), which were located in three mutated peptides of the neuraminidase: YGNGVWIGK, TKNHSSR, and PNGWTETDI/LK, respectively. We found that the amino acid substitutions in the proteins of RNA viruses (including influenza A virus) resulting from non-synonymous gene mutations can indeed be directly analyzed via mass spectrometry, and that manual interpretation of the MS/MS data may be beneficial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  18. Development of a Single Locus Sequence Typing (SLST) Scheme for Typing Bacterial Species Directly from Complex Communities.

    PubMed

    Scholz, Christian F P; Jensen, Anders

    2017-01-01

    The protocol describes a computational method to develop a Single Locus Sequence Typing (SLST) scheme for typing bacterial species. The resulting scheme can be used to type bacterial isolates as well as bacterial species directly from complex communities using next-generation sequencing technologies.

  19. Rapid identification of causative species in patients with Old World leishmaniasis.

    PubMed Central

    Minodier, P; Piarroux, R; Gambarelli, F; Joblet, C; Dumon, H

    1997-01-01

    Conventional methods for the identification of species of Leishmania parasite causing infections have limitations. By using a DNA-based alternative, the present study tries to develop a new tool for this purpose. Thirty-three patients living in Marseilles (in the south of France) were suffering from visceral or cutaneous leishmaniasis. DNA of the parasite in clinical samples (bone marrow, peripheral blood, or skin) from these patients were amplified by PCR and were directly sequenced. The sequences observed were compared to these of 30 strains of the genus causing Old World leishmaniasis collected in Europe, Africa, or Asia. In the analysis of the sequences of the strains, two different sequence patterns for Leishmania infantum, one sequence for Leishmania donovani, one sequence for Leishmania major, two sequences for Leishmania tropica, and one sequence for Leishmania aethiopica were obtained. Four sequences were observed among the strains from the patients: one was similar to the sequence for the L. major strains, two were identical to the sequences for the L. infantum strains, and the last sequence was not observed within the strains but had a high degree of homology with the sequences of the L. infantum and L. donovani strains. The L. infantum strains from all immunocompetent patients had the same sequence. The L. infantum strains from immunodeficient patients suffering from visceral leishmaniasis had three different sequences. This fact might signify that some variants of L. infantum acquire pathogenicity exclusively in immunocompromised patients. To dispense with the sequencing step, a restriction assay with HaeIII was used. Some restriction patterns might support genetic exchanges in members of the genus Leishmania. PMID:9316906

  20. A model of directional selection applied to the evolution of drug resistance in HIV-1.

    PubMed

    Seoighe, Cathal; Ketwaroo, Farahnaz; Pillay, Visva; Scheffler, Konrad; Wood, Natasha; Duffet, Rodger; Zvelebil, Marketa; Martinson, Neil; McIntyre, James; Morris, Lynn; Hide, Winston

    2007-04-01

    Understanding how pathogens acquire resistance to drugs is important for the design of treatment strategies, particularly for rapidly evolving viruses such as HIV-1. Drug treatment can exert strong selective pressures and sites within targeted genes that confer resistance frequently evolve far more rapidly than the neutral rate. Rapid evolution at sites that confer resistance to drugs can be used to help elucidate the mechanisms of evolution of drug resistance and to discover or corroborate novel resistance mutations. We have implemented standard maximum likelihood methods that are used to detect diversifying selection and adapted them for use with serially sampled reverse transcriptase (RT) coding sequences isolated from a group of 300 HIV-1 subtype C-infected women before and after single-dose nevirapine (sdNVP) to prevent mother-to-child transmission. We have also extended the standard models of codon evolution for application to the detection of directional selection. Through simulation, we show that the directional selection model can provide a substantial improvement in sensitivity over models of diversifying selection. Five of the sites within the RT gene that are known to harbor mutations that confer resistance to nevirapine (NVP) strongly supported the directional selection model. There was no evidence that other mutations that are known to confer NVP resistance were selected in this cohort. The directional selection model, applied to serially sampled sequences, also had more power than the diversifying selection model to detect selection resulting from factors other than drug resistance. Because inference of selection from serial samples is unlikely to be adversely affected by recombination, the methods we describe may have general applicability to the analysis of positive selection affecting recombining coding sequences when serially sampled data are available.

  1. Paleomagnetic directions and thermoluminescence dating from a bread oven-floor sequence in Lübeck (Germany): A record of 450 years of geomagnetic secular variation

    NASA Astrophysics Data System (ADS)

    Schnepp, Elisabeth; Pucher, Rudolf; Goedicke, Christian; Manzano, Ana; Müller, Uwe; Lanos, Philippe

    2003-02-01

    A record of about 450 years of geomagnetic secular variation is presented from a single archaeological site in Lübeck (Germany) where a sequence of 25 bread oven floors has been preserved in a bakery from medieval times until today. The age dating of the oven-floor sequence is based on historical documents, 14C-dating and thermoluminescence dating. It confines the time interval from about 1300 to 1800 A.D. Paleomagnetic directions have been determined from each oven floor by means of 198 oriented hand samples. After alternating field as well as thermal demagnetization experiments, the characteristic remanent magnetization direction was obtained using principal component analysis. The mean directions of 24 oven floors are characterized by high Fisherian precision parameters (>146) and small α95 confidence limits (1.2°-4.6°). For obtaining a smooth curve of geomagnetic secular variation for Lübeck, a spherical spline function was fitted to the data using a Bayesian approach, which considers not only the obtained ages, but also stratigraphic order. Correlation with historical magnetic records suggests that the age estimation for the upper 10 layers was too young and must date from the end of the sixteenth to the mid of the eighteenth century. For the lowermost 14 layers, dating is reliable and provides a secular variation curve for Germany. The inclination shows a minimum in the fourteenth century and then increases by more than 10°. Declination shows a local minimum around 1400 A.D. followed by a maximum in the seventeenth century. This is followed by the movement of declination about 30° to western directions.

  2. LEDGF/p75 interacts with mRNA splicing factors and targets HIV-1 integration to highly spliced genes

    PubMed Central

    Singh, Parmit Kumar; Plumb, Matthew R.; Ferris, Andrea L.; Iben, James R.; Wu, Xiaolin; Fadel, Hind J.; Luke, Brian T.; Esnault, Caroline; Poeschla, Eric M.; Hughes, Stephen H.; Kvaratskhelia, Mamuka; Levin, Henry L.

    2015-01-01

    The host chromatin-binding factor LEDGF/p75 interacts with HIV-1 integrase and directs integration to active transcription units. To understand how LEDGF/p75 recognizes transcription units, we sequenced 1 million HIV-1 integration sites isolated from cultured HEK293T cells. Analysis of integration sites showed that cancer genes were preferentially targeted, raising concerns about using lentivirus vectors for gene therapy. Additional analysis led to the discovery that introns and alternative splicing contributed significantly to integration site selection. These correlations were independent of transcription levels, size of transcription units, and length of the introns. Multivariate analysis with five parameters previously found to predict integration sites showed that intron density is the strongest predictor of integration density in transcription units. Analysis of previously published HIV-1 integration site data showed that integration density in transcription units in mouse embryonic fibroblasts also correlated strongly with intron number, and this correlation was absent in cells lacking LEDGF. Affinity purification showed that LEDGF/p75 is associated with a number of splicing factors, and RNA sequencing (RNA-seq) analysis of HEK293T cells lacking LEDGF/p75 or the LEDGF/p75 integrase-binding domain (IBD) showed that LEDGF/p75 contributes to splicing patterns in half of the transcription units that have alternative isoforms. Thus, LEDGF/p75 interacts with splicing factors, contributes to exon choice, and directs HIV-1 integration to transcription units that are highly spliced. PMID:26545813

  3. Psychiatric symptoms mediate the effects of neurological soft signs on functional outcomes in patients with chronic schizophrenia: A longitudinal path-analytic study.

    PubMed

    Fong, Ted C T; Ho, Rainbow T H; Wan, Adrian H Y; Au-Yeung, Friendly S W

    2017-03-01

    Neurological soft signs (NSS) in motor coordination and sequencing occur in schizophrenia patients and are an intrinsic sign of the underlying neural dysfunctions. The present longitudinal study explored the relationships among NSS, psychiatric symptoms, and functional outcomes in 151 Chinese patients with chronic schizophrenia across a 6-month period. The participants completed neurological assessments at baseline (Time 1), psychiatric interviews at Time 1 and 3-month follow-up (Time 2), and self-report measures on daily functioning at 6-month follow-up (Time 3). Two possible (combined and cascading) path models were examined on predicting the functional outcomes. Direct and indirect effects of Time 1 NSS on Time 3 functional outcomes via Time 2 psychiatric symptoms were evaluated using path analysis under bootstrapping. Motor coordination and sequencing NSS did not have significant direct effects on functional outcomes. Motor coordination NSS exerted significant and negative indirect effects on functional outcomes via psychiatric symptoms. These results contribute to a better understanding of the determinants of functional outcomes by showing significant indirect pathways from motor coordination NSS to functional outcomes via psychiatric symptoms. That motor sequencing NSS did not affect functional outcomes either directly or indirectly may be explained by their trait marking features. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  4. Application of a reverse dot blot, DNA-DNA hydridization method to quantify host-feeding tendencies of two sibling species in the Anopheles gambiae complex

    PubMed Central

    Fritz, Megan L; Miller, James R; Bayoh, M Nabie; Vulule, John M; Landgraf, Jeffrey R; Walker, Edward D

    2012-01-01

    A DNA-DNA hybridization method, reverse dot blot analysis (RDBA), was used for identification of Anopheles gambiae s.s. and An. arabiensis hosts. Of 299 blood fed and half gravid An. gambiae s.l. collected from Kisian, Kenya, 244 individuals were identifiable to species; 69.5% were An. arabiensis, and 29.5% were An. gambiae s.s. Host identifications with RDBA were comparable to conventional PCR followed by direct sequencing of amplicons of the vertebrate mitochondrial cytochrome B gene. Of the 174 amplicon-producing samples used for comparison of these two methods, 147 were identifiable by direct sequencing, and 139 of these same by RDBA. An. arabiensis blood meals were mostly (>90%) bovine in origin, whereas An. gambiae s.s. fed upon humans > 90% of the time. RDBA detected that 2 of 112 An. arabiensis had blood from more than one host species, whereas PCR and direct sequencing did not. Recent insecticide-treated bednet (ITN) use in Kisian has likely caused the shift in the dominant vector species from An. gambiae s.s. to An. arabiensis. RDBA provides an opportunity to study changes in host-feeding by members of the An. gambiae complex as a response to the broadening distribution of vector control measures targeting host-selection behaviors. PMID:24188164

  5. Single molecule sequencing of the M13 virus genome without amplification

    PubMed Central

    Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X.; Yan, Qin; Deem, Michael W.; He, Jiankui

    2017-01-01

    Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias. PMID:29253901

  6. Single molecule sequencing of the M13 virus genome without amplification.

    PubMed

    Zhao, Luyang; Deng, Liwei; Li, Gailing; Jin, Huan; Cai, Jinsen; Shang, Huan; Li, Yan; Wu, Haomin; Xu, Weibin; Zeng, Lidong; Zhang, Renli; Zhao, Huan; Wu, Ping; Zhou, Zhiliang; Zheng, Jiao; Ezanno, Pierre; Yang, Andrew X; Yan, Qin; Deem, Michael W; He, Jiankui

    2017-01-01

    Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias.

  7. The Swiss-Army-Knife Approach to the Nearly Automatic Analysis for Microearthquake Sequences.

    NASA Astrophysics Data System (ADS)

    Kraft, T.; Simon, V.; Tormann, T.; Diehl, T.; Herrmann, M.

    2017-12-01

    Many Swiss earthquake sequence have been studied using relative location techniques, which often allowed to constrain the active fault planes and shed light on the tectonic processes that drove the seismicity. Yet, in the majority of cases the number of located earthquakes was too small to infer the details of the space-time evolution of the sequences, or their statistical properties. Therefore, it has mostly been impossible to resolve clear patterns in the seismicity of individual sequences, which are needed to improve our understanding of the mechanisms behind them. Here we present a nearly automatic workflow that combines well-established seismological analysis techniques and allows to significantly improve the completeness of detected and located earthquakes of a sequence. We start from the manually timed routine catalog of the Swiss Seismological Service (SED), which contains the larger events of a sequence. From these well-analyzed earthquakes we dynamically assemble a template set and perform a matched filter analysis on the station with: the best SNR for the sequence; and a recording history of at least 10-15 years, our typical analysis period. This usually allows us to detect events several orders of magnitude below the SED catalog detection threshold. The waveform similarity of the events is then further exploited to derive accurate and consistent magnitudes. The enhanced catalog is then analyzed statistically to derive high-resolution time-lines of the a- and b-value and consequently the occurrence probability of larger events. Many of the detected events are strong enough to be located using double-differences. No further manual interaction is needed; we simply time-shift the arrival-time pattern of the detecting template to the associated detection. Waveform similarity assures a good approximation of the expected arrival-times, which we use to calculate event-pair arrival-time differences by cross correlation. After a SNR and cycle-skipping quality check these are directly fed into hypoDD. Using this procedure we usually improve the number of well-relocated events by a factor 2-5. We demonstrate the successful application of the workflow at the example of natural sequences in Switzerland and present first results of the advanced analysis the was possible with the enhanced catalogs.

  8. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations

    PubMed Central

    Wang, Junbai; Batmanov, Kirill

    2015-01-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein–DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein–DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  9. Identification of Sinorhizobium (Ensifer) medicae based on a specific genomic sequence unveiled by M13-PCR fingerprinting.

    PubMed

    Dourado, Ana Catarina; Alves, Paula I L; Tenreiro, Tania; Ferreira, Eugénio M; Tenreiro, Rogério; Fareleira, Paula; Crespo, M Teresa Barreto

    2009-12-01

    A collection of nodule isolates from Medicago polymorpha obtained from southern and central Portugal was evaluated by M13-PCR fingerprinting and hierarchical cluster analysis. Several genomic clusters were obtained which, by 16S rRNA gene sequencing of selected representatives, were shown to be associated with particular taxonomic groups of rhizobia and other soil bacteria. The method provided a clear separation between rhizobia and co-isolated non-symbiotic soil contaminants. Ten M13-PCR groups were assigned to Sinorhizobium (Ensifer) medicae and included all isolates responsible for the formation of nitrogen-fixing nodules upon re-inoculation of M. polymorpha test-plants. In addition, enterobacterial repetitive intergenic consensus (ERIC)-PCR fingerprinting indicated a high genomic heterogeneity within the major M13- PCR clusters of S. medicae isolates. Based on nucleotide sequence data of an M13-PCR amplicon of ca. 1500 bp, observed only in S. medicae isolates and spanning locus Smed_3707 to Smed_3709 from the pSMED01 plasmid sequence of S. medicae WSM419 genome's sequence, a pair of PCR primers was designed and used for direct PCR amplification of a 1399-bp sequence within this fragment. Additional in silico and in vitro experiments, as well as phylogenetic analysis, confirmed the specificity of this primer combination and therefore the reliability of this approach in the prompt identification of S. medicae isolates and their distinction from other soil bacteria.

  10. An integrated semiconductor device enabling non-optical genome sequencing.

    PubMed

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-20

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.

  11. Effective Connectivity of Cortical Sensorimotor Networks During Finger Movement Tasks: A Simultaneous fNIRS, fMRI, EEG Study.

    PubMed

    Anwar, A R; Muthalib, M; Perrey, S; Galka, A; Granert, O; Wolff, S; Heute, U; Deuschl, G; Raethjen, J; Muthuraman, Muthuraman

    2016-09-01

    Recently, interest has been growing to understand the underlying dynamic directional relationship between simultaneously activated regions of the brain during motor task performance. Such directionality analysis (or effective connectivity analysis), based on non-invasive electrophysiological (electroencephalography-EEG) and hemodynamic (functional near infrared spectroscopy-fNIRS; and functional magnetic resonance imaging-fMRI) neuroimaging modalities can provide an estimate of the motor task-related information flow from one brain region to another. Since EEG, fNIRS and fMRI modalities achieve different spatial and temporal resolutions of motor-task related activation in the brain, the aim of this study was to determine the effective connectivity of cortico-cortical sensorimotor networks during finger movement tasks measured by each neuroimaging modality. Nine healthy subjects performed right hand finger movement tasks of different complexity (simple finger tapping-FT, simple finger sequence-SFS, and complex finger sequence-CFS). We focused our observations on three cortical regions of interest (ROIs), namely the contralateral sensorimotor cortex (SMC), the contralateral premotor cortex (PMC) and the contralateral dorsolateral prefrontal cortex (DLPFC). We estimated the effective connectivity between these ROIs using conditional Granger causality (GC) analysis determined from the time series signals measured by fMRI (blood oxygenation level-dependent-BOLD), fNIRS (oxygenated-O2Hb and deoxygenated-HHb hemoglobin), and EEG (scalp and source level analysis) neuroimaging modalities. The effective connectivity analysis showed significant bi-directional information flow between the SMC, PMC, and DLPFC as determined by the EEG (scalp and source), fMRI (BOLD) and fNIRS (O2Hb and HHb) modalities for all three motor tasks. However the source level EEG GC values were significantly greater than the other modalities. In addition, only the source level EEG showed a significantly greater forward than backward information flow between the ROIs. This simultaneous fMRI, fNIRS and EEG study has shown through independent GC analysis of the respective time series that a bi-directional effective connectivity occurs within a cortico-cortical sensorimotor network (SMC, PMC and DLPFC) during finger movement tasks.

  12. Dual Priming Oligonucleotides for Broad-Range Amplification of the Bacterial 16S rRNA Gene Directly from Human Clinical Specimens

    PubMed Central

    Simmon, Keith; Karaca, Dilek; Langeland, Nina; Wiker, Harald G.

    2012-01-01

    Broad-range amplification and sequencing of the bacterial 16S rRNA gene directly from clinical specimens are offered as a diagnostic service in many laboratories. One major pitfall is primer cross-reactivity with human DNA which will result in mixed chromatograms. Mixed chromatograms will complicate subsequent sequence analysis and impede identification. In SYBR green real-time PCR assays, it can also affect crossing threshold values and consequently the status of a specimen as positive or negative. We evaluated two conventional primer pairs in common use and a new primer pair based on the dual priming oligonucleotide (DPO) principle. Cross-reactivity was observed when both conventional primer pairs were used, resulting in interpretation difficulties. No cross-reactivity was observed using the DPOs even in specimens with a high ratio of human to bacterial DNA. In addition to reducing cross-reactivity, the DPO principle also offers a high degree of flexibility in the design of primers and should be considered for any PCR assay intended for detection and identification of pathogens directly from human clinical specimens. PMID:22278843

  13. Sequence analysis of porcine kobuvirus VP1 region detected in pigs in Japan and Thailand.

    PubMed

    Okitsu, Shoko; Khamrin, Pattara; Thongprachum, Aksara; Hidaka, Satoshi; Kongkaew, Sompreeya; Kongkaew, Apisek; Maneekarn, Niwat; Mizuguchi, Masashi; Hayakawa, Satoshi; Ushijima, Hiroshi

    2012-04-01

    Porcine kobuvirus is a new candidate species of the genus Kobuvirus in the family Picornaviridae, and information is still limited. The identification of porcine kobuvirus has been performed by the sequence analyses of the 3D region of the viruses. Therefore, the purpose of this study was to characterize the molecular properties of VP1 nucleotide sequences of the porcine kobuviruses isolated from porcine stool samples in Japan during 2009 and Thailand between 2006 and 2008. In addition, previous identification of a unique porcine kobuvirus; Japanese H023/2009/JP, which is a bovine kobuvirus-like strain based on sequence analysis of the 3D region, was also included in this study. All of the strains were amplified by the VP1-specific primer pair: the amplicons were subjected to direct sequencing and compared with the VP1 nucleotide sequences of reference strains. The VP1 sequences of strains from the GenBank database revealed high nucleotide sequence identity at 84.3-100%. On the other hand, the nucleotide identities among the 15 porcine kobuvirus strains analyzed in this study ranged from 78.8 to 99.8%. The results revealed that diversity of the strains in this study were higher than those of the strains in previous studies. Furthermore, it was found that the VP1 region of the bovine kobuvirus-like strain, H023/2009/JP, clustered with nine porcine kobuvirus strains that were isolated in Thailand and Japan. Since this strain was previously found to be closely related to bovine kobuviruses in the 3D gene region, it may be a natural recombinant.

  14. Direct Downregulation of B-Cell Translocation Gene 3 by microRNA-93 Is Required for Desensitizing Esophageal Cancer to Radiotherapy.

    PubMed

    Cui, Hujun; Zhang, Shengqiang; Zhou, Hongbo; Guo, Ling

    2017-08-01

    Esophageal squamous carcinoma (ESC) is one of the most fatal malignancies worldwide with increasing occurrences yet poor outcome. MicroRNAs were reported to play roles in ESC. We aimed to understand how miRNAs affect the radiotherapy resistance of ESC. MicroRNA assays, real-time PCR, and Western blot were performed for expression analysis of miR-93 and BTG3. Luciferase activity assay was conducted with mutated B-cell translocation gene 3 (BTG3) 3'-UTR sequence in the 3' end of luciferase sequence with miR-93 inhibitor. ESC cells were treated with irradiation (IR) and clonogenic assay was utilized to detect the cell viability. Human ESC xenograft mouse model was established and subjected to target IR treatment followed by tumor size analysis. MiR-93 was decreased and BTG3 was increased in ESC cells, with negative correlation of their expression in ESC tissues. MiR-93 directly targeted BTG3 3'-UTR by luciferase activity assay. Either miR-93 inhibition or BTG3 overexpression decreased radiation resistance. Furthermore, miR-93 inhibition suppressed radiation resistance through BTG3. Direct downregulation of BTG3 by miR-93 is able to render ESC resistant to radiotherapy, and both BTG3 and miR-93 may potentially serve as clinical markers for ESC and contribute to the treatment of ESC.

  15. Unveiling fungal zooflagellates as members of freshwater picoeukaryotes: evidence from a molecular diversity study in a deep meromictic lake.

    PubMed

    Lefèvre, Emilie; Bardot, Corinne; Noël, Christophe; Carrias, Jean-François; Viscogliosi, Eric; Amblard, Christian; Sime-Ngando, Télesphore

    2007-01-01

    This study presents an original 18S rRNA PCR survey of the freshwater picoeukaryote community, and was designed to detect unidentified heterotrophic picoflagellates (size range 0.6-5 microm) which are prevalent throughout the year within the heterotrophic flagellate assemblage in Lake Pavin. Four clone libraries were constructed from samples collected in two contrasting zones in the lake. Computerized statistic tools have suggested that sequence retrieval was representative of the in situ picoplankton diversity. The two sampling zones exhibited similar diversity patterns but shared only about 5% of the operational taxonomic units (OTUs). Phylogenetic analysis clustered our sequences into three taxonomic groups: Alveolates (30% of OTUs), Fungi (23%) and Cercozoa (19%). Fungi thus substantially contributed to the detected diversity, as was additionally supported by direct microscopic observations of fungal zoospores and sporangia. A large fraction of the sequences belonged to parasites, including Alveolate sequences affiliated to the genus Perkinsus known as zooparasites, and chytrids that include host-specific parasitic fungi of various freshwater phytoplankton species, primarily diatoms. Phylogenetic analysis revealed five novel clades that probably include typical freshwater environmental sequences. Overall, from the unsuspected fungal diversity unveiled, we think that fungal zooflagellates have been misidentified as phagotrophic nanoflagellates in previous studies. This is in agreement with a recent experimental demonstration that zoospore-producing fungi and parasitic activity may play an important role in aquatic food webs.

  16. Modeling participation duration, with application to the North American Breeding Bird Survey

    USGS Publications Warehouse

    Link, William; Sauer, John

    2014-01-01

    We consider “participation histories,” binary sequences consisting of alternating finite sequences of 1s and 0s, ending with an infinite sequence of 0s. Our work is motivated by a study of observer tenure in the North American Breeding Bird Survey (BBS). In our analysis, j indexes an observer’s years of service and Xj is an indicator of participation in the survey; 0s interspersed among 1s correspond to years when observers did not participate, but subsequently returned to service. Of interest is the observer’s duration D = max {j: Xj = 1}. Because observed records X = (X1, X2,..., Xn)1 are of finite length, all that we can directly infer about duration is that D ⩾ max {j ⩽n: Xj = 1}; model-based analysis is required for inference about D. We propose models in which lengths of 0s and 1s sequences have distributions determined by the index j at which they begin; 0s sequences are infinite with positive probability, an estimable parameter. We found that BBS observers’ lengths of service vary greatly, with 25.3% participating for only a single year, 49.5% serving for 4 or fewer years, and an average duration of 8.7 years, producing an average of 7.7 counts.

  17. Using Wave-Current Observation to Predict Bottom Sediment Processes on Muddy Beaches

    DTIC Science & Technology

    2011-09-30

    as 80% of wave energy over a distance of just a few wave lengths (Gade, 1957; Jiang and Mehta, 1995; deWitt, 1995; Hill and Foda , 1999; Chan and Liu...bed transformation (see Section Figure 1) emerges from the analysis Sheremet et al., 2005; Jaramillo et al., 2008; Robillard, 2009; Sahin et al...Kaihatu et al., 2007; Sheremet et al., 2010). The ongoing work has three directions of research: Data analysis : reconstruct the sequence of bed

  18. Analysis of Bacterial Community Structure in Sulfurous-Oil-Containing Soils and Detection of Species Carrying Dibenzothiophene Desulfurization (dsz) Genes

    PubMed Central

    Duarte, Gabriela Frois; Rosado, Alexandre Soares; Seldin, Lucy; de Araujo, Welington; van Elsas, Jan Dirk

    2001-01-01

    The selective effects of sulfur-containing hydrocarbons, with respect to changes in bacterial community structure and selection of desulfurizing organisms and genes, were studied in soil. Samples taken from a polluted field soil (A) along a concentration gradient of sulfurous oil and from soil microcosms treated with dibenzothiophene (DBT)-containing petroleum (FSL soil) were analyzed. Analyses included plate counts of total bacteria and of DBT utilizers, molecular community profiling via soil DNA-based PCR-denaturing gradient gel electrophoresis (PCR-DGGE), and detection of genes that encode enzymes involved in the desulfurization of hydrocarbons, i.e., dszA, dszB, and dszC.Data obtained from the A soil showed no discriminating effects of oil levels on the culturable bacterial numbers on either medium used. Generally, counts of DBT degraders were 10- to 100-fold lower than the total culturable counts. However, PCR-DGGE showed that the numbers of bands detected in the molecular community profiles decreased with increasing oil content of the soil. Analysis of the sequences of three prominent bands of the profiles generated with the highly polluted soil samples suggested that the underlying organisms were related to Actinomyces sp., Arthrobacter sp., and a bacterium of uncertain affiliation. dszA, dszB, and dszC genes were present in all A soil samples, whereas a range of unpolluted soils gave negative results in this analysis. Results from the study of FSL soil revealed minor effects of the petroleum-DBT treatment on culturable bacterial numbers and clear effects on the DBT-utilizing communities. The molecular community profiles were largely stable over time in the untreated soil, whereas they showed a progressive change over time following treatment with DBT-containing petroleum. Direct PCR assessment revealed the presence of dszB-related signals in the untreated FSL soil and the apparent selection of dszA- and dszC-related sequences by the petroleum-DBT treatment. PCR-DGGE applied to sequential enrichment cultures in DBT-containing sulfur-free basal salts medium prepared from the A and treated FSL soils revealed the selection of up to 10 distinct bands. Sequencing a subset of these bands provided evidence for the presence of organisms related to Pseudomonas putida, a Pseudomonas sp., Stenotrophomonas maltophilia, and Rhodococcus erythropolis. Several of 52 colonies obtained from the A and FSL soils on agar plates with DBT as the sole sulfur source produced bands that matched the migration of bands selected in the enrichment cultures. Evidence for the presence of dszB in 12 strains was obtained, whereas dszA and dszC genes were found in only 7 and 6 strains, respectively. Most of the strains carrying dszA or dszC were classified as R. erythropolis related, and all revealed the capacity to desulfurize DBT. A comparison of 37 dszA sequences, obtained via PCR from the A and FSL soils, from enrichments of these soils, and from isolates, revealed the great similarity of all sequences to the canonical (R. erythropolis strain IGTS8) dszA sequence and a large degree of internal conservation. The 37 sequences recovered were grouped in three clusters. One group, consisting of 30 sequences, was minimally 98% related to the IGTS8 sequence, a second group of 2 sequences was slightly different, and a third group of 5 sequences was 95% similar. The first two groups contained sequences obtained from both soil types and enrichment cultures (including isolates), but the last consisted of sequences obtained directly from the polluted A soil. PMID:11229891

  19. First molecular identification and characterization of classical swine fever virus isolates from Nepal.

    PubMed

    Postel, Alexander; Jha, Vijay C; Schmeiser, Stefanie; Becher, Paul

    2013-01-01

    Classical swine fever (CSF) is a major constraint to pig production worldwide, and in many developing countries, the epidemiological status is unknown. Here, for the first time, molecular identification and characterization of CSFV isolates from two recent outbreaks in Nepal are presented. Analysis of full-length E2-encoding sequences revealed that these isolates belonged to CSFV subgenotype 2.2 and had highest genetic similarity to isolates from India. Hence, for CSFV, Nepal and India should be regarded as one epidemiological unit. Both Nepalese isolates exhibited significant sequence differences, excluding a direct epidemiological connection and suggesting that CSFV is endemic in that country.

  20. Methods and apparatus for analysis of chromatographic migration patterns

    DOEpatents

    Stockham, T.G.; Ives, J.T.

    1993-12-28

    A method and apparatus are presented for sharpening signal peaks in a signal representing the distribution of biological or chemical components of a mixture separated by a chromatographic technique such as, but not limited to, electrophoresis. A key step in the method is the use of a blind deconvolution technique, presently embodied as homomorphic filtering, to reduce the contribution of a blurring function to the signal encoding the peaks of the distribution. The invention further includes steps and apparatus directed to determination of a nucleotide sequence from a set of four such signals representing DNA sequence data derived by electrophoretic means. 16 figures.

  1. Elbow kinematics during sit-to-stand and stand-to-sit movements.

    PubMed

    Packer, T L; Wyss, U P; Costigan, P A

    1993-11-01

    The sit-to-stand and stand-to-sit movements of 10 healthy women (mean age 52.4 years) were subjected to a descriptive analysis that yielded a definition of phases, determination of the peak angles reached, maximum angular velocity during each movement, and the sequencing of key events. While subjects showed little intrasubject variability, intersubject variability was evident. Subjects differed in the joint angles and angular velocity recorded, but the sequence of flexion/extension and rotation events were unchanged. Changes in direction of flexion/extension and rotation tended to occur very close in time, if not at the same time. Copyright © 1993. Published by Elsevier Ltd.

  2. Genomic Footprints of Selective Sweeps from Metabolic Resistance to Pyrethroids in African Malaria Vectors Are Driven by Scale up of Insecticide-Based Vector Control.

    PubMed

    Barnes, Kayla G; Weedall, Gareth D; Ndula, Miranda; Irving, Helen; Mzihalowa, Themba; Hemingway, Janet; Wondji, Charles S

    2017-02-01

    Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the 'resistance curve' and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.

  3. The molecular mechanism for interaction of ceruloplasmin and myeloperoxidase

    NASA Astrophysics Data System (ADS)

    Bakhautdin, Bakytzhan; Bakhautdin, Esen Göksöy

    2016-04-01

    Ceruloplasmin (Cp) is a copper-containing ferroxidase with potent antioxidant activity. Cp is expressed by hepatocytes and activated macrophages and has been known as physiologic inhibitor of myeloperoxidase (MPO). Enzymatic activity of MPO produces anti-microbial agents and strong prooxidants such as hypochlorous acid and has a potential to damage host tissue at the sites of inflammation and infection. Thus Cp-MPO interaction and inhibition of MPO has previously been suggested as an important control mechanism of excessive MPO activity. Our aim in this study was to identify minimal Cp domain or peptide that interacts with MPO. We first confirmed Cp-MPO interaction by ELISA and surface plasmon resonance (SPR). SPR analysis of the interaction yielded 30 nM affinity between Cp and MPO. We then designed and synthesized 87 overlapping peptides spanning the entire amino acid sequence of Cp. Each of the peptides was tested whether it binds to MPO by direct binding ELISA. Two of the 87 peptides, P18 and P76 strongly interacted with MPO. Amino acid sequence analysis of identified peptides revealed high sequence and structural homology between them. Further structural analysis of Cp's crystal structure by PyMOL software unfolded that both peptides represent surface-exposed sites of Cp and face nearly the same direction. To confirm our finding we raised anti-P18 antisera in rabbit and demonstrated that this antisera disrupts Cp-MPO binding and rescues MPO activity. Collectively, our results confirm Cp-MPO interaction and identify two nearly identical sites on Cp that specifically bind MPO. We propose that inhibition of MPO by Cp requires two nearly identical sites on Cp to bind homodimeric MPO simultaneously and at an angle of at least 120 degrees, which, in turn, exerts tension on MPO and results in conformational change.

  4. General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies

    PubMed Central

    Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong

    2013-01-01

    We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515

  5. Direct typing of Canine parvovirus (CPV) from infected dog faeces by rapid mini sequencing technique.

    PubMed

    V, Pavana Jyothi; S, Akila; Selvan, Malini K; Naidu, Hariprasad; Raghunathan, Shwethaa; Kota, Sathish; Sundaram, R C Raja; Rana, Samir Kumar; Raj, G Dhinakar; Srinivasan, V A; Mohana Subramanian, B

    2016-12-01

    Canine parvovirus (CPV) is a non-enveloped single stranded DNA virus with an icosahedral capsid. Mini-sequencing based CPV typing was developed earlier to detect and differentiate all the CPV types and FPV in a single reaction. This technique was further evaluated in the present study by performing the mini-sequencing directly from fecal samples which avoided tedious virus isolation steps by cell culture system. Fecal swab samples were collected from 84 dogs with enteritis symptoms, suggestive of parvoviral infection from different locations across India. Seventy six of these samples were positive by PCR; the subsequent mini-sequencing reaction typed 74 of them as type 2a virus, and 2 samples as type 2b. Additionally, 25 of the positive samples were typed by cycle sequencing of PCR products. Direct CPV typing from fecal samples using mini-sequencing showed 100% correlation with CPV typing by cycle sequencing. Moreover, CPV typing was achieved by mini-sequencing even with faintly positive PCR amplicons which was not possible by cycle sequencing. Therefore, the mini-sequencing technique is recommended for regular epidemiological follow up of CPV types, since the technique is rapid, highly sensitive and high capacity method for CPV typing. Copyright © 2016. Published by Elsevier B.V.

  6. Assessment of a Pan-Dermatophyte Nested-PCR Compared with Conventional Methods for Direct Detection and Identification of Dermatophytosis Agents in Animals.

    PubMed

    Piri, Fahimeh; Zarei Mahmoudabadi, Ali; Ronagh, Ali; Ahmadi, Bahram; Makimura, Koichi; Rezaei-Matehkolaei, Ali

    2018-06-26

    Conventional direct microscopy with potassium hydroxide (KOH) and culture were found to lack the ability to establish a fast and specific diagnosis of dermatophytosis. A pan-dermatophyte nested-PCR assay was developed using a novel primer pair targeting the translation elongation factor 1-α (Tef-1α) sequences for direct detection and identification of most veterinary relevant dermatophytes in animal samples suspected to dermatophytosis. A total of 140 animal skin and hair samples were subjected to direct microscopy, culture, and ITS-RFLP/ITS-sequencing of culture isolates for the detection and identification of dermatophytosis agents. Nested-PCR sequencing was performed on all the extracted DNAs using a commercial kit after dissolving the specimens by mechanical beating. Nested-PCR was positive in 90% of samples, followed by direct microscopy (85.7%) and culture (75%). The degree of agreement between nested-PCR and direct microscopy (94.4%) was higher than with culture (83.3%). In 105 culture positive cases, the measures of agreement for the identification of dermatophytosis agents were as follows: 100% between nested-PCR sequencing and ITS-RFLP/ITS-sequencing and 63.8% between nested-PCR sequencing and culture. The developed nested-PCR was faster as well as more sensitive and specific than conventional methods for detection and identification of dermatophytes in clinical samples, which was particularly suitable for epidemiological studies. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  7. Generation of transgenic goats by pronuclear microinjection: a retrospective analysis of a commercial operation (1995-2012).

    PubMed

    Gavin, W; Blash, S; Buzzell, N; Pollock, D; Chen, L; Hawkins, N; Howe, J; Miner, K; Pollock, J; Porter, C; Schofield, M; Echelard, Y; Meade, H

    2018-02-01

    Production of transgenic founder goats involves introducing and stably integrating an engineered piece of DNA into the genome of the animal. At LFB USA, the ultimate use of these transgenic goats is for the production of recombinant human protein therapeutics in the milk of these dairy animals. The transgene or construct typically links a milk protein specific promoter sequence, the coding sequence for the gene of interest, and the necessary downstream regulatory sequences thereby directing expression of the recombinant protein in the milk during the lactation period. Over the time period indicated (1995-2012), pronuclear microinjection was used in a number of programs to insert transgenes into 18,120, 1- or 2- cell stage fertilized embryos. These embryos were transferred into 4180 synchronized recipient females with 1934 (47%) recipients becoming pregnant, 2594 offspring generated, and a 109 (4.2%) of those offspring determined to be transgenic. Even with new and improving genome editing tools now available, pronuclear microinjection is still the predominant and proven technology used in this commercial setting supporting regulatory filings and market authorizations when producing founder transgenic animals with large transgenes (> 10 kb) such as those necessary for directing monoclonal antibody production in milk.

  8. Automatic segmentation of low-visibility moving objects through energy analyis of the local 3D spectrum

    NASA Astrophysics Data System (ADS)

    Nestares, Oscar; Miravet, Carlos; Santamaria, Javier; Fonolla Navarro, Rafael

    1999-05-01

    Automatic object segmentation in highly noisy image sequences, composed by a translating object over a background having a different motion, is achieved through joint motion-texture analysis. Local motion and/or texture is characterized by the energy of the local spatio-temporal spectrum, as different textures undergoing different translational motions display distinctive features in their 3D (x,y,t) spectra. Measurements of local spectrum energy are obtained using a bank of directional 3rd order Gaussian derivative filters in a multiresolution pyramid in space- time (10 directions, 3 resolution levels). These 30 energy measurements form a feature vector describing texture-motion for every pixel in the sequence. To improve discrimination capability and reduce computational cost, we automatically select those 4 features (channels) that best discriminate object from background, under the assumptions that the object is smaller than the background and has a different velocity or texture. In this way we reject features irrelevant or dominated by noise, that could yield wrong segmentation results. This method has been successfully applied to sequences with extremely low visibility and for objects that are even invisible for the eye in absence of motion.

  9. Improved systematic tRNA gene annotation allows new insights into the evolution of mitochondrial tRNA structures and into the mechanisms of mitochondrial genome rearrangements

    PubMed Central

    Jühling, Frank; Pütz, Joern; Bernt, Matthias; Donath, Alexander; Middendorf, Martin; Florentz, Catherine; Stadler, Peter F.

    2012-01-01

    Transfer RNAs (tRNAs) are present in all types of cells as well as in organelles. tRNAs of animal mitochondria show a low level of primary sequence conservation and exhibit ‘bizarre’ secondary structures, lacking complete domains of the common cloverleaf. Such sequences are hard to detect and hence frequently missed in computational analyses and mitochondrial genome annotation. Here, we introduce an automatic annotation procedure for mitochondrial tRNA genes in Metazoa based on sequence and structural information in manually curated covariance models. The method, applied to re-annotate 1876 available metazoan mitochondrial RefSeq genomes, allows to distinguish between remaining functional genes and degrading ‘pseudogenes’, even at early stages of divergence. The subsequent analysis of a comprehensive set of mitochondrial tRNA genes gives new insights into the evolution of structures of mitochondrial tRNA sequences as well as into the mechanisms of genome rearrangements. We find frequent losses of tRNA genes concentrated in basal Metazoa, frequent independent losses of individual parts of tRNA genes, particularly in Arthropoda, and wide-spread conserved overlaps of tRNAs in opposite reading direction. Direct evidence for several recent Tandem Duplication-Random Loss events is gained, demonstrating that this mechanism has an impact on the appearance of new mitochondrial gene orders. PMID:22139921

  10. Targeted gene panel sequencing in children with very early onset inflammatory bowel disease--evaluation and prospective analysis.

    PubMed

    Kammermeier, Jochen; Drury, Suzanne; James, Chela T; Dziubak, Robert; Ocaka, Louise; Elawad, Mamoun; Beales, Philip; Lench, Nicholas; Uhlig, Holm H; Bacchelli, Chiara; Shah, Neil

    2014-11-01

    Multiple monogenetic conditions with partially overlapping phenotypes can present with inflammatory bowel disease (IBD)-like intestinal inflammation. With novel genotype-specific therapies emerging, establishing a molecular diagnosis is becoming increasingly important. We have introduced targeted next-generation sequencing (NGS) technology as a prospective screening tool in children with very early onset IBD (VEOIBD). We evaluated the coverage of 40 VEOIBD genes in two separate cohorts undergoing targeted gene panel sequencing (TGPS) (n=25) and whole exome sequencing (WES) (n=20). TGPS revealed causative mutations in four genes (IL10RA, EPCAM, TTC37 and SKIV2L) discovered unexpected phenotypes and directly influenced clinical decision making by supporting as well as avoiding haematopoietic stem cell transplantation. TGPS resulted in significantly higher median coverage when compared with WES, fewer coverage deficiencies and improved variant detection across established VEOIBD genes. Excluding or confirming known VEOIBD genotypes should be considered early in the disease course in all cases of therapy-refractory VEOIBD, as it can have a direct impact on patient management. To combine both described NGS technologies would compensate for the limitations of WES for disease-specific application while offering the opportunity for novel gene discovery in the research setting. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  11. Characterization of proviruses cloned from mink cell focus-forming virus-infected cellular DNA.

    PubMed Central

    Khan, A S; Repaske, R; Garon, C F; Chan, H W; Rowe, W P; Martin, M A

    1982-01-01

    Two proviruses were cloned from EcoRI-digested DNA extracted from mink cells chronically infected with AKR mink cell focus-forming (MCF) 247 murine leukemia virus (MuLV), using a lambda phage host vector system. One cloned MuLV DNA fragment (designated MCF 1) contained sequences extending 6.8 kilobases from an EcoRI restriction site in the 5' long terminal repeat (LTR) to an EcoRI site located in the envelope (env) region and was indistinguishable by restriction endonuclease mapping for 5.1 kilobases (except for the EcoRI site in the LTR) from the 5' end of AKR ecotropic proviral DNA. The DNA segment extending from 5.1 to 6.8 kilobases contained several restriction sites that were not present in the AKR ecotropic provirus. A 0.5-kilobase DNA segment located at the 3' end of MCF 1 DNA contained sequences which hybridized to a xenotropic env-specific DNA probe but not to labeled ecotropic env-specific DNA. This dual character of MCF 1 proviral DNA was also confirmed by analyzing heteroduplex molecules by electron microscopy. The second cloned proviral DNA (designated MCF 2) was a 6.9-kilobase EcoRI DNA fragment which contained LTR sequences at each end and a 2.0-kilobase deletion encompassing most of the env region. The MCF 2 proviral DNA proved to be a useful reagent for detecting LTRs electron microscopically due to the presence of nonoverlapping, terminally located LTR sequences which effected its circularization with DNAs containing homologous LTR sequences. Nucleotide sequence analysis demonstrated the presence of a 104-base-pair direct repeat in the LTR of MCF 2 DNA. In contrast, only a single copy of the reiterated component of the direct repeat was present in MCF 1 DNA. Images PMID:6281459

  12. Next-generation Sequencing-based genomic profiling: Fostering innovation in cancer care?

    PubMed

    Fernandes, Gustavo S; Marques, Daniel F; Girardi, Daniel M; Braghiroli, Maria Ignez F; Coudry, Renata A; Meireles, Sibele I; Katz, Artur; Hoff, Paulo M

    2017-10-01

    With the development of next-generation sequencing (NGS) technologies, DNA sequencing has been increasingly utilized in clinical practice. Our goal was to investigate the impact of genomic evaluation on treatment decisions for heavily pretreated patients with metastatic cancer. We analyzed metastatic cancer patients from a single institution whose cancers had progressed after all available standard-of-care therapies and whose tumors underwent next-generation sequencing analysis. We determined the percentage of patients who received any therapy directed by the test, and its efficacy. From July 2013 to December 2015, 185 consecutive patients were tested using a commercially available next-generation sequencing-based test, and 157 patients were eligible. Sixty-six patients (42.0%) were female, and 91 (58.0%) were male. The mean age at diagnosis was 52.2 years, and the mean number of pre-test lines of systemic treatment was 2.7. One hundred and seventy-seven patients (95.6%) had at least one identified gene alteration. Twenty-four patients (15.2%) underwent systemic treatment directed by the test result. Of these, one patient had a complete response, four (16.7%) had partial responses, two (8.3%) had stable disease, and 17 (70.8%) had disease progression as the best result. The median progression-free survival time with matched therapy was 1.6 months, and the median overall survival was 10 months. We identified a high prevalence of gene alterations using an next-generation sequencing test. Although some benefit was associated with the matched therapy, most of the patients had disease progression as the best response, indicating the limited biological potential and unclear clinical relevance of this practice.

  13. Cultivation of Hard-To-Culture Subsurface Mercury-Resistant Bacteria and Discovery of New merA Gene Sequences▿

    PubMed Central

    Rasmussen, L. D.; Zawadsky, C.; Binnerup, S. J.; Øregaard, G.; Sørensen, S. J.; Kroer, N.

    2008-01-01

    Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach. The cultivation method simulated natural conditions by using polycarbonate membranes as a growth support and a nonsterile soil slurry as a culture medium. Resistant bacteria were pregrown to microcolony-forming units (mCFU) before being plated on standard medium. Compared to direct plating, culturability was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25 different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One of the sequences did not result in a match in the BLAST search. The results illustrate the power of integrating advanced cultivation methodology with molecular techniques for the characterization of the diversity of mercury-resistant populations and assessing the potential for mercury reduction in contaminated environments. PMID:18441111

  14. Single-molecule, full-length transcript sequencing provides insight into the extreme metabolism of the ruby-throated hummingbird Archilochus colubris

    PubMed Central

    Workman, Rachael E; Myrka, Alexander M; Wong, G William; Tseng, Elizabeth

    2018-01-01

    Abstract Background Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids that are derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. Findings We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. Conclusions We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism. PMID:29618047

  15. Changes in the Composition of Drinking Water Bacterial Clone Libraries Introduced by Using Two Different 16S rRna Gene PCR Primers

    EPA Science Inventory

    Sequence analysis of 16S rRNA gene clone libraries is a popular tool used to describe the composition of natural microbial communities. Commonly, clone libraries are developed by direct cloning of 16S rRNA gene PCR products. Different primers are often employed in the initial amp...

  16. Changes in the Composition of Drinking Water Bacterial Clone Libraries Introduced by Using Two Different 16S rRNA Gene PCR Primers

    EPA Science Inventory

    Sequence analysis of 16S rRNA gene clone libraries is a popular tool used to describe the composition of natural microbial communities. Commonly, clone libraries are developed by direct cloning of 16S rRNA gene PCR products. Different primers are often employed in the initial amp...

  17. A phylogenetic analysis of Aquifex pyrophilus

    NASA Technical Reports Server (NTRS)

    Burggraf, S.; Olsen, G. J.; Stetter, K. O.; Woese, C. R.

    1992-01-01

    The 16S rRNA of the bacterion Aquifex pyrophilus, a microaerophilic, oxygen-reducing hyperthermophile, has been sequenced directly from the the PCR amplified gene. Phylogenetic analyses show the Aq. pyrophilus lineage to be probably the deepest (earliest) in the (eu)bacterial tree. The addition of this deep branching to the bacterial tree further supports the argument that the Bacteria are of thermophilic ancestry.

  18. Application of viromics: a new approach to the understanding of viral infections in humans.

    PubMed

    Ramamurthy, Mageshbabu; Sankar, Sathish; Kannangai, Rajesh; Nandagopal, Balaji; Sridharan, Gopalan

    2017-12-01

    This review is focused at exploring the strengths of modern technology driven data compiled in the areas of virus gene sequencing, virus protein structures and their implication to viral diagnosis and therapy. The information for virome analysis (viromics) is generated by the study of viral genomes (entire nucleotide sequence) and viral genes (coding for protein). Presently, the study of viral infectious diseases in terms of etiopathogenesis and development of newer therapeutics is undergoing rapid changes. Currently, viromics relies on deep sequencing, next generation sequencing (NGS) data and public domain databases like GenBank and unique virus specific databases. Two commonly used NGS platforms: Illumina and Ion Torrent, recommend maximum fragment lengths of about 300 and 400 nucleotides for analysis respectively. Direct detection of viruses in clinical samples is now evolving using these methods. Presently, there are a considerable number of good treatment options for HBV/HIV/HCV. These viruses however show development of drug resistance. The drug susceptibility regions of the genomes are sequenced and the prediction of drug resistance is now possible from 3 public domains available on the web. This has been made possible through advances in the technology with the advent of high throughput sequencing and meta-analysis through sophisticated and easy to use software and the use of high speed computers for bioinformatics. More recently NGS technology has been improved with single-molecule real-time sequencing. Here complete long reads can be obtained with less error overcoming a limitation of the NGS which is inherently prone to software anomalies that arise in the hands of personnel without adequate training. The development in understanding the viruses in terms of their genome, pathobiology, transcriptomics and molecular epidemiology constitutes viromics. It could be stated that these developments will bring about radical changes and advancement especially in the field of antiviral therapy and diagnostic virology.

  19. Nanopore DNA Sequencing and Genome Assembly on the International Space Station.

    PubMed

    Castro-Wallace, Sarah L; Chiu, Charles Y; John, Kristen K; Stahl, Sarah E; Rubins, Kathleen H; McIntyre, Alexa B R; Dworkin, Jason P; Lupisella, Mark L; Smith, David J; Botkin, Douglas J; Stephenson, Timothy A; Juul, Sissel; Turner, Daniel J; Izquierdo, Fernando; Federman, Scot; Stryke, Doug; Somasekar, Sneha; Alexander, Noah; Yu, Guixia; Mason, Christopher E; Burton, Aaron S

    2017-12-21

    We evaluated the performance of the MinION DNA sequencer in-flight on the International Space Station (ISS), and benchmarked its performance off-Earth against the MinION, Illumina MiSeq, and PacBio RS II sequencing platforms in terrestrial laboratories. Samples contained equimolar mixtures of genomic DNA from lambda bacteriophage, Escherichia coli (strain K12, MG1655) and Mus musculus (female BALB/c mouse). Nine sequencing runs were performed aboard the ISS over a 6-month period, yielding a total of 276,882 reads with no apparent decrease in performance over time. From sequence data collected aboard the ISS, we constructed directed assemblies of the ~4.6 Mb E. coli genome, ~48.5 kb lambda genome, and a representative M. musculus sequence (the ~16.3 kb mitochondrial genome), at 100%, 100%, and 96.7% consensus pairwise identity, respectively; de novo assembly of the E. coli genome from raw reads yielded a single contig comprising 99.9% of the genome at 98.6% consensus pairwise identity. Simulated real-time analyses of in-flight sequence data using an automated bioinformatic pipeline and laptop-based genomic assembly demonstrated the feasibility of sequencing analysis and microbial identification aboard the ISS. These findings illustrate the potential for sequencing applications including disease diagnosis, environmental monitoring, and elucidating the molecular basis for how organisms respond to spaceflight.

  20. Monitoring Error Rates In Illumina Sequencing.

    PubMed

    Manley, Leigh J; Ma, Duanduan; Levine, Stuart S

    2016-12-01

    Guaranteeing high-quality next-generation sequencing data in a rapidly changing environment is an ongoing challenge. The introduction of the Illumina NextSeq 500 and the depreciation of specific metrics from Illumina's Sequencing Analysis Viewer (SAV; Illumina, San Diego, CA, USA) have made it more difficult to determine directly the baseline error rate of sequencing runs. To improve our ability to measure base quality, we have created an open-source tool to construct the Percent Perfect Reads (PPR) plot, previously provided by the Illumina sequencers. The PPR program is compatible with HiSeq 2000/2500, MiSeq, and NextSeq 500 instruments and provides an alternative to Illumina's quality value (Q) scores for determining run quality. Whereas Q scores are representative of run quality, they are often overestimated and are sourced from different look-up tables for each platform. The PPR's unique capabilities as a cross-instrument comparison device, as a troubleshooting tool, and as a tool for monitoring instrument performance can provide an increase in clarity over SAV metrics that is often crucial for maintaining instrument health. These capabilities are highlighted.

  1. Data Interoperability of Whole Exome Sequencing (WES) Based Mutational Burden Estimates from Different Laboratories

    PubMed Central

    Qiu, Ping; Pang, Ling; Arreaza, Gladys; Maguire, Maureen; Chang, Ken C. N.; Marton, Matthew J.; Levitan, Diane

    2016-01-01

    Immune checkpoint inhibitors, which unleash a patient’s own T cells to kill tumors, are revolutionizing cancer treatment. Several independent studies suggest that higher non-synonymous mutational burden assessed by whole exome sequencing (WES) in tumors is associated with improved objective response, durable clinical benefit, and progression-free survival in immune checkpoint inhibitors treatment. Next-generation sequencing (NGS) is a promising technology being used in the clinic to direct patient treatment. Cancer genome WES poses a unique challenge due to tumor heterogeneity and sequencing artifacts introduced by formalin-fixed, paraffin-embedded (FFPE) tissue. In order to evaluate the data interoperability of WES data from different sources to survey tumor mutational landscape, we compared WES data of several tumor/normal matched samples from five commercial vendors. A large data discrepancy was observed from vendors’ self-reported data. Independent data analysis from vendors’ raw NGS data shows that whole exome sequencing data from qualified vendors can be combined and analyzed uniformly to derive comparable quantitative estimates of tumor mutational burden. PMID:27136543

  2. Leptospira species molecular epidemiology in the genomic era.

    PubMed

    Caimi, K; Repetto, S A; Varni, V; Ruybal, P

    2017-10-01

    Leptospirosis is a zoonotic disease which global burden is increasing often related to climatic change. Hundreds of whole genome sequences from worldwide isolates of Leptospira spp. are available nowadays, together with online tools that permit to assign MLST sequence types (STs) directly from raw sequence data. In this work we have applied R7L-MLST to near 500 genomes and strains collection globally distributed. All 10 pathogenic species as well as intermediate were typed using this MLST scheme. The correlation observed between STs and serogroups in our previous work, is still satisfied with this higher dataset sustaining the implementation of MLST to assist serological classification as a complementary approach. Bayesian phylogenetic analysis of concatenated sequences from R7-MLST loci allowed us to resolve taxonomic inconsistencies but also showed that events such as recombination, gene conversion or lateral gene transfer played an important role in the evolution of Leptospira genus. Whole genome sequencing allows us to contribute with suitable epidemiologic information useful to apply in the design of control strategies and also in diagnostic methods for this illness. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Ancient dna from pleistocene fossils: Preservation, recovery, and utility of ancient genetic information for quaternary research

    NASA Astrophysics Data System (ADS)

    Yang, Hong

    Until recently, recovery and analysis of genetic information encoded in ancient DNA sequences from Pleistocene fossils were impossible. Recent advances in molecular biology offered technical tools to obtain ancient DNA sequences from well-preserved Quaternary fossils and opened the possibilities to directly study genetic changes in fossil species to address various biological and paleontological questions. Ancient DNA studies involving Pleistocene fossil material and ancient DNA degradation and preservation in Quaternary deposits are reviewed. The molecular technology applied to isolate, amplify, and sequence ancient DNA is also presented. Authentication of ancient DNA sequences and technical problems associated with modern and ancient DNA contamination are discussed. As illustrated in recent studies on ancient DNA from proboscideans, it is apparent that fossil DNA sequence data can shed light on many aspects of Quaternary research such as systematics and phylogeny. conservation biology, evolutionary theory, molecular taphonomy, and forensic sciences. Improvement of molecular techniques and a better understanding of DNA degradation during fossilization are likely to build on current strengths and to overcome existing problems, making fossil DNA data a unique source of information for Quaternary scientists.

  4. Review of general algorithmic features for genome assemblers for next generation sequencers.

    PubMed

    Wajid, Bilal; Serpedin, Erchin

    2012-04-01

    In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity. Copyright © 2012 Beijing Institute of Genomics, Chinese Academy of Sciences. Published by Elsevier Ltd. All rights reserved.

  5. Transcription Factor Map Alignment of Promoter Regions

    PubMed Central

    Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic

    2006-01-01

    We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547

  6. The Mouse Genomes Project: a repository of inbred laboratory mouse strain genomes.

    PubMed

    Adams, David J; Doran, Anthony G; Lilue, Jingtao; Keane, Thomas M

    2015-10-01

    The Mouse Genomes Project was initiated in 2009 with the goal of using next-generation sequencing technologies to catalogue molecular variation in the common laboratory mouse strains, and a selected set of wild-derived inbred strains. The initial sequencing and survey of sequence variation in 17 inbred strains was completed in 2011 and included comprehensive catalogue of single nucleotide polymorphisms, short insertion/deletions, larger structural variants including their fine scale architecture and landscape of transposable element variation, and genomic sites subject to post-transcriptional alteration of RNA. From this beginning, the resource has expanded significantly to include 36 fully sequenced inbred laboratory mouse strains, a refined and updated data processing pipeline, and new variation querying and data visualisation tools which are available on the project's website ( http://www.sanger.ac.uk/resources/mouse/genomes/ ). The focus of the project is now the completion of de novo assembled chromosome sequences and strain-specific gene structures for the core strains. We discuss how the assembled chromosomes will power comparative analysis, data access tools and future directions of mouse genetics.

  7. Photothermal method of determining calorific properties of coal

    DOEpatents

    Amer, N.M.

    1983-05-16

    Predetermined amounts of heat are generated within a coal sample by directing pump light pulses of predetermined energy content into a small surface region of the sample. A beam of probe light is directed along the sample surface and deflection of the probe beam from thermally induced changes of index of refraction in the fluid medium adjacent the heated region are detected. Deflection amplitude and the phase lag of the deflection, relative to the initiating pump light pulse, are indicative of the calorific value and the porosity of the sample. The method provides rapid, accurate and nondestructive analysis of the heat producing capabilities of coal samples. In the preferred form, sequences of pump light pulses of increasing durations are directed into the sample at each of a series of minute regions situated along a raster scan path enabling detailed analysis of variations of thermal properties at different areas of the sample and at different depths.

  8. Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

    PubMed

    Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

    2016-04-01

    Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  9. Cis-acting elements in the promoter region of the human aldolase C gene.

    PubMed

    Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F

    1993-08-16

    We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.

  10. Beyond sequencing: optical mapping of DNA in the age of nanotechnology and nanoscopy.

    PubMed

    Levy-Sakin, Michal; Ebenstein, Yuval

    2013-08-01

    Next generation sequencing (NGS) is revolutionizing all fields of biological research but it fails to extract the full range of information associated with genetic material. Optical mapping of DNA grants access to genetic and epigenetic information on individual DNA molecules up to ∼1 Mbp in length. Fluorescent labeling of specific sequence motifs, epigenetic marks and other genomic information on individual DNA molecules generates a high content optical barcode along the DNA. By stretching the DNA to a linear configuration this barcode may be directly visualized by fluorescence microscopy. We discuss the advances of these methods in light of recent developments in nano-fabrication and super-resolution optical imaging (nanoscopy) and review the latest achievements of optical mapping in the context of genomic analysis. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. XS: a FASTQ read simulator.

    PubMed

    Pratas, Diogo; Pinho, Armando J; Rodrigues, João M O S

    2014-01-16

    The emerging next-generation sequencing (NGS) is bringing, besides the natural huge amounts of data, an avalanche of new specialized tools (for analysis, compression, alignment, among others) and large public and private network infrastructures. Therefore, a direct necessity of specific simulation tools for testing and benchmarking is rising, such as a flexible and portable FASTQ read simulator, without the need of a reference sequence, yet correctly prepared for producing approximately the same characteristics as real data. We present XS, a skilled FASTQ read simulation tool, flexible, portable (does not need a reference sequence) and tunable in terms of sequence complexity. It has several running modes, depending on the time and memory available, and is aimed at testing computing infrastructures, namely cloud computing of large-scale projects, and testing FASTQ compression algorithms. Moreover, XS offers the possibility of simulating the three main FASTQ components individually (headers, DNA sequences and quality-scores). XS provides an efficient and convenient method for fast simulation of FASTQ files, such as those from Ion Torrent (currently uncovered by other simulators), Roche-454, Illumina and ABI-SOLiD sequencing machines. This tool is publicly available at http://bioinformatics.ua.pt/software/xs/.

  12. Optimization process planning using hybrid genetic algorithm and intelligent search for job shop machining.

    PubMed

    Salehi, Mojtaba; Bahreininejad, Ardeshir

    2011-08-01

    Optimization of process planning is considered as the key technology for computer-aided process planning which is a rather complex and difficult procedure. A good process plan of a part is built up based on two elements: (1) the optimized sequence of the operations of the part; and (2) the optimized selection of the machine, cutting tool and Tool Access Direction (TAD) for each operation. In the present work, the process planning is divided into preliminary planning, and secondary/detailed planning. In the preliminary stage, based on the analysis of order and clustering constraints as a compulsive constraint aggregation in operation sequencing and using an intelligent searching strategy, the feasible sequences are generated. Then, in the detailed planning stage, using the genetic algorithm which prunes the initial feasible sequences, the optimized operation sequence and the optimized selection of the machine, cutting tool and TAD for each operation based on optimization constraints as an additive constraint aggregation are obtained. The main contribution of this work is the optimization of sequence of the operations of the part, and optimization of machine selection, cutting tool and TAD for each operation using the intelligent search and genetic algorithm simultaneously.

  13. Optimization process planning using hybrid genetic algorithm and intelligent search for job shop machining

    PubMed Central

    Salehi, Mojtaba

    2010-01-01

    Optimization of process planning is considered as the key technology for computer-aided process planning which is a rather complex and difficult procedure. A good process plan of a part is built up based on two elements: (1) the optimized sequence of the operations of the part; and (2) the optimized selection of the machine, cutting tool and Tool Access Direction (TAD) for each operation. In the present work, the process planning is divided into preliminary planning, and secondary/detailed planning. In the preliminary stage, based on the analysis of order and clustering constraints as a compulsive constraint aggregation in operation sequencing and using an intelligent searching strategy, the feasible sequences are generated. Then, in the detailed planning stage, using the genetic algorithm which prunes the initial feasible sequences, the optimized operation sequence and the optimized selection of the machine, cutting tool and TAD for each operation based on optimization constraints as an additive constraint aggregation are obtained. The main contribution of this work is the optimization of sequence of the operations of the part, and optimization of machine selection, cutting tool and TAD for each operation using the intelligent search and genetic algorithm simultaneously. PMID:21845020

  14. A short review of variants calling for single-cell-sequencing data with applications.

    PubMed

    Wei, Zhuohui; Shu, Chang; Zhang, Changsheng; Huang, Jingying; Cai, Hongmin

    2017-11-01

    The field of single-cell sequencing is fleetly expanding, and many techniques have been developed in the past decade. With this technology, biologists can study not only the heterogeneity between two adjacent cells in the same tissue or organ, but also the evolutionary relationships and degenerative processes in a single cell. Calling variants is the main purpose in analyzing single cell sequencing (SCS) data. Currently, some popular methods used for bulk-cell-sequencing data analysis are tailored directly to be applied in dealing with SCS data. However, SCS requires an extra step of genome amplification to accumulate enough quantity for satisfying sequencing needs. The amplification yields large biases and thus raises challenge for using the bulk-cell-sequencing methods. In order to provide guidance for the development of specialized analyzed methods as well as using currently developed tools for SNS, this paper aims to bridge the gap. In this paper, we firstly introduced two popular genome amplification methods and compared their capabilities. Then we introduced a few popular models for calling single-nucleotide polymorphisms and copy-number variations. Finally, break-through applications of SNS were summarized to demonstrate its potential in researching cell evolution. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Mutational analysis of the gag-pol junction of Moloney murine leukemia virus: requirements for expression of the gag-pol fusion protein.

    PubMed Central

    Felsenstein, K M; Goff, S P

    1992-01-01

    The gag-pol polyprotein of the murine and feline leukemia viruses is expressed by translational readthrough of a UAG terminator codon at the 3' end of the gag gene. To explore the cis-acting sequence requirements for the readthrough event in vivo, we generated a library of mutants of the Moloney murine leukemia virus with point mutations near the terminator codon and tested the mutant viral DNAs for the ability to direct synthesis of the gag-pol fusion protein and formation of infectious virus. The analysis showed that sequences 3' to the terminator are necessary and sufficient for the process. The results do not support a role for one proposed stem-loop structure that includes the terminator but are consistent with the involvement of another stem-loop 3' to the terminator. One mutant, containing two compensatory changes in this stem structure, was temperature sensitive for replication and for formation of the gag-pol protein. The results suggest that RNA sequence and structure are critical determinants of translational readthrough in vivo. Images PMID:1404606

  16. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications.

    PubMed

    Haque, Ashraful; Engel, Jessica; Teichmann, Sarah A; Lönnberg, Tapio

    2017-08-18

    RNA sequencing (RNA-seq) is a genomic approach for the detection and quantitative analysis of messenger RNA molecules in a biological sample and is useful for studying cellular responses. RNA-seq has fueled much discovery and innovation in medicine over recent years. For practical reasons, the technique is usually conducted on samples comprising thousands to millions of cells. However, this has hindered direct assessment of the fundamental unit of biology-the cell. Since the first single-cell RNA-sequencing (scRNA-seq) study was published in 2009, many more have been conducted, mostly by specialist laboratories with unique skills in wet-lab single-cell genomics, bioinformatics, and computation. However, with the increasing commercial availability of scRNA-seq platforms, and the rapid ongoing maturation of bioinformatics approaches, a point has been reached where any biomedical researcher or clinician can use scRNA-seq to make exciting discoveries. In this review, we present a practical guide to help researchers design their first scRNA-seq studies, including introductory information on experimental hardware, protocol choice, quality control, data analysis and biological interpretation.

  17. Spindle Epithelial Tumor with Thymus-Like Differentiation (SETTLE): A Next-Generation Sequencing Study.

    PubMed

    Stevens, Todd M; Morlote, Diana; Swensen, Jeff; Ellis, Michelle; Harada, Shuko; Spencer, Sharon; Prieto-Granada, Carlos N; Folpe, Andrew L; Gatalica, Zoran

    2018-05-07

    Spindle epithelial tumor with thymus-like differentiation (SETTLE) is a malignant biphasic neoplasm of the thyroid or neck with propensity for late metastasis. Unlike synovial sarcoma, its main morphologic mimic, SETTLE lacks synovial sarcoma-associated translocations. A single case of SETTLE has shown a KRAS mutation but to date no comprehensive next generation sequencing studies of this rare neoplasm have been undertaken. Herein, we subjected 5 well defined cases of SETTLE to direct sequence analysis of 592 genes and fusion gene analysis of 52 genes frequently rearranged in human cancers. We identified one case with two pathogenic variants in the KMT2D gene, one being in an intron splice site (c.674-1A>G) and the other being a frameshift variant (p.M2829fs). This same case also had a pathogenic nonsense variant in the KMT2C gene (p.R1237*). A second case of SETTLE carried a pathogenic NRAS missense variant, Q61R. No other molecular alterations, microsatellite instability, gene fusions or amplifications were identified.

  18. Genetic characterization of a new astrovirus detected in dogs suffering from diarrhoea.

    PubMed

    Toffan, Anna; Jonassen, Christine Monceyron; De Battisti, Cristian; Schiavon, Eliana; Kofstad, Tone; Capua, Ilaria; Cattoli, Giovanni

    2009-10-20

    Astroviruses have been described in several animals species frequently associated with diarrhoea, especially in young animals. In dogs, astrovirus-like particles have been observed sporadically and very little is known about their epidemiology and characteristics. In this paper, we describe the detection of astrovirus-like particles in symptomatic puppies. Furthermore, for the first time in this species, the presumptive identification made by electron microscopy was confirmed by genetic analysis of the viral RNA conducted directly on the clinical specimens. Genetic sequences of ORF2 (2443 nt), encoding for the capsid protein, and partial sequence of ORF1b (346 nt), encoding for the viral polymerase, identified the viruses as member of the family Astroviridae. The phylogenetic analysis clearly clustered canine astroviruses in the genus Mamastrovirus. Relative closest similarities were revealed with a cluster comprising human, porcine and feline astroviruses, based on the ORF2 sequences available. Based on the species definition for astroviruses and on the data obtained in this study, we suggest a new species of astrovirus - canine astrovirus, CaAstV - to be included in the genus Mamastrovirus.

  19. Neutralizing antibodies against West Nile virus identified directly from human B cells by single-cell analysis and next generation sequencing

    PubMed Central

    Tsioris, Konstantinos; Gupta, Namita T.; Ogunniyi, Adebola O.; Zimnisky, Ross M.; Qian, Feng; Yao, Yi; Wang, Xiaomei; Stern, Joel N. H.; Chari, Raj; Briggs, Adrian W.; Clouser, Christopher R.; Vigneault, Francois; Church, George M.; Garcia, Melissa N.; Murray, Kristy O.; Montgomery, Ruth R.; Kleinstein, Steven H.; Love, J. Christopher

    2015-01-01

    West Nile virus infection (WNV) is an emerging mosquito-borne disease that can lead to severe neurological illness and currently has no available treatment or vaccine. Using microengraving, an integrated single-cell analysis method, we analyzed a cohort of subjects infected with WNV - recently infected and post-convalescent subjects - and efficiently identified four novel WNV neutralizing antibodies. We also assessed the humoral response to WNV on a single-cell and repertoire level by integrating next generation sequencing (NGS) into our analysis. The results from single-cell analysis indicate persistence of WNV-specific memory B cells and antibody-secreting cells in post-convalescent subjects. These cells exhibited class-switched antibody isotypes. Furthermore, the results suggest that the antibody response itself does not predict the clinical severity of the disease (asymptomatic or symptomatic). Using the nucleotide coding sequences for WNV-specific antibodies derived from single cells, we revealed the ontogeny of expanded WNV-specific clones in the repertoires of recently infected subjects through NGS and bioinformatic analysis. This analysis also indicated that the humoral response to WNV did not depend on an anamnestic response, due to an unlikely previous exposure to the virus. The innovative and integrative approach presented here to analyze the evolution of neutralizing antibodies from natural infection on a single-cell and repertoire level can also be applied to vaccine studies, and could potentially aid the development of therapeutic antibodies and our basic understanding of other infectious diseases. PMID:26481611

  20. Neutralizing antibodies against West Nile virus identified directly from human B cells by single-cell analysis and next generation sequencing.

    PubMed

    Tsioris, Konstantinos; Gupta, Namita T; Ogunniyi, Adebola O; Zimnisky, Ross M; Qian, Feng; Yao, Yi; Wang, Xiaomei; Stern, Joel N H; Chari, Raj; Briggs, Adrian W; Clouser, Christopher R; Vigneault, Francois; Church, George M; Garcia, Melissa N; Murray, Kristy O; Montgomery, Ruth R; Kleinstein, Steven H; Love, J Christopher

    2015-12-01

    West Nile virus (WNV) infection is an emerging mosquito-borne disease that can lead to severe neurological illness and currently has no available treatment or vaccine. Using microengraving, an integrated single-cell analysis method, we analyzed a cohort of subjects infected with WNV - recently infected and post-convalescent subjects - and efficiently identified four novel WNV neutralizing antibodies. We also assessed the humoral response to WNV on a single-cell and repertoire level by integrating next generation sequencing (NGS) into our analysis. The results from single-cell analysis indicate persistence of WNV-specific memory B cells and antibody-secreting cells in post-convalescent subjects. These cells exhibited class-switched antibody isotypes. Furthermore, the results suggest that the antibody response itself does not predict the clinical severity of the disease (asymptomatic or symptomatic). Using the nucleotide coding sequences for WNV-specific antibodies derived from single cells, we revealed the ontogeny of expanded WNV-specific clones in the repertoires of recently infected subjects through NGS and bioinformatic analysis. This analysis also indicated that the humoral response to WNV did not depend on an anamnestic response, due to an unlikely previous exposure to the virus. The innovative and integrative approach presented here to analyze the evolution of neutralizing antibodies from natural infection on a single-cell and repertoire level can also be applied to vaccine studies, and could potentially aid the development of therapeutic antibodies and our basic understanding of other infectious diseases.

  1. The Influence of Primary and Secondary DNA Structure in Deletion and Duplication between Direct Repeats in Escherichia Coli

    PubMed Central

    Trinh, T. Q.; Sinden, R. R.

    1993-01-01

    We describe a system to measure the frequency of both deletions and duplications between direct repeats. Short 17- and 18-bp palindromic and nonpalindromic DNA sequences were cloned into the EcoRI site within the chloramphenicol acetyltransferase gene of plasmids pBR325 and pJT7. This creates an insert between direct repeated EcoRI sites and results in a chloramphenicol-sensitive phenotype. Selection for chloramphenicol resistance was utilized to select chloramphenicol resistant revertants that included those with precise deletion of the insert from plasmid pBR325 and duplication of the insert in plasmid pJT7. The frequency of deletion or duplication varied more than 500-fold depending on the sequence of the short sequence inserted into the EcoRI site. For the nonpalindromic inserts, multiple internal direct repeats and the length of the direct repeats appear to influence the frequency of deletion. Certain palindromic DNA sequences with the potential to form DNA hairpin structures that might stabilize the misalignment of direct repeats had a high frequency of deletion. Other DNA sequences with the potential to form structures that might destabilize misalignment of direct repeats had a very low frequency of deletion. Duplication mutations occurred at the highest frequency when the DNA between the direct repeats contained no direct or inverted repeats. The presence of inverted repeats dramatically reduced the frequency of duplications. The results support the slippage-misalignment model, suggesting that misalignment occurring during DNA replication leads to deletion and duplication mutations. The results also support the idea that the formation of DNA secondary structures during DNA replication can facilitate and direct specific mutagenic events. PMID:8325478

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.

    Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less

  3. RNA metabolism in the regulation of protein synthesis in plants. Progress report, 1975-1979

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Key, J L

    1979-01-01

    The major objectives of the research for the contract period covered by this report were (1) to gain an insight into the sequence organization of the DNA of soybean, emphasizing the arrangement of single copy or unique sequences and repetitive sequences of DNA throughout the genome, (2) to characterize soybean RNAs relative to nucleotide sequence complexity and kinetics of synthesis and turnover of poly A/sup +/ mRNA, and (3) to study ribosomal proteins directed to an analysis of possible changes in proteins which relate to the activation of 80S ribosomes and thus mRNA utilization and protein synthesis in response tomore » environmental stimuli. Even with greatly reduced funding compared to that requested, objectives 1 and 2 were substantially accomplished. Because of reduced funding and the 20-month no cost extension, relatively little progress was made on objective 3. Accordingly objectives 1 and 2 will be summarized in some detail; a brief account of progress is presented on objective 3.« less

  4. Rift Valley Fever, Sudan, 2007 and 2010

    PubMed Central

    Aradaib, Imadeldin E.; Erickson, Bobbie R.; Elageb, Rehab M.; Khristova, Marina L.; Carroll, Serena A.; Elkhidir, Isam M.; Karsany, Mubarak E.; Karrar, AbdelRahim E.; Elbashir, Mustafa I.

    2013-01-01

    To elucidate whether Rift Valley fever virus (RVFV) diversity in Sudan resulted from multiple introductions or from acquired changes over time from 1 introduction event, we generated complete genome sequences from RVFV strains detected during the 2007 and 2010 outbreaks. Phylogenetic analyses of small, medium, and large RNA segment sequences indicated several genetic RVFV variants were circulating in Sudan, which all grouped into Kenya-1 or Kenya-2 sublineages from the 2006–2008 eastern Africa epizootic. Bayesian analysis of sequence differences estimated that diversity among the 2007 and 2010 Sudan RVFV variants shared a most recent common ancestor circa 1996. The data suggest multiple introductions of RVFV into Sudan as part of sweeping epizootics from eastern Africa. The sequences indicate recent movement of RVFV and support the need for surveillance to recognize when and where RVFV circulates between epidemics, which can make data from prediction tools easier to interpret and preventive measures easier to direct toward high-risk areas. PMID:23347790

  5. Detection and identification of cutaneous leishmaniasis isolates by culture, Polymerase chain reaction and sequence analyses in Syrian and Central Anatolia patients.

    PubMed

    Beyhan, Yunus E; Karakus, Mehmet; Karagoz, Alper; Mungan, Mesut; Ozkan, Aysegul T; Hokelek, Murat

    2017-09-01

    To characterize the cutaneous leishmaniasis (CL) isolates of Syrian and Central Anatolia patients at species levels. Methods: Skin scrapings of 3 patients (2 Syrian, 1 Turkish) were taken and examined by direct examination, culture in Novy-MacNeal-Nicole (NNN) medium, internal transcribed spacer polymerase chain reaction and sequence analysis (PCR). Results:According to microscopic examination, culture and PCR methods, 3 samples were detected positive. The sequencing results of all isolates in the study were identified as Leishmania tropica. The same genotypes were detected in the 3 isolates and nucleotide sequence submitted into GenBank with the accession number: KP689599. Conclusion: This finding could give information about the transmission of CL between Turkey and Syria. Because of the Syrian civil war, most of the Syrian citizens circulating in Turkey and different part of Europe, this can be increase the risk of spreading the disease. So, prevention measurements must be taken urgently.

  6. Cysteine-containing peptide tag for site-specific conjugation of proteins

    DOEpatents

    Backer, Marina V.; Backer, Joseph M.

    2008-04-08

    The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.

  7. Cysteine-containing peptide tag for site-specific conjugation of proteins

    DOEpatents

    Backer, Marina V.; Backer, Joseph M.

    2010-10-05

    The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.

  8. Are special read alignment strategies necessary and cost-effective when handling sequencing reads from patient-derived tumor xenografts?

    PubMed

    Tso, Kai-Yuen; Lee, Sau Dan; Lo, Kwok-Wai; Yip, Kevin Y

    2014-12-23

    Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data. We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing. Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.

  9. Hypoxia-induced oxidative base modifications in the VEGF hypoxia-response element are associated with transcriptionally active nucleosomes.

    PubMed

    Ruchko, Mykhaylo V; Gorodnya, Olena M; Pastukh, Viktor M; Swiger, Brad M; Middleton, Natavia S; Wilson, Glenn L; Gillespie, Mark N

    2009-02-01

    Reactive oxygen species (ROS) generated in hypoxic pulmonary artery endothelial cells cause transient oxidative base modifications in the hypoxia-response element (HRE) of the VEGF gene that bear a conspicuous relationship to induction of VEGF mRNA expression (K.A. Ziel et al., FASEB J. 19, 387-394, 2005). If such base modifications are indeed linked to transcriptional regulation, then they should be detected in HRE sequences associated with transcriptionally active nucleosomes. Southern blot analysis of the VEGF HRE associated with nucleosome fractions prepared by micrococcal nuclease digestion indicated that hypoxia redistributed some HRE sequences from multinucleosomes to transcriptionally active mono- and dinucleosome fractions. A simple PCR method revealed that VEGF HRE sequences harboring oxidative base modifications were found exclusively in mononucleosomes. Inhibition of hypoxia-induced ROS generation with myxathiozol prevented formation of oxidative base modifications but not the redistribution of HRE sequences into mono- and dinucleosome fractions. The histone deacetylase inhibitor trichostatin A caused retention of HRE sequences in compacted nucleosome fractions and prevented formation of oxidative base modifications. These findings suggest that the hypoxia-induced oxidant stress directed at the VEGF HRE requires the sequence to be repositioned into mononucleosomes and support the prospect that oxidative modifications in this sequence are an important step in transcriptional activation.

  10. Sequence-dependent DNA deformability studied using molecular dynamics simulations.

    PubMed

    Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

    2007-01-01

    Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein-DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein-DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids.

  11. Histoimmunogenetics Markup Language 1.0: Reporting next generation sequencing-based HLA and KIR genotyping.

    PubMed

    Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin

    2015-12-01

    We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  12. Mining dynamic noteworthy functions in software execution sequences

    PubMed Central

    Huang, Guoyan; Wang, Yuqian; He, Haitao; Ren, Jiadong

    2017-01-01

    As the quality of crucial entities can directly affect that of software, their identification and protection become an important premise for effective software development, management, maintenance and testing, which thus contribute to improving the software quality and its attack-defending ability. Most analysis and evaluation on important entities like codes-based static structure analysis are on the destruction of the actual software running. In this paper, from the perspective of software execution process, we proposed an approach to mine dynamic noteworthy functions (DNFM)in software execution sequences. First, according to software decompiling and tracking stack changes, the execution traces composed of a series of function addresses were acquired. Then these traces were modeled as execution sequences and then simplified so as to get simplified sequences (SFS), followed by the extraction of patterns through pattern extraction (PE) algorithm from SFS. After that, evaluating indicators inner-importance and inter-importance were designed to measure the noteworthiness of functions in DNFM algorithm. Finally, these functions were sorted by their noteworthiness. Comparison and contrast were conducted on the experiment results from two traditional complex network-based node mining methods, namely PageRank and DegreeRank. The results show that the DNFM method can mine noteworthy functions in software effectively and precisely. PMID:28278276

  13. Complement component 3: characterization and association with mastitis resistance in Egyptian water buffalo and cattle.

    PubMed

    El-Halawany, Nermin; Abd-El-Monsif, Shawky A; Al-Tohamy Ahmed, F M; Hegazy, Lamees; Abdel-Shafy, Hamdy; Abdel-Latif, Magdy A; Ghazi, Yasser A; Neuhoff, Christiane; Salilew-Wondim, Dessie; Schellander, Karl

    2017-03-01

    Mastitis is an infectious disease of the mammary gland that leads to reduced milk production and change in milk composition. Complement component C3 plays a major role as a central molecule of the complement cascade involving in killing of microorganisms, either directly or in cooperation with phagocytic cells. C3 cDNA were isolated, from Egyptian buffalo and cattle, sequenced and characterized. The C3 cDNA sequences of buffalo and cattle consist of 5025 and 5019 bp, respectively. Buffalo and cattle C3 cDNAs share 99% of sequence identity with each other. The 4986 bp open reading frame in buffalo encodes a putative protein of 1661 amino acids-as in cattle-and includes all the functional domains. Further, analysis of the C3 cDNA sequences detected six novel single-nucleotide polymorphisms (SNPs) in buffalo and three novel SNPs in cattle. The association analysis of the detected SNPs with milk somatic cell score as an indicator of mastitis revealed that the most significant association in buffalo was found in the C>A substitution (ss: 1752816097) in exon 27, whereas in cattle it was in the C>T substitution (ss: 1752816085) in exon 12. Our findings provide preliminary information about the contribution of C3 polymorphisms to mastitis resistance in buffalo and cattle.

  14. Molecular cloning of a cDNA encoding the glycoprotein of hen oviduct microsomal signal peptidase.

    PubMed Central

    Newsome, A L; McLean, J W; Lively, M O

    1992-01-01

    Detergent-solubilized hen oviduct signal peptidase has been characterized previously as an apparent complex of a 19 kDa protein and a 23 kDa glycoprotein (GP23) [Baker & Lively (1987) Biochemistry 26, 8561-8567]. A cDNA clone encoding GP23 from a chicken oviduct lambda gt11 cDNA library has now been characterized. The cDNA encodes a protein of 180 amino acid residues with a single site for asparagine-linked glycosylation that has been directly identified by amino acid sequence analysis of a tryptic-digest peptide containing the glycosylated site. Immunoblot analysis reveals cross-reactivity with a dog pancreas protein. Comparison of the deduced amino acid sequence of GP23 with the 22/23 kDa glycoprotein of dog microsomal signal peptidase [Shelness, Kanwar & Blobel (1988) J. Biol. Chem. 263, 17063-17070], one of five proteins associated with this enzyme, reveals that the amino acid sequences are 90% identical. Thus the signal peptidase glycoprotein is as highly conserved as the sequences of cytochromes c and b from these same species and is likely to be found in a similar form in many, if not all, vertebrate species. The data also show conclusively that the dog and avian signal peptidases have at least one protein subunit in common. Images Fig. 1. PMID:1546959

  15. [Genetic characterization analysis on the first imported measles virus of genotype D8 in Chinese mainland].

    PubMed

    Sun, Xiao-Dong; Li, Chong-Shan; Tang, Xian; Li, Zhi; Zhang, Yan; Tang, Wei; Wang, Jing; Wang, Hui-Ling; Yang, Yan-Ji; Li, Jia; Yuan, Zheng-An; Xu, Wen-Bo

    2013-11-01

    This study analyzed the genetic characterization on first imported measles virus of genotype D8 in Chinese mainland. Serums were collected from the suspicious MV patients to detect IgM antibody in ELISA. Throat swabs were cultured in Vero/SLAM cell line to get measles virus isolates. Part of the nucleotide sequence of the 3' terminus of nucleoprotein (N) gene of these isolates were amplified by RT-PCR, and the amplicons were directly sequenced. The phylogenetic analysis was based on the nucleotide sequence about 456 base pairs of the 3' terminus of nucleoprotein (N) gene. Results showed that it reported 1 105 suspicious measles cases in shanghai, 2012, including 590 confirmed cases and 2 clinical case. The reported morbidity was 2.52 per one hundred thousand. 247 measles viruses were isolated from 984 throat swabs specimen. Most of them belonged to sub-genotype H1a except Shanghai12-239 was genotype D8. The homology of nucleotide and amino acid sequences were 97.8% and 98.6% respectively between Shanghai12-239 and WHO reference strain (Manchester. UNK30.94(D8)AF280803). Those were 89.6%-94.5% and 88.7%-95.3% between Shanghai12-239 and WHO reference strains of other genotypes.

  16. High frequency of hepatitis E virus infection in swine from South Brazil and close similarity to human HEV isolates.

    PubMed

    Passos-Castilho, Ana Maria; Granato, Celso Francisco Hernandes

    Hepatitis E virus is responsible for acute and chronic liver infections worldwide. Swine hepatitis E virus has been isolated in Brazil, and a probable zoonotic transmission has been described, although data are still scarce. The aim of this study was to investigate the frequency of hepatitis E virus infection in pigs from a small-scale farm in the rural area of Paraná State, South Brazil. Fecal samples were collected from 170 pigs and screened for hepatitis E virus RNA using a duplex real-time RT-PCR targeting a highly conserved 70nt long sequence within overlapping parts of ORF2 and ORF3 as well as a 113nt sequence of ORF2. Positive samples with high viral loads were subjected to direct sequencing and phylogenetic analysis. hepatitis E virus RNA was detected in 34 (20.0%) of the 170 pigs following positive results in at least one set of screening real-time RT-PCR primers and probes. The swine hepatitis E virus strains clustered with the genotype hepatitis E virus-3b reference sequences in the phylogenetic analysis and showed close similarity to human hepatitis E virus isolates previously reported in Brazil. Copyright © 2017 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.

  17. A generalized analysis of hydrophobic and loop clusters within globular protein sequences

    PubMed Central

    Eudes, Richard; Le Tuan, Khanh; Delettré, Jean; Mornon, Jean-Paul; Callebaut, Isabelle

    2007-01-01

    Background Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. Results The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. Conclusion The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. PMID:17210072

  18. Familial cases of Norrie disease detected by copy number analysis.

    PubMed

    Arai, Eisuke; Fujimaki, Takuro; Yanagawa, Ai; Fujiki, Keiko; Yokoyama, Toshiyuki; Okumura, Akihisa; Shimizu, Toshiaki; Murakami, Akira

    2014-09-01

    Norrie disease (ND, MIM#310600) is an X-linked disorder characterized by severe vitreoretinal dysplasia at birth. We report the results of causative NDP gene analysis in three male siblings with Norrie disease and describe the associated phenotypes. Three brothers with suspected Norrie disease and their mother presented for clinical examination. After obtaining informed consent, DNA was extracted from the peripheral blood of the proband, one of his brothers and his unaffected mother. Exons 1-3 of the NDP gene were amplified by polymerase chain reaction (PCR), and direct sequencing was performed. Multiplex ligation-dependent probe amplification (MLPA) was also performed to search for copy number variants in the NDP gene. The clinical findings of the three brothers included no light perception, corneal opacity, shallow anterior chamber, leukocoria, total retinal detachment and mental retardation. Exon 2 of the NDP gene was not amplified in the proband and one brother, even when the PCR primers for exon 2 were changed, whereas the other two exons showed no mutations by direct sequencing. MLPA analysis showed deletion of exon 2 of the NDP gene in the proband and one brother, while there was only one copy of exon 2 in the mother. Norrie disease was diagnosed in three patients from a Japanese family by clinical examination and was confirmed by genetic analysis. To localize the defect, confirmation of copy number variation by the MLPA method was useful in the present study.

  19. Sequence Analysis of APOA5 Among the Kuwaiti Population Identifies Association of rs2072560, rs2266788, and rs662799 With TG and VLDL Levels

    PubMed Central

    Jasim, Anfal A.; Al-Bustan, Suzanne A.; Al-Kandari, Wafa; Al-Serri, Ahmad; AlAskar, Huda

    2018-01-01

    Common variants of Apolipoprotein A5 (APOA5) have been associated with lipid levels yet very few studies have reported full sequence data from various ethnic groups. The purpose of this study was to analyse the full APOA5 gene sequence to identify variants in 100 healthy Kuwaitis of Arab ethnicities and assess their association with variation in lipid levels in a cohort of 733 samples. Sanger method was used in the direct sequencing of the full 3.7 Kb APOA5 and multiple sequence alignment was used to identify variants. The complete APOA5 sequence in Kuwaiti Arabs has been deposited in GenBank (KJ401315). A total of 20 reported single nucleotide polymorphisms (SNPs) were identified. Two novel SNPs were also identified: a synonymous 2197G>A polymorphism at genomic position 116661525 and a 3′ UTR 3222 C>T polymorphism at genomic position 116660500 based on human genome assembly GRCh37/hg:19. Five SNPs along with the two novel SNPs were selected for validation in the cohort. Association of those SNPs with lipid levels was tested and minor alleles of three SNPs (rs2072560, rs2266788, and rs662799) were found significantly associated with TG and VLDL levels. This is the first study to report the full APOA5 sequence and SNPs in an Arab ethnic group. Analysis of the variants identified and comparison to other populations suggests a distinctive genetic component in Arabs. The positive association observed for rs2072560 and rs2266788 with TG and VLDL levels confirms their role in lipid metabolism. PMID:29686695

  20. Sequence Analysis of APOA5 Among the Kuwaiti Population Identifies Association of rs2072560, rs2266788, and rs662799 With TG and VLDL Levels.

    PubMed

    Jasim, Anfal A; Al-Bustan, Suzanne A; Al-Kandari, Wafa; Al-Serri, Ahmad; AlAskar, Huda

    2018-01-01

    Common variants of Apolipoprotein A5 ( APOA 5) have been associated with lipid levels yet very few studies have reported full sequence data from various ethnic groups. The purpose of this study was to analyse the full APOA5 gene sequence to identify variants in 100 healthy Kuwaitis of Arab ethnicities and assess their association with variation in lipid levels in a cohort of 733 samples. Sanger method was used in the direct sequencing of the full 3.7 Kb APOA5 and multiple sequence alignment was used to identify variants. The complete APOA5 sequence in Kuwaiti Arabs has been deposited in GenBank (KJ401315). A total of 20 reported single nucleotide polymorphisms (SNPs) were identified. Two novel SNPs were also identified: a synonymous 2197G>A polymorphism at genomic position 116661525 and a 3' UTR 3222 C>T polymorphism at genomic position 116660500 based on human genome assembly GRCh37/hg:19. Five SNPs along with the two novel SNPs were selected for validation in the cohort. Association of those SNPs with lipid levels was tested and minor alleles of three SNPs (rs2072560, rs2266788, and rs662799) were found significantly associated with TG and VLDL levels. This is the first study to report the full APOA5 sequence and SNPs in an Arab ethnic group. Analysis of the variants identified and comparison to other populations suggests a distinctive genetic component in Arabs. The positive association observed for rs2072560 and rs2266788 with TG and VLDL levels confirms their role in lipid metabolism.

  1. Streptococcal phosphoenolpyruvate-sugar phosphotransferase system: amino acid sequence and site of ATP-dependent phosphorylation of HPr

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deutscher, J.; Pevec, B.; Beyreuther, K.

    1986-10-21

    The amino acid sequence of histidine-containing protein (HPr) from Streptococcus faecalis has been determined by direct Edman degradation of intact HPr and by amino acid sequence analysis of tryptic peptides, V8 proteolyptic peptides, thermolytic peptides, and cyanogen bromide cleavage products. HPr from S. faecalis was found to contain 89 amino acid residues, corresponding to a molecular weight of 9438. The amino acid sequence of HPr from S. faecalis shows extended homology to the primary structure of HPr proteins from other bacteria. Besides the phosphoenolpyruvate-dependent phosphorylation of a histidyl residue in HPr, catalyzed by enzyme I of the bacterial phosphotransferase system,more » HPr was also found to be phosphorylated at a seryl residue in an ATP-dependent protein kinase catalyzed reaction. The site of ATP-dependent phosphorylation in HPr of S faecalis has now been determined. (/sup 32/P)P-Ser-HPr was digested with three different proteases, and in each case, a single labeled peptide was isolated. Following digestion with subtilisin, they obtained a peptide with the sequence -(P)Ser-Ile-Met-. Using chymotrypsin, they isolated a peptide with the sequence -Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-Gly-Val-Met-. The longest labeled peptide was obtained with V8 staphylococcal protease. According to amino acid analysis, this peptide contained 36 out of the 89 amino acid residues of HPr. The following sequence of 12 amino acid residues of the V8 peptide was determined: -Tyr-Lys-Gly-Lys-Ser-Val-Asn-Leu-Lys-(P)Ser-Ile-Met-. Thus, the site of ATP-dependent phosphorylation was determined to be Ser-46 within the primary structure of HPr.« less

  2. Directional Selection from Host Plants Is a Major Force Driving Host Specificity in Magnaporthe Species.

    PubMed

    Zhong, Zhenhui; Norvienyeku, Justice; Chen, Meilian; Bao, Jiandong; Lin, Lianyu; Chen, Liqiong; Lin, Yahong; Wu, Xiaoxian; Cai, Zena; Zhang, Qi; Lin, Xiaoye; Hong, Yonghe; Huang, Jun; Xu, Linghong; Zhang, Honghong; Chen, Long; Tang, Wei; Zheng, Huakun; Chen, Xiaofeng; Wang, Yanli; Lian, Bi; Zhang, Liangsheng; Tang, Haibao; Lu, Guodong; Ebbole, Daniel J; Wang, Baohua; Wang, Zonghua

    2016-05-06

    One major threat to global food security that requires immediate attention, is the increasing incidence of host shift and host expansion in growing number of pathogenic fungi and emergence of new pathogens. The threat is more alarming because, yield quality and quantity improvement efforts are encouraging the cultivation of uniform plants with low genetic diversity that are increasingly susceptible to emerging pathogens. However, the influence of host genome differentiation on pathogen genome differentiation and its contribution to emergence and adaptability is still obscure. Here, we compared genome sequence of 6 isolates of Magnaporthe species obtained from three different host plants. We demonstrated the evolutionary relationship between Magnaporthe species and the influence of host differentiation on pathogens. Phylogenetic analysis showed that evolution of pathogen directly corresponds with host divergence, suggesting that host-pathogen interaction has led to co-evolution. Furthermore, we identified an asymmetric selection pressure on Magnaporthe species. Oryza sativa-infecting isolates showed higher directional selection from host and subsequently tends to lower the genetic diversity in its genome. We concluded that, frequent gene loss or gain, new transposon acquisition and sequence divergence are host adaptability mechanisms for Magnaporthe species, and this coevolution processes is greatly driven by directional selection from host plants.

  3. Directional Selection from Host Plants Is a Major Force Driving Host Specificity in Magnaporthe Species

    PubMed Central

    Zhong, Zhenhui; Norvienyeku, Justice; Chen, Meilian; Bao, Jiandong; Lin, Lianyu; Chen, Liqiong; Lin, Yahong; Wu, Xiaoxian; Cai, Zena; Zhang, Qi; Lin, Xiaoye; Hong, Yonghe; Huang, Jun; Xu, Linghong; Zhang, Honghong; Chen, Long; Tang, Wei; Zheng, Huakun; Chen, Xiaofeng; Wang, Yanli; Lian, Bi; Zhang, Liangsheng; Tang, Haibao; Lu, Guodong; Ebbole, Daniel J.; Wang, Baohua; Wang, Zonghua

    2016-01-01

    One major threat to global food security that requires immediate attention, is the increasing incidence of host shift and host expansion in growing number of pathogenic fungi and emergence of new pathogens. The threat is more alarming because, yield quality and quantity improvement efforts are encouraging the cultivation of uniform plants with low genetic diversity that are increasingly susceptible to emerging pathogens. However, the influence of host genome differentiation on pathogen genome differentiation and its contribution to emergence and adaptability is still obscure. Here, we compared genome sequence of 6 isolates of Magnaporthe species obtained from three different host plants. We demonstrated the evolutionary relationship between Magnaporthe species and the influence of host differentiation on pathogens. Phylogenetic analysis showed that evolution of pathogen directly corresponds with host divergence, suggesting that host-pathogen interaction has led to co-evolution. Furthermore, we identified an asymmetric selection pressure on Magnaporthe species. Oryza sativa-infecting isolates showed higher directional selection from host and subsequently tends to lower the genetic diversity in its genome. We concluded that, frequent gene loss or gain, new transposon acquisition and sequence divergence are host adaptability mechanisms for Magnaporthe species, and this coevolution processes is greatly driven by directional selection from host plants. PMID:27151494

  4. Quantitative RNA-seq analysis of the Campylobacter jejuni transcriptome

    PubMed Central

    Chaudhuri, Roy R.; Yu, Lu; Kanji, Alpa; Perkins, Timothy T.; Gardner, Paul P.; Choudhary, Jyoti; Maskell, Duncan J.

    2011-01-01

    Campylobacter jejuni is the most common bacterial cause of foodborne disease in the developed world. Its general physiology and biochemistry, as well as the mechanisms enabling it to colonize and cause disease in various hosts, are not well understood, and new approaches are required to understand its basic biology. High-throughput sequencing technologies provide unprecedented opportunities for functional genomic research. Recent studies have shown that direct Illumina sequencing of cDNA (RNA-seq) is a useful technique for the quantitative and qualitative examination of transcriptomes. In this study we report RNA-seq analyses of the transcriptomes of C. jejuni (NCTC11168) and its rpoN mutant. This has allowed the identification of hitherto unknown transcriptional units, and further defines the regulon that is dependent on rpoN for expression. The analysis of the NCTC11168 transcriptome was supplemented by additional proteomic analysis using liquid chromatography-MS. The transcriptomic and proteomic datasets represent an important resource for the Campylobacter research community. PMID:21816880

  5. A long natural-antisense RNA is accumulated in the conidia of Aspergillus oryzae.

    PubMed

    Tsujii, Masaru; Okuda, Satoshi; Ishi, Kazutomo; Madokoro, Kana; Takeuchi, Michio; Yamagata, Youhei

    2016-01-01

    Analysis of expressed sequence tag libraries from various culture conditions revealed the existence of conidia-specific transcripts assembled to putative conidiation-specific reductase gene (csrA) in Aspergillus oryzae. However, the all transcripts were transcribed with opposite direction to the gene csrA. The sequence analysis of the transcript revealed that the RNA overlapped mRNA of csrA with 3'-end, and did not code protein longer than 60 amino acid residues. We designated the transcript Conidia Specific Long Natural-antisense RNA (CSLNR). The real-time PCR analysis demonstrated that the CSLNR is conidia-specific transcript, which cannot be transcribed in the absence of brlA, and the amount of CSLNR was much more than that of the transcript from csrA in conidia. Furthermore, the csrA deletion, also lacking coding region of CSLNR in A. oryzae reduced the number of conidia. Overexpression of CsrA demonstrated the inhibition of growth and conidiation, while CSLNR did not affect conidiation.

  6. Divergence of Structure and Function in the Haloacid Dehalogenase Enzyme Superfamily: Bacteroides thetaiotaomicron BT2127 is an Inorganic Pyrophosphatase+

    PubMed Central

    Huang, Hua; Yury, Patskovsky; Toro, Rafael; Farelli, Jeremiah D.; Pandya, Chetanya; Almo, Steven C.; Allen, Karen N.; Dunaway-Mariano, Debra

    2012-01-01

    The explosion of protein sequence information requires that current strategies for function assignment must evolve to complement experimental approaches with computationally-based function prediction. This necessitates the development of strategies based on the identification of sequence markers in the form of specificity determinants and a more informed definition of orthologues. Herein, we have undertaken the function assignment of the unknown Haloalkanoate Dehalogenase superfamily member BT2127 (Uniprot accession # Q8A5V9) from Bacteroides thetaiotaomicron using an integrated bioinformatics/structure/mechanism approach. The substrate specificity profile and steady-state rate constants of BT2127 (with kcat/Km value for pyrophosphate of ∼1 × 105 M−1 s−1), together with the gene context, supports the assigned in vivo function as an inorganic pyrophosphatase. The X-ray structural analysis of the wild-type BT2127 and several variants generated by site-directed mutagenesis shows that substrate discrimination is based, in part, on active site space restrictions imposed by the cap domain (specifically by residues Tyr76 and Glu47). Structure guided site directed mutagenesis coupled with kinetic analysis of the mutant enzymes identified the residues required for catalysis, substrate binding, and domain-domain association. Based on this structure-function analysis, the catalytic residues Asp11, Asp13, Thr113, and Lys147 as well the metal binding residues Asp171, Asn172 and Glu47 were used as markers to confirm BT2127 orthologues identified via sequence searches. This bioinformatic analysis demonstrated that the biological range of BT2127 orthologue is restricted to the phylum Bacteroidetes/Chlorobi. The key structural determinants in the divergence of BT2127 and its closest homologue β-phosphoglucomutase control the leaving group size (phosphate vs. glucose-phosphate) and the position of the Asp acid/base in the open vs. closed conformations. HADSF pyrophosphatases represent a third mechanistic and fold type for bacterial pyrophosphatases. PMID:21894910

  7. Genetic analysis of human immunodeficiency virus type 1 envelope V3 region isolates from mothers and infants after perinatal transmission.

    PubMed Central

    Ahmad, N; Baroudy, B M; Baker, R C; Chappey, C

    1995-01-01

    The human immunodeficiency virus type 1 (HIV-1) sequences from variable region 3 (V3) of the envelope gene were analyzed from seven infected mother-infant pairs following perinatal transmission. The V3 region sequences directly derived from the DNA of the uncultured peripheral blood mononuclear cells from infected mothers displayed a heterogeneous population. In contrast, the infants' sequences were less diverse than those of their mothers. In addition, the sequences from the younger infants' peripheral blood mononuclear cell DNA were more homogeneous than the older infants' sequences. All infants' sequences were different but displayed patterns similar to those seen in their mothers. In the mother-infant pair sequences analyzed, a minor genotype or subtype found in the mothers predominated in their infants. The conserved N-linked glycosylation site proximal to the first cysteine of the V3 loop was absent only in one infant's sequence set and in some variants of two other infants' sequences. Furthermore, the HIV-1 sequences of the epidemiologically linked mother-infant pairs were closer than the sequences of epidemiologically unlinked individuals, suggesting that the sequence comparison of mother-infant pairs done in order to identify genetic variants transmitted from mother to infant could be performed even in older infants. There was no evidence for transmission of a major genotype or multiple genotypes from mother to infant. In conclusion, a minor genotype of maternal virus is transmitted to the infants, and this finding could be useful in developing strategies to prevent maternal transmission of HIV-1 by means of perinatal interventions. PMID:7815476

  8. Diversity of Secondary Structure in Catalytic Peptides with β-Turn-Biased Sequences

    PubMed Central

    2016-01-01

    X-ray crystallography has been applied to the structural analysis of a series of tetrapeptides that were previously assessed for catalytic activity in an atroposelective bromination reaction. Common to the series is a central Pro-Xaa sequence, where Pro is either l- or d-proline, which was chosen to favor nucleation of canonical β-turn secondary structures. Crystallographic analysis of 35 different peptide sequences revealed a range of conformational states. The observed differences appear not only in cases where the Pro-Xaa loop-region is altered, but also when seemingly subtle alterations to the flanking residues are introduced. In many instances, distinct conformers of the same sequence were observed, either as symmetry-independent molecules within the same unit cell or as polymorphs. Computational studies using DFT provided additional insight into the analysis of solid-state structural features. Select X-ray crystal structures were compared to the corresponding solution structures derived from measured proton chemical shifts, 3J-values, and 1H–1H-NOESY contacts. These findings imply that the conformational space available to simple peptide-based catalysts is more diverse than precedent might suggest. The direct observation of multiple ground state conformations for peptides of this family, as well as the dynamic processes associated with conformational equilibria, underscore not only the challenge of designing peptide-based catalysts, but also the difficulty in predicting their accessible transition states. These findings implicate the advantages of low-barrier interconversions between conformations of peptide-based catalysts for multistep, enantioselective reactions. PMID:28029251

  9. A Unified Theoretical Framework for Cognitive Sequencing.

    PubMed

    Savalia, Tejas; Shukla, Anuj; Bapi, Raju S

    2016-01-01

    The capacity to sequence information is central to human performance. Sequencing ability forms the foundation stone for higher order cognition related to language and goal-directed planning. Information related to the order of items, their timing, chunking and hierarchical organization are important aspects in sequencing. Past research on sequencing has emphasized two distinct and independent dichotomies: implicit vs. explicit and goal-directed vs. habits. We propose a theoretical framework unifying these two streams. Our proposal relies on brain's ability to implicitly extract statistical regularities from the stream of stimuli and with attentional engagement organizing sequences explicitly and hierarchically. Similarly, sequences that need to be assembled purposively to accomplish a goal require engagement of attentional processes. With repetition, these goal-directed plans become habits with concomitant disengagement of attention. Thus, attention and awareness play a crucial role in the implicit-to-explicit transition as well as in how goal-directed plans become automatic habits. Cortico-subcortical loops basal ganglia-frontal cortex and hippocampus-frontal cortex loops mediate the transition process. We show how the computational principles of model-free and model-based learning paradigms, along with a pivotal role for attention and awareness, offer a unifying framework for these two dichotomies. Based on this framework, we make testable predictions related to the potential influence of response-to-stimulus interval (RSI) on developing awareness in implicit learning tasks.

  10. A Unified Theoretical Framework for Cognitive Sequencing

    PubMed Central

    Savalia, Tejas; Shukla, Anuj; Bapi, Raju S.

    2016-01-01

    The capacity to sequence information is central to human performance. Sequencing ability forms the foundation stone for higher order cognition related to language and goal-directed planning. Information related to the order of items, their timing, chunking and hierarchical organization are important aspects in sequencing. Past research on sequencing has emphasized two distinct and independent dichotomies: implicit vs. explicit and goal-directed vs. habits. We propose a theoretical framework unifying these two streams. Our proposal relies on brain's ability to implicitly extract statistical regularities from the stream of stimuli and with attentional engagement organizing sequences explicitly and hierarchically. Similarly, sequences that need to be assembled purposively to accomplish a goal require engagement of attentional processes. With repetition, these goal-directed plans become habits with concomitant disengagement of attention. Thus, attention and awareness play a crucial role in the implicit-to-explicit transition as well as in how goal-directed plans become automatic habits. Cortico-subcortical loops basal ganglia-frontal cortex and hippocampus-frontal cortex loops mediate the transition process. We show how the computational principles of model-free and model-based learning paradigms, along with a pivotal role for attention and awareness, offer a unifying framework for these two dichotomies. Based on this framework, we make testable predictions related to the potential influence of response-to-stimulus interval (RSI) on developing awareness in implicit learning tasks. PMID:27917146

  11. On the value of Mendelian laws of segregation in families: data quality control, imputation and beyond

    PubMed Central

    Blue, Elizabeth Marchani; Sun, Lei; Tintle, Nathan L.; Wijsman, Ellen M.

    2014-01-01

    When analyzing family data, we dream of perfectly informative data, even whole genome sequences (WGS) for all family members. Reality intervenes, and we find next-generation sequence (NGS) data have error, and are often too expensive or impossible to collect on everyone. Genetic Analysis Workshop 18 groups “Quality Control” and “Dropping WGS through families using GWAS framework” focused on finding, correcting, and using errors within the available sequence and family data, developing methods to infer and analyze missing sequence data among relatives, and testing for linkage and association with simulated blood pressure. We found that single nucleotide polymorphisms, NGS, and imputed data are generally concordant, but that errors are particularly likely at rare variants, homozygous genotypes, within regions with repeated sequences or structural variants, and within sequence data imputed from unrelateds. Admixture complicated identification of cryptic relatedness, but information from Mendelian transmission improved error detection and provided an estimate of the de novo mutation rate. Both genotype and pedigree errors had an adverse effect on subsequent analyses. Computationally fast rules-based imputation was accurate, but could not cover as many loci or subjects as more computationally demanding probability-based methods. Incorporating population-level data into pedigree-based imputation methods improved results. Observed data outperformed imputed data in association testing, but imputed data were also useful. We discuss the strengths and weaknesses of existing methods, and suggest possible future directions. Topics include improving communication between those performing data collection and analysis, establishing thresholds for and improving imputation quality, and incorporating error into imputation and analytical models. PMID:25112184

  12. Assessment of antibody library diversity through next generation sequencing and technical error compensation

    PubMed Central

    Lisi, Simonetta; Chirichella, Michele; Arisi, Ivan; Goracci, Martina; Cremisi, Federico; Cattaneo, Antonino

    2017-01-01

    Antibody libraries are important resources to derive antibodies to be used for a wide range of applications, from structural and functional studies to intracellular protein interference studies to developing new diagnostics and therapeutics. Whatever the goal, the key parameter for an antibody library is its complexity (also known as diversity), i.e. the number of distinct elements in the collection, which directly reflects the probability of finding in the library an antibody against a given antigen, of sufficiently high affinity. Quantitative evaluation of antibody library complexity and quality has been for a long time inadequately addressed, due to the high similarity and length of the sequences of the library. Complexity was usually inferred by the transformation efficiency and tested either by fingerprinting and/or sequencing of a few hundred random library elements. Inferring complexity from such a small sampling is, however, very rudimental and gives limited information about the real diversity, because complexity does not scale linearly with sample size. Next-generation sequencing (NGS) has opened new ways to tackle the antibody library complexity quality assessment. However, much remains to be done to fully exploit the potential of NGS for the quantitative analysis of antibody repertoires and to overcome current limitations. To obtain a more reliable antibody library complexity estimate here we show a new, PCR-free, NGS approach to sequence antibody libraries on Illumina platform, coupled to a new bioinformatic analysis and software (Diversity Estimator of Antibody Library, DEAL) that allows to reliably estimate the complexity, taking in consideration the sequencing error. PMID:28505201

  13. Evaluation of the genetic diversity of Plum pox virus in a single plum tree.

    PubMed

    Predajňa, Lukáš; Šubr, Zdeno; Candresse, Thierry; Glasa, Miroslav

    2012-07-01

    Genetic diversity of Plum pox virus (PPV) and its distribution within a single perennial woody host (plum, Prunus domestica) has been evaluated. A plum tree was triply infected by chip-budding with PPV-M, PPV-D and PPV-Rec isolates in 2003 and left to develop untreated under open field conditions. In September 2010 leaf and fruit samples were collected from different parts of the tree canopy. A 745-bp NIb-CP fragment of PPV genome, containing the hypervariable region encoding the CP N-terminal end was amplified by RT-PCR from each sample and directly sequenced to determine the dominant sequence. In parallel, the PCR products were cloned and a total of 105 individual clones were sequenced. Sequence analysis revealed that after 7 years of infection, only PPV-M was still detectable in the tree and that the two other isolates (PPV-Rec and PPV-D) had been displaced. Despite the fact that the analysis targeted a relatively short portion of the genome, a substantial amount of intra-isolate variability was observed for PPV-M. A total of 51 different haplotypes could be identified from the 105 individual sequences, two of which were largely dominant. However, no clear-cut structuration of the viral population by the tree architecture could be highlighted although the results obtained suggest the possibility of intra-leaf/fruit differentiation of the viral population. Comparison of the consensus sequence with the original source isolate showed no difference, suggesting within-plant stability of this original isolate under open field conditions. Copyright © 2012 Elsevier B.V. All rights reserved.

  14. OpenFluDB, a database for human and animal influenza virus

    PubMed Central

    Liechti, Robin; Gleizes, Anne; Kuznetsov, Dmitry; Bougueleret, Lydie; Le Mercier, Philippe; Bairoch, Amos; Xenarios, Ioannis

    2010-01-01

    Although research on influenza lasted for more than 100 years, it is still one of the most prominent diseases causing half a million human deaths every year. With the recent observation of new highly pathogenic H5N1 and H7N7 strains, and the appearance of the influenza pandemic caused by the H1N1 swine-like lineage, a collaborative effort to share observations on the evolution of this virus in both animals and humans has been established. The OpenFlu database (OpenFluDB) is a part of this collaborative effort. It contains genomic and protein sequences, as well as epidemiological data from more than 27 000 isolates. The isolate annotations include virus type, host, geographical location and experimentally tested antiviral resistance. Putative enhanced pathogenicity as well as human adaptation propensity are computed from protein sequences. Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it. Several analysis tools including multiple sequence alignment, phylogenetic analysis and sequence similarity maps enable rapid and efficient mining. The contents of OpenFluDB are supplied by direct user submission, as well as by a daily automatic procedure importing data from public repositories. Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank. This resource has been successfully used to rapidly and widely distribute the sequences collected during the recent human swine flu outbreak and also as an exchange platform during the vaccine selection procedure. Database URL: http://openflu.vital-it.ch. PMID:20624713

  15. Assessment of antibody library diversity through next generation sequencing and technical error compensation.

    PubMed

    Fantini, Marco; Pandolfini, Luca; Lisi, Simonetta; Chirichella, Michele; Arisi, Ivan; Terrigno, Marco; Goracci, Martina; Cremisi, Federico; Cattaneo, Antonino

    2017-01-01

    Antibody libraries are important resources to derive antibodies to be used for a wide range of applications, from structural and functional studies to intracellular protein interference studies to developing new diagnostics and therapeutics. Whatever the goal, the key parameter for an antibody library is its complexity (also known as diversity), i.e. the number of distinct elements in the collection, which directly reflects the probability of finding in the library an antibody against a given antigen, of sufficiently high affinity. Quantitative evaluation of antibody library complexity and quality has been for a long time inadequately addressed, due to the high similarity and length of the sequences of the library. Complexity was usually inferred by the transformation efficiency and tested either by fingerprinting and/or sequencing of a few hundred random library elements. Inferring complexity from such a small sampling is, however, very rudimental and gives limited information about the real diversity, because complexity does not scale linearly with sample size. Next-generation sequencing (NGS) has opened new ways to tackle the antibody library complexity quality assessment. However, much remains to be done to fully exploit the potential of NGS for the quantitative analysis of antibody repertoires and to overcome current limitations. To obtain a more reliable antibody library complexity estimate here we show a new, PCR-free, NGS approach to sequence antibody libraries on Illumina platform, coupled to a new bioinformatic analysis and software (Diversity Estimator of Antibody Library, DEAL) that allows to reliably estimate the complexity, taking in consideration the sequencing error.

  16. Identification of a Novel HADHB Gene Mutation in an Iranian Patient with Mitochondrial Trifunctional Protein Deficiency.

    PubMed

    Shahrokhi, Mahdiyeh; Shafiei, Mohammad; Galehdari, Hamid; Shariati, Gholamreza

    2017-01-01

    Mitochondrial trifunctional protein (MTP) is a hetero-octamer composed of eight parts (subunits): four α-subunits containing LCEH (long-chain 2,3-enoyl-CoA  hydratase) and LCHAD (long-chain 3-hydroxyacyl CoA dehydrogenase) activity, and four β-subunits that possess LCKT (long-chain  3-ketoacyl-CoA thiolase) activity which catalyzes three out of four steps in β-oxidation spiral of long-chain fatty acid. Its deficiency is an autosomal recessive disorder that causes a clinical spectrum of diseases. A blood spot was collected from the patient's original newborn screening card with parental informed consent. A newborn screening test and quantity plasma acylcarnitine profile analysis by MS/MS were performed. After isolation of DNA and Amplification of all exons of the HADHA and HADHB, directly Sequence analyses of all exons and the flanking introns both of genes were performed. Here, we report a novel mutation in a patient with MTP deficiency diagnosed with newborn screening test and quantity plasma acylcarnitine profile analysis by MS/MS and then confirmed by enzyme analysis in cultured fibroblasts and direct sequencing of the HADHA and HADHB genes. Molecular analysis of causative genes showed a missense mutation (p.Q385P) c.1154A > C in exon 14 of HADHB gene. Since this mutation was not found in 50 normal control cases; so it was concluded that c.1154A > C mutation was a causative mutation. Phenotype analysis of this mutation predicted pathogenesis which reduces the stability of the MTP protein complex.

  17. Comparative structural analysis of Bru1 region homeologs in Saccharum spontaneum and S. officinarum

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jisen; Sharma, Anupma; Yu, Qingyi

    Here, sugarcane is a major sugar and biofuel crop, but genomic research and molecular breeding have lagged behind other major crops due to the complexity of auto-allopolyploid genomes. Sugarcane cultivars are frequently aneuploid with chromosome number ranging from 100 to 130, consisting of 70-80 % S. officinarum, 10-20 % S. spontaneum, and 10 % recombinants between these two species. Analysis of a genomic region in the progenitor autoploid genomes of sugarcane hybrid cultivars will reveal the nature and divergence of homologous chromosomes. As a result, to investigate the origin and evolution of haplotypes in the Bru1 genomic regions in sugarcanemore » cultivars, we identified two BAC clones from S. spontaneum and four from S. officinarum and compared to seven haplotype sequences from sugarcane hybrid R570. The results clarified the origin of seven homologous haplotypes in R570, four haplotypes originated from S. officinarum, two from S. spontaneum and one recombinant.. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence ranged from 18.2 % to 60.5 % with an average of 33. 7 %. Gene content and gene structure were relatively well conserved among the homologous haplotypes. Exon splitting occurred in haplotypes of the hybrid genome but not in its progenitor genomes. Tajima's D analysis revealed that S. spontaneum hapotypes in the Bru1 genomic regions were under strong directional selection. Numerous inversions, deletions, insertions and translocations were found between haplotypes within each genome. In conclusion, this is the first comparison among haplotypes of a modern sugarcane hybrid and its two progenitors. Tajima's D results emphasized the crucial role of this fungal disease resistance gene for enhancing the fitness of this species and indicating that the brown rust resistance gene in R570 is from S. spontaneum. Species-specific InDel, sequences similarity and phylogenetic analysis of homologous genes can be used for identifying the origin of S. spontaneum and S. officinarum haplotype in Saccharum hybrids. Comparison of exon splitting among the homologous haplotypes suggested that the genome rearrangements in Saccharum hybrids S. officinarum would be sufficient for proper genome assembly of this autopolyploid genome. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence may allow sequencing and assembling the autopolyploid Saccharum genomes and the auto-allopolyploid hybrid genomes using whole genome shotgun sequencing.« less

  18. Comparative structural analysis of Bru1 region homeologs in Saccharum spontaneum and S. officinarum

    DOE PAGES

    Zhang, Jisen; Sharma, Anupma; Yu, Qingyi; ...

    2016-06-10

    Here, sugarcane is a major sugar and biofuel crop, but genomic research and molecular breeding have lagged behind other major crops due to the complexity of auto-allopolyploid genomes. Sugarcane cultivars are frequently aneuploid with chromosome number ranging from 100 to 130, consisting of 70-80 % S. officinarum, 10-20 % S. spontaneum, and 10 % recombinants between these two species. Analysis of a genomic region in the progenitor autoploid genomes of sugarcane hybrid cultivars will reveal the nature and divergence of homologous chromosomes. As a result, to investigate the origin and evolution of haplotypes in the Bru1 genomic regions in sugarcanemore » cultivars, we identified two BAC clones from S. spontaneum and four from S. officinarum and compared to seven haplotype sequences from sugarcane hybrid R570. The results clarified the origin of seven homologous haplotypes in R570, four haplotypes originated from S. officinarum, two from S. spontaneum and one recombinant.. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence ranged from 18.2 % to 60.5 % with an average of 33. 7 %. Gene content and gene structure were relatively well conserved among the homologous haplotypes. Exon splitting occurred in haplotypes of the hybrid genome but not in its progenitor genomes. Tajima's D analysis revealed that S. spontaneum hapotypes in the Bru1 genomic regions were under strong directional selection. Numerous inversions, deletions, insertions and translocations were found between haplotypes within each genome. In conclusion, this is the first comparison among haplotypes of a modern sugarcane hybrid and its two progenitors. Tajima's D results emphasized the crucial role of this fungal disease resistance gene for enhancing the fitness of this species and indicating that the brown rust resistance gene in R570 is from S. spontaneum. Species-specific InDel, sequences similarity and phylogenetic analysis of homologous genes can be used for identifying the origin of S. spontaneum and S. officinarum haplotype in Saccharum hybrids. Comparison of exon splitting among the homologous haplotypes suggested that the genome rearrangements in Saccharum hybrids S. officinarum would be sufficient for proper genome assembly of this autopolyploid genome. Retrotransposon insertions and sequences variations among the homologous haplotypes sequence divergence may allow sequencing and assembling the autopolyploid Saccharum genomes and the auto-allopolyploid hybrid genomes using whole genome shotgun sequencing.« less

  19. Analysis of expressed sequence tags for Frankliniella occidentalis, the western flower thrips.

    PubMed

    Rotenberg, D; Whitfield, A E

    2010-08-01

    Thrips are members of the insect order Thysanoptera and Frankliniella occidentalis (the western flower thrips) is the most economically important pest within this order. F. occidentalis is both a direct pest of crops and an efficient vector of plant viruses, including Tomato spotted wilt virus (TSWV). Despite the world-wide importance of thrips in agriculture, there is little knowledge of the F. occidentalis genome or gene functions at this time. A normalized cDNA library was constructed from first instar thrips and 13 839 expressed sequence tags (ESTs) were obtained. Our EST data assembled into 894 contigs and 11 806 singletons (12 700 nonredundant sequences). We found that 31% of these sequences had significant similarity (E< or = 10(-10)) to protein sequences in the National Center for Biotechnology Information nonredundant (nr) protein database, and 25% were functionally annotated using Blast 2GO. We identified 74 sequences with putative homology to proteins associated with insect innate immunity. Sixteen sequences had significant similarity to proteins associated with small RNA-mediated gene silencing pathways (RNA interference; RNAi), including the antiviral pathway (short interfering RNA-mediated pathway). Our EST collection provides new sequence resources for characterizing gene functions in F. occidentalis and other thrips species with regards to vital biological processes, studying the mechanism of interactions with the viruses harboured and transmitted by the vector, and identifying new insect gene-centred targets for plant disease and insect control.

  20. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.

    PubMed

    Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D

    2017-01-04

    The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Detection and Analysis of Circular RNAs by RT-PCR.

    PubMed

    Panda, Amaresh C; Gorospe, Myriam

    2018-03-20

    Gene expression in eukaryotic cells is tightly regulated at the transcriptional and posttranscriptional levels. Posttranscriptional processes, including pre-mRNA splicing, mRNA export, mRNA turnover, and mRNA translation, are controlled by RNA-binding proteins (RBPs) and noncoding (nc)RNAs. The vast family of ncRNAs comprises diverse regulatory RNAs, such as microRNAs and long noncoding (lnc)RNAs, but also the poorly explored class of circular (circ)RNAs. Although first discovered more than three decades ago by electron microscopy, only the advent of high-throughput RNA-sequencing (RNA-seq) and the development of innovative bioinformatic pipelines have begun to allow the systematic identification of circRNAs (Szabo and Salzman, 2016; Panda et al ., 2017b; Panda et al ., 2017c). However, the validation of true circRNAs identified by RNA sequencing requires other molecular biology techniques including reverse transcription (RT) followed by conventional or quantitative (q) polymerase chain reaction (PCR), and Northern blot analysis (Jeck and Sharpless, 2014). RT-qPCR analysis of circular RNAs using divergent primers has been widely used for the detection, validation, and sometimes quantification of circRNAs (Abdelmohsen et al ., 2015 and 2017; Panda et al ., 2017b). As detailed here, divergent primers designed to span the circRNA backsplice junction sequence can specifically amplify the circRNAs and not the counterpart linear RNA. In sum, RT-PCR analysis using divergent primers allows direct detection and quantification of circRNAs.

  2. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

    PubMed

    Senol Cali, Damla; Kim, Jeremie S; Ghose, Saugata; Alkan, Can; Mutlu, Onur

    2018-04-02

    Nanopore sequencing technology has the potential to render other sequencing technologies obsolete with its ability to generate long reads and provide portability. However, high error rates of the technology pose a challenge while generating accurate genome assemblies. The tools used for nanopore sequence analysis are of critical importance, as they should overcome the high error rates of the technology. Our goal in this work is to comprehensively analyze current publicly available tools for nanopore sequence analysis to understand their advantages, disadvantages and performance bottlenecks. It is important to understand where the current tools do not perform well to develop better tools. To this end, we (1) analyze the multiple steps and the associated tools in the genome assembly pipeline using nanopore sequence data, and (2) provide guidelines for determining the appropriate tools for each step. Based on our analyses, we make four key observations: (1) the choice of the tool for basecalling plays a critical role in overcoming the high error rates of nanopore sequencing technology. (2) Read-to-read overlap finding tools, GraphMap and Minimap, perform similarly in terms of accuracy. However, Minimap has a lower memory usage, and it is faster than GraphMap. (3) There is a trade-off between accuracy and performance when deciding on the appropriate tool for the assembly step. The fast but less accurate assembler Miniasm can be used for quick initial assembly, and further polishing can be applied on top of it to increase the accuracy, which leads to faster overall assembly. (4) The state-of-the-art polishing tool, Racon, generates high-quality consensus sequences while providing a significant speedup over another polishing tool, Nanopolish. We analyze various combinations of different tools and expose the trade-offs between accuracy, performance, memory usage and scalability. We conclude that our observations can guide researchers and practitioners in making conscious and effective choices for each step of the genome assembly pipeline using nanopore sequence data. Also, with the help of bottlenecks we have found, developers can improve the current tools or build new ones that are both accurate and fast, to overcome the high error rates of the nanopore sequencing technology.

  3. Development of a two-step high-resolution melting (HRM) analysis for screening sequence variants associated with resistance to the QoIs, benzimidazoles and dicarboximides in airborne inoculum of Botrytis cinerea.

    PubMed

    Chatzidimopoulos, Michael; Ganopoulos, Ioannis; Vellios, Evangelos; Madesis, Panagiotis; Tsaftaris, Athanasios; Pappas, Athanassios C

    2014-11-01

    A rapid, high-resolution melting (HRM) analysis protocol was developed to detect sequence variations associated with resistance to the QoIs, benzimidazoles and dicarboximides in Botrytis cinerea airborne inoculum. HRM analysis was applied directly in fungal DNA collected from air samplers with selective medium. Three and five different genotypes were detected and classified according to their melting profiles in BenA and bos1 genes associated with resistance to benzimidazoles and dicarboximides, respectively. The sensitivity of the methodology was evident in the case of the QoIs, where genotypes varying either by a single nucleotide polymorphism or an additional 1205-bp intron were separated accurately with a single pair of primers. The developed two-step protocol was completed in 82 min and showed reduced variation in the melting curves' formation. HRM analysis rapidly detected the major mutations found in greenhouse strains providing accurate data for successfully controlling grey mould. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  4. Soil DNA metabarcoding and high-throughput sequencing as a forensic tool: considerations, potential limitations and recommendations.

    PubMed

    Young, J M; Austin, J J; Weyrich, L S

    2017-02-01

    Analysis of physical evidence is typically a deciding factor in forensic casework by establishing what transpired at a scene or who was involved. Forensic geoscience is an emerging multi-disciplinary science that can offer significant benefits to forensic investigations. Soil is a powerful, nearly 'ideal' contact trace evidence, as it is highly individualistic, easy to characterise, has a high transfer and retention probability, and is often overlooked in attempts to conceal evidence. However, many real-life cases encounter close proximity soil samples or soils with low inorganic content, which cannot be easily discriminated based on current physical and chemical analysis techniques. The capability to improve forensic soil discrimination, and identify key indicator taxa from soil using the organic fraction is currently lacking. The development of new DNA sequencing technologies offers the ability to generate detailed genetic profiles from soils and enhance current forensic soil analyses. Here, we discuss the use of DNA metabarcoding combined with high-throughput sequencing (HTS) technology to distinguish between soils from different locations in a forensic context. Specifically, we provide recommendations for best practice, outline the potential limitations encountered in a forensic context and describe the future directions required to integrate soil DNA analysis into casework. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  5. Identification of a cis-Regulatory Element Involved in Phytochrome Down-Regulated Expression of the Pea Small GTPase Gene pra21

    PubMed Central

    Inaba, Takehito; Nagano, Yukio; Sakakibara, Toshihiro; Sasaki, Yukiko

    1999-01-01

    The pra2 gene encodes a pea (Pisum sativum) small GTPase belonging to the YPT/rab family, and its expression is down-regulated by light, mediated by phytochrome. We have isolated and characterized a genomic clone of this gene and constructed a fusion DNA of its 5′-upstream region in front of the gene for firefly luciferase. Using this construct in a transient assay, we determined a pra2 cis-regulatory region sufficient to direct the light down-regulation of the luciferase reporter gene. Both 5′- and internal deletion analyses revealed that the 93-bp sequence between −734 and −642 from the transcriptional start site was important for phytochrome down-regulation. Gain-of-function analysis showed that this 93-bp region could confer light down-regulation when fused to the cauliflower mosaic virus 35S promoter. Furthermore, linker-scanning analysis showed that a 12-bp sequence within the 93-bp region mediated phytochrome down-regulation. Gel-retardation analysis showed the presence of a nuclear factor that was specifically bound to the 12-bp sequence in vitro. These results indicate that this element is a cis-regulatory element involved in phytochrome down-regulated expression. PMID:10364400

  6. Isolation of a polyphenol oxidase (PPO) cDNA from artichoke and expression analysis in wounded artichoke heads.

    PubMed

    Quarta, Angela; Mita, Giovanni; Durante, Miriana; Arlorio, Marco; De Paolis, Angelo

    2013-07-01

    The polyphenol oxidase (PPO) enzyme, which can catalyze the oxidation of phenolics to quinones, has been reported to be involved in undesirable browning in many plant foods. This phenomenon is particularly severe in artichoke heads wounded during the manufacturing process. A full-length cDNA encoding for a putative polyphenol oxidase (designated as CsPPO) along with a 1432 bp sequence upstream of the starting ATG codon was characterized for the first time from [Cynara cardunculus var. scolymus (L.) Fiori]. The 1764 bp CsPPO sequence encodes a putative protein of 587 amino acids with a calculated molecular mass of 65,327 Da and an isoelectric point of 5.50. Analysis of the promoter region revealed the presence of cis-acting elements, some of which are putatively involved in the response to light and wounds. Expression analysis of the gene in wounded capitula indicated that CsPPO was significantly induced after 48 h, even though the browning process had started earlier. This suggests that the early browning event observed in artichoke heads was not directly related to de novo mRNA synthesis. Finally, we provide the complete gene sequence encoding for polyphenol oxidase and the upstream regulative region in artichoke. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  7. A Single Molecular Beacon Probe Is Sufficient for the Analysis of Multiple Nucleic Acid Sequences

    PubMed Central

    Gerasimova, Yulia V.; Hayson, Aaron; Ballantyne, Jack; Kolpashchikov, Dmitry M.

    2010-01-01

    Molecular beacon (MB) probes are dual-labeled hairpin-shaped oligodeoxyribonucleotides that are extensively used for real-time detection of specific RNA/DNA analytes. In the MB probe, the loop fragment is complementary to the analyte: therefore, a unique probe is required for the analysis of each new analyte sequence. The conjugation of an oligonucleotide with two dyes and subsequent purification procedures add to the cost of MB probes, thus reducing their application in multiplex formats. Here we demonstrate how one MB probe can be used for the analysis of an arbitrary nucleic acid. The approach takes advantage of two oligonucleotide adaptor strands, each of which contains a fragment complementary to the analyte and a fragment complementary to an MB probe. The presence of the analyte leads to association of MB probe and the two DNA strands in quadripartite complex. The MB probe fluorescently reports the formation of this complex. In this design, the MB does not bind the analyte directly; therefore, the MB sequence is independent of the analyte. In this study one universal MB probe was used to genotype three human polymorphic sites. This approach promises to reduce the cost of multiplex real-time assays and improve the accuracy of single-nucleotide polymorphism genotyping. PMID:20665615

  8. A paleomagnetic and paleointensity study on Pleistocene and Pliocene basaltic flows from the Djavakheti Highland (Southern Georgia, Caucasus)

    NASA Astrophysics Data System (ADS)

    Calvo-Rathert, Manuel; Goguitchaichvili, Avto; Bógalo, María-Felicidad; Vegas-Tubía, Néstor; Carrancho, Ángel; Sologashvili, Jemal

    2011-08-01

    New paleomagnetic, rock-magnetic and paleointensity results obtained on samples from 23 basaltic lava flows belonging to four different flow sequences (Mashavera, Kvemo Orozmani, Zemo Karabulaki and Diliska) of Pleistocene and Pliocene age from the eastern Djavakheti Highland, in southern Georgia, are presented. Radiometric dating of these sequences yields ages between 1.8 and 2.18 Ma for Mashavera, 2.07 and 2.58 Ma for Zemo-Karabulakhi and 2.12 and 3.27 for Diliska. No radiometric ages are available for the Kvemo Orozmani sequence, which is considered to be coeval to the Mashavera sequence. Rock-magnetic experiments including measurement of thermomagnetic, hysteresis and IRM-acquisition curves suggest low-Ti titanomagnetite as main carrier of remanence, although a lower Curie-temperature component was also observed in several cases. Reversible and non-reversible curves were recorded in thermomagnetic experiments. Paleomagnetic analysis generally indicated the presence of a single component (mainly in the Mashavera sequence), but also two more or less superimposed components in some other cases. In 21 sites a characteristic component could be determined and all except one were characterised by normal-polarity directions. Flows from the Mashavera sequence had a rather steep inclination (73.1°). Nevertheless, a mean paleomagnetic direction of all four sequences is obtained ( D = 8.5°, I = 60.8°, N = 4, α95 = 11.7°, k = 62.7) which agrees with the Plio-Quaternary directions obtained in previous studies in Georgia. The paleomagnetic pole obtained (latitude ϕ = 82.1°, longitude λ = 118.2°, A95 = 8.0°, k = 240.7) agrees with the pole values of both the 0 Ma and the 5 Ma windows of the synthetic Eurasian polar wander path from Besse and Courtillot (2002). In order to analyse the behaviour of secular variation, the scatter of paleosecular variation of virtual geomagnetic poles of both the Mashavera flow and all 18 studied flows of Pleistocene age was calculated. It could be observed that both data-sets seem to fit well the expected scatter at latitude 41°N. Paleointensity experiments were carried out with the Coe modification of the Thellier method. Twenty-five out of 84 samples (30%) provided reliable paleointensity results. These successful results were mainly obtained in the Mashavera sequence. Most flows yielded paleointensity results in the 30-45 μT range, in accordance with expected Pliocene to present day intensities. Two flows, however, located near the top of the Mashavera sequence yield high paleointensity values around 60 μT. Anomalous paleointensity results in the upper-lying Mashavera flows together with the steep inclinations observed in that sequence, could perhaps signal the near onset of the Olduvai-Matuyama reversal.

  9. A comparative analysis of soft computing techniques for gene prediction.

    PubMed

    Goel, Neelam; Singh, Shailendra; Aseri, Trilok Chand

    2013-07-01

    The rapid growth of genomic sequence data for both human and nonhuman species has made analyzing these sequences, especially predicting genes in them, very important and is currently the focus of many research efforts. Beside its scientific interest in the molecular biology and genomics community, gene prediction is of considerable importance in human health and medicine. A variety of gene prediction techniques have been developed for eukaryotes over the past few years. This article reviews and analyzes the application of certain soft computing techniques in gene prediction. First, the problem of gene prediction and its challenges are described. These are followed by different soft computing techniques along with their application to gene prediction. In addition, a comparative analysis of different soft computing techniques for gene prediction is given. Finally some limitations of the current research activities and future research directions are provided. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Randomizing world trade. II. A weighted network analysis

    NASA Astrophysics Data System (ADS)

    Squartini, Tiziano; Fagiolo, Giorgio; Garlaschelli, Diego

    2011-10-01

    Based on the misleading expectation that weighted network properties always offer a more complete description than purely topological ones, current economic models of the International Trade Network (ITN) generally aim at explaining local weighted properties, not local binary ones. Here we complement our analysis of the binary projections of the ITN by considering its weighted representations. We show that, unlike the binary case, all possible weighted representations of the ITN (directed and undirected, aggregated and disaggregated) cannot be traced back to local country-specific properties, which are therefore of limited informativeness. Our two papers show that traditional macroeconomic approaches systematically fail to capture the key properties of the ITN. In the binary case, they do not focus on the degree sequence and hence cannot characterize or replicate higher-order properties. In the weighted case, they generally focus on the strength sequence, but the knowledge of the latter is not enough in order to understand or reproduce indirect effects.

  11. Genome analysis of the platypus reveals unique signatures of evolution.

    PubMed

    Warren, Wesley C; Hillier, LaDeana W; Marshall Graves, Jennifer A; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P; Miethke, Pat; Waters, Paul D; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S; López-Otín, Carlos; Ordóñez, Gonzalo R; Eichler, Evan E; Chen, Lin; Cheng, Ze; Deakin, Janine E; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T; Wakefield, Matthew J; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A; Smit, Arian F A; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A; Walker, Jerilyn A; Konkel, Miriam K; Harris, Robert S; Whittington, Camilla M; Wong, Emily S W; Gemmell, Neil J; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R; Ray, David A; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H; Taylor, James; Jones, Russell C; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N; Pohl, Craig S; Smith, Scott M; Hou, Shunfeng; Nefedov, Mikhail; de Jong, Pieter J; Renfree, Marilyn B; Mardis, Elaine R; Wilson, Richard K

    2008-05-08

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.

  12. Genome analysis of the platypus reveals unique signatures of evolution

    PubMed Central

    Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.

    2009-01-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734

  13. Progressive Recruitment of Mesenchymal Progenitors Reveals a Time-Dependent Process of Cell Fate Acquisition in Mouse and Human Nephrogenesis.

    PubMed

    Lindström, Nils O; De Sena Brandine, Guilherme; Tran, Tracy; Ransick, Andrew; Suh, Gio; Guo, Jinjin; Kim, Albert D; Parvez, Riana K; Ruffins, Seth W; Rutledge, Elisabeth A; Thornton, Matthew E; Grubbs, Brendan; McMahon, Jill A; Smith, Andrew D; McMahon, Andrew P

    2018-06-04

    Mammalian nephrons arise from a limited nephron progenitor pool through a reiterative inductive process extending over days (mouse) or weeks (human) of kidney development. Here, we present evidence that human nephron patterning reflects a time-dependent process of recruitment of mesenchymal progenitors into an epithelial nephron precursor. Progressive recruitment predicted from high-resolution image analysis and three-dimensional reconstruction of human nephrogenesis was confirmed through direct visualization and cell fate analysis of mouse kidney organ cultures. Single-cell RNA sequencing of the human nephrogenic niche provided molecular insights into these early patterning processes and predicted developmental trajectories adopted by nephron progenitor cells in forming segment-specific domains of the human nephron. The temporal-recruitment model for nephron polarity and patterning suggested by direct analysis of human kidney development provides a framework for integrating signaling pathways driving mammalian nephrogenesis. Copyright © 2018 Elsevier Inc. All rights reserved.

  14. Application of optical correlation techniques to particle imaging velocimetry

    NASA Technical Reports Server (NTRS)

    Wernet, Mark P.; Edwards, Robert V.

    1988-01-01

    Pulsed laser sheet velocimetry yields nonintrusive measurements of velocity vectors across an extended 2-dimensional region of the flow field. The application of optical correlation techniques to the analysis of multiple exposure laser light sheet photographs can reduce and/or simplify the data reduction time and hardware. Here, Matched Spatial Filters (MSF) are used in a pattern recognition system. Usually MSFs are used to identify the assembly line parts. In this application, the MSFs are used to identify the iso-velocity vector contours in the flow. The patterns to be recognized are the recorded particle images in a pulsed laser light sheet photograph. Measurement of the direction of the partical image displacements between exposures yields the velocity vector. The particle image exposure sequence is designed such that the velocity vector direction is determined unambiguously. A global analysis technique is used in comparison to the more common particle tracking algorithms and Young's fringe analysis technique.

  15. Google matrix analysis of directed networks

    NASA Astrophysics Data System (ADS)

    Ermann, Leonardo; Frahm, Klaus M.; Shepelyansky, Dima L.

    2015-10-01

    In the past decade modern societies have developed enormous communication and social networks. Their classification and information retrieval processing has become a formidable task for the society. Because of the rapid growth of the World Wide Web, and social and communication networks, new mathematical methods have been invented to characterize the properties of these networks in a more detailed and precise way. Various search engines extensively use such methods. It is highly important to develop new tools to classify and rank a massive amount of network information in a way that is adapted to internal network structures and characteristics. This review describes the Google matrix analysis of directed complex networks demonstrating its efficiency using various examples including the World Wide Web, Wikipedia, software architectures, world trade, social and citation networks, brain neural networks, DNA sequences, and Ulam networks. The analytical and numerical matrix methods used in this analysis originate from the fields of Markov chains, quantum chaos, and random matrix theory.

  16. Ambient ionisation mass spectrometry for in situ analysis of intact proteins

    PubMed Central

    Kocurek, Klaudia I.; Griffiths, Rian L.

    2018-01-01

    Abstract Ambient surface mass spectrometry is an emerging field which shows great promise for the analysis of biomolecules directly from their biological substrate. In this article, we describe ambient ionisation mass spectrometry techniques for the in situ analysis of intact proteins. As a broad approach, the analysis of intact proteins offers unique advantages for the determination of primary sequence variations and posttranslational modifications, as well as interrogation of tertiary and quaternary structure and protein‐protein/ligand interactions. In situ analysis of intact proteins offers the potential to couple these advantages with information relating to their biological environment, for example, their spatial distributions within healthy and diseased tissues. Here, we describe the techniques most commonly applied to in situ protein analysis (liquid extraction surface analysis, continuous flow liquid microjunction surface sampling, nano desorption electrospray ionisation, and desorption electrospray ionisation), their advantages, and limitations and describe their applications to date. We also discuss the incorporation of ion mobility spectrometry techniques (high field asymmetric waveform ion mobility spectrometry and travelling wave ion mobility spectrometry) into ambient workflows. Finally, future directions for the field are discussed. PMID:29607564

  17. Biosynthesis and genetic encoding of phosphothreonine through parallel selection and deep sequencing

    PubMed Central

    Huguenin-Dezot, Nicolas; Liang, Alexandria D.; Schmied, Wolfgang H.; Rogerson, Daniel T.; Chin, Jason W.

    2017-01-01

    The phosphorylation of threonine residues in proteins regulates diverse processes in eukaryotic cells, and thousands of threonine phosphorylations have been identified. An understanding of how threonine phosphorylation regulates biological function will be accelerated by general methods to bio-synthesize defined phospho-proteins. Here we address limitations in current methods for discovering aminoacyl-tRNA synthetase/tRNA pairs for incorporating non-natural amino acids into proteins, by combining parallel positive selections with deep sequencing and statistical analysis, to create a rapid approach for directly discovering aminoacyl-tRNA synthetase/tRNA pairs that selectively incorporate non-natural substrates. Our approach is scalable and enables the direct discovery of aminoacyl-tRNA synthetase/tRNA pairs with mutually orthogonal substrate specificity. We biosynthesize phosphothreonine in cells, and use our new selection approach to discover a phosphothreonyl-tRNA synthetase/tRNACUA pair. By combining these advances we create an entirely biosynthetic route to incorporating phosphothreonine in proteins and biosynthesize several phosphoproteins; enabling phosphoprotein structure determination and synthetic protein kinase activation. PMID:28553966

  18. Geomagnetic paleointensities from excursion sequences in lavas on Oahu, Hawaii

    USGS Publications Warehouse

    Coe, Robert S.; Gromme, Sherman; Mankinen, Edward A.

    1984-01-01

    Paleomagnetic data demonstrating three late Tertiary excursions in the direction of the geomagnetic field recorded in sequences of basaltic lavas on the island of Oahu, Hawaii were published by R. R. Doell and G. B. Dalrymple in 1973. We have determined geomagnetic paleointensities by the Thelliers' method for 14 lavas from the three sites. During these experiments, considerable difficulty was encountered because of the presence of titanomaghemite in many lavas and the contamination of natural remanent magnetization by lightning in many others. Moreover, we often observed the production of spurious high‐temperature chemical remanent magnetization during the Thellier experiments. An analysis of this particularly troublesome problem is presented. Two of the sites showed low paleointensities associated with angular departures of the paleomagnetic field direction from that of a geocentric axial dipole, which suggests that these excursions represent aborted reversals or fragments of reversals. At the third site, however, the paleointensity did not become low as the field diverged. This excursion may reflect the variation of a large nondipole source near Hawaii.

  19. Frequency-dependent seismic attenuation in the eastern United States as observed from the 2011 central Virginia earthquake and aftershock sequence

    USGS Publications Warehouse

    McNamara, Daniel E.; Gee, Lind; Benz, Harley M.; Chapman, Martin

    2014-01-01

    Ground shaking due to earthquakes in the eastern United States (EUS) is felt at significantly greater distances than in the western United States (WUS) and for some earthquakes it has been shown to display a strong preferential direction. Shaking intensity variation can be due to propagation path effects, source directivity, and/or site amplification. In this paper, we use S and Lg waves recorded from the 2011 central Virginia earthquake and aftershock sequence, in the Central Virginia Seismic Zone, to quantify attenuation as frequency‐dependent Q(f). In support of observations based on shaking intensity, we observe high Q values in the EUS relative to previous studies in the WUS with especially efficient propagation along the structural trend of the Appalachian mountains. Our analysis of Q(f) quantifies the path effects of the northeast‐trending felt distribution previously inferred from the U.S. Geological Survey (USGS) “Did You Feel It” data, historic intensity data, and the asymmetrical distribution of rockfalls and landslides.

  20. Ribosomal frameshifting used in influenza A virus expression occurs within the sequence UCC_UUU_CGU and is in the +1 direction.

    PubMed

    Firth, A E; Jagger, B W; Wise, H M; Nelson, C C; Parsawar, K; Wills, N M; Napthine, S; Taubenberger, J K; Digard, P; Atkins, J F

    2012-10-01

    Programmed ribosomal frameshifting is used in the expression of many virus genes and some cellular genes. In eukaryotic systems, the most well-characterized mechanism involves -1 tandem tRNA slippage on an X_XXY_YYZ motif. By contrast, the mechanisms involved in programmed +1 (or -2) slippage are more varied and often poorly characterized. Recently, a novel gene, PA-X, was discovered in influenza A virus and found to be expressed via a shift to the +1 reading frame. Here, we identify, by mass spectrometric analysis, both the site (UCC_UUU_CGU) and direction (+1) of the frameshifting that is involved in PA-X expression. Related sites are identified in other virus genes that have previously been proposed to be expressed via +1 frameshifting. As these viruses infect insects (chronic bee paralysis virus), plants (fijiviruses and amalgamaviruses) and vertebrates (influenza A virus), such motifs may form a new class of +1 frameshift-inducing sequences that are active in diverse eukaryotes.

  1. GAP Final Technical Report 12-14-04

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andrew J. Bordner, PhD, Senior Research Scientist

    2004-12-14

    The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less

  2. DNA Data Visualization (DDV): Software for Generating Web-Based Interfaces Supporting Navigation and Analysis of DNA Sequence Data of Entire Genomes.

    PubMed

    Neugebauer, Tomasz; Bordeleau, Eric; Burrus, Vincent; Brzezinski, Ryszard

    2015-01-01

    Data visualization methods are necessary during the exploration and analysis activities of an increasingly data-intensive scientific process. There are few existing visualization methods for raw nucleotide sequences of a whole genome or chromosome. Software for data visualization should allow the researchers to create accessible data visualization interfaces that can be exported and shared with others on the web. Herein, novel software developed for generating DNA data visualization interfaces is described. The software converts DNA data sets into images that are further processed as multi-scale images to be accessed through a web-based interface that supports zooming, panning and sequence fragment selection. Nucleotide composition frequencies and GC skew of a selected sequence segment can be obtained through the interface. The software was used to generate DNA data visualization of human and bacterial chromosomes. Examples of visually detectable features such as short and long direct repeats, long terminal repeats, mobile genetic elements, heterochromatic segments in microbial and human chromosomes, are presented. The software and its source code are available for download and further development. The visualization interfaces generated with the software allow for the immediate identification and observation of several types of sequence patterns in genomes of various sizes and origins. The visualization interfaces generated with the software are readily accessible through a web browser. This software is a useful research and teaching tool for genetics and structural genomics.

  3. Characterization of a novel ADAM protease expressed by Pneumocystis carinii.

    PubMed

    Kennedy, Cassie C; Kottom, Theodore J; Limper, Andrew H

    2009-08-01

    Pneumocystis species are opportunistic fungal pathogens that cause severe pneumonia in immunocompromised hosts. Recent evidence has suggested that unidentified proteases are involved in Pneumocystis life cycle regulation. Proteolytically active ADAM (named for "a disintegrin and metalloprotease") family molecules have been identified in some fungal organisms, such as Aspergillus fumigatus and Schizosaccharomyces pombe, and some have been shown to participate in life cycle regulation. Accordingly, we sought to characterize ADAM-like molecules in the fungal opportunistic pathogen, Pneumocystis carinii (PcADAM). After an in silico search of the P. carinii genomic sequencing project identified a 329-bp partial sequence with homology to known ADAM proteins, the full-length PcADAM sequence was obtained by PCR extension cloning, yielding a final coding sequence of 1,650 bp. Sequence analysis detected the presence of a typical ADAM catalytic active site (HEXXHXXGXXHD). Expression of PcADAM over the Pneumocystis life cycle was analyzed by Northern blot. Southern and contour-clamped homogenous electronic field blot analysis demonstrated its presence in the P. carinii genome. Expression of PcADAM was observed to be increased in Pneumocystis cysts compared to trophic forms. The full-length gene was subsequently cloned and heterologously expressed in Saccharomyces cerevisiae. Purified PcADAMp protein was proteolytically active in casein zymography, requiring divalent zinc. Furthermore, native PcADAMp extracted directly from freshly isolated Pneumocystis organisms also exhibited protease activity. This is the first report of protease activity attributable to a specific, characterized protein in the clinically important opportunistic fungal pathogen Pneumocystis.

  4. Molecular Characterization of “Candidatus Parilichlamydia carangidicola,” a Novel Chlamydia-Like Epitheliocystis Agent in Yellowtail Kingfish, Seriola lalandi (Valenciennes), and the Proposal of a New Family, “Candidatus Parilichlamydiaceae” fam. nov. (Order Chlamydiales)

    PubMed Central

    Polkinghorne, A.; Miller, T. L.; Groff, J. M.; LaPatra, S. E.; Nowak, B. F.

    2013-01-01

    Three cohorts of farmed yellowtail kingfish (Seriola lalandi) from South Australia were examined for Chlamydia-like organisms associated with epitheliocystis. To characterize the bacteria, 38 gill samples were processed for histopathology, electron microscopy, and 16S rRNA amplification, sequencing, and phylogenetic analysis. Microscopically, the presence of membrane-enclosed cysts was observed within the gill lamellae. Also observed was hyperplasia of the epithelial cells with cytoplasmic vacuolization and fusion of the gill lamellae. Transmission electron microscopy revealed morphological features of the reticulate and intermediate bodies typical of members of the order Chlamydiales. A novel 1,393-bp 16S chlamydial rRNA sequence was amplified from gill DNA extracted from fish in all cohorts over a 3-year period that corresponded to the 16S rRNA sequence amplified directly from laser-dissected cysts. This sequence was only 87% similar to the reported “Candidatus Piscichlamydia salmonis” (AY462244) from Atlantic salmon and Arctic charr. Phylogenetic analysis of this sequence against 35 Chlamydia and Chlamydia-like bacteria revealed that this novel bacterium belongs to an undescribed family lineage in the order Chlamydiales. Based on these observations, we propose this bacterium of yellowtail kingfish be known as “Candidatus Parilichlamydia carangidicola” and that the new family be known as “Candidatus Parilichlamydiaceae.” PMID:23275507

  5. Comparative genomics of the pIPO2/pSB102 family of environmental plasmids: sequence, evolution, and ecology of pTer331 isolated from Collimonas fungivorans Ter331.

    PubMed

    Mela, Francesca; Fritsche, Kathrin; Boersma, Hidde; van Elsas, Jan D; Bartels, Daniela; Meyer, Folker; de Boer, Wietse; van Veen, Johannes A; Leveau, Johan H J

    2008-10-01

    Plasmid pTer331 from the bacterium Collimonas fungivorans Ter331 is a new member of the pIPO2/pSB102 family of environmental plasmids. The 40 457-bp sequence of pTer331 codes for 44 putative ORFs, most of which represent genes involved in replication, partitioning and transfer of the plasmid. We confirmed that pTer331 is stably maintained in its native host. Deletion analysis identified a mini-replicon capable of replicating autonomously in Escherichia coli and Pseudomonas putida. Furthermore, plasmid pTer331 was able to mobilize and retromobilize IncQ plasmid pSM1890 at typical rates of 10(-4) and 10(-8), respectively. Analysis of the 91% DNA sequence identity between pTer331 and pIPO2 revealed functional conservation of coding sequences, the deletion of DNA fragments flanked by short direct repeats (DR), and sequence preservation of long DRs. In addition, we experimentally established that pTer331 has no obvious contribution in several of the phenotypes that are characteristic of its host C. fungivorans Ter331, including the ability to efficiently colonize plant roots. Based on our findings, we hypothesize that cryptic plasmids such as pTer331 and pIPO2 might not confer an individual advantage to bacteria, but, due to their broad-host-range and ability to retromobilize, benefit bacterial populations by accelerating the intracommunal dissemination of the mobile gene pool.

  6. Relationships between functional genes in Lactobacillus delbrueckii ssp. bulgaricus isolates and phenotypic characteristics associated with fermentation time and flavor production in yogurt elucidated using multilocus sequence typing.

    PubMed

    Liu, Wenjun; Yu, Jie; Sun, Zhihong; Song, Yuqin; Wang, Xueni; Wang, Hongmei; Wuren, Tuoya; Zha, Musu; Menghe, Bilige; Heping, Zhang

    2016-01-01

    Lactobacillus delbrueckii ssp. bulgaricus (L. bulgaricus) is well known for its worldwide application in yogurt production. Flavor production and acid producing are considered as the most important characteristics for starter culture screening. To our knowledge this is the first study applying functional gene sequence multilocus sequence typing technology to predict the fermentation and flavor-producing characteristics of yogurt-producing bacteria. In the present study, phenotypic characteristics of 35 L. bulgaricus strains were quantified during the fermentation of milk to yogurt and during its subsequent storage; these included fermentation time, acidification rate, pH, titratable acidity, and flavor characteristics (acetaldehyde concentration). Furthermore, multilocus sequence typing analysis of 7 functional genes associated with fermentation time, acid production, and flavor formation was done to elucidate the phylogeny and genetic evolution of the same L. bulgaricus isolates. The results showed that strains significantly differed in fermentation time, acidification rate, and acetaldehyde production. Combining functional gene sequence analysis with phenotypic characteristics demonstrated that groups of strains established using genotype data were consistent with groups identified based on their phenotypic traits. This study has established an efficient and rapid molecular genotyping method to identify strains with good fermentation traits; this has the potential to replace time-consuming conventional methods based on direct measurement of phenotypic traits. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  7. Isolation and molecular characterization of dTnp1, a mobile and defective transposable element of Nicotiana plumbaginifolia.

    PubMed

    Meyer, C; Pouteau, S; Rouzé, P; Caboche, M

    1994-01-01

    By Northern blot analysis of nitrate reductase-deficient mutants of Nicotiana plumbaginifolia, we identified a mutant (mutant D65), obtained after gamma-ray irradiation of protoplasts, which contained an insertion sequence in the nitrate reductase (NR) mRNA. This insertion sequence was localized by polymerase chain reaction (PCR) in the first exon of NR and was also shown to be present in the NR gene. The mutant gene contained a 565 bp insertion sequence that exhibits the sequence characteristics of a transposable element, which was thus named dTnp1. The dTnp1 element has 14 bp terminal inverted repeats and is flanked by an 8-bp target site duplication generated upon transposition. These inverted repeats have significant sequence homology with those of other transposable elements. Judging by its size and the absence of a long open reading frame, dTnp1 appears to represent a defective, although mobile, transposable element. The octamer motif TTTAGGCC was found several times in direct orientation near the 5' and 3' ends of dTnp1 together with a perfect palindrome located after the 5' inverted repeat. Southern blot analysis using an internal probe of dTnp1 suggested that this element occurs as a single copy in the genome of N. plumbaginifolia. It is also present in N. tabacum, but absent in tomato or petunia. The dTnp1 element is therefore of potential use for gene tagging in Nicotiana species.

  8. Single-Cell Genomics: Approaches and Utility in Immunology.

    PubMed

    Neu, Karlynn E; Tang, Qingming; Wilson, Patrick C; Khan, Aly A

    2017-02-01

    Single-cell genomics offers powerful tools for studying immune cells, which make it possible to observe rare and intermediate cell states that cannot be resolved at the population level. Advances in computer science and single-cell sequencing technology have created a data-driven revolution in immunology. The challenge for immunologists is to harness computing and turn an avalanche of quantitative data into meaningful discovery of immunological principles, predictive models, and strategies for therapeutics. Here, we review the current literature on computational analysis of single-cell RNA-sequencing data and discuss underlying assumptions, methods, and applications in immunology, and highlight important directions for future research. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Analysis and evaluation in the production process and equipment area of the low-cost solar array project

    NASA Technical Reports Server (NTRS)

    Goldman, H.; Wolf, M.

    1979-01-01

    The energy consumed in manufacturing silicon solar cell modules was calculated for the current process, as well as for 1982 and 1986 projected processes. In addition, energy payback times for the above three sequences are shown. The module manufacturing energy was partitioned two ways. In one way, the silicon reduction, silicon purification, sheet formation, cell fabrication, and encapsulation energies were found. In addition, the facility, equipment, processing material and direct material lost-in-process energies were appropriated in junction formation processes and full module manufacturing sequences. A brief methodology accounting for the energy of silicon wafers lost-in-processing during cell manufacturing is described.

  10. De Novo Transcriptome Sequence Assembly from Coconut Leaves and Seeds with a Focus on Factors Involved in RNA-Directed DNA Methylation

    PubMed Central

    Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L.; Chang, Bill Chia-Han; Matzke, Antonius J. M.; Matzke, Marjori

    2014-01-01

    Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. PMID:25193496

  11. De novo transcriptome sequence assembly from coconut leaves and seeds with a focus on factors involved in RNA-directed DNA methylation.

    PubMed

    Huang, Ya-Yi; Lee, Chueh-Pai; Fu, Jason L; Chang, Bill Chia-Han; Matzke, Antonius J M; Matzke, Marjori

    2014-09-04

    Coconut palm (Cocos nucifera) is a symbol of the tropics and a source of numerous edible and nonedible products of economic value. Despite its nutritional and industrial significance, coconut remains under-represented in public repositories for genomic and transcriptomic data. We report de novo transcript assembly from RNA-seq data and analysis of gene expression in seed tissues (embryo and endosperm) and leaves of a dwarf coconut variety. Assembly of 10 GB sequencing data for each tissue resulted in 58,211 total unigenes in embryo, 61,152 in endosperm, and 33,446 in leaf. Within each unigene pool, 24,857 could be annotated in embryo, 29,731 could be annotated in endosperm, and 26,064 could be annotated in leaf. A KEGG analysis identified 138, 138, and 139 pathways, respectively, in transcriptomes of embryo, endosperm, and leaf tissues. Given the extraordinarily large size of coconut seeds and the importance of small RNA-mediated epigenetic regulation during seed development in model plants, we used homology searches to identify putative homologs of factors required for RNA-directed DNA methylation in coconut. The findings suggest that RNA-directed DNA methylation is important during coconut seed development, particularly in maturing endosperm. This dataset will expand the genomics resources available for coconut and provide a foundation for more detailed analyses that may assist molecular breeding strategies aimed at improving this major tropical crop. Copyright © 2014 Huang et al.

  12. Variable-Number Tandem Repeats That Are Useful in Genotyping Isolates of Salmonella enterica subsp. enterica Serovars Typhimurium and Newport▿

    PubMed Central

    Witonski, D. ; Stefanova, R.; Ranganathan, A.; Schutze, G. E.; Eisenach, K. D.; Cave, M. D.

    2006-01-01

    The genome of Salmonella enterica subsp. enterica serovar Typhimurium strain LT2 was analyzed for direct repeats, and 54 sequences containing variable-number tandem repeat loci were identified. Ten primer pairs that anneal upstream and downstream of each selected locus were designed and used to amplify PCR targets in isolates of S. enterica serovars Typhimurium and Newport. Four of the 10 loci did not show polymorphism in the length of products. Six loci were selected for analysis. Isolates of S. enterica serovars Typhimurium and Newport that were related to specific outbreaks and showed identical pulsed-field gel electrophoresis patterns were indistinguishable by the length of the six variable-number tandem repeats. Isolates that differed in their pulsed-field gel electrophoresis patterns showed polymorphism in variable-number tandem repeat profiles. Length of the products was confirmed by DNA sequence analysis. Only 2 of the 10 loci contained exact integers of the direct repeat. Eight loci contained partial copies. The partial copies were maintained at the ends of the variable-number tandem repeat loci in all isolates. In spite of having partial copies that were maintained in all isolates, the number of direct repeats at a locus was polymorphic. Six variable-number tandem repeat loci were useful in distinguishing isolates of S. enterica serovars Typhimurium and Newport that had different pulsed-field gel electrophoresis patterns and in identifying outbreak-associated cases that shared a common pulsed-field gel pattern. PMID:16943354

  13. Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood

    PubMed Central

    Fan, H. Christina; Blumenfeld, Yair J.; Chitkara, Usha; Hudgins, Louanne; Quake, Stephen R.

    2008-01-01

    We directly sequenced cell-free DNA with high-throughput shotgun sequencing technology from plasma of pregnant women, obtaining, on average, 5 million sequence tags per patient sample. This enabled us to measure the over- and underrepresentation of chromosomes from an aneuploid fetus. The sequencing approach is polymorphism-independent and therefore universally applicable for the noninvasive detection of fetal aneuploidy. Using this method, we successfully identified all nine cases of trisomy 21 (Down syndrome), two cases of trisomy 18 (Edward syndrome), and one case of trisomy 13 (Patau syndrome) in a cohort of 18 normal and aneuploid pregnancies; trisomy was detected at gestational ages as early as the 14th week. Direct sequencing also allowed us to study the characteristics of cell-free plasma DNA, and we found evidence that this DNA is enriched for sequences from nucleosomes. PMID:18838674

  14. The dynamics of genome replication using deep sequencing

    PubMed Central

    Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.

    2014-01-01

    Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142

  15. Amino acid sequence of tyrosinase from Neurospora crassa.

    PubMed Central

    Lerch, K

    1978-01-01

    The amino-acid sequence of tyrosinase from Neurospora crassa (monophenol,dihydroxyphenylalanine:oxygen oxidoreductase, EC 1.14.18.1) is reported. This copper-containing oxidase consists of a single polypeptide chain of 407 amino acids. The primary structure was determined by automated and manual sequence analysis on fragments produced by cleavage with cyanogen bromide and on peptides obtained by digestion with trypsin, pepsin, thermolysin, or chymotrypsin. The amino terminus of the protein is acetylated and the single cysteinyl residue 96 is covalently linked via a thioether bridge to histidyl residue 94. The formation and the possible role of this unusual structure in Neurospora tyrosinase is discussed. Dye-sensitized photooxidation of apotyrosinase and active-site-directed inactivation of the native enzyme indicate the possible involvement of histidyl residues 188, 192, 289, and 305 or 306 as ligands to the active-site copper as well as in the catalytic mechanism of this monooxygenase. PMID:151279

  16. Functional specificity of a Hox protein mediated by the recognition of minor groove structure.

    PubMed

    Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S

    2007-11-02

    The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.

  17. No exonic mutations at GJB2, GJB3, GJB4, GJB6, ARS (Component B), and LOR genes responsible for a Chinese patient affected by progressive symmetric erythrokeratodermia with pseudoainhum.

    PubMed

    Zhou, Fusheng; Fu, Hongyang; Liu, Linghua; Cui, Yong; Zhang, Zhengzhong; Chang, Ruixue; Yue, Zhen; Yang, Sen; Zhang, Xuejun

    2014-09-01

    Progressive symmetric erythrokeratodermia (PSEK) is characterized by symmetric and growing erythematous hyperkeratotic patches over the body shortly after birth, particularly trunk and limbs, the buttocks, and the face, sometimes together with palmoplantar keratoderma (PPK). The GJB2, GJB3, GJB4, GJB6, ARS (Component B), and LOR gene mutation might contribute to PSEK manifestation. This study aimed to identify sequence alteration of these genes in a Chinese PSEK patient with pseudoainhum. Genomic DNA was purified from the patient's peripheral blood. Mutation analysis of target genes was performed by direct sequencing using ABI 3730 sequencer No exonic mutations was identified in the aforementioned genes. The result underlines the genetic heterogeneity of PSEK and other related erythrokeratodermas. © 2014 The International Society of Dermatology.

  18. Next generation sequencing--implications for clinical practice.

    PubMed

    Raffan, Eleanor; Semple, Robert K

    2011-01-01

    Genetic testing in inherited disease has traditionally relied upon recognition of the presenting clinical syndrome and targeted analysis of genes known to be linked to that syndrome. Consequently, many patients with genetic syndromes remain without a specific diagnosis. New 'next-generation' sequencing (NGS) techniques permit simultaneous sequencing of enormous amounts of DNA. A slew of research publications have recently demonstrated the tremendous power of these technologies in increasing understanding of human genetic disease. These approaches are likely to be increasingly employed in routine diagnostic practice, but the scale of the genetic information yielded about individuals means that caution must be exercised to avoid net harm in this setting. Use of NGS in a research setting will increasingly have a major but indirect beneficial impact on clinical practice. However, important technical, ethical and social challenges need to be addressed through informed professional and public dialogue before it finds its mature niche as a direct tool in the clinical diagnostic armoury.

  19. TP53 Mutation Status of Tubo-ovarian and Peritoneal High-grade Serous Carcinoma with a Wild-type p53 Immunostaining Pattern.

    PubMed

    Na, Kiyong; Sung, Ji-Youn; Kim, Hyun-Soo

    2017-12-01

    Diffuse and strong nuclear p53 immunoreactivity and a complete lack of p53 expression are regarded as indicative of missense and nonsense mutations, respectively, of the TP53 gene. Tubo-ovarian and peritoneal high-grade serous carcinoma (HGSC) is characterized by aberrant p53 expression induced by a TP53 mutation. However, our experience with some HGSC cases with a wild-type p53 immunostaining pattern led us to comprehensively review previous cases and investigate the TP53 mutational status of the exceptional cases. We analyzed the immunophenotype of 153 cases of HGSC and performed TP53 gene sequencing analysis in those with a wild-type p53 immunostaining pattern. Immunostaining revealed that 109 (71.3%) cases displayed diffuse and strong p53 expression (missense mutation pattern), while 39 (25.5%) had no p53 expression (nonsense mutation pattern). The remaining five cases of HGSC showed a wild-type p53 immunostaining pattern. Direct sequencing analysis revealed that three of these cases harbored nonsense TP53 mutations and two had novel splice site deletions. TP53 mutation is almost invariably present in HGSC, and p53 immunostaining can be used as a surrogate marker of TP53 mutation. In cases with a wild-type p53 immunostaining pattern, direct sequencing for TP53 mutational status can be helpful to confirm the presence of a TP53 mutation. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  20. Nicotiana tabacum EIL2 directly regulates expression of at least one tobacco gene induced by sulphur starvation.

    PubMed

    Wawrzyńska, Anna; Lewandowska, Małgorzata; Sirko, Agnieszka

    2010-03-01

    Sulphur deficiency severely affects plant growth and their agricultural productivity leading to diverse changes in development and metabolisms. Molecular mechanisms regulating gene expression under low sulphur conditions remain largely unknown. AtSLIM1, a member of the EIN3-like (EIL) family was reported to be a central transcriptional regulator of the plant sulphur response, however, no direct interaction of this protein with any sulphur-responsive promoters was demonstrated. The focus of this study was on the analysis of a promoter region of UP9C, a tobacco gene strongly induced by sulphur limitation. Cloning and subsequent examination of this promoter resulted in the identification of a 20-nt sequence (UPE-box), also present in the promoters of several Arabidopsis genes, including three out of four homologues of UP9C. The UPE-box, consisting of two parallel tebs sequences (TEIL binding site), proved to be necessary to bind the transcription factors belonging to the EIL family and of a 5-nt conserved sequence at the 3'-end. The yeast one-hybrid analysis resulted in the identification of one transcription factor (NtEIL2) capable of binding to the UPE-box. The interactions of NtEIL2, and its homologue from Arabidopsis, AtSLIM1, with DNA were affected by mutations within the UPE-box. Transient expression assays in Nicotiana benthamiana have further shown that both factors, NtEIL2 and AtSLIM1, activate the UP9C promoter. Interestingly, activation by NtEIL2, but not by AtSLIM1, was dependent on the sulphur-deficiency of the plants.

  1. Spatio-temporal Analysis of the Genetic Diversity of Arctic Rabies Viruses and Their Reservoir Hosts in Greenland

    PubMed Central

    Hanke, Dennis; Freuling, Conrad M.; Fischer, Susanne; Hueffer, Karsten; Hundertmark, Kris; Nadin-Davis, Susan; Marston, Denise; Fooks, Anthony R.; Bøtner, Anette; Mettenleiter, Thomas C.; Beer, Martin; Rasmussen, Thomas B.; Müller, Thomas F.; Höper, Dirk

    2016-01-01

    There has been limited knowledge on spatio-temporal epidemiology of zoonotic arctic fox rabies among countries bordering the Arctic, in particular Greenland. Previous molecular epidemiological studies have suggested the occurrence of one particular arctic rabies virus (RABV) lineage (arctic-3), but have been limited by a low number of available samples preventing in-depth high resolution phylogenetic analysis of RABVs at that time. However, an improved knowledge of the evolution, at a molecular level, of the circulating RABVs and a better understanding of the historical perspective of the disease in Greenland is necessary for better direct control measures on the island. These issues have been addressed by investigating the spatio-temporal genetic diversity of arctic RABVs and their reservoir host, the arctic fox, in Greenland using both full and partial genome sequences. Using a unique set of 79 arctic RABV full genome sequences from Greenland, Canada, USA (Alaska) and Russia obtained between 1977 and 2014, a description of the historic context in relation to the genetic diversity of currently circulating RABV in Greenland and neighboring Canadian Northern territories has been provided. The phylogenetic analysis confirmed delineation into four major arctic RABV lineages (arctic 1–4) with viruses from Greenland exclusively grouping into the circumpolar arctic-3 lineage. High resolution analysis enabled distinction of seven geographically distinct subclades (3.I – 3.VII) with two subclades containing viruses from both Greenland and Canada. By combining analysis of full length RABV genome sequences and host derived sequences encoding mitochondrial proteins obtained simultaneously from brain tissues of 49 arctic foxes, the interaction of viruses and their hosts was explored in detail. Such an approach can serve as a blueprint for analysis of infectious disease dynamics and virus-host interdependencies. The results showed a fine-scale spatial population structure in Greenland arctic foxes based on mitochondrial sequences, but provided no evidence for independent isolated evolutionary development of RABV in different arctic fox lineages. These data are invaluable to support future initiatives for arctic fox rabies control and elimination in Greenland. PMID:27459154

  2. Spatio-temporal Analysis of the Genetic Diversity of Arctic Rabies Viruses and Their Reservoir Hosts in Greenland.

    PubMed

    Hanke, Dennis; Freuling, Conrad M; Fischer, Susanne; Hueffer, Karsten; Hundertmark, Kris; Nadin-Davis, Susan; Marston, Denise; Fooks, Anthony R; Bøtner, Anette; Mettenleiter, Thomas C; Beer, Martin; Rasmussen, Thomas B; Müller, Thomas F; Höper, Dirk

    2016-07-01

    There has been limited knowledge on spatio-temporal epidemiology of zoonotic arctic fox rabies among countries bordering the Arctic, in particular Greenland. Previous molecular epidemiological studies have suggested the occurrence of one particular arctic rabies virus (RABV) lineage (arctic-3), but have been limited by a low number of available samples preventing in-depth high resolution phylogenetic analysis of RABVs at that time. However, an improved knowledge of the evolution, at a molecular level, of the circulating RABVs and a better understanding of the historical perspective of the disease in Greenland is necessary for better direct control measures on the island. These issues have been addressed by investigating the spatio-temporal genetic diversity of arctic RABVs and their reservoir host, the arctic fox, in Greenland using both full and partial genome sequences. Using a unique set of 79 arctic RABV full genome sequences from Greenland, Canada, USA (Alaska) and Russia obtained between 1977 and 2014, a description of the historic context in relation to the genetic diversity of currently circulating RABV in Greenland and neighboring Canadian Northern territories has been provided. The phylogenetic analysis confirmed delineation into four major arctic RABV lineages (arctic 1-4) with viruses from Greenland exclusively grouping into the circumpolar arctic-3 lineage. High resolution analysis enabled distinction of seven geographically distinct subclades (3.I - 3.VII) with two subclades containing viruses from both Greenland and Canada. By combining analysis of full length RABV genome sequences and host derived sequences encoding mitochondrial proteins obtained simultaneously from brain tissues of 49 arctic foxes, the interaction of viruses and their hosts was explored in detail. Such an approach can serve as a blueprint for analysis of infectious disease dynamics and virus-host interdependencies. The results showed a fine-scale spatial population structure in Greenland arctic foxes based on mitochondrial sequences, but provided no evidence for independent isolated evolutionary development of RABV in different arctic fox lineages. These data are invaluable to support future initiatives for arctic fox rabies control and elimination in Greenland.

  3. Comparison of MR imaging sequences for liver and head and neck interventions: is there a single optimal sequence for all purposes?

    PubMed

    Boll, Daniel T; Lewin, Jonathan S; Duerk, Jeffrey L; Aschoff, Andrik J; Merkle, Elmar M

    2004-05-01

    To compare the appropriate pulse sequences for interventional device guidance during magnetic resonance (MR) imaging at 0.2 T and to evaluate the dependence of sequence selection on the anatomic region of the procedure. Using a C-arm 0.2 T system, four interventional MR sequences were applied in 23 liver cases and during MR-guided neck interventions in 13 patients. The imaging protocol consisted of: multislice turbo spin echo (TSE) T2w, sequential-slice fast imaging with steady precession (FISP), a time-reversed version of FISP (PSIF), and FISP with balanced gradients in all spatial directions (True-FISP) sequences. Vessel conspicuity was rated and contrast-to-noise ratio (CNR) was calculated for each sequence and a differential receiver operating characteristic was performed. Liver findings were detected in 96% using the TSE sequence. PSIF, FISP, and True-FISP imaging showed lesions in 91%, 61%, and 65%, respectively. The TSE sequence offered the best CNR, followed by PSIF imaging. Differential receiver operating characteristic analysis also rated TSE and PSIF to be the superior sequences. Lesions in the head and neck were detected in all cases by TSE and FISP, in 92% using True-FISP, and in 84% using PSIF. True-FISP offered the best CNR, followed by TSE imaging. Vessels appeared bright on FISP and True-FISP imaging and dark on the other sequences. In interventional MR imaging, no single sequence fits all purposes. Image guidance for interventional MR during liver procedures is best achieved by PSIF or TSE, whereas biopsies in the head and neck are best performed using FISP or True-FISP sequences.

  4. Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods

    PubMed Central

    Dröge, J.; Gregor, I.; McHardy, A. C.

    2015-01-01

    Motivation: Metagenomics characterizes microbial communities by random shotgun sequencing of DNA isolated directly from an environment of interest. An essential step in computational metagenome analysis is taxonomic sequence assignment, which allows identifying the sequenced community members and reconstructing taxonomic bins with sequence data for the individual taxa. For the massive datasets generated by next-generation sequencing technologies, this cannot be performed with de-novo phylogenetic inference methods. We describe an algorithm and the accompanying software, taxator-tk, which performs taxonomic sequence assignment by fast approximate determination of evolutionary neighbors from sequence similarities. Results: Taxator-tk was precise in its taxonomic assignment across all ranks and taxa for a range of evolutionary distances and for short as well as for long sequences. In addition to the taxonomic binning of metagenomes, it is well suited for profiling microbial communities from metagenome samples because it identifies bacterial, archaeal and eukaryotic community members without being affected by varying primer binding strengths, as in marker gene amplification, or copy number variations of marker genes across different taxa. Taxator-tk has an efficient, parallelized implementation that allows the assignment of 6 Gb of sequence data per day on a standard multiprocessor system with 10 CPU cores and microbial RefSeq as the genomic reference data. Availability and implementation: Taxator-tk source and binary program files are publicly available at http://algbio.cs.uni-duesseldorf.de/software/. Contact: Alice.McHardy@uni-duesseldorf.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25388150

  5. Transcriptional regulation of human eosinophil RNases by an evolutionary- conserved sequence motif in primate genome

    PubMed Central

    Wang, Hsiu-Yu; Chang, Hao-Teng; Pai, Tun-Wen; Wu, Chung-I; Lee, Yuan-Hung; Chang, Yen-Hsin; Tai, Hsiu-Ling; Tang, Chuan-Yi; Chou, Wei-Yao; Chang, Margaret Dah-Tsyr

    2007-01-01

    Background Human eosinophil-derived neurotoxin (edn) and eosinophil cationic protein (ecp) are members of a subfamily of primate ribonuclease (rnase) genes. Although they are generated by gene duplication event, distinct edn and ecp expression profile in various tissues have been reported. Results In this study, we obtained the upstream promoter sequences of several representative primate eosinophil rnases. Bioinformatic analysis revealed the presence of a shared 34-nucleotide (nt) sequence stretch located at -81 to -48 in all edn promoters and macaque ecp promoter. Such a unique sequence motif constituted a region essential for transactivation of human edn in hepatocellular carcinoma cells. Gel electrophoretic mobility shift assay, transient transfection and scanning mutagenesis experiments allowed us to identify binding sites for two transcription factors, Myc-associated zinc finger protein (MAZ) and SV-40 protein-1 (Sp1), within the 34-nt segment. Subsequent in vitro and in vivo binding assays demonstrated a direct molecular interaction between this 34-nt region and MAZ and Sp1. Interestingly, overexpression of MAZ and Sp1 respectively repressed and enhanced edn promoter activity. The regulatory transactivation motif was mapped to the evolutionarily conserved -74/-65 region of the edn promoter, which was guanidine-rich and critical for recognition by both transcription factors. Conclusion Our results provide the first direct evidence that MAZ and Sp1 play important roles on the transcriptional activation of the human edn promoter through specific binding to a 34-nt segment present in representative primate eosinophil rnase promoters. PMID:17927842

  6. Transcription initiation from the dihydrofolate reductase promoter is positioned by HIP1 binding at the initiation site.

    PubMed

    Means, A L; Farnham, P J

    1990-02-01

    We have identified a sequence element that specifies the position of transcription initiation for the dihydrofolate reductase gene. Unlike the functionally analogous TATA box that directs RNA polymerase II to initiate transcription 30 nucleotides downstream, the positioning element of the dihydrofolate reductase promoter is located directly at the site of transcription initiation. By using DNase I footprint analysis, we have shown that a protein binds to this initiator element. Transcription initiated at the dihydrofolate reductase initiator element when 28 nucleotides were inserted between it and all other upstream sequences, or when it was placed on either side of the DNA helix, suggesting that there is no strict spatial requirement between the initiator and an upstream element. Although neither a single Sp1-binding site nor a single initiator element was sufficient for transcriptional activity, the combination of one Sp1-binding site and the dihydrofolate reductase initiator element cloned into a plasmid vector resulted in transcription starting at the initiator element. We have also shown that the simian virus 40 late major initiation site has striking sequence homology to the dihydrofolate reductase initiation site and that the same, or a similar, protein binds to both sites. Examination of the sequences at other RNA polymerase II initiation sites suggests that we have identified an element that is important in the transcription of other housekeeping genes. We have thus named the protein that binds to the initiator element HIP1 (Housekeeping Initiator Protein 1).

  7. Divergence, differential methylation and interspersion of melon satellite DNA sequences.

    PubMed Central

    Shmookler Reis, R; Timmis, J N; Ingle, J

    1981-01-01

    Melon (Cucumis melo) satellite DNA consists of two components, Q and S, each with a buoyant density in CsCl of 1.707 g/ml, but differing by 9 degrees C in "melting" temperature. These physical properties appear to be in contradiction, since both depend on G + C content. In order to resolve this anomaly, base compositions were directly determined for isolated fractions. the low-"melting" component S contains 41.8% G + C, with 6% of C present as 5-methylcytosine, whereas Q DNA contains 54% G + C, with 41% of C methylated. Analyses of restriction site loss agreed well with the direct determinations of methylation and divergence, and indicated some clustering of methylated sites in Q DNA. Analysis of restricted main-band DNA by hydridization with RNA complementary to Q satellite DNA ("Southern transfer") showed satellite Q tandem arrays interspersed in DNA of main-band density. Sequence divergence and extent of methylation did not appear to depend on whether a repeat array was present as satellite or interspersed in main-band DNA. Hydridization in situ indicated considerable heterogeneity in the genomic proportion of the Q-DNA sequences in melon fruit nuclei, implying over- and under-representation consistent with extensive unequal recombination in satellite Q tandem arrays. The cucumber, Cucumis sativus, contains less than 8% as much Q-homologous DNA per genome as the melon, suggesting rapid evolutionary gain or loss of these tandem repeat sequences. Images Fig. 2. PLATE 1 Fig. 4. Fig. 10. PMID:6172117

  8. Comparison of causality analysis on simultaneously measured fMRI and NIRS signals during motor tasks.

    PubMed

    Anwar, Abdul Rauf; Muthalib, Makii; Perrey, Stephane; Galka, Andreas; Granert, Oliver; Wolff, Stephan; Deuschl, Guenther; Raethjen, Jan; Heute, Ulrich; Muthuraman, Muthuraman

    2013-01-01

    Brain activity can be measured using different modalities. Since most of the modalities tend to complement each other, it seems promising to measure them simultaneously. In to be presented research, the data recorded from Functional Magnetic Resonance Imaging (fMRI) and Near Infrared Spectroscopy (NIRS), simultaneously, are subjected to causality analysis using time-resolved partial directed coherence (tPDC). Time-resolved partial directed coherence uses the principle of state space modelling to estimate Multivariate Autoregressive (MVAR) coefficients. This method is useful to visualize both frequency and time dynamics of causality between the time series. Afterwards, causality results from different modalities are compared by estimating the Spearman correlation. In to be presented study, we used directionality vectors to analyze correlation, rather than actual signal vectors. Results show that causality analysis of the fMRI correlates more closely to causality results of oxy-NIRS as compared to deoxy-NIRS in case of a finger sequencing task. However, in case of simple finger tapping, no clear difference between oxy-fMRI and deoxy-fMRI correlation is identified.

  9. Molecular analysis of the anaerobic rumen fungus Orpinomyces - insights into an AT-rich genome.

    PubMed

    Nicholson, Matthew J; Theodorou, Michael K; Brookman, Jayne L

    2005-01-01

    The anaerobic gut fungi occupy a unique niche in the intestinal tract of large herbivorous animals and are thought to act as primary colonizers of plant material during digestion. They are the only known obligately anaerobic fungi but molecular analysis of this group has been hampered by difficulties in their culture and manipulation, and by their extremely high A+T nucleotide content. This study begins to answer some of the fundamental questions about the structure and organization of the anaerobic gut fungal genome. Directed plasmid libraries using genomic DNA digested with highly or moderately rich AT-specific restriction enzymes (VspI and EcoRI) were prepared from a polycentric Orpinomyces isolate. Clones were sequenced from these libraries and the breadth of genomic inserts, both genic and intergenic, was characterized. Genes encoding numerous functions not previously characterized for these fungi were identified, including cytoskeletal, secretory pathway and transporter genes. A peptidase gene with no introns and having sequence similarity to a gene encoding a bacterial peptidase was also identified, extending the range of metabolic enzymes resulting from apparent trans-kingdom transfer from bacteria to fungi, as previously characterized largely for genes encoding plant-degrading enzymes. This paper presents the first thorough analysis of the genic, intergenic and rDNA regions of a variety of genomic segments from an anaerobic gut fungus and provides observations on rules governing intron boundaries, the codon biases observed with different types of genes, and the sequence of only the second anaerobic gut fungal promoter reported. Large numbers of retrotransposon sequences of different types were found and the authors speculate on the possible consequences of any such transposon activity in the genome. The coding sequences identified included several orphan gene sequences, including one with regions strongly suggestive of structural proteins such as collagens and lampirin. This gene was present as a single copy in Orpinomyces, was expressed during vegetative growth and was also detected in genomes from another gut fungal genus, Neocallimastix.

  10. Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.

    PubMed

    Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M

    2018-05-01

    Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.

  11. Whole Genome Sequence Analysis of a Large Isoniazid-Resistant Tuberculosis Outbreak in London: A Retrospective Observational Study

    PubMed Central

    Casali, Nicola; Broda, Agnieszka; Harris, Simon R.; Brown, Timothy; Drobniewski, Francis

    2016-01-01

    Background A large isoniazid-resistant tuberculosis outbreak centred on London, United Kingdom, has been ongoing since 1995. The aim of this study was to investigate the power and value of whole genome sequencing (WGS) to resolve the transmission network compared to current molecular strain typing approaches, including analysis of intra-host diversity within a specimen, across body sites, and over time, with identification of genetic factors underlying the epidemiological success of this cluster. Methods and Findings We sequenced 344 outbreak isolates from individual patients collected over 14 y (2 February 1998–22 June 2012). This demonstrated that 96 (27.9%) were indistinguishable, and only one differed from this major clone by more than five single nucleotide polymorphisms (SNPs). The maximum number of SNPs between any pair of isolates was nine SNPs, and the modal distance between isolates was two SNPs. WGS was able to reveal the direction of transmission of tuberculosis in 16 cases within the outbreak (4.7%), including within a multidrug-resistant cluster that carried a rare rpoB mutation associated with rifampicin resistance. Eleven longitudinal pairs of patient pulmonary isolates collected up to 48 mo apart differed from each other by between zero and four SNPs. Extrapulmonary dissemination resulted in acquisition of a SNP in two of five cases. WGS analysis of 27 individual colonies cultured from a single patient specimen revealed ten loci differed amongst them, with a maximum distance between any pair of six SNPs. A limitation of this study, as in previous studies, is that indels and SNPs in repetitive regions were not assessed due to the difficulty in reliably determining this variation. Conclusions Our study suggests that (1) certain paradigms need to be revised, such as the 12 SNP distance as the gold standard upper threshold to identify plausible transmissions; (2) WGS technology is helpful to rule out the possibility of direct transmission when isolates are separated by a substantial number of SNPs; (3) the concept of a transmission chain or network may not be useful in institutional or household settings; (4) the practice of isolating single colonies prior to sequencing is likely to lead to an overestimation of the number of SNPs between cases resulting from direct transmission; and (5) despite appreciable genomic diversity within a host, transmission of tuberculosis rarely results in minority variants becoming dominant. Thus, whilst WGS provided some increased resolution over variable number tandem repeat (VNTR)-based clustering, it was insufficient for inferring transmission in the majority of cases. PMID:27701423

  12. Beyond directed evolution - semi-rational protein engineering and design

    PubMed Central

    Lutz, Stefan

    2010-01-01

    Over the last two decades, directed evolution has transformed the field of protein engineering. The advances in understanding protein structure and function, in no insignificant part a result of directed evolution studies, are increasingly empowering scientists and engineers to device more effective methods for manipulating and tailoring biocatalysts. Abandoning large combinatorial libraries, the focus has shifted to small, functionally-rich libraries and rational design. A critical component to the success of these emerging engineering strategies are computational tools for the evaluation of protein sequence datasets and the analysis of conformational variations of amino acids in proteins. Highlighting the opportunities and limitations of such approaches, this review focuses on recent engineering and design examples that require screening or selection of small libraries. PMID:20869867

  13. Occurrence of a Sequence in Marine Cyanophages Similar to That of T4 g20 and Its Application to PCR-Based Detection and Quantification Techniques†

    PubMed Central

    Fuller, Nicholas J.; Wilson, William H.; Joint, Ian R.; Mann, Nicholas H.

    1998-01-01

    Viruses are ubiquitous components of marine ecosystems and are known to infect unicellular phycoerythrin-containing cyanobacteria belonging to the genus Synechococcus. A conserved region from the cyanophage genome was identified in three genetically distinct cyanomyoviruses, and a sequence analysis revealed that this region exhibited significant similarity to a gene encoding a capsid assembly protein (gp20) from the enteric coliphage T4. The results of a comparison of gene 20 sequences from three cyanomyoviruses and T4 allowed us to design two degenerate PCR primers, CPS1 and CPS2, which specifically amplified a 165-bp region from the majority of cyanomyoviruses tested. A competitive PCR (cPCR) analysis revealed that cyanomyovirus strains could be accurately enumerated, and it was demonstrated that quantification was log-linear over ca. 3 orders of magnitude. Different calibration curves were obtained for each of the three cyanomyovirus strains tested; consequently, cPCR performed with primers CPS1 and CPS2 could lead to substantial inaccuracies in estimates of phage abundance in natural assemblages. Further sequence analysis of cyanomyovirus gene 20 homologs would be necessary in order to design primers which do not exhibit phage-to-phage variability in priming efficiency. It was demonstrated that PCR products of the correct size could be amplified from seawater samples following 100× concentration and even directly without any prior concentration. Hence, the use of degenerate primers in PCR analyses of cyanophage populations should provide valuable data on the diversity of cyanophages in natural assemblages. Further optimization of procedures may ultimately lead to a sensitive assay which can be used to analyze natural cyanophage populations both quantitatively (by cPCR) and qualitatively following phylogenetic analysis of amplified products. PMID:9603813

  14. Geno2pheno[HCV] – A Web-based Interpretation System to Support Hepatitis C Treatment Decisions in the Era of Direct-Acting Antiviral Agents

    PubMed Central

    Kalaghatgi, Prabhav; Sikorski, Anna Maria; Knops, Elena; Rupp, Daniel; Sierra, Saleta; Heger, Eva; Neumann-Fraune, Maria; Beggel, Bastian; Walker, Andreas; Timm, Jörg; Walter, Hauke; Obermeier, Martin; Kaiser, Rolf; Bartenschlager, Ralf; Lengauer, Thomas

    2016-01-01

    The face of hepatitis C virus (HCV) therapy is changing dramatically. Direct-acting antiviral agents (DAAs) specifically targeting HCV proteins have been developed and entered clinical practice in 2011. However, despite high sustained viral response (SVR) rates of more than 90%, a fraction of patients do not eliminate the virus and in these cases treatment failure has been associated with the selection of drug resistance mutations (RAMs). RAMs may be prevalent prior to the start of treatment, or can be selected under therapy, and furthermore they can persist after cessation of treatment. Additionally, certain DAAs have been approved only for distinct HCV genotypes and may even have subtype specificity. Thus, sequence analysis before start of therapy is instrumental for managing DAA-based treatment strategies. We have created the interpretation system geno2pheno[HCV] (g2p[HCV]) to analyse HCV sequence data with respect to viral subtype and to predict drug resistance. Extensive reviewing and weighting of literature related to HCV drug resistance was performed to create a comprehensive list of drug resistance rules for inhibitors of the HCV protease in non-structural protein 3 (NS3-protease: Boceprevir, Paritaprevir, Simeprevir, Asunaprevir, Grazoprevir and Telaprevir), the NS5A replicase factor (Daclatasvir, Ledipasvir, Elbasvir and Ombitasvir), and the NS5B RNA-dependent RNA polymerase (Dasabuvir and Sofosbuvir). Upon submission of up to eight sequences, g2p[HCV] aligns the input sequences, identifies the genomic region(s), predicts the HCV geno- and subtypes, and generates for each DAA a drug resistance prediction report. g2p[HCV] offers easy-to-use and fast subtype and resistance analysis of HCV sequences, is continuously updated and freely accessible under http://hcv.geno2pheno.org/index.php. The system was partially validated with respect to the NS3-protease inhibitors Boceprevir, Telaprevir and Simeprevir by using data generated with recombinant, phenotypic cell culture assays obtained from patients’ virus variants. PMID:27196673

  15. Direct sequencing of mitochondrial DNA detects highly divergent haplotypes in blue marlin (Makaira nigricans).

    PubMed

    Finnerty, J R; Block, B A

    1992-06-01

    We were able to differentiate between species of billfish (Istiophoridae family) and to detect considerable intraspecific variation in the blue marlin (Makaira nigricans) by directly sequencing a polymerase chain reaction (PCR)-amplified, 612-bp fragment of the mitochondrial cytochrome b gene. Thirteen variable nucleotide sites separated blue marlin (n = 26) into 7 genotypes. On average, these genotypes differed by 5.7 base substitutions. A smaller sample of swordfish from an equally broad geographic distribution displayed relatively little intraspecific variation, with an average of 1.3 substitutions separating different genotypes. A cladistic analysis of blue marlin cytochrome b variants indicates two major divergent evolutionary lines within the species. The frequencies of these two major evolutionary lines differ significantly between Atlantic and Pacific ocean basins. This finding is important given that the Atlantic stocks of blue marlin are considered endangered. Migration from the Pacific can help replenish the numbers of blue marlin in the Atlantic, but the loss of certain mitochondrial DNA haplotypes in the Atlantic due to overfishing probably could not be remedied by an influx of Pacific fish because of their absence in the Pacific population. Fishery management strategies should attempt to preserve the genetic diversity within the species. The detection of DNA sequence polymorphism indicates the utility of PCR technology in pelagic fishery genetics.

  16. Molecular Evidence for a Natural Primary Triple Hybrid in Plants Revealed from Direct Sequencing

    PubMed Central

    Kaplan, Zdenek; Fehrer, Judith

    2007-01-01

    Background and Aims Molecular evidence for natural primary hybrids composed of three different plant species is very rarely reported. An investigation was therefore carried out into the origin and a possible scenario for the rise of a sterile plant clone showing a combination of diagnostic morphological features of three separate, well-defined Potamogeton species. Methods The combination of sequences from maternally inherited cytoplasmic (rpl20-rps12) and biparentally inherited nuclear ribosomal DNA (ITS) was used to identify the exact identity of the putative triple hybrid. Key Results Direct sequencing showed ITS variants of three parental taxa, P. gramineus, P. lucens and P. perfoliatus, whereas chloroplast DNA identified P. perfoliatus as the female parent. A scenario for the rise of the triple hybrid through a fertile binary hybrid P. gramineus × P. lucens crossed with P. perfoliatus is described. Conclusions Even though the triple hybrid is sterile, it possesses an efficient strategy for its existence and became locally successful even in the parental environment, perhaps as a result of heterosis. The population investigated is the only one known of this hybrid, P. × torssanderi, worldwide. Isozyme analysis indicated the colony to be genetically uniform. The plants studied represented a single clone that seems to have persisted at this site for a long time. PMID:17478544

  17. Algorithms for accelerated convergence of adaptive PCA.

    PubMed

    Chatterjee, C; Kang, Z; Roychowdhury, V P

    2000-01-01

    We derive and discuss new adaptive algorithms for principal component analysis (PCA) that are shown to converge faster than the traditional PCA algorithms due to Oja, Sanger, and Xu. It is well known that traditional PCA algorithms that are derived by using gradient descent on an objective function are slow to converge. Furthermore, the convergence of these algorithms depends on appropriate choices of the gain sequences. Since online applications demand faster convergence and an automatic selection of gains, we present new adaptive algorithms to solve these problems. We first present an unconstrained objective function, which can be minimized to obtain the principal components. We derive adaptive algorithms from this objective function by using: 1) gradient descent; 2) steepest descent; 3) conjugate direction; and 4) Newton-Raphson methods. Although gradient descent produces Xu's LMSER algorithm, the steepest descent, conjugate direction, and Newton-Raphson methods produce new adaptive algorithms for PCA. We also provide a discussion on the landscape of the objective function, and present a global convergence proof of the adaptive gradient descent PCA algorithm using stochastic approximation theory. Extensive experiments with stationary and nonstationary multidimensional Gaussian sequences show faster convergence of the new algorithms over the traditional gradient descent methods.We also compare the steepest descent adaptive algorithm with state-of-the-art methods on stationary and nonstationary sequences.

  18. Chromodomains direct integration of retrotransposons to heterochromatin

    PubMed Central

    Gao, Xiang; Hou, Yi; Ebina, Hirotaka; Levin, Henry L.; Voytas, Daniel F.

    2008-01-01

    The enrichment of mobile genetic elements in heterochromatin may be due, in part, to targeted integration. The chromoviruses are Ty3/gypsy retrotransposons with chromodomains at their integrase C termini. Chromodomains are logical determinants for targeting to heterochromatin, because the chromodomain of heterochromatin protein 1 (HP1) typically recognizes histone H3 K9 methylation, an epigenetic mark characteristic of heterochromatin. We describe three groups of chromoviruses based on amino acid sequence relationships of their integrase C termini. Genome sequence analysis indicates that representative chromoviruses from each group are enriched in gene-poor regions of the genome relative to other retrotransposons, and when fused to fluorescent marker proteins, the chromodomains target proteins to specific subnuclear foci coincident with heterochromatin. The chromodomain of the fungal element, MAGGY, interacts with histone H3 dimethyl- and trimethyl-K9, and when the MAGGY chromodomain is fused to integrase of the Schizosaccharomyces pombe Tf1 retrotransposon, new Tf1 insertions are directed to sites of H3 K9 methylation. Repetitive sequences such as transposable elements trigger the RNAi pathway resulting in their epigenetic modification. Our results suggest a dynamic interplay between retrotransposons and heterochromatin, wherein mobile elements recognize heterochromatin at the time of integration and then perpetuate the heterochromatic mark by triggering epigenetic modification. PMID:18256242

  19. FREQ-Seq: A Rapid, Cost-Effective, Sequencing-Based Method to Determine Allele Frequencies Directly from Mixed Populations

    PubMed Central

    Delaney, Nigel F.; Marx, Christopher J.

    2012-01-01

    Understanding evolutionary dynamics within microbial populations requires the ability to accurately follow allele frequencies through time. Here we present a rapid, cost-effective method (FREQ-Seq) that leverages Illumina next-generation sequencing for localized, quantitative allele frequency detection. Analogous to RNA-Seq, FREQ-Seq relies upon counts from the >105 reads generated per locus per time-point to determine allele frequencies. Loci of interest are directly amplified from a mixed population via two rounds of PCR using inexpensive, user-designed oligonucleotides and a bar-coded bridging primer system that can be regenerated in-house. The resulting bar-coded PCR products contain the adapters needed for Illumina sequencing, eliminating further library preparation. We demonstrate the utility of FREQ-Seq by determining the order and dynamics of beneficial alleles that arose as a microbial population, founded with an engineered strain of Methylobacterium, evolved to grow on methanol. Quantifying allele frequencies with minimal bias down to 1% abundance allowed effective analysis of SNPs, small in-dels and insertions of transposable elements. Our data reveal large-scale clonal interference during the early stages of adaptation and illustrate the utility of FREQ-Seq as a cost-effective tool for tracking allele frequencies in populations. PMID:23118913

  20. Mass spectrometric analysis of O-linked oligosaccharides from various recombinant expression systems.

    PubMed

    Kenny, Diarmuid T; Gaunitz, Stefan; Hayes, Catherine A; Gustafsson, Anki; Sjöblom, Magnus; Holgersson, Jan; Karlsson, Niclas G

    2013-01-01

    Analysis of O-linked glycosylation is one of the main challenges during structural validation of recombinant glycoproteins. With methods available for N-linked glycosylation in regard to oligosaccharide analysis as well as glycopeptide mapping, there are still challenges for O-linked glycan analysis. Here, we present mass spectrometric methodology for O-linked oligosaccharides released by reductive β-elimination. Using LC-MS and LC-MS(2) with graphitized carbon columns, oligosaccharides are analyzed without derivatization. This approach provides a high-throughput method for screening during clonal selection, as well as product structure verification, without impairing sequencing ability. The protocols are exemplified by analysis of glycoproteins from mammalian cell cultures (CHO cells) as well as insect cells and yeast. The data shows that the method can be successfully applied to both neutral and acidic O-linked oligosaccharides, where sialic acid, hexuronic acid, and sulfate are common substituents. Further characterization of O-glycans can be achieved using permethylation. Permethylation of O-linked oligosaccharides followed by direct infusion into the mass spectrometer provide information about oligosaccharide composition, and subsequent MS (n) experiments can be carried out to elucidate oligosaccharide structure including linkage information and sequence.

Top