Sequence-specific bias correction for RNA-seq data using recurrent neural networks.
Zhang, Yao-Zhong; Yamaguchi, Rui; Imoto, Seiya; Miyano, Satoru
2017-01-25
The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited for biological problems that require automatic or hierarchical feature representation for biological data when prior knowledge is limited. In this work, we address the sequence-specific bias correction problem for RNA-seq data redusing Recurrent Neural Networks (RNNs) to model nucleotide sequences without pre-determining sequence structures. The sequence-specific bias of a read is then calculated based on the sequence probabilities estimated by RNNs, and used in the estimation of gene abundance. We explore the application of two popular RNN recurrent units for this task and demonstrate that RNN-based approaches provide a flexible way to model nucleotide sequences without knowledge of predetermined sequence structures. Our experiments show that training a RNN-based nucleotide sequence model is efficient and RNN-based bias correction methods compare well with the-state-of-the-art sequence-specific bias correction method on the commonly used MAQC-III data set. RNNs provides an alternative and flexible way to calculate sequence-specific bias without explicitly pre-determining sequence structures.
Patel, Ronak Y; Shah, Neethu; Jackson, Andrew R; Ghosh, Rajarshi; Pawliczek, Piotr; Paithankar, Sameer; Baker, Aaron; Riehle, Kevin; Chen, Hailin; Milosavljevic, Sofia; Bizon, Chris; Rynearson, Shawn; Nelson, Tristan; Jarvik, Gail P; Rehm, Heidi L; Harrison, Steven M; Azzariti, Danielle; Powell, Bradford; Babb, Larry; Plon, Sharon E; Milosavljevic, Aleksandar
2017-01-12
The success of the clinical use of sequencing based tests (from single gene to genomes) depends on the accuracy and consistency of variant interpretation. Aiming to improve the interpretation process through practice guidelines, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) have published standards and guidelines for the interpretation of sequence variants. However, manual application of the guidelines is tedious and prone to human error. Web-based tools and software systems may not only address this problem but also document reasoning and supporting evidence, thus enabling transparency of evidence-based reasoning and resolution of discordant interpretations. In this report, we describe the design, implementation, and initial testing of the Clinical Genome Resource (ClinGen) Pathogenicity Calculator, a configurable system and web service for the assessment of pathogenicity of Mendelian germline sequence variants. The system allows users to enter the applicable ACMG/AMP-style evidence tags for a specific allele with links to supporting data for each tag and generate guideline-based pathogenicity assessment for the allele. Through automation and comprehensive documentation of evidence codes, the system facilitates more accurate application of the ACMG/AMP guidelines, improves standardization in variant classification, and facilitates collaborative resolution of discordances. The rules of reasoning are configurable with gene-specific or disease-specific guideline variations (e.g. cardiomyopathy-specific frequency thresholds and functional assays). The software is modular, equipped with robust application program interfaces (APIs), and available under a free open source license and as a cloud-hosted web service, thus facilitating both stand-alone use and integration with existing variant curation and interpretation systems. The Pathogenicity Calculator is accessible at http://calculator.clinicalgenome.org . By enabling evidence-based reasoning about the pathogenicity of genetic variants and by documenting supporting evidence, the Calculator contributes toward the creation of a knowledge commons and more accurate interpretation of sequence variants in research and clinical care.
Halper, Sean M; Cetnar, Daniel P; Salis, Howard M
2018-01-01
Engineering many-enzyme metabolic pathways suffers from the design curse of dimensionality. There are an astronomical number of synonymous DNA sequence choices, though relatively few will express an evolutionary robust, maximally productive pathway without metabolic bottlenecks. To solve this challenge, we have developed an integrated, automated computational-experimental pipeline that identifies a pathway's optimal DNA sequence without high-throughput screening or many cycles of design-build-test. The first step applies our Operon Calculator algorithm to design a host-specific evolutionary robust bacterial operon sequence with maximally tunable enzyme expression levels. The second step applies our RBS Library Calculator algorithm to systematically vary enzyme expression levels with the smallest-sized library. After characterizing a small number of constructed pathway variants, measurements are supplied to our Pathway Map Calculator algorithm, which then parameterizes a kinetic metabolic model that ultimately predicts the pathway's optimal enzyme expression levels and DNA sequences. Altogether, our algorithms provide the ability to efficiently map the pathway's sequence-expression-activity space and predict DNA sequences with desired metabolic fluxes. Here, we provide a step-by-step guide to applying the Pathway Optimization Pipeline on a desired multi-enzyme pathway in a bacterial host.
2017-01-01
Abstract Target search as performed by DNA-binding proteins is a complex process, in which multiple factors contribute to both thermodynamic discrimination of the target sequence from overwhelmingly abundant off-target sites and kinetic acceleration of dynamic sequence interrogation. TRF1, the protein that binds to telomeric tandem repeats, faces an intriguing variant of the search problem where target sites are clustered within short fragments of chromosomal DNA. In this study, we use extensive (>0.5 ms in total) MD simulations to study the dynamical aspects of sequence-specific binding of TRF1 at both telomeric and non-cognate DNA. For the first time, we describe the spontaneous formation of a sequence-specific native protein–DNA complex in atomistic detail, and study the mechanism by which proteins avoid off-target binding while retaining high affinity for target sites. Our calculated free energy landscapes reproduce the thermodynamics of sequence-specific binding, while statistical approaches allow for a comprehensive description of intermediate stages of complex formation. PMID:28633355
Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.
Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li
2015-10-16
The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.
Kono, H; Saven, J G
2001-02-23
Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.
Liu, Bin; Liu, Fule; Fang, Longyun; Wang, Xiaolong; Chou, Kuo-Chen
2015-04-15
In order to develop powerful computational predictors for identifying the biological features or attributes of DNAs, one of the most challenging problems is to find a suitable approach to effectively represent the DNA sequences. To facilitate the studies of DNAs and nucleotides, we developed a Python package called representations of DNAs (repDNA) for generating the widely used features reflecting the physicochemical properties and sequence-order effects of DNAs and nucleotides. There are three feature groups composed of 15 features. The first group calculates three nucleic acid composition features describing the local sequence information by means of kmers; the second group calculates six autocorrelation features describing the level of correlation between two oligonucleotides along a DNA sequence in terms of their specific physicochemical properties; the third group calculates six pseudo nucleotide composition features, which can be used to represent a DNA sequence with a discrete model or vector yet still keep considerable sequence-order information via the physicochemical properties of its constituent oligonucleotides. In addition, these features can be easily calculated based on both the built-in and user-defined properties via using repDNA. The repDNA Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repDNA/. bliu@insun.hit.edu.cn or kcchou@gordonlifescience.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Geith, Tobias; Schmidt, Gerwin; Biffar, Andreas; Dietrich, Olaf; Duerr, Hans Roland; Reiser, Maximilian; Baur-Melnyk, Andrea
2014-09-01
The purpose of our study was to determine the optimum combination of b values for calculating the apparent diffusion coefficient (ADC) using a diffusion-weighted (DW) single-shot turbo spin-echo (TSE) sequence in the differentiation between acute benign and malignant vertebral body fractures. Twenty-six patients with osteoporotic (mean age, 69 years; range, 31.5-86.2 years) and 20 patients with malignant vertebral fractures (mean age, 63.4 years; range, 24.7-86.4 years) were studied. T1-weighted, STIR, and T2-weighted sequences were acquired at 1.5 T. A DW single-shot TSE sequence at different b values (100, 250, 400, and 600 s/mm(2)) was applied. On the DW images for each evaluated fracture, an ROI was manually adapted to the area of hyperintense signal intensity on STIR-hypointense signal on T1-weighted images. For each ROI, nine different combinations of two, three, and four b values were used to calculate the ADC using a least-squares algorithm. The Student t test and Mann-Whitney U test were used to determine significant differences between benign and malignant fractures. An ROC analysis and the Youden index were used to determine cutoff values for assessment of the highest sensitivity and specificity for the different ADC values. The positive (PPV) and negative predictive values (NPV) were also determined. All calculated ADCs (except the combination of b = 400 s/mm(2) and b = 600 s/mm(2)) showed statistically significant differences between benign and malignant vertebral body fractures, with benign fractures having higher ADCs than malignant ones. The use of higher b values resulted in lower ADCs than those calculated with low b values. The highest AUC (0.85) showed the ADCs calculated with b = 100 and 400 s/mm(2), and the second highest AUC (0.829) showed the ADCs calculated with b = 100, 250, and 400 s/mm(2). The Youden index with equal weight given to sensitivity and specificity suggests use of an ADC calculated with b = 100, 250, and 400 s/mm(2) (cutoff ADC, < 1.7 × 10(-3) mm(2)/s) to best diagnose malignancy (sensitivity, 85%; specificity, 84.6%; PPV, 81.0%; NPV, 88.0%). ADCs calculated with a combination of low to intermediate b values (b = 100, 250, and 400 s/mm(2)) provide the best diagnostic performance of a DW single-shot TSE sequence to differentiate acute benign and malignant vertebral body fractures.
Lu, Stephen M.; Lu, Wuyuan; Qasim, M. A.; Anderson, Stephen; Apostol, Izydor; Ardelt, Wojciech; Bigler, Theresa; Chiang, Yi Wen; Cook, James; James, Michael N. G.; Kato, Ikunoshin; Kelly, Clyde; Kohr, William; Komiyama, Tomoko; Lin, Tiao-Yin; Ogawa, Michio; Otlewski, Jacek; Park, Soon-Jae; Qasim, Sabiha; Ranjbar, Michael; Tashiro, Misao; Warne, Nicholas; Whatley, Harry; Wieczorek, Anna; Wieczorek, Maciej; Wilusz, Tadeusz; Wynn, Richard; Zhang, Wenlei; Laskowski, Michael
2001-01-01
An additivity-based sequence to reactivity algorithm for the interaction of members of the Kazal family of protein inhibitors with six selected serine proteinases is described. Ten consensus variable contact positions in the inhibitor were identified, and the 19 possible variants at each of these positions were expressed. The free energies of interaction of these variants and the wild type were measured. For an additive system, this data set allows for the calculation of all possible sequences, subject to some restrictions. The algorithm was extensively tested. It is exceptionally fast so that all possible sequences can be predicted. The strongest, the most specific possible, and the least specific inhibitors were designed, and an evolutionary problem was solved. PMID:11171964
[Identification of pyrrosiae folium and its adulterants based on psbA-trnH sequence].
Zhang, Ya-Qin; Shi, Yue; Song, Ming; Lin, Yun-Han; Ma, Xiao-Xi; Sun, Wei; Xiang, Li; Liu, Xi
2014-06-01
In this study, the psbA-trnH sequence as DNA barcode was used to evaluate the accuracy and stability for identification pteridophyte medicinal material Pyrrosiae Foliumas from adulterants. Genomic DNA from 106 samples were extracted successfully. The Kimura 2-Parameter (K2P) distances and ML tree were calculated using software MEGA 6.0. The intra-specific genetic distances of 3 original plants were lower than inter-specific genetic distances of adulterants. The ML tree indicated that Pyrrosiae Folium can be distinguished from its adulterants obviously. Therefore, the psbA-trnH sequence as a barcode of the pteridophyte, can accurately and stably distinguish Pyrrosiae Folium from its adulterants.
Accounting for uncertainty in DNA sequencing data.
O'Rawe, Jason A; Ferson, Scott; Lyon, Gholson J
2015-02-01
Science is defined in part by an honest exposition of the uncertainties that arise in measurements and propagate through calculations and inferences, so that the reliabilities of its conclusions are made apparent. The recent rapid development of high-throughput DNA sequencing technologies has dramatically increased the number of measurements made at the biochemical and molecular level. These data come from many different DNA-sequencing technologies, each with their own platform-specific errors and biases, which vary widely. Several statistical studies have tried to measure error rates for basic determinations, but there are no general schemes to project these uncertainties so as to assess the surety of the conclusions drawn about genetic, epigenetic, and more general biological questions. We review here the state of uncertainty quantification in DNA sequencing applications, describe sources of error, and propose methods that can be used for accounting and propagating these errors and their uncertainties through subsequent calculations. Copyright © 2014 Elsevier Ltd. All rights reserved.
Review of road user costs and methods.
DOT National Transportation Integrated Search
2013-07-01
The South Dakota Department of Transportation (SDDOT) uses road user costs (RUC) to calculate incentive or disincentive compensation for contractors, quantify project-specific liquidated damages, select the ideal sequencing of a project, and forecast...
Brittnacher, Mitchell J; Heltshe, Sonya L; Hayden, Hillary S; Radey, Matthew C; Weiss, Eli J; Damman, Christopher J; Zisman, Timothy L; Suskind, David L; Miller, Samuel I
2016-01-01
Comparative analysis of gut microbiomes in clinical studies of human diseases typically rely on identification and quantification of species or genes. In addition to exploring specific functional characteristics of the microbiome and potential significance of species diversity or expansion, microbiome similarity is also calculated to study change in response to therapies directed at altering the microbiome. Established ecological measures of similarity can be constructed from species abundances, however methods for calculating these commonly used ecological measures of similarity directly from whole genome shotgun (WGS) metagenomic sequence are lacking. We present an alignment-free method for calculating similarity of WGS metagenomic sequences that is analogous to the Bray-Curtis index for species, implemented by the General Utility for Testing Sequence Similarity (GUTSS) software application. This method was applied to intestinal microbiomes of healthy young children to measure developmental changes toward an adult microbiome during the first 3 years of life. We also calculate similarity of donor and recipient microbiomes to measure establishment, or engraftment, of donor microbiota in fecal microbiota transplantation (FMT) studies focused on mild to moderate Crohn's disease. We show how a relative index of similarity to donor can be calculated as a measure of change in a patient's microbiome toward that of the donor in response to FMT. Because clinical efficacy of the transplant procedure cannot be fully evaluated without analysis methods to quantify actual FMT engraftment, we developed a method for detecting change in the gut microbiome that is independent of species identification and database bias, sensitive to changes in relative abundance of the microbial constituents, and can be formulated as an index for correlating engraftment success with clinical measures of disease. More generally, this method may be applied to clinical evaluation of human microbiomes and provide potential diagnostic determination of individuals who may be candidates for specific therapies directed at alteration of the microbiome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Xingyuan; He, Zhili; Zhou, Jizhong
2005-10-30
The oligonucleotide specificity for microarray hybridizationcan be predicted by its sequence identity to non-targets, continuousstretch to non-targets, and/or binding free energy to non-targets. Mostcurrently available programs only use one or two of these criteria, whichmay choose 'false' specific oligonucleotides or miss 'true' optimalprobes in a considerable proportion. We have developed a software tool,called CommOligo using new algorithms and all three criteria forselection of optimal oligonucleotide probes. A series of filters,including sequence identity, free energy, continuous stretch, GC content,self-annealing, distance to the 3'-untranslated region (3'-UTR) andmelting temperature (Tm), are used to check each possibleoligonucleotide. A sequence identity is calculated based onmore » gapped globalalignments. A traversal algorithm is used to generate alignments for freeenergy calculation. The optimal Tm interval is determined based on probecandidates that have passed all other filters. Final probes are pickedusing a combination of user-configurable piece-wise linear functions andan iterative process. The thresholds for identity, stretch and freeenergy filters are automatically determined from experimental data by anaccessory software tool, CommOligo_PE (CommOligo Parameter Estimator).The program was used to design probes for both whole-genome and highlyhomologous sequence data. CommOligo and CommOligo_PE are freely availableto academic users upon request.« less
Specificity determinants for the abscisic acid response element.
Sarkar, Aditya Kumar; Lahiri, Ansuman
2013-01-01
Abscisic acid (ABA) response elements (ABREs) are a group of cis-acting DNA elements that have been identified from promoter analysis of many ABA-regulated genes in plants. We are interested in understanding the mechanism of binding specificity between ABREs and a class of bZIP transcription factors known as ABRE binding factors (ABFs). In this work, we have modeled the homodimeric structure of the bZIP domain of ABRE binding factor 1 from Arabidopsis thaliana (AtABF1) and studied its interaction with ACGT core motif-containing ABRE sequences. We have also examined the variation in the stability of the protein-DNA complex upon mutating ABRE sequences using the protein design algorithm FoldX. The high throughput free energy calculations successfully predicted the ability of ABF1 to bind to alternative core motifs like GCGT or AAGT and also rationalized the role of the flanking sequences in determining the specificity of the protein-DNA interaction.
Meghdadi, Hossein; Khosravi, Azar D.; Ghadiri, Ata A.; Sina, Amir H.; Alami, Ameneh
2015-01-01
Present study was aimed to examine the diagnostic utility of polymerase chain reaction (PCR) and nested PCR techniques for the detection of Mycobacterium tuberculosis (MTB) DNA in samples from patients with extra pulmonary tuberculosis (EPTB). In total 80 formalin-fixed, paraffin-embedded (FFPE) samples comprising 70 samples with definite diagnosis of EPTB and 10 samples from known non- EPTB on the basis of histopathology examination, were included in the study. PCR amplification targeting IS6110, rpoB gene and nested PCR targeting the rpoB gene were performed on the extracted DNAs from 80 FFPE samples. The strong positive samples were directly sequenced. For negative samples and those with weak band in nested-rpoB PCR, TA cloning was performed by cloning the products into the plasmid vector with subsequent sequencing. The 95% confidence intervals (CI) for the estimates of sensitivity and specificity were calculated for each method. Fourteen (20%), 34 (48.6%), and 60 (85.7%) of the 70 positive samples confirmed by histopathology, were positive by rpoB-PCR, IS6110-PCR, and nested-rpoB PCR, respectively. By performing TA cloning on samples that yielded weak (n = 8) or negative results (n = 10) in the PCR methods, we were able to improve their quality for later sequencing. All samples with weak band and 7 out of 10 negative samples, showed strong positive results after cloning. So nested-rpoB PCR cloning revealed positivity in 67 out of 70 confirmed samples (95.7%). The sensitivity of these combination methods was calculated as 95.7% in comparison with histopathology examination. The CI for sensitivity of the PCR methods were calculated as 11.39–31.27% for rpoB-PCR, 36.44–60.83% for IS6110- PCR, 75.29–92.93% for nested-rpoB PCR, and 87.98–99.11% for nested-rpoB PCR cloning. The 10 true EPTB negative samples by histopathology, were negative by all tested methods including cloning and were used to calculate the specificity of the applied methods. The CI for 100% specificity of each PCR method were calculated as 69.15–100%. Our results indicated that nested-rpoB PCR combined with TA cloning and sequencing is a preferred method for the detection of MTB DNA in EPTB samples with high sensitivity and specificity which confirm the histopathology results. PMID:26191059
Meghdadi, Hossein; Khosravi, Azar D; Ghadiri, Ata A; Sina, Amir H; Alami, Ameneh
2015-01-01
Present study was aimed to examine the diagnostic utility of polymerase chain reaction (PCR) and nested PCR techniques for the detection of Mycobacterium tuberculosis (MTB) DNA in samples from patients with extra pulmonary tuberculosis (EPTB). In total 80 formalin-fixed, paraffin-embedded (FFPE) samples comprising 70 samples with definite diagnosis of EPTB and 10 samples from known non- EPTB on the basis of histopathology examination, were included in the study. PCR amplification targeting IS6110, rpoB gene and nested PCR targeting the rpoB gene were performed on the extracted DNAs from 80 FFPE samples. The strong positive samples were directly sequenced. For negative samples and those with weak band in nested-rpoB PCR, TA cloning was performed by cloning the products into the plasmid vector with subsequent sequencing. The 95% confidence intervals (CI) for the estimates of sensitivity and specificity were calculated for each method. Fourteen (20%), 34 (48.6%), and 60 (85.7%) of the 70 positive samples confirmed by histopathology, were positive by rpoB-PCR, IS6110-PCR, and nested-rpoB PCR, respectively. By performing TA cloning on samples that yielded weak (n = 8) or negative results (n = 10) in the PCR methods, we were able to improve their quality for later sequencing. All samples with weak band and 7 out of 10 negative samples, showed strong positive results after cloning. So nested-rpoB PCR cloning revealed positivity in 67 out of 70 confirmed samples (95.7%). The sensitivity of these combination methods was calculated as 95.7% in comparison with histopathology examination. The CI for sensitivity of the PCR methods were calculated as 11.39-31.27% for rpoB-PCR, 36.44-60.83% for IS6110- PCR, 75.29-92.93% for nested-rpoB PCR, and 87.98-99.11% for nested-rpoB PCR cloning. The 10 true EPTB negative samples by histopathology, were negative by all tested methods including cloning and were used to calculate the specificity of the applied methods. The CI for 100% specificity of each PCR method were calculated as 69.15-100%. Our results indicated that nested-rpoB PCR combined with TA cloning and sequencing is a preferred method for the detection of MTB DNA in EPTB samples with high sensitivity and specificity which confirm the histopathology results.
Design of nucleic acid sequences for DNA computing based on a thermodynamic approach
Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma
2005-01-01
We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762
The scattering of electromagnetic pulses by a slit in a conducting screen
NASA Technical Reports Server (NTRS)
Ackerknecht, W. E., III; Chen, C.-L.
1975-01-01
A direct method for calculating the impulse response of a slit in a conducting screen is presented which is derived specifically for the analysis of transient scattering by two-dimensional objects illuminated by a plane incident wave. The impulse response is obtained by assuming that the total response is composed of two sequences of diffracted waves. The solution is determined for the first two waves in one sequence by using Green's functions and the equivalence principle, for additional waves in the sequence by iteration, and for the other sequence by a transformation of coordinates. The cases of E-polarization and H-polarization are considered.
Arnold, Roland; Goldenberg, Florian; Mewes, Hans-Werner; Rattei, Thomas
2014-01-01
The Similarity Matrix of Proteins (SIMAP, http://mips.gsf.de/simap/) database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to ∼70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith–Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads. PMID:24165881
TFBSshape: a motif database for DNA shape features of transcription factor binding sites.
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone.
TFBSshape: a motif database for DNA shape features of transcription factor binding sites
Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo
2014-01-01
Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955
Andreotti, Renato; Pedroso, Marisela S; Caetano, Alexandre R; Martins, Natália F
2008-01-01
This paper reports the sequence analysis of Bm86 Campo Grande strain comparing it with Bm86 and Bm95 antigens from the preparations TickGardPLUS and Gavac, respectively. The PCR product was cloned into pMOSBlue and sequenced. The secondary structure prediction tool PSIPRED was used to calculate alpha helices and beta strand contents of the predicted polypeptide. The hydrophobicity profile was calculated using the algorithms from the Hopp and Woods method, in addition to identification of potential MHC class-I binding regions in the antigens. Pair-wise alignment revealed that the similarity between Bm86 Campo Grande strain and Bm86 is 0.2% higher than that between Bm86 Campo Grande strain and Bm95 antigens. The identities were 96.5% and 96.3% respectively. Major suggestive differences in hydrophobicity were predicted among the sequences in two specific regions.
Lummel, N; Schoepf, V; Burke, M; Brueckmann, H; Linn, J
2011-12-01
FLAIR images are highly sensitive for SAH. However, CSF flow artifacts caused by conventional FLAIR can produce false-positive results. Here, we compare 3D and 3D FLAIR sequences, focusing on their potential for containing these artifacts and their sensitivity and specificity for detection of SAHs. We evaluated the following 4 FLAIR sequences: 1) 2D FLAIR at 1.5T, 2) 2D FLAIR, 3) 2D PROPELLER-FLAIR, and 4) 3D Cube-FLAIR at 3T. All sequences were performed in 5 healthy volunteers; sequences 2 and 4 were also performed under routine conditions in 10 patients with focal epilepsy and in 10 patients with SAH. Two neuroradiologists independently conducted the analysis. The presence of flow artifacts in the ventricles and cisterns of healthy volunteers and patients with epilepsy was evaluated and scored on a 4-point scale. Mean values were calculated and compared by using paired t tests. Sensitivity and specificity for SAH detection in sequences 2 and 4 were determined. Cube-FLAIR showed almost no CSF artifacts in the volunteers and the patients with epilepsy; therefore, it was superior to any other FLAIR (P < .001). Sensitivity and specificity of SAH detection by 3T FLAIR were 58.3% and 89.4%, respectively, whereas Cube-FLAIR had a sensitivity of 95% and a specificity of 100%. Cube-FLAIR allows FLAIR imaging with almost no CSF artifacts and is, thus, particularly useful for SAH detection.
NASA Astrophysics Data System (ADS)
Ayzenshtadt, A. M.; Frolova, M. A.; Makhova, T. A.; Danilov, V. E.; Gupta, Piyush K.; Verma, Rama S.
2018-01-01
Minerals samples of mixed-genesis rocks in a finely dispersed state were obtained and studied, namely sand deposit (Kholmogory district) and basalt (Myandukha deposit, Plesetsk district) in Arkhangelsk region. The paper provides the chemical composition data used to calculate the specific mass atomization energy of rocks. The energy parameters of the micro and nano systems of the rock samples - free surface energy and surface activity - were calculated. For toxicological evaluation of the materials obtained, next-generation sequencing (NGS) was used to perform metagenomic analysis which allowed determining the species diversity of microorganisms in the samples under study. It was shown that the sequencing method and metagenomic analysis are applicable and provide good reproducibility for the analysis of the toxicological properties of selected rock samples. The correlation of the surface activity of finely dispersed rock systems and the species diversity of cultivated microorganisms on the raw material was observed.
1989-12-29
1.1.2. General Performance Criteria for Gamma Ray Spectrometers 4 1.1.3. Special Criteria for Space-Based Spectrometer Systems 7 1.1.4. Prior Approaches...calculations were performed for selected incident gamma ray energies and were used to generate tabular and graphical listings of gamma scattering results. The... generated . These output presentations were studied to identify behavior patterns of "good" and "bad" event sequences. For the specific gamma energy
Donlan, Chris; Cowan, Richard; Newton, Elizabeth J; Lloyd, Delyth
2007-04-01
A sample (n=48) of eight-year-olds with specific language impairments is compared with age-matched (n=55) and language matched controls (n=55) on a range of tasks designed to test the interdependence of language and mathematical development. Performance across tasks varies substantially in the SLI group, showing profound deficits in production of the count word sequence and basic calculation and significant deficits in understanding of the place-value principle in Hindu-Arabic notation. Only in understanding of arithmetic principles does SLI performance approximate that of age-matched-controls, indicating that principled understanding can develop even where number sequence production and other aspects of number processing are severely compromised.
Telegrafo, Michele; Rella, Leonarda; Stabile Ianora, Amato Antonio; Angelelli, Giuseppe; Moschetta, Marco
2015-10-01
To assess the role of STIR, T2-weighted TSE and DWIBS sequences for detecting and characterizing breast lesions and to compare unenhanced (UE)-MRI results with contrast-enhanced (CE)-MRI and histological findings, having the latter as the reference standard. Two hundred eighty consecutive patients (age range, 27-73 years; mean age±standard deviation (SD), 48.8±9.8years) underwent MR examination with a diagnostic protocol including STIR, T2-weighted TSE, THRIVE and DWIBS sequences. Two radiologists blinded to both dynamic sequences and histological findings evaluated in consensus STIR, T2-weighted TSE and DWIBS sequences and after two weeks CE-MRI images searching for breast lesions. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic accuracy for UE-MRI and CE-MRI were calculated. UE-MRI results were also compared with CE- MRI. UE-MRI sequences obtained sensitivity, specificity, diagnostic accuracy, PPV and NPV values of 94%, 79%, 86%, 79% and 94%, respectively. CE-MRI sequences obtained sensitivity, specificity, diagnostic accuracy, PPV and NPV values of 98%, 83%, 90%, 84% and 98%, respectively. No statistically significant difference between UE-MRI and CE-MRI was found. Breast UE-MRI could represent an accurate diagnostic tool and a valid alternative to CE-MRI for evaluating breast lesions. STIR and DWIBS sequences allow to detect breast lesions while T2-weighted TSE sequences and ADC values could be useful for lesion characterization. Copyright © 2015 Elsevier Inc. All rights reserved.
2000 Year-old ancient equids: an ancient-DNA lesson from pompeii remains.
Di Bernardo, Giovanni; Del Gaudio, Stefania; Galderisi, Umberto; Cipollaro, Marilena
2004-11-15
Ancient DNA extracted from 2000 year-old equine bones was examined in order to amplify mitochondrial and nuclear DNA fragments. A specific equine satellite-type sequence representing 3.7%-11% of the entire equine genome, proved to be a suitable target to address the question of the presence of aDNA in ancient bones. The PCR strategy designed to investigate this specific target also allowed us to calculate the molecular weight of amplifiable DNA fragments. Sequencing of a 370 bp DNA fragment of mitochondrial control region allowed the comparison of ancient DNA sequences with those of modern horses to assess their genetic relationship. The 16S rRNA mitochondrial gene was also examined to unravel the post-mortem base modification feature and to test the status of Pompeian equids taxon on the basis of a Mae III restriction site polymorphism. Copyright 2004 Wiley-Liss, Inc.
Vibration-Rotation Bands of HF and DF
1977-09-23
98 IZa. Comparison of Observed and Calculated Line Positions of HF, Av = I Sequence ........................... 99 f2b. Comparison of Observed and...Calculated Line Positions of HF, Av = 2 Sequence ........................... 102 12c. Comparison of Observed and Calculated Line Positions of HF, Av = 3...Sequence ........................... 107 i2d. Comparison of Observed and Calculated Line Positions ofHF, Av = 4 Sequence ........................... fi
Nicosia, Aldo; Maggio, Teresa; Mazzola, Salvatore; Cuttitta, Angela
2013-10-30
Anemonia viridis is a widespread and extensively studied Mediterranean species of sea anemone from which a large number of polypeptide toxins, such as blood depressing substances (BDS) peptides, have been isolated. The first members of this class, BDS-1 and BDS-2, are polypeptides belonging to the β-defensin fold family and were initially described for their antihypertensive and antiviral activities. BDS-1 and BDS-2 are 43 amino acid peptides characterised by three disulfide bonds that act as neurotoxins affecting Kv3.1, Kv3.2 and Kv3.4 channel gating kinetics. In addition, BDS-1 inactivates the Nav1.7 and Nav1.3 channels. The development of a large dataset of A. viridis expressed sequence tags (ESTs) and the identification of 13 putative BDS-like cDNA sequences has attracted interest, especially as scientific and diagnostic tools. A comparison of BDS cDNA sequences showed that the untranslated regions are more conserved than the protein-coding regions. Moreover, the KA/KS ratios calculated for all pairwise comparisons showed values greater than 1, suggesting mechanisms of accelerated evolution. The structures of the BDS homologs were predicted by molecular modelling. All toxins possess similar 3D structures that consist of a triple-stranded antiparallel β-sheet and an additional small antiparallel β-sheet located downstream of the cleavage/maturation site; however, the orientation of the triple-stranded β-sheet appears to differ among the toxins. To characterise the spatial expression profile of the putative BDS cDNA sequences, tissue-specific cDNA libraries, enriched for BDS transcripts, were constructed. In addition, the proper amplification of ectodermal or endodermal markers ensured the tissue specificity of each library. Sequencing randomly selected clones from each library revealed ectodermal-specific expression of ten BDS transcripts, while transcripts of BDS-8, BDS-13, BDS-14 and BDS-15 failed to be retrieved, likely due to under-representation in our cDNA libraries. The calculation of the relative abundance of BDS transcripts in the cDNA libraries revealed that BDS-1, BDS-3, BDS-4, BDS-5 and BDS-6 are the most represented transcripts.
Koch, P J; Goldschmidt, M D; Walsh, M J; Zimbelmann, R; Schmelz, M; Franke, W W
1991-05-01
Desmosomes are cell-type-specific intercellular junctions found in epithelium, myocardium and certain other tissues. They consist of assemblies of molecules involved in the adhesion of specific cell types and in the anchorage of cell-type-specific cytoskeletal elements, the intermediate-size filaments, to the plasma membrane. To explore the individual desmosomal components and their functions we have isolated DNA clones encoding the desmosomal glycoprotein, desmocollin, using antibodies and a cDNA expression library from bovine muzzle epithelium. The cDNA-deduced amino-acid sequence of desmocollin (presently we cannot decide to which of the two desmocollins, DC I or DC II, this clone relates) defines a polypeptide with a calculated molecular weight of 85,000, with a single candidate sequence of 24 amino acids sufficiently long for a transmembrane arrangement, and an extracellular aminoterminal portion of 561 amino acid residues, compared to a cytoplasmic part of only 176 amino acids. Amino acid sequence comparisons have revealed that desmocollin is highly homologous to members of the cadherin family of cell adhesion molecules, including the previously sequenced desmoglein, another desmosome-specific cadherin. Using riboprobes derived from cDNAs for Northern-blot analyses, we have identified an mRNA of approximately 6 kb in stratified epithelia such as muzzle epithelium and tongue mucosa but not in two epithelial cell culture lines containing desmosomes and desmoplakins. The difference may indicate drastic differences in mRNA concentration or the existence of cell-type-specific desmocollin subforms. The molecular topology of desmocollin(s) is discussed in relation to possible functions of the individual molecular domains.
High-pressure structural study of MnF 2
Stavrou, Elissaios; Yao, Yansun; Goncharov, Alexander F.; ...
2015-02-01
In this study, manganese fluoride (MnF 2) with the tetragonal rutile-type structure has been studied using a synchrotron angle-dispersive powder x-ray diffraction and Raman spectroscopy in a diamond anvil cell up to 60 GPa at room temperature combined with first-principles density functional calculations. The experimental data reveal two pressure-induced structural phase transitions with the following sequence: rutile → SrI 2 type (3 GPa)→ α–PbCl 2 type (13 GPa). Complete structural information, including interatomic distances, has been determined in the case of MnF 2 including the exact structure of the debated first high-pressure phase. First-principles density functional calculations confirm this phasemore » transition sequence, and the two calculated transition pressures are in excellent agreement with the experiment. Lattice dynamics calculations also reproduce the experimental Raman spectra measured for the ambient and high-pressure phases. The results are discussed in line with the possible practical use of rutile-type fluorides in general and specifically MnF 2 as a model compound to reveal the HP structural behavior of rutile-type SiO 2 (Stishovite).« less
Sloma, Michael F.; Mathews, David H.
2016-01-01
RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. PMID:27852924
Yang, A S; Hitz, B; Honig, B
1996-06-21
The stability of beta-turns is calculated as a function of sequence and turn type with a Monte Carlo sampling technique. The conformational energy of four internal hydrogen-bonded turn types, I, I', II and II', is obtained by evaluating their gas phase energy with the CHARMM force field and accounting for solvation effects with the Finite Difference Poisson-Boltzmann (FDPB) method. All four turn types are found to be less stable than the coil state, independent of the sequence in the turn. The free-energy penalties associated with turn formation vary between 1.6 kcal/mol and 7.7 kcal/mol, depending on the sequence and turn type. Differences in turn stability arise mainly from intraresidue interactions within the two central residues of the turn. For each combination of the two central residues, except for -Gly-Gly-, the most stable beta-turn type is always found to occur most commonly in native proteins. The fact that a model based on local interactions accounts for the observed preference of specific sequences suggests that long-range tertiary interactions tend to play a secondary role in determining turn conformation. In contrast, for beta-hairpins, long-range interactions appear to dominate. Specifically, due to the right-handed twist of beta-strands, type I' turns for -Gly-Gly- are found to occur with high frequency, even when local energetics would dictate otherwise. The fact that any combination of two residues is found able to adopt a relatively low-energy turn structure explains why the amino acid sequence in turns is highly variable. The calculated free-energy cost of turn formation, when combined with related numbers obtained for alpha-helices and beta-sheets, suggests a model for the initiation of protein folding based on metastable fragments of secondary structure.
Stable isotope, site-specific mass tagging for protein identification
Chen, Xian
2006-10-24
Proteolytic peptide mass mapping as measured by mass spectrometry provides an important method for the identification of proteins, which are usually identified by matching the measured and calculated m/z values of the proteolytic peptides. A unique identification is, however, heavily dependent upon the mass accuracy and sequence coverage of the fragment ions generated by peptide ionization. The present invention describes a method for increasing the specificity, accuracy and efficiency of the assignments of particular proteolytic peptides and consequent protein identification, by the incorporation of selected amino acid residue(s) enriched with stable isotope(s) into the protein sequence without the need for ultrahigh instrumental accuracy. Selected amino acid(s) are labeled with .sup.13C/.sup.15N/.sup.2H and incorporated into proteins in a sequence-specific manner during cell culturing. Each of these labeled amino acids carries a defined mass change encoded in its monoisotopic distribution pattern. Through their characteristic patterns, the peptides with mass tag(s) can then be readily distinguished from other peptides in mass spectra. The present method of identifying unique proteins can also be extended to protein complexes and will significantly increase data search specificity, efficiency and accuracy for protein identifications.
Re-Assembly and Analysis of an Ancient Variola Virus Genome.
Smithson, Chad; Imbery, Jacob; Upton, Chris
2017-09-08
We report a major improvement to the assembly of published short read sequencing data from an ancient variola virus (VARV) genome by the removal of contig-capping sequencing tags and manual searches for gap-spanning reads. The new assembly, together with camelpox and taterapox genomes, permitted new dates to be calculated for the last common ancestor of all VARV genomes. The analysis of recently sequenced VARV-like cowpox virus genomes showed that single nucleotide polymorphisms (SNPs) and amino acid changes in the vaccinia virus (VACV)-Cop-O1L ortholog, predicted to be associated with VARV host specificity and virulence, were introduced into the lineage before the divergence of these viruses. A comparison of the ancient and modern VARV genome sequences also revealed a measurable drift towards adenine + thymine (A + T) richness.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mays, S.E.; Poloski, J.P.; Sullivan, W.H.
1982-07-01
A probabilistic risk assessment (PRA) was made of the Browns Ferry, Unit 1, nuclear plant as part of the Nuclear Regulatory Commission's Interim Reliability Evaluation Program (IREP). Specific goals of the study were to identify the dominant contributors to core melt, develop a foundation for more extensive use of PRA methods, expand the cadre of experienced PRA practitioners, and apply procedures for extension of IREP analyses to other domestic light water reactors. Event tree and fault tree analyses were used to estimate the frequency of accident sequences initiated by transients and loss of coolant accidents. External events such as floods,more » fires, earthquakes, and sabotage were beyond the scope of this study and were, therefore, excluded. From these sequences, the dominant contributors to probable core melt frequency were chosen. Uncertainty and sensitivity analyses were performed on these sequences to better understand the limitations associated with the estimated sequence frequencies. Dominant sequences were grouped according to common containment failure modes and corresponding release categories on the basis of comparison with analyses of similar designs rather than on the basis of detailed plant-specific calculations.« less
BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference.
Garcia-Garcia, Javier; Schleker, Sylvia; Klein-Seetharaman, Judith; Oliva, Baldo
2012-07-01
Protein-protein interactions (PPIs) play a crucial role in biology, and high-throughput experiments have greatly increased the coverage of known interactions. Still, identification of complete inter- and intraspecies interactomes is far from being complete. Experimental data can be complemented by the prediction of PPIs within an organism or between two organisms based on the known interactions of the orthologous genes of other organisms (interologs). Here, we present the BIANA (Biologic Interactions and Network Analysis) Interolog Prediction Server (BIPS), which offers a web-based interface to facilitate PPI predictions based on interolog information. BIPS benefits from the capabilities of the framework BIANA to integrate the several PPI-related databases. Additional metadata can be used to improve the reliability of the predicted interactions. Sensitivity and specificity of the server have been calculated using known PPIs from different interactomes using a leave-one-out approach. The specificity is between 72 and 98%, whereas sensitivity varies between 1 and 59%, depending on the sequence identity cut-off used to calculate similarities between sequences. BIPS is freely accessible at http://sbi.imim.es/BIPS.php.
Jakubec, David; Laskowski, Roman A.; Vondrasek, Jiri
2016-01-01
Decades of intensive experimental studies of the recognition of DNA sequences by proteins have provided us with a view of a diverse and complicated world in which few to no features are shared between individual DNA-binding protein families. The originally conceived direct readout of DNA residue sequences by amino acid side chains offers very limited capacity for sequence recognition, while the effects of the dynamic properties of the interacting partners remain difficult to quantify and almost impossible to generalise. In this work we investigated the energetic characteristics of all DNA residue—amino acid side chain combinations in the conformations found at the interaction interface in a very large set of protein—DNA complexes by the means of empirical potential-based calculations. General specificity-defining criteria were derived and utilised to look beyond the binding motifs considered in previous studies. Linking energetic favourability to the observed geometrical preferences, our approach reveals several additional amino acid motifs which can distinguish between individual DNA bases. Our results remained valid in environments with various dielectric properties. PMID:27384774
Zandrino, Franco; La Paglia, Ernesto; Musante, Francesco
2010-01-01
To assess the diagnostic accuracy of magnetic resonance imaging in local staging of endometrial carcinoma, and to review the results and pitfalls described in the literature. Thirty women with a histological diagnosis of endometrial carcinoma underwent magnetic resonance imaging. Unenhanced T2-weighted and dynamic contrast-enhanced Ti-weighted sequences were obtained. Hysterectomy and salpingo-oophorectomy was performed in all patients. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy were calculated for the detection of deep myometrial and cervical infiltration. For deep myometrial infiltration T2-weighted sequences reached a sensitivity of 85%, specificity of 76%, PPV of 73%, NVP of 87%, and accuracy of 80%, while contrast-enhanced scans reached a sensitivity of 90%, specificity of 80%, PPV of 82%, NPV of 89%, and accuracy of 85%. For cervical infiltration T2-weighted sequences reached a sensitivity of 75%, specificity of 88%, PPV of 50%, NPV of 96%, and accuracy of 87%, while contrast-enhanced scans reached a sensitivity of 100%, specificity of 94%, PPV of 75%, NPV of 100%, and accuracy of 95%. Unenhanced and dynamic gadolinium-enhanced magnetic resonance allows accurate assessment of myometrial and cervical infiltration. Information provided by magnetic resonance imaging can define prognosis and management.
Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data.
Favero, F; Joshi, T; Marquard, A M; Birkbak, N J; Krzystanek, M; Li, Q; Szallasi, Z; Eklund, A C
2015-01-01
Exome or whole-genome deep sequencing of tumor DNA along with paired normal DNA can potentially provide a detailed picture of the somatic mutations that characterize the tumor. However, analysis of such sequence data can be complicated by the presence of normal cells in the tumor specimen, by intratumor heterogeneity, and by the sheer size of the raw data. In particular, determination of copy number variations from exome sequencing data alone has proven difficult; thus, single nucleotide polymorphism (SNP) arrays have often been used for this task. Recently, algorithms to estimate absolute, but not allele-specific, copy number profiles from tumor sequencing data have been described. We developed Sequenza, a software package that uses paired tumor-normal DNA sequencing data to estimate tumor cellularity and ploidy, and to calculate allele-specific copy number profiles and mutation profiles. We applied Sequenza, as well as two previously published algorithms, to exome sequence data from 30 tumors from The Cancer Genome Atlas. We assessed the performance of these algorithms by comparing their results with those generated using matched SNP arrays and processed by the allele-specific copy number analysis of tumors (ASCAT) algorithm. Comparison between Sequenza/exome and SNP/ASCAT revealed strong correlation in cellularity (Pearson's r = 0.90) and ploidy estimates (r = 0.42, or r = 0.94 after manual inspecting alternative solutions). This performance was noticeably superior to previously published algorithms. In addition, in artificial data simulating normal-tumor admixtures, Sequenza detected the correct ploidy in samples with tumor content as low as 30%. The agreement between Sequenza and SNP array-based copy number profiles suggests that exome sequencing alone is sufficient not only for identifying small scale mutations but also for estimating cellularity and inferring DNA copy number aberrations. © The Author 2014. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
Sobti, Ranbir Chander; Kumari, Mamtesh; Sharma, Vijay Lakshmi; Sodhi, Monika; Mukesh, Manishi; Shouche, Yogesh
2009-11-01
The present study was aimed to get the nucleotide sequences of a part of COII mitochondrial gene amplified from individuals of five species of Termites (Isoptera: Termitidae: Macrotermitinae). Four of them belonged to the genus Odontotermes (O. obesus, O. horni, O. bhagwatii and Odontotermes sp.) and one to Microtermes (M. obesi). Partial COII gene fragments were amplified by using specific primers. The sequences so obtained were characterized to calculate the frequencies of each nucleotide bases and a high A + T content was observed. The interspecific pairwise sequence divergence in Odontotermes species ranged from 6.5% to 17.1% across COII fragment. M. obesi sequence diversity ranged from 2.5 with Odontotermes sp. to 19.0% with O. bhagwatii. Phylogenetic trees drawn on the basis of distance neighbour-joining method revealed three main clades clustering all the individuals according to their genera and families.
Gaur, Sonia; Harmon, Stephanie; Gupta, Rajan T; Margolis, Daniel J; Lay, Nathan; Mehralivand, Sherif; Merino, Maria J; Wood, Bradford J; Pinto, Peter A; Shih, Joanna H; Choyke, Peter L; Turkbey, Baris
2018-04-25
To determine independent contribution of each prostate multiparametric magnetic resonance imaging (mpMRI) sequence to cancer detection when read in isolation. Prostate mpMRI at 3-Tesla with endorectal coil from 45 patients (n = 30 prostatectomy cases, n = 15 controls with negative magnetic resonance imaging [MRI] or biopsy) were retrospectively interpreted. Sequences (T2-weighted [T2W] MRI, diffusion-weighted imaging [DWI], and dynamic contrast-enhanced [DCE] MRI; N = 135) were separately distributed to three radiologists at different institutions. Readers evaluated each sequence blinded to other mpMRI sequences. Findings were correlated to whole-mount pathology. Cancer detection sensitivity, positive predictive value for whole prostate (WP), transition zone, and peripheral zone were evaluated per sequence by reader, with reader concordance measured by index of specific agreement. Cancer detection rates (CDRs) were calculated for combinations of independently read sequences. 44 patients were evaluable (cases median prostate-specific antigen 6.83 [ range 1.95-51.13] ng/mL, age 62 [45-71] years; controls prostate-specific antigen 6.85 [2.4-10.87] ng/mL, age 65.5 [47-71] years). Readers had highest sensitivity on DWI (59%) vs T2W MRI (48%) and DCE (23%) in WP. DWI-only positivity (DWI+/T2W-/DCE-) achieved highest CDR in WP (38%), compared to T2W-only (CDR 24%) and DCE-only (CDR 8%). DWI+/T2W+/DCE- achieved CDR 80%, an added benefit of 56.4% from T2W-only and of 42% from DWI-only (P < .0001). All three sequences interpreted independently positive gave highest CDR of 90%. Reader agreement was moderate (index of specific agreement: T2W = 54%, DWI = 58%, DCE = 33%). When prostate mpMRI sequences are interpreted independently by multiple observers, DWI achieves highest sensitivity and CDR in transition zone and peripheral zone. T2W and DCE MRI both add value to detection; mpMRI achieves highest detection sensitivity when all three mpMRI sequences are positive. Published by Elsevier Inc.
Cycle-time determination and process control of sequencing batch membrane bioreactors.
Krampe, J
2013-01-01
In this paper a method to determine the cycle time for sequencing batch membrane bioreactors (SBMBRs) is introduced. One of the advantages of SBMBRs is the simplicity of adapting them to varying wastewater composition. The benefit of this flexibility can only be fully utilised if the cycle times are optimised for the specific inlet load conditions. This requires either proactive and ongoing operator adjustment or active predictive instrument-based control. Determination of the cycle times for conventional sequencing batch reactor (SBR) plants is usually based on experience. Due to the higher mixed liquor suspended solids concentrations in SBMBRs and the limited experience with their application, a new approach to calculate the cycle time had to be developed. Based on results from a semi-technical pilot plant, the paper presents an approach for calculating the cycle time in relation to the influent concentration according to the Activated Sludge Model No. 1 and the German HSG (Hochschulgruppe) Approach. The approach presented in this paper considers the increased solid contents in the reactor and the resultant shortened reaction times. This allows for an exact calculation of the nitrification and denitrification cycles with a tolerance of only a few minutes. Ultimately the same approach can be used for a predictive control strategy and for conventional SBR plants.
Three-dimensional T1rho-weighted MRI at 1.5 Tesla.
Borthakur, Arijitt; Wheaton, Andrew; Charagundla, Sridhar R; Shapiro, Erik M; Regatte, Ravinder R; Akella, Sarma V S; Kneeland, J Bruce; Reddy, Ravinder
2003-06-01
To design and implement a magnetic resonance imaging (MRI) pulse sequence capable of performing three-dimensional T(1rho)-weighted MRI on a 1.5-T clinical scanner, and determine the optimal sequence parameters, both theoretically and experimentally, so that the energy deposition by the radiofrequency pulses in the sequence, measured as the specific absorption rate (SAR), does not exceed safety guidelines for imaging human subjects. A three-pulse cluster was pre-encoded to a three-dimensional gradient-echo imaging sequence to create a three-dimensional, T(1rho)-weighted MRI pulse sequence. Imaging experiments were performed on a GE clinical scanner with a custom-built knee-coil. We validated the performance of this sequence by imaging articular cartilage of a bovine patella and comparing T(1rho) values measured by this sequence to those obtained with a previously tested two-dimensional imaging sequence. Using a previously developed model for SAR calculation, the imaging parameters were adjusted such that the energy deposition by the radiofrequency pulses in the sequence did not exceed safety guidelines for imaging human subjects. The actual temperature increase due to the sequence was measured in a phantom by a MRI-based temperature mapping technique. Following these experiments, the performance of this sequence was demonstrated in vivo by obtaining T(1rho)-weighted images of the knee joint of a healthy individual. Calculated T(1rho) of articular cartilage in the specimen was similar for both and three-dimensional and two-dimensional methods (84 +/- 2 msec and 80 +/- 3 msec, respectively). The temperature increase in the phantom resulting from the sequence was 0.015 degrees C, which is well below the established safety guidelines. Images of the human knee joint in vivo demonstrate a clear delineation of cartilage from surrounding tissues. We developed and implemented a three-dimensional T(1rho)-weighted pulse sequence on a 1.5-T clinical scanner. Copyright 2003 Wiley-Liss, Inc.
A New Method for Setting Calculation Sequence of Directional Relay Protection in Multi-Loop Networks
NASA Astrophysics Data System (ADS)
Haijun, Xiong; Qi, Zhang
2016-08-01
Workload of relay protection setting calculation in multi-loop networks may be reduced effectively by optimization setting calculation sequences. A new method of setting calculation sequences of directional distance relay protection in multi-loop networks based on minimum broken nodes cost vector (MBNCV) was proposed to solve the problem experienced in current methods. Existing methods based on minimum breakpoint set (MBPS) lead to more break edges when untying the loops in dependent relationships of relays leading to possibly more iterative calculation workloads in setting calculations. A model driven approach based on behavior trees (BT) was presented to improve adaptability of similar problems. After extending the BT model by adding real-time system characters, timed BT was derived and the dependency relationship in multi-loop networks was then modeled. The model was translated into communication sequence process (CSP) models and an optimization setting calculation sequence in multi-loop networks was finally calculated by tools. A 5-nodes multi-loop network was applied as an example to demonstrate effectiveness of the modeling and calculation method. Several examples were then calculated with results indicating the method effectively reduces the number of forced broken edges for protection setting calculation in multi-loop networks.
Ito, Yuji
2017-01-01
As an alternative to hybridoma technology, the antibody phage library system can also be used for antibody selection. This method enables the isolation of antigen-specific binders through an in vitro selection process known as biopanning. While it has several advantages, such as an avoidance of animal immunization, the phage cloning and screening steps of biopanning are time-consuming and problematic. Here, we introduce a novel biopanning method combined with high-throughput sequencing (HTS) using a next-generation sequencer (NGS) to save time and effort in antibody selection, and to increase the diversity of acquired antibody sequences. Biopannings against a target antigen were performed using a human single chain Fv (scFv) antibody phage library. VH genes in pooled phages at each round of biopanning were analyzed by HTS on a NGS. The obtained data were trimmed, merged, and translated into amino acid sequences. The frequencies (%) of the respective VH sequences at each biopanning step were calculated, and the amplification factor (change of frequency through biopanning) was obtained to estimate the potential for antigen binding. A phylogenetic tree was drawn using the top 50 VH sequences with high amplification factors. Representative VH sequences forming the cluster were then picked up and used to reconstruct scFv genes harboring these VHs. Their derived scFv-Fc fusion proteins showed clear antigen binding activity. These results indicate that a combination of biopanning and HTS enables the rapid and comprehensive identification of specific binders from antibody phage libraries.
Optimal control design of turbo spin‐echo sequences with applications to parallel‐transmit systems
Hoogduin, Hans; Hajnal, Joseph V.; van den Berg, Cornelis A. T.; Luijten, Peter R.; Malik, Shaihan J.
2016-01-01
Purpose The design of turbo spin‐echo sequences is modeled as a dynamic optimization problem which includes the case of inhomogeneous transmit radiofrequency fields. This problem is efficiently solved by optimal control techniques making it possible to design patient‐specific sequences online. Theory and Methods The extended phase graph formalism is employed to model the signal evolution. The design problem is cast as an optimal control problem and an efficient numerical procedure for its solution is given. The numerical and experimental tests address standard multiecho sequences and pTx configurations. Results Standard, analytically derived flip angle trains are recovered by the numerical optimal control approach. New sequences are designed where constraints on radiofrequency total and peak power are included. In the case of parallel transmit application, the method is able to calculate the optimal echo train for two‐dimensional and three‐dimensional turbo spin echo sequences in the order of 10 s with a single central processing unit (CPU) implementation. The image contrast is maintained through the whole field of view despite inhomogeneities of the radiofrequency fields. Conclusion The optimal control design sheds new light on the sequence design process and makes it possible to design sequences in an online, patient‐specific fashion. Magn Reson Med 77:361–373, 2017. © 2016 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine PMID:26800383
PrimerStation: a highly specific multiplex genomic PCR primer design server for the human genome
Yamada, Tomoyuki; Soma, Haruhiko; Morishita, Shinichi
2006-01-01
PrimerStation () is a web service that calculates primer sets guaranteeing high specificity against the entire human genome. To achieve high accuracy, we used the hybridization ratio of primers in liquid solution. Calculating the status of sequence hybridization in terms of the stringent hybridization ratio is computationally costly, and no web service checks the entire human genome and returns a highly specific primer set calculated using a precise physicochemical model. To shorten the response time, we precomputed candidates for specific primers using a massively parallel computer with 100 CPUs (SunFire 15 K) about 3 months in advance. This enables PrimerStation to search and output qualified primers interactively. PrimerStation can select highly specific primers suitable for multiplex PCR by seeking a wider temperature range that minimizes the possibility of cross-reaction. It also allows users to add heuristic rules to the primer design, e.g. the exclusion of single nucleotide polymorphisms (SNPs) in primers, the avoidance of poly(A) and CA-repeats in the PCR products, and the elimination of defective primers using the secondary structure prediction. We performed several tests to verify the PCR amplification of randomly selected primers for ChrX, and we confirmed that the primers amplify specific PCR products perfectly. PMID:16845094
Sloma, Michael F; Mathews, David H
2016-12-01
RNA secondary structure prediction is widely used to analyze RNA sequences. In an RNA partition function calculation, free energy nearest neighbor parameters are used in a dynamic programming algorithm to estimate statistical properties of the secondary structure ensemble. Previously, partition functions have largely been used to estimate the probability that a given pair of nucleotides form a base pair, the conditional stacking probability, the accessibility to binding of a continuous stretch of nucleotides, or a representative sample of RNA structures. Here it is demonstrated that an RNA partition function can also be used to calculate the exact probability of formation of hairpin loops, internal loops, bulge loops, or multibranch loops at a given position. This calculation can also be used to estimate the probability of formation of specific helices. Benchmarking on a set of RNA sequences with known secondary structures indicated that loops that were calculated to be more probable were more likely to be present in the known structure than less probable loops. Furthermore, highly probable loops are more likely to be in the known structure than the set of loops predicted in the lowest free energy structures. © 2016 Sloma and Mathews; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
NASA Astrophysics Data System (ADS)
Shanak, Siba; Helms, Volkhard
2014-12-01
Adenine and cytosine methylation are two important epigenetic modifications of DNA sequences at the levels of the genome and transcriptome. To characterize the differential roles of methylating adenine or cytosine with respect to their hydration properties, we performed conventional MD simulations and free energy perturbation calculations for two particular DNA sequences, namely the brain-derived neurotrophic factor (BDNF) promoter and the R.DpnI-bound DNA that are known to undergo methylation of C5-methyl cytosine and N6-methyl adenine, respectively. We found that a single methylated cytosine has a clearly favorable hydration free energy over cytosine since the attached methyl group has a slightly polar character. In contrast, capping the strongly polar N6 of adenine with a methyl group gives a slightly unfavorable contribution to its free energy of solvation. Performing the same demethylation in the context of a DNA double-strand gave quite similar results for the more solvent-accessible cytosine but much more unfavorable results for the rather buried adenine. Interestingly, the same demethylation reactions are far more unfavorable when performed in the context of the opposite (BDNF or R.DpnI target) sequence. This suggests a natural preference for methylation in a specific sequence context. In addition, free energy calculations for demethylating adenine or cytosine in the context of B-DNA vs. Z-DNA suggest that the conformational B-Z transition of DNA transition is rather a property of cytosine methylated sequences but is not preferable for the adenine-methylated sequences investigated here.
Shanak, Siba; Helms, Volkhard
2014-12-14
Adenine and cytosine methylation are two important epigenetic modifications of DNA sequences at the levels of the genome and transcriptome. To characterize the differential roles of methylating adenine or cytosine with respect to their hydration properties, we performed conventional MD simulations and free energy perturbation calculations for two particular DNA sequences, namely the brain-derived neurotrophic factor (BDNF) promoter and the R.DpnI-bound DNA that are known to undergo methylation of C5-methyl cytosine and N6-methyl adenine, respectively. We found that a single methylated cytosine has a clearly favorable hydration free energy over cytosine since the attached methyl group has a slightly polar character. In contrast, capping the strongly polar N6 of adenine with a methyl group gives a slightly unfavorable contribution to its free energy of solvation. Performing the same demethylation in the context of a DNA double-strand gave quite similar results for the more solvent-accessible cytosine but much more unfavorable results for the rather buried adenine. Interestingly, the same demethylation reactions are far more unfavorable when performed in the context of the opposite (BDNF or R.DpnI target) sequence. This suggests a natural preference for methylation in a specific sequence context. In addition, free energy calculations for demethylating adenine or cytosine in the context of B-DNA vs. Z-DNA suggest that the conformational B-Z transition of DNA transition is rather a property of cytosine methylated sequences but is not preferable for the adenine-methylated sequences investigated here.
Zhitnikova, M Y; Shestopalova, A V
2017-11-01
The structural adjustments of the sugar-phosphate DNA backbone (switching of the γ angle (O5'-C5'-C4'-C3') from canonical to alternative conformations and/or C2'-endo → C3'-endo transition of deoxyribose) lead to the sequence-specific changes in accessible surface area of both polar and non-polar atoms of the grooves and the polar/hydrophobic profile of the latter ones. The distribution of the minor groove electrostatic potential is likely to be changing as a result of such conformational rearrangements in sugar-phosphate DNA backbone. Our analysis of the crystal structures of the short free DNA fragments and calculation of their electrostatic potentials allowed us to determine: (1) the number of classical and alternative γ angle conformations in the free B-DNA; (2) changes in the minor groove electrostatic potential, depending on the conformation of the sugar-phosphate DNA backbone; (3) the effect of the DNA sequence on the minor groove electrostatic potential. We have demonstrated that the structural adjustments of the DNA double helix (the conformations of the sugar-phosphate backbone and the minor groove dimensions) induce changes in the distribution of the minor groove electrostatic potential and are sequence-specific. Therefore, these features of the minor groove sizes and distribution of minor groove electrostatic potential can be used as a signal for recognition of the target DNA sequence by protein in the implementation of the indirect readout mechanism.
Direct Sequence Detection of Structured H5 Influenza Viral RNA
Kerby, Matthew B.; Freeman, Sarah; Prachanronarong, Kristina; Artenstein, Andrew W.; Opal, Steven M.; Tripathi, Anubhav
2008-01-01
We describe the development of sequence-specific molecular beacons (dual-labeled DNA probes) for identification of the H5 influenza subtype, cleavage motif, and receptor specificity when hybridized directly with in vitro transcribed viral RNA (vRNA). The cloned hemagglutinin segment from a highly pathogenic H5N1 strain, A/Hanoi/30408/2005(H5N1), isolated from humans was used as template for in vitro transcription of sense-strand vRNA. The hybridization behavior of vRNA and a conserved subtype probe was characterized experimentally by varying conditions of time, temperature, and Mg2+ to optimize detection. Comparison of the hybridization rates of probe to DNA and RNA targets indicates that conformational switching of influenza RNA structure is a rate-limiting step and that the secondary structure of vRNA dominates the binding kinetics. The sensitivity and specificity of probe recognition of other H5 strains was calculated from sequence matches to the National Center for Biotechnology Information influenza database. The hybridization specificity of the subtype probes was experimentally verified with point mutations within the probe loop at five locations corresponding to the other human H5 strains. The abundance frequencies of the hemagglutinin cleavage motif and sialic acid recognition sequences were experimentally tested for H5 in all host viral species. Although the detection assay must be coupled with isothermal amplification on the chip, the new probes form the basis of a portable point-of-care diagnostic device for influenza subtyping. PMID:18403607
Zepeda-Mendoza, Marie Lisandra; Bohmann, Kristine; Carmona Baez, Aldo; Gilbert, M Thomas P
2016-05-03
DNA metabarcoding is an approach for identifying multiple taxa in an environmental sample using specific genetic loci and taxa-specific primers. When combined with high-throughput sequencing it enables the taxonomic characterization of large numbers of samples in a relatively time- and cost-efficient manner. One recent laboratory development is the addition of 5'-nucleotide tags to both primers producing double-tagged amplicons and the use of multiple PCR replicates to filter erroneous sequences. However, there is currently no available toolkit for the straightforward analysis of datasets produced in this way. We present DAMe, a toolkit for the processing of datasets generated by double-tagged amplicons from multiple PCR replicates derived from an unlimited number of samples. Specifically, DAMe can be used to (i) sort amplicons by tag combination, (ii) evaluate PCR replicates dissimilarity, and (iii) filter sequences derived from sequencing/PCR errors, chimeras, and contamination. This is attained by calculating the following parameters: (i) sequence content similarity between the PCR replicates from each sample, (ii) reproducibility of each unique sequence across the PCR replicates, and (iii) copy number of the unique sequences in each PCR replicate. We showcase the insights that can be obtained using DAMe prior to taxonomic assignment, by applying it to two real datasets that vary in their complexity regarding number of samples, sequencing libraries, PCR replicates, and used tag combinations. Finally, we use a third mock dataset to demonstrate the impact and importance of filtering the sequences with DAMe. DAMe allows the user-friendly manipulation of amplicons derived from multiple samples with PCR replicates built in a single or multiple sequencing libraries. It allows the user to: (i) collapse amplicons into unique sequences and sort them by tag combination while retaining the sample identifier and copy number information, (ii) identify sequences carrying unused tag combinations, (iii) evaluate the comparability of PCR replicates of the same sample, and (iv) filter tagged amplicons from a number of PCR replicates using parameters of minimum length, copy number, and reproducibility across the PCR replicates. This enables an efficient analysis of complex datasets, and ultimately increases the ease of handling datasets from large-scale studies.
Moschetta, Marco; Telegrafo, Michele; Rella, Leonarda; Capolongo, Arcangela; Stabile Ianora, Amato Antonio; Angelelli, Giuseppe
2014-07-01
Diffusion imaging represents a new imaging tool for the diagnosis of breast cancer. This study aims to investigate the role of diffusion-weighted MRI with background body signal suppression (DWIBS) for evaluating breast lesions. 90 patients were prospectively evaluated by MRI with STIR, TSE-T2, contrast enhanced THRIVE-T1 and DWIBS sequences. DWIBS were analyzed searching for the presence of breast lesions and calculating the ADC value. ADC values of ≤1.44×10(-3)mm(2)/s were considered suspicious for malignancy. This analysis was then compared with the histological findings. Sensitivity, specificity, diagnostic accuracy (DA), positive predictive value (PPV) and negative (NPV) were calculated. In 53/90 (59%) patients, DWIBS indicated the presence of breast lesions, 16 (30%) with ADC values of >1.44 and 37 (70%) with ADC≤1.44. The comparison with histology showed 25 malignant and 28 benign lesions. DWIBS sequences obtained sensitivity, specificity, DA, PPV and NPV values of 100, 82, 87, 68 and 100%, respectively. DWIBS can be proposed in the MRI breast protocol representing an accurate diagnostic complement. Copyright © 2014 Elsevier Inc. All rights reserved.
Schneider, Markus; Rosam, Mathias; Glaser, Manuel; Patronov, Atanas; Shah, Harpreet; Back, Katrin Christiane; Daake, Marina Angelika; Buchner, Johannes; Antes, Iris
2016-10-01
Substrate binding to Hsp70 chaperones is involved in many biological processes, and the identification of potential substrates is important for a comprehensive understanding of these events. We present a multi-scale pipeline for an accurate, yet efficient prediction of peptides binding to the Hsp70 chaperone BiP by combining sequence-based prediction with molecular docking and MMPBSA calculations. First, we measured the binding of 15mer peptides from known substrate proteins of BiP by peptide array (PA) experiments and performed an accuracy assessment of the PA data by fluorescence anisotropy studies. Several sequence-based prediction models were fitted using this and other peptide binding data. A structure-based position-specific scoring matrix (SB-PSSM) derived solely from structural modeling data forms the core of all models. The matrix elements are based on a combination of binding energy estimations, molecular dynamics simulations, and analysis of the BiP binding site, which led to new insights into the peptide binding specificities of the chaperone. Using this SB-PSSM, peptide binders could be predicted with high selectivity even without training of the model on experimental data. Additional training further increased the prediction accuracies. Subsequent molecular docking (DynaDock) and MMGBSA/MMPBSA-based binding affinity estimations for predicted binders allowed the identification of the correct binding mode of the peptides as well as the calculation of nearly quantitative binding affinities. The general concept behind the developed multi-scale pipeline can readily be applied to other protein-peptide complexes with linearly bound peptides, for which sufficient experimental binding data for the training of classical sequence-based prediction models is not available. Proteins 2016; 84:1390-1407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Yoshimura, Tomoaki; Kuribara, Hideo; Matsuoka, Takeshi; Kodama, Takashi; Iida, Mayu; Watanabe, Takahiro; Akiyama, Hiroshi; Maitani, Tamio; Furui, Satoshi; Hino, Akihiro
2005-03-23
The applicability of quantifying genetically modified (GM) maize and soy to processed foods was investigated using heat treatment processing models. The detection methods were based on real-time quantitative polymerase chain reaction (PCR) analysis. Ground seeds of insect resistant GM maize (MON810) and glyphosate tolerant Roundup Ready (RR) soy were dissolved in water and were heat treated by autoclaving for various time intervals. The calculated copy numbers of the recombinant and taxon specific deoxyribonucleic acid (DNA) sequences in the extracted DNA solution were found to decrease with time. This decrease was influenced by the PCR-amplified size. The conversion factor (Cf), which is the ratio of the recombinant DNA sequence to the taxon specific DNA sequence and is used as a constant number for calculating GM% at each event, tended to be stable when the sizes of PCR products of two DNA sequences were nearly equal. The results suggested that the size of the PCR product plays a key role in the quantification of GM organisms in processed foods. It is believed that the Cf of the endosperm (3n) is influenced by whether the GM originated from a paternal or maternal source. The embryos and endosperms were separated from the F1 generation seeds of five GM maize events, and their Cf values were measured. Both paternal and maternal GM events were identified. In these, the endosperm Cf was lower than that of the embryo, and the embryo Cf was lower than that of the endosperm. These results demonstrate the difficulties encountered in the determination of GM% in maize grains (F2 generation) and in processed foods from maize and soy.
Marine Fungi: Their Ecology and Molecular Diversity
NASA Astrophysics Data System (ADS)
Richards, Thomas A.; Jones, Meredith D. M.; Leonard, Guy; Bass, David
2012-01-01
Fungi appear to be rare in marine environments. There are relatively few marine isolates in culture, and fungal small subunit ribosomal DNA (SSU rDNA) sequences are rarely recovered in marine clone library experiments (i.e., culture-independent sequence surveys of eukaryotic microbial diversity from environmental DNA samples). To explore the diversity of marine fungi, we took a broad selection of SSU rDNA data sets and calculated a summary phylogeny. Bringing these data together identified a diverse collection of marine fungi, including sequences branching close to chytrids (flagellated fungi), filamentous hypha-forming fungi, and multicellular fungi. However, the majority of the sequences branched with ascomycete and basidiomycete yeasts. We discuss evidence for 36 novel marine lineages, the majority and most divergent of which branch with the chytrids. We then investigate what these data mean for the evolutionary history of the Fungi and specifically marine-terrestrial transitions. Finally, we discuss the roles of fungi in marine ecosystems.
Effect of Base Sequence "Defects" on the Electrostatic Potential of Dissolved DNA
NASA Astrophysics Data System (ADS)
Adams, Scott V.; Wagner, Katrina; Kephart, Thomas S.; Edwards, Glenn
1997-11-01
An analytical model of the electrostatic potential surrounding dissolved DNA has been developed. The model consists of an all-atom, mathematically helical structure for DNA, in which the atoms are arranged in infinite lines of discrete point charges on concentric cylindrical surfaces. The surrounding solvent and counterions are treated with the Debye-Huckel approximation (Wagner et al., Biophysical Journal 73, 21-30, 1997). Variation in the electrostatic potential due to structural differences between A, B, and Z conformations and homopolymer base sequence is apparent. The most recent modification to the model exploits the principle of superposition to calculate the potential of DNA with a base sequence containing `defects.' That is, the base sequence is no longer uniform along the polymer. Differences between the potential of homopolymer DNA and the potential of DNA containing base `defects' are immediately obvious. These results may aid in understanding the role of electrostatics in base-sequence specificity exhibited by DNA-binding proteins.
Atri, Mostafa; Zhang, Zheng; Marques, Helga; Gorelick, Jeremy; Harisinghani, Mukesh; Sohaib, Aslam; Koh, Dow-Mu; Raman, Steven; Gee, Michael; Choi, Haesun; Landrum, Lisa; Mannel, Robert; Chuang, Linus; Yu, Jian Qin (Michael); McCourt, Carolyn Kay; Gold, Michael
2014-01-01
Rationale and objectives To assess if ferumoxtran-10 (f-10) improves accuracy of MRI to detect lymph node (LN) metastasis in advanced cervical cancer. Materials and methods F-10 MRI component of an IRB approved HIPAA compliant ACRIN/GOG trial was analyzed. Patients underwent f-10 MRI followed by extra-peritoneal or laparoscopic pelvic and abdominal lymphadenectomy. F-10-sensitive sequences were T2* GRE sequences with TE of 12 and 21. Seven independent blinded readers reviewed f-10-insensitive sequences and all sequences in different sessions. Region correlations were performed between pathology and MRI for eight abdomen and pelvis regions. Sensitivity and specificity were calculated at participant level. Reference standard is based on pathology result of surgically removed LNs. Results Among 43 women enrolled in the trial between September 2007 and November 2009, 33 women (mean age 49 ± 11 years old) with advanced cervical cancer (12 IB2, 3 IIA, 15 IIB and 3 IIIB, 29 squamous cell carcinomas, 32 grade 2 or 3) were evaluable. Based on histopathology, LN metastasis was 39% in abdomen and 70% in pelvis. Sensitivity of all sequence review in pelvis, abdomen, and combined were 83%, 60%, and 86%, compared with 78%, 54%, and 80% for f-10 insensitive sequences (P: 0.24, 0.44 and 0.14, respectively). Mean diameter of the largest positive focus on histopathology was 13.7 mm in abdomen and 18.8 mm in pelvis (P = 0.018). Specificities of all sequence review in pelvis, abdomen, and combined were 48%, 75%, and 43%, compared with 75%, 83%, and 73% (P: 0.003, 0.14, 0.002 respectively) for f-10 insensitive sequences. Conclusion Addition of f-10 increased MRI sensitivity to detect LN metastasis in advanced cervical cancer. Increased sensitivity did not reach statistical significance and was at the expense of lower specificity. PMID:25774381
Atri, Mostafa; Zhang, Zheng; Marques, Helga; Gorelick, Jeremy; Harisinghani, Mukesh; Sohaib, Aslam; Koh, Dow-Mu; Raman, Steven; Gee, Michael; Choi, Haesun; Landrum, Lisa; Mannel, Robert; Chuang, Linus; Yu, Jian Qin Michael; McCourt, Carolyn Kay; Gold, Michael
To assess if ferumoxtran-10 (f-10) improves accuracy of MRI to detect lymph node (LN) metastasis in advanced cervical cancer. F-10 MRI component of an IRB approved HIPAA compliant ACRIN/GOG trial was analyzed. Patients underwent f-10 MRI followed by extra-peritoneal or laparoscopic pelvic and abdominal lymphadenectomy. F-10-sensitive sequences were T2* GRE sequences with TE of 12 and 21. Seven independent blinded readers reviewed f-10-insensitive sequences and all sequences in different sessions. Region correlations were performed between pathology and MRI for eight abdomen and pelvis regions. Sensitivity and specificity were calculated at participant level. Reference standard is based on pathology result of surgically removed LNs. Among 43 women enrolled in the trial between September 2007 and November 2009, 33 women (mean age 49 ±11 years old) with advanced cervical cancer (12 IB2, 3 IIA, 15 IIB and 3 IIIB, 29 squamous cell carcinomas, 32 grade 2 or 3) were evaluable. Based on histopathology, LN metastasis was 39% in abdomen and 70% in pelvis. Sensitivity of all sequence review in pelvis, abdomen, and combined were 83%, 60%, and 86%, compared with 78%, 54%, and 80% for f-10 insensitive sequences ( P : 0.24, 0.44 and 0.14, respectively). Mean diameter of the largest positive focus on histopathology was 13.7 mm in abdomen and 18.8 mm in pelvis ( P = 0.018). Specificities of all sequence review in pelvis, abdomen, and combined were 48%, 75%, and 43%, compared with 75%, 83%, and 73% ( P : 0.003, 0.14, 0.002 respectively) for f-10 insensitive sequences. Addition of f-10 increased MRI sensitivity to detect LN metastasis in advanced cervical cancer. Increased sensitivity did not reach statistical significance and was at the expense of lower specificity.
Sequence dependency of canonical base pair opening in the DNA double helix
Villa, Alessandra
2017-01-01
The flipping-out of a DNA base from the double helical structure is a key step of many cellular processes, such as DNA replication, modification and repair. Base pair opening is the first step of base flipping and the exact mechanism is still not well understood. We investigate sequence effects on base pair opening using extensive classical molecular dynamics simulations targeting the opening of 11 different canonical base pairs in two DNA sequences. Two popular biomolecular force fields are applied. To enhance sampling and calculate free energies, we bias the simulation along a simple distance coordinate using a newly developed adaptive sampling algorithm. The simulation is guided back and forth along the coordinate, allowing for multiple opening pathways. We compare the calculated free energies with those from an NMR study and check assumptions of the model used for interpreting the NMR data. Our results further show that the neighboring sequence is an important factor for the opening free energy, but also indicates that other sequence effects may play a role. All base pairs are observed to have a propensity for opening toward the major groove. The preferred opening base is cytosine for GC base pairs, while for AT there is sequence dependent competition between the two bases. For AT opening, we identify two non-canonical base pair interactions contributing to a local minimum in the free energy profile. For both AT and CG we observe long-lived interactions with water and with sodium ions at specific sites on the open base pair. PMID:28369121
Kalendar, Ruslan; Tselykh, Timofey V; Khassenov, Bekbolat; Ramanculov, Erlan M
2017-01-01
This chapter introduces the FastPCR software as an integrated tool environment for PCR primer and probe design, which predicts properties of oligonucleotides based on experimental studies of the PCR efficiency. The software provides comprehensive facilities for designing primers for most PCR applications and their combinations. These include the standard PCR as well as the multiplex, long-distance, inverse, real-time, group-specific, unique, overlap extension PCR for multi-fragments assembling cloning and loop-mediated isothermal amplification (LAMP). It also contains a built-in program to design oligonucleotide sets both for long sequence assembly by ligase chain reaction and for design of amplicons that tile across a region(s) of interest. The software calculates the melting temperature for the standard and degenerate oligonucleotides including locked nucleic acid (LNA) and other modifications. It also provides analyses for a set of primers with the prediction of oligonucleotide properties, dimer and G/C-quadruplex detection, linguistic complexity as well as a primer dilution and resuspension calculator. The program consists of various bioinformatical tools for analysis of sequences with the GC or AT skew, CG% and GA% content, and the purine-pyrimidine skew. It also analyzes the linguistic sequence complexity and performs generation of random DNA sequence as well as restriction endonucleases analysis. The program allows to find or create restriction enzyme recognition sites for coding sequences and supports the clustering of sequences. It performs efficient and complete detection of various repeat types with visual display. The FastPCR software allows the sequence file batch processing that is essential for automation. The program is available for download at http://primerdigital.com/fastpcr.html , and its online version is located at http://primerdigital.com/tools/pcr.html .
ERIC Educational Resources Information Center
Cohen, Adam S.; German, Tamsin C.
2010-01-01
In a task where participants' overt task was to track the location of an object across a sequence of events, reaction times to unpredictable probes requiring an inference about a social agent's beliefs about the location of that object were obtained. Reaction times to false belief situations were faster than responses about the (false) contents of…
Chaouachi, Maher; El Malki, Redouane; Berard, Aurélie; Romaniuk, Marcel; Laval, Valérie; Brunel, Dominique; Bertheau, Yves
2008-03-26
The labeling of products containing genetically modified organisms (GMO) is linked to their quantification since a threshold for the presence of fortuitous GMOs in food has been established. This threshold is calculated from a combination of two absolute quantification values: one for the specific GMO target and the second for an endogenous reference gene specific to the taxon. Thus, the development of reliable methods to quantify GMOs using endogenous reference genes in complex matrixes such as food and feed is needed. Plant identification can be difficult in the case of closely related taxa, which moreover are subject to introgression events. Based on the homology of beta-fructosidase sequences obtained from public databases, two couples of consensus primers were designed for the detection, quantification, and differentiation of four Solanaceae: potato (Solanum tuberosum), tomato (Solanum lycopersicum), pepper (Capsicum annuum), and eggplant (Solanum melongena). Sequence variability was studied first using lines and cultivars (intraspecies sequence variability), then using taxa involved in gene introgressions, and finally, using taxonomically close taxa (interspecies sequence variability). This study allowed us to design four highly specific TaqMan-MGB probes. A duplex real time PCR assay was developed for simultaneous quantification of tomato and potato. For eggplant and pepper, only simplex real time PCR tests were developed. The results demonstrated the high specificity and sensitivity of the assays. We therefore conclude that beta-fructosidase can be used as an endogenous reference gene for GMO analysis.
A putative peroxidase cDNA from turnip and analysis of the encoded protein sequence.
Romero-Gómez, S; Duarte-Vázquez, M A; García-Almendárez, B E; Mayorga-Martínez, L; Cervantes-Avilés, O; Regalado, C
2008-12-01
A putative peroxidase cDNA was isolated from turnip roots (Brassica napus L. var. purple top white globe) by reverse transcriptase-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). Total RNA extracted from mature turnip roots was used as a template for RT-PCR, using a degenerated primer designed to amplify the highly conserved distal motif of plant peroxidases. The resulting partial sequence was used to design the rest of the specific primers for 5' and 3' RACE. Two cDNA fragments were purified, sequenced, and aligned with the partial sequence from RT-PCR, and a complete overlapping sequence was obtained and labeled as BbPA (Genbank Accession No. AY423440, named as podC). The full length cDNA is 1167bp long and contains a 1077bp open reading frame (ORF) encoding a 358 deduced amino acid peroxidase polypeptide. The putative peroxidase (BnPA) showed a calculated Mr of 34kDa, and isoelectric point (pI) of 4.5, with no significant identity with other reported turnip peroxidases. Sequence alignment showed that only three peroxidases have a significant identity with BnPA namely AtP29a (84%), and AtPA2 (81%) from Arabidopsis thaliana, and HRPA2 (82%) from horseradish (Armoracia rusticana). Work is in progress to clone this gene into an adequate host to study the specific role and possible biotechnological applications of this alternative peroxidase source.
BlockLogo: visualization of peptide and sequence motif conservation
Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian; Sun, Jing; Schönbach, Christian; Reinherz, Ellis L.; Zhang, Guang Lan; Brusic, Vladimir
2013-01-01
BlockLogo is a web-server application for visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, selection of motif positions, type of sequence, and output format definition. The output has BlockLogo along with the sequence logo, and a table of motif frequencies. We deployed BlockLogo as an online application and have demonstrated its utility through examples that show visualization of T-cell epitopes and B-cell epitopes (both continuous and discontinuous). Our additional example shows a visualization and analysis of structural motifs that determine specificity of peptide binding to HLA-DR molecules. The BlockLogo server also employs selected experimentally validated prediction algorithms to enable on-the-fly prediction of MHC binding affinity to 15 common HLA class I and class II alleles as well as visual analysis of discontinuous epitopes from multiple sequence alignments. It enables the visualization and analysis of structural and functional motifs that are usually described as regular expressions. It provides a compact view of discontinuous motifs composed of distant positions within biological sequences. BlockLogo is available at: http://research4.dfci.harvard.edu/cvc/blocklogo/ and http://methilab.bu.edu/blocklogo/ PMID:24001880
Ujino-Ihara, Tokuko; Kanamori, Hiroyuki; Yamane, Hiroko; Taguchi, Yuriko; Namiki, Nobukazu; Mukai, Yuzuru; Yoshimura, Kensuke; Tsumura, Yoshihiko
2005-12-01
To identify and characterize lineage-specific genes of conifers, two sets of ESTs (with 12791 and 5902 ESTs, representing 5373 and 3018 gene transcripts, respectively) were generated from the Cupressaceae species Cryptomeria japonica and Chamaecyparis obtusa. These transcripts were compared with non-redundant sets of genes generated from Pinaceae species, other gymnosperms and angiosperms. About 6% of tentative unique genes (Unigenes) of C. japonica and C. obtusa had homologs in other conifers but not angiosperms, and about 70% had apparent homologs in angiosperms. The calculated GC contents of orthologous genes showed that GC contents of coniferous genes are likely to be lower than those of angiosperms. Comparisons of the numbers of homologous genes in each species suggest that copy numbers of genes may be correlated between diverse seed plants. This correlation suggests that the multiplicity of such genes may have arisen before the divergence of gymnosperms and angiosperms.
Thomas, Austen C; Jarman, Simon N; Haman, Katherine H; Trites, Andrew W; Deagle, Bruce E
2014-08-01
Ecologists are increasingly interested in quantifying consumer diets based on food DNA in dietary samples and high-throughput sequencing of marker genes. It is tempting to assume that food DNA sequence proportions recovered from diet samples are representative of consumer's diet proportions, despite the fact that captive feeding studies do not support that assumption. Here, we examine the idea of sequencing control materials of known composition along with dietary samples in order to correct for technical biases introduced during amplicon sequencing and biological biases such as variable gene copy number. Using the Ion Torrent PGM(©) , we sequenced prey DNA amplified from scats of captive harbour seals (Phoca vitulina) fed a constant diet including three fish species in known proportions. Alongside, we sequenced a prey tissue mix matching the seals' diet to generate tissue correction factors (TCFs). TCFs improved the diet estimates (based on sequence proportions) for all species and reduced the average estimate error from 28 ± 15% (uncorrected) to 14 ± 9% (TCF-corrected). The experimental design also allowed us to infer the magnitude of prey-specific digestion biases and calculate digestion correction factors (DCFs). The DCFs were compared with possible proxies for differential digestion (e.g. fish protein%, fish lipid%) revealing a strong relationship between the DCFs and percent lipid of the fish prey, suggesting prey-specific corrections based on lipid content would produce accurate diet estimates in this study system. These findings demonstrate the value of parallel sequencing of food tissue mixtures in diet studies and offer new directions for future research in quantitative DNA diet analysis. © 2013 John Wiley & Sons Ltd.
Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.
2016-01-01
Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825
Tappaz, M; Bitoun, M; Reymond, I; Sergeant, A
1999-09-01
Cysteine sulfinate decarboxylase (CSD) is considered as the rate-limiting enzyme in the biosynthesis of taurine, a possible osmoregulator in brain. Through cloning and sequencing of RT-PCR and RACE-PCR products of rat brain mRNAs, a 2,396-bp cDNA sequence was obtained encoding a protein of 493 amino acids (calculated molecular mass, 55.2 kDa). The corresponding fusion protein showed a substrate specificity similar to that of the endogenous enzyme. The sequence of the encoded protein is identical to that encoded by liver CSD cDNA. Among other characterized amino acid decarboxylases, CSD shows the highest homology (54%) with either isoform of glutamic acid decarboxylase (GAD65 and GAD67). A single mRNA band, approximately 2.5 kb, was detected by northern blot in RNA extracts of brain, liver, and kidney. However, brain and liver CSD cDNA sequences differed in the 5' untranslated region. This indicates two forms of CSD mRNA. Analysis of PCR-amplified products of genomic DNA suggests that the brain form results from the use of a 3' alternative internal splicing site within an exon specifically found in liver CSD mRNA. Through selective RT-PCR the brain form was detected in brain only, whereas the liver form was found in liver and kidney. These results indicate a tissue-specific regulation of CSD genomic expression.
Novel Δ J =1 Sequence in 78Ge: Possible Evidence for Triaxiality
NASA Astrophysics Data System (ADS)
Forney, A. M.; Walters, W. B.; Chiara, C. J.; Janssens, R. V. F.; Ayangeakaa, A. D.; Sethi, J.; Harker, J.; Alcorta, M.; Carpenter, M. P.; Gürdal, G.; Hoffman, C. R.; Kay, B. P.; Kondev, F. G.; Lauritsen, T.; Lister, C. J.; McCutchan, E. A.; Rogers, A. M.; Seweryniak, D.; Stefanescu, I.; Zhu, S.
2018-05-01
A sequence of low-energy levels in Ge783246 has been identified with spins and parity of 2+, 3+, 4+, 5+, and 6+. Decays within this band proceed strictly through Δ J =1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2+ level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ -rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reduced transition probabilities for the Δ J =2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ =3 0 ° , there are sequences of higher-spin levels connected by strong Δ J =1 transitions which decay in the same manner as those observed experimentally, yet are calculated at too high an excitation energy.
Extended phase graphs with anisotropic diffusion
NASA Astrophysics Data System (ADS)
Weigel, M.; Schwenk, S.; Kiselev, V. G.; Scheffler, K.; Hennig, J.
2010-08-01
The extended phase graph (EPG) calculus gives an elegant pictorial description of magnetization response in multi-pulse MR sequences. The use of the EPG calculus enables a high computational efficiency for the quantitation of echo intensities even for complex sequences with multiple refocusing pulses with arbitrary flip angles. In this work, the EPG concept dealing with RF pulses with arbitrary flip angles and phases is extended to account for anisotropic diffusion in the presence of arbitrary varying gradients. The diffusion effect can be expressed by specific diffusion weightings of individual magnetization pathways. This can be represented as an action of a linear operator on the magnetization state. The algorithm allows easy integration of diffusion anisotropy effects. The formalism is validated on known examples from literature and used to calculate the effective diffusion weighting in multi-echo sequences with arbitrary refocusing flip angles.
Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee
2014-02-01
Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.
Hu, Long; Xu, Zhiyu; Hu, Boqin; Lu, Zhi John
2017-01-09
Recent genomic studies suggest that novel long non-coding RNAs (lncRNAs) are specifically expressed and far outnumber annotated lncRNA sequences. To identify and characterize novel lncRNAs in RNA sequencing data from new samples, we have developed COME, a coding potential calculation tool based on multiple features. It integrates multiple sequence-derived and experiment-based features using a decompose-compose method, which makes it more accurate and robust than other well-known tools. We also showed that COME was able to substantially improve the consistency of predication results from other coding potential calculators. Moreover, COME annotates and characterizes each predicted lncRNA transcript with multiple lines of supporting evidence, which are not provided by other tools. Remarkably, we found that one subgroup of lncRNAs classified by such supporting features (i.e. conserved local RNA secondary structure) was highly enriched in a well-validated database (lncRNAdb). We further found that the conserved structural domains on lncRNAs had better chance than other RNA regions to interact with RNA binding proteins, based on the recent eCLIP-seq data in human, indicating their potential regulatory roles. Overall, we present COME as an accurate, robust and multiple-feature supported method for the identification and characterization of novel lncRNAs. The software implementation is available at https://github.com/lulab/COME. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Chen, Zhen; Zhao, Pei; Li, Fuyi; Leier, André; Marquez-Lago, Tatiana T; Wang, Yanan; Webb, Geoffrey I; Smith, A Ian; Daly, Roger J; Chou, Kuo-Chen; Song, Jiangning
2018-03-08
Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection, and dimensionality reduction algorithms, greatly facilitating training, analysis, and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit. http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/. jiangning.song@monash.edu; kcchou@gordonlifescience.org; roger.daly@monash.edu. Supplementary data are available at Bioinformatics online.
Universality of long-range correlations in expansion randomization systems
NASA Astrophysics Data System (ADS)
Messer, P. W.; Lässig, M.; Arndt, P. F.
2005-10-01
We study the stochastic dynamics of sequences evolving by single-site mutations, segmental duplications, deletions, and random insertions. These processes are relevant for the evolution of genomic DNA. They define a universality class of non-equilibrium 1D expansion-randomization systems with generic stationary long-range correlations in a regime of growing sequence length. We obtain explicitly the two-point correlation function of the sequence composition and the distribution function of the composition bias in sequences of finite length. The characteristic exponent χ of these quantities is determined by the ratio of two effective rates, which are explicitly calculated for several specific sequence evolution dynamics of the universality class. Depending on the value of χ, we find two different scaling regimes, which are distinguished by the detectability of the initial composition bias. All analytic results are accurately verified by numerical simulations. We also discuss the non-stationary build-up and decay of correlations, as well as more complex evolutionary scenarios, where the rates of the processes vary in time. Our findings provide a possible example for the emergence of universality in molecular biology.
Skill-dependent proximal-to-distal sequence in team-handball throwing.
Wagner, Herbert; Pfusterschmied, Jürgen; Von Duvillard, Serge P; Müller, Erich
2012-01-01
The importance of proximal-to-distal sequencing in human performance throwing has been reported previously. However, a comprehensive comparison of the proximal-to-distal sequence in team-handball throwing in athletes with different training experience and competition is lacking. Therefore, the aim of the study was to compare the ball velocity and proximal-to-distal sequence in the team-handball standing throw with run-up of players of different skill (less experienced, experienced, and elite). Twenty-four male team-handball players (n = 8 for each group) performed five standing throws with run-up with maximal ball velocity and accuracy. Kinematics and ball trajectories were recorded with a Vicon motion capture system and joint movements were calculated. A specific proximal-to-distal sequence, where elbow flexion occurred before shoulder internal rotation, was found in all three groups. These results are in line with previous studies in team-handball. Furthermore, the results of the present study suggest that in the team-handball standing throw with run-up, increased playing experience is associated with an increase in ball velocity as well as a delayed start to trunk flexion.
Computing Lives And Reliabilities Of Turboprop Transmissions
NASA Technical Reports Server (NTRS)
Coy, J. J.; Savage, M.; Radil, K. C.; Lewicki, D. G.
1991-01-01
Computer program PSHFT calculates lifetimes of variety of aircraft transmissions. Consists of main program, series of subroutines applying to specific configurations, generic subroutines for analysis of properties of components, subroutines for analysis of system, and common block. Main program selects routines used in analysis and causes them to operate in desired sequence. Series of configuration-specific subroutines put in configuration data, perform force and life analyses for components (with help of generic component-property-analysis subroutines), fill property array, call up system-analysis routines, and finally print out results of analysis for system and components. Written in FORTRAN 77(IV).
Mulé, Sébastien; Soize, Sébastien; Benaissa, Azzedine; Portefaix, Christophe; Pierot, Laurent
2016-08-01
To investigate the ability of T2* and fluid-attenuated inversion recovery (FLAIR) MR sequences to detect hemosiderin deposition 3 months after aneurysmal subarachnoid hemorrhage (SAH) in comparison with early non-enhanced CT (NECT) as a gold standard. From September 2008 through May 2013, patients with aneurysmal SAH were included if a NECT less than 24 h after the onset of symptoms showed a SAH, and MRI, including T2* and FLAIR sequences, was performed 3 months later. All aneurysms were treated endovascularly. NECT and MR sequences were blindly analyzed for the presence of SAH (NECT) or hemosiderin deposition (MRI). When positive, details of the spatial distribution of SAH or hemosiderin deposits were noted. Sensitivities were calculated for each patient. Sensitivities, specificities, and positive predictive values (PPVs) were calculated for each location. Forty-nine patients (mean age 52.9 years) were included. Bleeding-related patterns were identified in 43 patients (87.8%) on T2* and 10 patients (20.4%) on FLAIR. T2* was highly predictive of the location of the initial hemorrhage, especially in the Sylvian cisterns (PPVs 95% and 100%) and the anterior interhemispheric fissure (PPV 90%). The T2* sequence can detect and localize a previous SAH a few months after aneurysmal bleeding. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Maranhão, Paulo A C; Teixeira, Claudener S; Sousa, Bruno L; Barroso-Neto, Ito L; Monteiro-Júnior, José E; Fernandes, Andreia V; Ramos, Marcio V; Vasconcelos, Ilka M; Gonçalves, José F C; Rocha, Bruno A M; Freire, Valder N; Grangeiro, Thalles B
2017-07-01
The genus Swartzia is a member of the tribe Swartzieae, whose genera constitute the living descendants of one of the early branches of the papilionoid legumes. Legume lectins comprise one of the main families of structurally and evolutionarily related carbohydrate-binding proteins of plant origin. However, these proteins have been poorly investigated in Swartzia and to date, only the lectin from S. laevicarpa seeds (SLL) has been purified. Moreover, no sequence information is known from lectins of any member of the tribe Swartzieae. In the present study, partial cDNA sequences encoding L-type lectins were obtained from developing seeds of S. simplex var. grandiflora. The amino acid sequences of the S. simplex grandiflora lectins (SSGLs) were only averagely related to the known primary structures of legume lectins, with sequence identities not greater than 50-52%. The SSGL sequences were more related to amino acid sequences of papilionoid lectins from members of the tribes Sophoreae and Dalbergieae and from the Cladratis and Vataireoid clades, which constitute with other taxa, the first branching lineages of the subfamily Papilionoideae. The three-dimensional structures of 2 representative SSGLs (SSGL-A and SSGL-E) were predicted by homology modeling using templates that exhibit the characteristic β-sandwich fold of the L-type lectins. Molecular docking calculations predicted that SSGL-A is able to interact with D-galactose, N-acetyl-D-galactosamine and α-lactose, whereas SSGL-E is probably a non-functional lectin due to 2 mutations in the carbohydrate-binding site. Using molecular dynamics simulations followed by density functional theory calculations, the binding free energies of the interaction of SSGL-A with GalNAc and α-lactose were estimated as -31.7 and -47.5 kcal/mol, respectively. These findings gave insights about the carbohydrate-binding specificity of SLL, which binds to immobilized lactose but is not retained in a matrix containing D-GalNAc as ligand. Copyright © 2017 Elsevier Ltd. All rights reserved.
Modeling bias and variation in the stochastic processes of small RNA sequencing
Etheridge, Alton; Sakhanenko, Nikita; Galas, David
2017-01-01
Abstract The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data. PMID:28369495
Novel Δ J = 1 Sequence in Ge 78 : Possible Evidence for Triaxiality
Forney, A. M.; Walters, W. B.; Chiara, C. J.; ...
2018-05-22
Here, a sequence of low-energy levels in 78 32Ge 46 has been identified with spins and parity of 2 +, 3 +, 4 +, 5 +, and 6 +. Decays within this band proceed strictly through ΔJ=1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2+ level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ-rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reducedmore » transition probabilities for the ΔJ=2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ=30°, there are sequences of higher-spin levels connected by strong ΔJ=1 transitions which decay in the same manner as those observed experimentally, yet are calculated at too high an excitation energy.« less
A novel ΔJ = 1 sequence in 78Ge: possible evidence for triaxiality
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forney, A. M.; Walters, W. B.; Chiara, C. J.
2018-02-20
A sequence of low-energy levels inmore » $$78\\atop{32}$$Ge 46 has been identi ed with spins and parity of 2 +, 3 +, 4 +, 5 +, and 6 +. Decays within this band proceed strictly through ΔJ = 1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2 + level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ-rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reduced transition probabilities for the ΔJ = 2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Lastly, within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ = 30°, there are sequences of higher-spin levels connected by strong ΔJ = 1 transitions which decay in the same manner as those observed experimentally, yet calculated at too high an excitation energy.« less
Novel Δ J = 1 Sequence in Ge 78 : Possible Evidence for Triaxiality
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forney, A. M.; Walters, W. B.; Chiara, C. J.
Here, a sequence of low-energy levels in 78 32Ge 46 has been identified with spins and parity of 2 +, 3 +, 4 +, 5 +, and 6 +. Decays within this band proceed strictly through ΔJ=1 transitions, unlike similar sequences in neighboring Ge and Se nuclei. Above the 2+ level, members of this sequence do not decay into the ground-state band. Moreover, the energy staggering of this sequence has the phase that would be expected for a γ-rigid structure. The energies and branching ratios of many of the levels are described well by shell-model calculations. However, the calculated reducedmore » transition probabilities for the ΔJ=2 in-band transitions imply that they should have been observed, in contradiction with the experiment. Within the calculations of Davydov, Filippov, and Rostovsky for rigid-triaxial rotors with γ=30°, there are sequences of higher-spin levels connected by strong ΔJ=1 transitions which decay in the same manner as those observed experimentally, yet are calculated at too high an excitation energy.« less
A Statistical Guide to the Design of Deep Mutational Scanning Experiments
Matuszewski, Sebastian; Hildebrandt, Marcel E.; Ghenu, Ana-Hermina; Jensen, Jeffrey D.; Bank, Claudia
2016-01-01
The characterization of the distribution of mutational effects is a key goal in evolutionary biology. Recently developed deep-sequencing approaches allow for accurate and simultaneous estimation of the fitness effects of hundreds of engineered mutations by monitoring their relative abundance across time points in a single bulk competition. Naturally, the achievable resolution of the estimated fitness effects depends on the specific experimental setup, the organism and type of mutations studied, and the sequencing technology utilized, among other factors. By means of analytical approximations and simulations, we provide guidelines for optimizing time-sampled deep-sequencing bulk competition experiments, focusing on the number of mutants, the sequencing depth, and the number of sampled time points. Our analytical results show that sampling more time points together with extending the duration of the experiment improves the achievable precision disproportionately compared with increasing the sequencing depth or reducing the number of competing mutants. Even if the duration of the experiment is fixed, sampling more time points and clustering these at the beginning and the end of the experiment increase experimental power and allow for efficient and precise assessment of the entire range of selection coefficients. Finally, we provide a formula for calculating the 95%-confidence interval for the measurement error estimate, which we implement as an interactive web tool. This allows for quantification of the maximum expected a priori precision of the experimental setup, as well as for a statistical threshold for determining deviations from neutrality for specific selection coefficient estimates. PMID:27412710
DNA unzipping phase diagram calculated via replica theory.
Roland, C Brian; Hatch, Kristi Adamson; Prentiss, Mara; Shakhnovich, Eugene I
2009-05-01
We show how single-molecule unzipping experiments can provide strong evidence that the zero-force melting transition of long molecules of natural dsDNA should be classified as a phase transition of the higher-order type (continuous). Toward this end, we study a statistical-mechanics model for the fluctuating structure of a long molecule of dsDNA, and compute the equilibrium phase diagram for the experiment in which the molecule is unzipped under applied force. We consider a perfect-matching dsDNA model, in which the loops are volume-excluding chains with arbitrary loop exponent c . We include stacking interactions, hydrogen bonds, and main-chain entropy. We include sequence heterogeneity at the level of random sequences; in particular, there is no correlation in the base-pairing (bp) energy from one sequence position to the next. We present heuristic arguments to demonstrate that the low-temperature macrostate does not exhibit degenerate ergodicity breaking. We use this claim to understand the results of our replica-theoretic calculation of the equilibrium properties of the system. As a function of temperature, we obtain the minimal force at which the molecule separates completely. This critical-force curve is a line in the temperature-force phase diagram that marks the regions where the molecule exists primarily as a double helix versus the region where the molecule exists as two separate strands. We compare our random-sequence model to magnetic tweezer experiments performed on the 48 502 bp genome of bacteriophage lambda . We find good agreement with the experimental data, which is restricted to temperatures between 24 and 50 degrees C . At higher temperatures, the critical-force curve of our random-sequence model is very different for that of the homogeneous-sequence version of our model. For both sequence models, the critical force falls to zero at the melting temperature T_{c} like |T-T_{c}|;{alpha} . For the homogeneous-sequence model, alpha=1/2 almost exactly, while for the random-sequence model, alpha approximately 0.9 . Importantly, the shape of the critical-force curve is connected, via our theory, to the manner in which the helix fraction falls to zero at T_{c} . The helix fraction is the property that is used to classify the melting transition as a type of phase transition. In our calculation, the shape of the critical-force curve holds strong evidence that the zero-force melting transition of long natural dsDNA should be classified as a higher-order (continuous) phase transition. Specifically, the order is 3rd or greater.
Extended phase graphs with anisotropic diffusion.
Weigel, M; Schwenk, S; Kiselev, V G; Scheffler, K; Hennig, J
2010-08-01
The extended phase graph (EPG) calculus gives an elegant pictorial description of magnetization response in multi-pulse MR sequences. The use of the EPG calculus enables a high computational efficiency for the quantitation of echo intensities even for complex sequences with multiple refocusing pulses with arbitrary flip angles. In this work, the EPG concept dealing with RF pulses with arbitrary flip angles and phases is extended to account for anisotropic diffusion in the presence of arbitrary varying gradients. The diffusion effect can be expressed by specific diffusion weightings of individual magnetization pathways. This can be represented as an action of a linear operator on the magnetization state. The algorithm allows easy integration of diffusion anisotropy effects. The formalism is validated on known examples from literature and used to calculate the effective diffusion weighting in multi-echo sequences with arbitrary refocusing flip angles. Copyright 2010 Elsevier Inc. All rights reserved.
Ganda, Erika Korzune; Bisinotto, Rafael Sisconeto; Decter, Dean Harrison; Bicalho, Rodrigo Carvalho
2016-01-01
The present study aimed evaluate an on-farm culture system for identification of milk pathogens associated with clinical mastitis in dairy cows using two different gold standard approaches: standard laboratory culture in study 1 and 16S rRNA sequencing in study 2. In study 1, milk from mastitic quarters (i.e. presence of flakes, clots, or serous milk; n = 538) was cultured on-farm using a single plate containing three selective chromogenic media (Accumast-FERA Animal Health LCC, Ithaca, NY) and in a reference laboratory using standard culture methods, which was considered the gold standard. In study 2, mastitic milk was cultured on-farm and analyzed through 16S rRNA sequencing (n = 214). In both studies, plates were cultured aerobically at 37°C for 24 h and read by a single technician masked to gold standard results. Accuracy, sensitivity, specificity, positive (PPV) and negative predictive value (NPV) were calculated based on standard laboratory culture in study 1, and PPV was calculated based on sequencing results in study 2. Overall accuracy of Accumast was 84.9%. Likewise, accuracy for identification of Gram-negative bacteria, Staphylococcus sp., and Streptococcus sp. was 96.4%, 93.8%, and 91.5%, respectively. Sensitivity, specificity, PPV, and NPV were 75.0%, 97.9%, 79.6%, and 97.3% for identification of E. coli, 100.0%, 99.8%, 87.5%, and 100.0% for S. aureus, 70.0%, 95.0%, 45.7%, and 98.1% for other Staphylococcus sp., and 90.0%, 92.9%, 91.8%, and 91.2% for Streptococcus sp. In study 2, Accumast PPV was 96.7% for E. coli, 100.0% for Enterococcus sp., 100.0% for Other Gram-negatives, 88.2% for Staphylococcus sp., and 95.0% for Streptococcus sp., respectively. In conclusion, Accumast is a unique approach for on-farm identification pathogens associated with mastitis, presenting overall sensitivity and specificity of 82.3% and 89.9% respectively.
Ganda, Erika Korzune; Bisinotto, Rafael Sisconeto; Decter, Dean Harrison; Bicalho, Rodrigo Carvalho
2016-01-01
The present study aimed evaluate an on-farm culture system for identification of milk pathogens associated with clinical mastitis in dairy cows using two different gold standard approaches: standard laboratory culture in study 1 and 16S rRNA sequencing in study 2. In study 1, milk from mastitic quarters (i.e. presence of flakes, clots, or serous milk; n = 538) was cultured on-farm using a single plate containing three selective chromogenic media (Accumast—FERA Animal Health LCC, Ithaca, NY) and in a reference laboratory using standard culture methods, which was considered the gold standard. In study 2, mastitic milk was cultured on-farm and analyzed through 16S rRNA sequencing (n = 214). In both studies, plates were cultured aerobically at 37°C for 24 h and read by a single technician masked to gold standard results. Accuracy, sensitivity, specificity, positive (PPV) and negative predictive value (NPV) were calculated based on standard laboratory culture in study 1, and PPV was calculated based on sequencing results in study 2. Overall accuracy of Accumast was 84.9%. Likewise, accuracy for identification of Gram-negative bacteria, Staphylococcus sp., and Streptococcus sp. was 96.4%, 93.8%, and 91.5%, respectively. Sensitivity, specificity, PPV, and NPV were 75.0%, 97.9%, 79.6%, and 97.3% for identification of E. coli, 100.0%, 99.8%, 87.5%, and 100.0% for S. aureus, 70.0%, 95.0%, 45.7%, and 98.1% for other Staphylococcus sp., and 90.0%, 92.9%, 91.8%, and 91.2% for Streptococcus sp. In study 2, Accumast PPV was 96.7% for E. coli, 100.0% for Enterococcus sp., 100.0% for Other Gram-negatives, 88.2% for Staphylococcus sp., and 95.0% for Streptococcus sp., respectively. In conclusion, Accumast is a unique approach for on-farm identification pathogens associated with mastitis, presenting overall sensitivity and specificity of 82.3% and 89.9% respectively. PMID:27176216
Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers
Pabinger, Stephan; Ernst, Karina; Pulverer, Walter; Kallmeyer, Rainer; Valdes, Ana M.; Metrustry, Sarah; Katic, Denis; Nuzzo, Angelo; Kriegner, Albert; Vierlinger, Klemens; Weinhaeusel, Andreas
2016-01-01
Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM). Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage. TABSAT is freely available under a GNU General Public License version 3.0 (GPLv3) at https://github.com/tadkeys/tabsat/ and http://demo.platomics.com/. PMID:27467908
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daling, P.M.; Marler, J.E.; Vo, T.V.
This study evaluates the values (benefits) and impacts (costs) associated with potential resolutions to Generic Issue 143, ``Availability of HVAC and Chilled Water Systems.`` The study identifies vulnerabilities related to failures of HVAC, chilled water, and room cooling systems; develops estimates of room heatup rates and safety-related equipment vulnerabilities following losses of HVAC/room cooler systems; develops estimates of the core damage frequencies and public risks associated with failures of these systems; develops three proposed resolution strategies to this generic issue; and performs a value/impact analysis of the proposed resolutions. Existing probabilistic risk assessments for four representative plants, including one plantmore » from each vendor, form the basis for the core damage frequency and public risk calculations. Both internal and external events were considered. It was concluded that all three proposed resolution strategies exceed the $1,000/person-rem cost-effectiveness ratio. Additional evaluations were performed to develop ``generic`` insights on potential design-related and configuration-related vulnerabilities and potential high-frequency ({approximately}1E-04/RY) accident sequences that involve failures of HVAC/room cooling functions. It was concluded that, although high-frequency accident sequences may exist at some plants, these high-frequency sequences are plant-specific in nature or have been resolved through hardware and/or operational changes. The plant-specific Individual Plant Examinations are an effective vehicle for identification and resolution of these plant-specific anomalies and hardware configurations.« less
Characterizing Protease Specificity: How Many Substrates Do We Need?
Schauperl, Michael; Fuchs, Julian E.; Waldner, Birgit J.; Huber, Roland G.; Kramer, Christian; Liedl, Klaus R.
2015-01-01
Calculation of cleavage entropies allows to quantify, map and compare protease substrate specificity by an information entropy based approach. The metric intrinsically depends on the number of experimentally determined substrates (data points). Thus a statistical analysis of its numerical stability is crucial to estimate the systematic error made by estimating specificity based on a limited number of substrates. In this contribution, we show the mathematical basis for estimating the uncertainty in cleavage entropies. Sets of cleavage entropies are calculated using experimental cleavage data and modeled extreme cases. By analyzing the underlying mathematics and applying statistical tools, a linear dependence of the metric in respect to 1/n was found. This allows us to extrapolate the values to an infinite number of samples and to estimate the errors. Analyzing the errors, a minimum number of 30 substrates was found to be necessary to characterize substrate specificity, in terms of amino acid variability, for a protease (S4-S4’) with an uncertainty of 5 percent. Therefore, we encourage experimental researchers in the protease field to record specificity profiles of novel proteases aiming to identify at least 30 peptide substrates of maximum sequence diversity. We expect a full characterization of protease specificity helpful to rationalize biological functions of proteases and to assist rational drug design. PMID:26559682
Parente, Daniel J; Ray, J Christian J; Swint-Kruse, Liskin
2015-12-01
As proteins evolve, amino acid positions key to protein structure or function are subject to mutational constraints. These positions can be detected by analyzing sequence families for amino acid conservation or for coevolution between pairs of positions. Coevolutionary scores are usually rank-ordered and thresholded to reveal the top pairwise scores, but they also can be treated as weighted networks. Here, we used network analyses to bypass a major complication of coevolution studies: For a given sequence alignment, alternative algorithms usually identify different, top pairwise scores. We reconciled results from five commonly-used, mathematically divergent algorithms (ELSC, McBASC, OMES, SCA, and ZNMI), using the LacI/GalR and 1,6-bisphosphate aldolase protein families as models. Calculations used unthresholded coevolution scores from which column-specific properties such as sequence entropy and random noise were subtracted; "central" positions were identified by calculating various network centrality scores. When compared among algorithms, network centrality methods, particularly eigenvector centrality, showed markedly better agreement than comparisons of the top pairwise scores. Positions with large centrality scores occurred at key structural locations and/or were functionally sensitive to mutations. Further, the top central positions often differed from those with top pairwise coevolution scores: instead of a few strong scores, central positions often had multiple, moderate scores. We conclude that eigenvector centrality calculations reveal a robust evolutionary pattern of constraints-detectable by divergent algorithms--that occur at key protein locations. Finally, we discuss the fact that multiple patterns coexist in evolutionary data that, together, give rise to emergent protein functions. © 2015 Wiley Periodicals, Inc.
Mena-Ulecia, Karel; Gonzalez-Norambuena, Fabian; Vergara-Jaque, Ariela; Poblete, Horacio; Tiznado, William; Caballero, Julio
2018-06-15
Protein kinases (PKs) discriminate between closely related sequences that contain serine, threonine, and/or tyrosine residues. Such specificity is defined by the amino acid sequence surrounding the phosphorylatable residue, so that it is possible to identify an optimal recognition motif (ORM) for each PK. The ORM for the protein kinase A (PKA), a well-known member of the PK family, is the sequence RRX(S/T)X, where arginines at the -3 and -2 positions play a key role with respect to the primed phosphorylation site. In this work, differential affinities of PKA for the peptide substrate Kemptide (LRRASLG) and mutants that substitute the arginine residues by the unnatural peptide homoarginine were evaluated through molecular dynamics (MD) and free energy perturbation (FEP) calculations. The FEP study for the homoarginine mutants required previous elaboration of a CHARMM "arginine to homoarginine" (R2B) hybrid topology file which is available in this manuscript as Supporting Information. Mutants substituting the arginine residues by alanine, lysine, and histidine were also considered in the comparison by using the same protocol. FEP calculations allowed estimating the free energy changes from the free PKA to PKA-substrate complex (ΔΔG E→ES ) when Kemptide structure was mutated. Both ΔΔG S→ES values for homoarginine mutants were predicted with a difference below 1 kcal/mol. In addition, FEP correctly predicted that all the studied mutations decrease the catalytic efficiency of Kemptide for PKA. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Zhu, H.; Braun, W.
1999-01-01
A statistical analysis of a representative data set of 169 known protein structures was used to analyze the specificity of residue interactions between spatial neighboring strands in beta-sheets. Pairwise potentials were derived from the frequency of residue pairs in nearest contact, second nearest and third nearest contacts across neighboring beta-strands compared to the expected frequency of residue pairs in a random model. A pseudo-energy function based on these statistical pairwise potentials recognized native beta-sheets among possible alternative pairings. The native pairing was found within the three lowest energies in 73% of the cases in the training data set and in 63% of beta-sheets in a test data set of 67 proteins, which were not part of the training set. The energy function was also used to detect tripeptides, which occur frequently in beta-sheets of native proteins. The majority of native partners of tripeptides were distributed in a low energy range. Self-correcting distance geometry (SECODG) calculations using distance constraints sets derived from possible low energy pairing of beta-strands uniquely identified the native pairing of the beta-sheet in pancreatic trypsin inhibitor (BPTI). These results will be useful for predicting the structure of proteins from their amino acid sequence as well as for the design of proteins containing beta-sheets. PMID:10048326
Bandgap oscillation in quasiperiodic (BN)xCy nanotubes
NASA Astrophysics Data System (ADS)
Freitas, A.; Bezerra, C. G.; Azevedo, S.; Machado, L. D.; Pedreira, D. O.
2016-12-01
In the present contribution, we apply first-principles calculations to study the effects of quasiperiodic disorder on the physical properties of BN and C nanotubes. We take BN nanotubes (BNNTs) and C nanotubes (CNTs) as building blocks and construct quasiperiodic BNxCy nanotubes according to the Fibonacci sequence. We studied armchair and zigzag nanotubes of varying diameters. Our results demonstrate that the energy gap oscillates as a function of the n-generation index of the Fibonacci sequence. Moreover, we show that the choice of the BNNTs and CNTs may lead to a quasiperiodic BNxCy nanotube presenting an adjustable energy gap. We obtained a variety of quasiperiodic nanotubes with energy gaps ranging from 0.29 eV to 1.06 eV, which may be of interest for specific technological applications. Finally, it is also demonstrated that the specific heat of the quasiperiodic zigzag and armchair nanotubes presents an oscillatory behavior in the low temperature regime, and that this behavior depends on the curvature of the nanotube.
Zook, Justin M.; Samarov, Daniel; McDaniel, Jennifer; Sen, Shurjo K.; Salit, Marc
2012-01-01
While the importance of random sequencing errors decreases at higher DNA or RNA sequencing depths, systematic sequencing errors (SSEs) dominate at high sequencing depths and can be difficult to distinguish from biological variants. These SSEs can cause base quality scores to underestimate the probability of error at certain genomic positions, resulting in false positive variant calls, particularly in mixtures such as samples with RNA editing, tumors, circulating tumor cells, bacteria, mitochondrial heteroplasmy, or pooled DNA. Most algorithms proposed for correction of SSEs require a data set used to calculate association of SSEs with various features in the reads and sequence context. This data set is typically either from a part of the data set being “recalibrated” (Genome Analysis ToolKit, or GATK) or from a separate data set with special characteristics (SysCall). Here, we combine the advantages of these approaches by adding synthetic RNA spike-in standards to human RNA, and use GATK to recalibrate base quality scores with reads mapped to the spike-in standards. Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 Phred-scaled quality score units, and by as much as 13 units at CpG sites. In addition, since the spike-in data used for recalibration are independent of the genome being sequenced, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database. We also use GATK with the spike-in standards to demonstrate that the Illumina RNA sequencing runs overestimate quality scores for AC, CC, GC, GG, and TC dinucleotides, while SOLiD has less dinucleotide SSEs but more SSEs for certain cycles. We conclude that using these DNA and RNA spike-in standards with GATK improves base quality score recalibration. PMID:22859977
Logan, Grace; Freimanis, Graham L; King, David J; Valdazo-González, Begoña; Bachanek-Bankowska, Katarzyna; Sanderson, Nicholas D; Knowles, Nick J; King, Donald P; Cottam, Eleanor M
2014-09-30
Next-Generation Sequencing (NGS) is revolutionizing molecular epidemiology by providing new approaches to undertake whole genome sequencing (WGS) in diagnostic settings for a variety of human and veterinary pathogens. Previous sequencing protocols have been subject to biases such as those encountered during PCR amplification and cell culture, or are restricted by the need for large quantities of starting material. We describe here a simple and robust methodology for the generation of whole genome sequences on the Illumina MiSeq. This protocol is specific for foot-and-mouth disease virus (FMDV) or other polyadenylated RNA viruses and circumvents both the use of PCR and the requirement for large amounts of initial template. The protocol was successfully validated using five FMDV positive clinical samples from the 2001 epidemic in the United Kingdom, as well as a panel of representative viruses from all seven serotypes. In addition, this protocol was successfully used to recover 94% of an FMDV genome that had previously been identified as cell culture negative. Genome sequences from three other non-FMDV polyadenylated RNA viruses (EMCV, ERAV, VESV) were also obtained with minor protocol amendments. We calculated that a minimum coverage depth of 22 reads was required to produce an accurate consensus sequence for FMDV O. This was achieved in 5 FMDV/O/UKG isolates and the type O FMDV from the serotype panel with the exception of the 5' genomic termini and area immediately flanking the poly(C) region. We have developed a universal WGS method for FMDV and other polyadenylated RNA viruses. This method works successfully from a limited quantity of starting material and eliminates the requirement for genome-specific PCR amplification. This protocol has the potential to generate consensus-level sequences within a routine high-throughput diagnostic environment.
Abi-Ghanem, Josephine; Rabin, Clémence; Porrini, Massimiliano; Dausse, Eric; Toulmé, Jean-Jacques; Gabelica, Valérie
2017-10-06
In the RNA realm, non-Watson-Crick base pairs are abundant and can affect both the RNA 3D structure and its function. Here, we investigated the formation of RNA kissing complexes in which the loop-loop interaction is modulated by non-Watson-Crick pairs. Mass spectrometry, surface plasmon resonance, and UV-melting experiments show that the G⋅U wobble base pair favors kissing complex formation only when placed at specific positions. We tried to rationalize this effect by molecular modeling, including molecular mechanics Poisson-Boltzmann surface area (MMPBSA) thermodynamics calculations and PBSA calculations of the electrostatic potential surfaces. Modeling reveals that the G⋅U stabilization is due to a specific electrostatic environment defined by the base pairs of the entire loop-loop region. The loop is not symmetric, and therefore the identity and position of each base pair matters. Predicting and visualizing the electrostatic environment created by a given sequence can help to design specific kissing complexes with high affinity, for potential therapeutic, nanotechnology or analytical applications. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Handbook of Industrial Engineering Equations, Formulas, and Calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Badiru, Adedeji B; Omitaomu, Olufemi A
The first handbook to focus exclusively on industrial engineering calculations with a correlation to applications, Handbook of Industrial Engineering Equations, Formulas, and Calculations contains a general collection of the mathematical equations often used in the practice of industrial engineering. Many books cover individual areas of engineering and some cover all areas, but none covers industrial engineering specifically, nor do they highlight topics such as project management, materials, and systems engineering from an integrated viewpoint. Written by acclaimed researchers and authors, this concise reference marries theory and practice, making it a versatile and flexible resource. Succinctly formatted for functionality, the bookmore » presents: Basic Math Calculations; Engineering Math Calculations; Production Engineering Calculations; Engineering Economics Calculations; Ergonomics Calculations; Facility Layout Calculations; Production Sequencing and Scheduling Calculations; Systems Engineering Calculations; Data Engineering Calculations; Project Engineering Calculations; and Simulation and Statistical Equations. It has been said that engineers make things while industrial engineers make things better. To make something better requires an understanding of its basic characteristics and the underlying equations and calculations that facilitate that understanding. To do this, however, you do not have to be computational experts; you just have to know where to get the computational resources that are needed. This book elucidates the underlying equations that facilitate the understanding required to improve design processes, continuously improving the answer to the age-old question: What is the best way to do a job?« less
Statistical tests to compare motif count exceptionalities
Robin, Stéphane; Schbath, Sophie; Vandewalle, Vincent
2007-01-01
Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. PMID:17346349
[Identification of medicinal plant Dendrobium based on the chloroplast psbK-psbI intergenic spacer].
Yao, Hui; Yang, Pei; Zhou, Hong; Ma, Shuang-jiao; Song, Jing-yuan; Chen, Shi-lin
2015-06-01
In this paper, the chloroplast psbK-psbI intergenic spacers of 18 species of Dendrobium and their adulterants were amplified and sequenced, and then the sequence characteristics were analyzed. The sequence lengths of chloroplast psbK-psbI regions of Dendrobium ranged from 474 to 513 bp and the GC contents were 25.4%-27.6%. The variable sites were 71 while the informative sites were 46. The inter-specific genetic distances calculated by Kimura 2-parameter (K2P) of Dendrobium were 0.006 1-0.058 1, with an average of 0.028 4. The K2P genetic distances between Dendrobium species and Bulbophyllum odoratissimum were 0.093 2-0.120 4. The NJ tree showed that the Dendrobium species can be easily differentiated from each other and 6 samples of the inspected Dendrobium species were identified successfully through sequencing the psbK-psbI intergenic spacer. Therefore, the chloroplast psbK-psbI intergenic spacer can be used as a candidate marker to identify Dendrobium species and its adulterants.
Using hidden Markov models to align multiple sequences.
Mount, David W
2009-07-01
A hidden Markov model (HMM) is a probabilistic model of a multiple sequence alignment (msa) of proteins. In the model, each column of symbols in the alignment is represented by a frequency distribution of the symbols (called a "state"), and insertions and deletions are represented by other states. One moves through the model along a particular path from state to state in a Markov chain (i.e., random choice of next move), trying to match a given sequence. The next matching symbol is chosen from each state, recording its probability (frequency) and also the probability of going to that state from a previous one (the transition probability). State and transition probabilities are multiplied to obtain a probability of the given sequence. The hidden nature of the HMM is due to the lack of information about the value of a specific state, which is instead represented by a probability distribution over all possible values. This article discusses the advantages and disadvantages of HMMs in msa and presents algorithms for calculating an HMM and the conditions for producing the best HMM.
GuiTope: an application for mapping random-sequence peptides to protein sequences.
Halperin, Rebecca F; Stafford, Phillip; Emery, Jack S; Navalkar, Krupa Arun; Johnston, Stephen Albert
2012-01-03
Random-sequence peptide libraries are a commonly used tool to identify novel ligands for binding antibodies, other proteins, and small molecules. It is often of interest to compare the selected peptide sequences to the natural protein binding partners to infer the exact binding site or the importance of particular residues. The ability to search a set of sequences for similarity to a set of peptides may sometimes enable the prediction of an antibody epitope or a novel binding partner. We have developed a software application designed specifically for this task. GuiTope provides a graphical user interface for aligning peptide sequences to protein sequences. All alignment parameters are accessible to the user including the ability to specify the amino acid frequency in the peptide library; these frequencies often differ significantly from those assumed by popular alignment programs. It also includes a novel feature to align di-peptide inversions, which we have found improves the accuracy of antibody epitope prediction from peptide microarray data and shows utility in analyzing phage display datasets. Finally, GuiTope can randomly select peptides from a given library to estimate a null distribution of scores and calculate statistical significance. GuiTope provides a convenient method for comparing selected peptide sequences to protein sequences, including flexible alignment parameters, novel alignment features, ability to search a database, and statistical significance of results. The software is available as an executable (for PC) at http://www.immunosignature.com/software and ongoing updates and source code will be available at sourceforge.net.
Rattei, Thomas; Tischler, Patrick; Götz, Stefan; Jehl, Marc-André; Hoser, Jonathan; Arnold, Roland; Conesa, Ana; Mewes, Hans-Werner
2010-01-01
The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).
Characterization and biological properties of a new staphylococcal exotoxin
1994-01-01
Staphylococcus aureus strain D4508 is a toxic shock syndrome toxin 1- negative clinical isolate from a nonmenstrual case of toxic shock syndrome (TSS). In the present study, we have purified and characterized a new exotoxin from the extracellular products of this strain. This toxin was found to have a molecular mass of 25.14 kD by mass spectrometry and an isoelectric point of 5.65 by isoelectric focusing. We have also cloned and sequenced its corresponding genomic determinant. The DNA sequence encoding the mature protein was found to be 654 base pairs and is predicted to encode a polypeptide of 218 amino acids. The deduced protein contains an NH2-terminal sequence identical to that of the native protein. The calculated molecular weight (25.21 kD) of the recombinant mature protein is also consistent with that of the native molecules. When injected intravenously into rabbits, both the native and recombinant toxins induce an acute TSS-like illness characterized by high fever, hypotension, diarrhea, shock, and in some cases death, with classical histological findings of TSS. Furthermore, the activity of the toxin is specifically enhanced by low quantities of endotoxins. The toxicity can be blocked by rabbit immunoglobulin G antibody specific for the toxin. Western blotting and DNA sequencing data confirm that the protein is a unique staphylococcal exotoxin, yet shares significant sequence homology with known staphylococcal enterotoxins, especially the SEA, SED, and SEE toxins. We conclude therefore that this 25-kD protein belongs to the staphylococcal enterotoxin gene family that is capable of inducing a TSS-like illness in rabbits. PMID:7964453
A Statistical Guide to the Design of Deep Mutational Scanning Experiments.
Matuszewski, Sebastian; Hildebrandt, Marcel E; Ghenu, Ana-Hermina; Jensen, Jeffrey D; Bank, Claudia
2016-09-01
The characterization of the distribution of mutational effects is a key goal in evolutionary biology. Recently developed deep-sequencing approaches allow for accurate and simultaneous estimation of the fitness effects of hundreds of engineered mutations by monitoring their relative abundance across time points in a single bulk competition. Naturally, the achievable resolution of the estimated fitness effects depends on the specific experimental setup, the organism and type of mutations studied, and the sequencing technology utilized, among other factors. By means of analytical approximations and simulations, we provide guidelines for optimizing time-sampled deep-sequencing bulk competition experiments, focusing on the number of mutants, the sequencing depth, and the number of sampled time points. Our analytical results show that sampling more time points together with extending the duration of the experiment improves the achievable precision disproportionately compared with increasing the sequencing depth or reducing the number of competing mutants. Even if the duration of the experiment is fixed, sampling more time points and clustering these at the beginning and the end of the experiment increase experimental power and allow for efficient and precise assessment of the entire range of selection coefficients. Finally, we provide a formula for calculating the 95%-confidence interval for the measurement error estimate, which we implement as an interactive web tool. This allows for quantification of the maximum expected a priori precision of the experimental setup, as well as for a statistical threshold for determining deviations from neutrality for specific selection coefficient estimates. Copyright © 2016 by the Genetics Society of America.
Long sequence correlation coprocessor
NASA Astrophysics Data System (ADS)
Gage, Douglas W.
1994-09-01
A long sequence correlation coprocessor (LSCC) accelerates the bitwise correlation of arbitrarily long digital sequences by calculating in parallel the correlation score for 16, for example, adjacent bit alignments between two binary sequences. The LSCC integrated circuit is incorporated into a computer system with memory storage buffers and a separate general purpose computer processor which serves as its controller. Each of the LSCC's set of sequential counters simultaneously tallies a separate correlation coefficient. During each LSCC clock cycle, computer enable logic associated with each counter compares one bit of a first sequence with one bit of a second sequence to increment the counter if the bits are the same. A shift register assures that the same bit of the first sequence is simultaneously compared to different bits of the second sequence to simultaneously calculate the correlation coefficient by the different counters to represent different alignments of the two sequences.
Principles of Quantitative MR Imaging with Illustrated Review of Applicable Modular Pulse Diagrams.
Mills, Andrew F; Sakai, Osamu; Anderson, Stephan W; Jara, Hernan
2017-01-01
Continued improvements in diagnostic accuracy using magnetic resonance (MR) imaging will require development of methods for tissue analysis that complement traditional qualitative MR imaging studies. Quantitative MR imaging is based on measurement and interpretation of tissue-specific parameters independent of experimental design, compared with qualitative MR imaging, which relies on interpretation of tissue contrast that results from experimental pulse sequence parameters. Quantitative MR imaging represents a natural next step in the evolution of MR imaging practice, since quantitative MR imaging data can be acquired using currently available qualitative imaging pulse sequences without modifications to imaging equipment. The article presents a review of the basic physical concepts used in MR imaging and how quantitative MR imaging is distinct from qualitative MR imaging. Subsequently, the article reviews the hierarchical organization of major applicable pulse sequences used in this article, with the sequences organized into conventional, hybrid, and multispectral sequences capable of calculating the main tissue parameters of T1, T2, and proton density. While this new concept offers the potential for improved diagnostic accuracy and workflow, awareness of this extension to qualitative imaging is generally low. This article reviews the basic physical concepts in MR imaging, describes commonly measured tissue parameters in quantitative MR imaging, and presents the major available pulse sequences used for quantitative MR imaging, with a focus on the hierarchical organization of these sequences. © RSNA, 2017.
Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato
2014-10-01
Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.
QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles.
Van der Borght, Koen; Thys, Kim; Wetzels, Yves; Clement, Lieven; Verbist, Bie; Reumers, Joke; van Vlijmen, Herman; Aerssens, Jeroen
2015-11-10
Next generation sequencing enables studying heterogeneous populations of viral infections. When the sequencing is done at high coverage depth ("deep sequencing"), low frequency variants can be detected. Here we present QQ-SNV (http://sourceforge.net/projects/qqsnv), a logistic regression classifier model developed for the Illumina sequencing platforms that uses the quantiles of the quality scores, to distinguish true single nucleotide variants from sequencing errors based on the estimated SNV probability. To train the model, we created a dataset of an in silico mixture of five HIV-1 plasmids. Testing of our method in comparison to the existing methods LoFreq, ShoRAH, and V-Phaser 2 was performed on two HIV and four HCV plasmid mixture datasets and one influenza H1N1 clinical dataset. For default application of QQ-SNV, variants were called using a SNV probability cutoff of 0.5 (QQ-SNV(D)). To improve the sensitivity we used a SNV probability cutoff of 0.0001 (QQ-SNV(HS)). To also increase specificity, SNVs called were overruled when their frequency was below the 80(th) percentile calculated on the distribution of error frequencies (QQ-SNV(HS-P80)). When comparing QQ-SNV versus the other methods on the plasmid mixture test sets, QQ-SNV(D) performed similarly to the existing approaches. QQ-SNV(HS) was more sensitive on all test sets but with more false positives. QQ-SNV(HS-P80) was found to be the most accurate method over all test sets by balancing sensitivity and specificity. When applied to a paired-end HCV sequencing study, with lowest spiked-in true frequency of 0.5%, QQ-SNV(HS-P80) revealed a sensitivity of 100% (vs. 40-60% for the existing methods) and a specificity of 100% (vs. 98.0-99.7% for the existing methods). In addition, QQ-SNV required the least overall computation time to process the test sets. Finally, when testing on a clinical sample, four putative true variants with frequency below 0.5% were consistently detected by QQ-SNV(HS-P80) from different generations of Illumina sequencers. We developed and successfully evaluated a novel method, called QQ-SNV, for highly efficient single nucleotide variant calling on Illumina deep sequencing virology data.
Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri
2015-12-01
Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.
Sayah, Anousheh; Jay, Ann K; Toaff, Jacob S; Makariou, Erini V; Berkowitz, Frank
2016-09-01
Reducing lumbar spine MRI scanning time while retaining diagnostic accuracy can benefit patients and reduce health care costs. This study compares the effectiveness of a rapid lumbar MRI protocol using 3D T2-weighted sampling perfection with application-optimized contrast with different flip-angle evolutions (SPACE) sequences with a standard MRI protocol for evaluation of lumbar spondylosis. Two hundred fifty consecutive unenhanced lumbar MRI examinations performed at 1.5 T were retrospectively reviewed. Full, rapid, and complete versions of each examination were interpreted for spondylotic changes at each lumbar level, including herniations and neural compromise. The full examination consisted of sagittal T1-weighted, T2-weighted turbo spin-echo (TSE), and STIR sequences; and axial T1- and T2-weighted TSE sequences (time, 18 minutes 40 seconds). The rapid examination consisted of sagittal T1- and T2-weighted SPACE sequences, with axial SPACE reformations (time, 8 minutes 46 seconds). The complete examination consisted of the full examination plus the T2-weighted SPACE sequence. Sensitivities and specificities of the full and rapid examinations were calculated using the complete study as the reference standard. The rapid and full studies had sensitivities of 76.0% and 69.3%, with specificities of 97.2% and 97.9%, respectively, for all degenerative processes. Rapid and full sensitivities were 68.7% and 66.3% for disk herniation, 85.2% and 81.5% for canal compromise, 82.9% and 69.1% for lateral recess compromise, and 76.9% and 69.7% for foraminal compromise, respectively. Isotropic SPACE T2-weighted imaging provides high-quality imaging of lumbar spondylosis, with multiplanar reformatting capability. Our SPACE-based rapid protocol had sensitivities and specificities for herniations and neural compromise comparable to those of the protocol without SPACE. This protocol fits within a 15-minute slot, potentially reducing costs and discomfort for a large subgroup of patients.
Jaekel, Ulrike; Musat, Niculina; Adam, Birgit; Kuypers, Marcel; Grundmann, Olav; Musat, Florin
2013-05-01
The short-chain, non-methane hydrocarbons propane and butane can contribute significantly to the carbon and sulfur cycles in marine environments affected by oil or natural gas seepage. In the present study, we enriched and identified novel propane and butane-degrading sulfate reducers from marine oil and gas cold seeps in the Gulf of Mexico and Hydrate Ridge. The enrichment cultures obtained were able to degrade simultaneously propane and butane, but not other gaseous alkanes. They were cold-adapted, showing highest sulfate-reduction rates between 16 and 20 °C. Analysis of 16S rRNA gene libraries, followed by whole-cell hybridizations with sequence-specific oligonucleotide probes showed that each enrichment culture was dominated by a unique phylotype affiliated with the Desulfosarcina-Desulfococcus cluster within the Deltaproteobacteria. These phylotypes formed a distinct phylogenetic cluster of propane and butane degraders, including sequences from environments associated with hydrocarbon seeps. Incubations with (13)C-labeled substrates, hybridizations with sequence-specific probes and nanoSIMS analyses showed that cells of the dominant phylotypes were the first to become enriched in (13)C, demonstrating that they were directly involved in hydrocarbon degradation. Furthermore, using the nanoSIMS data, carbon assimilation rates were calculated for the dominant cells in each enrichment culture.
Evaluation of liver fat in the presence of iron with MRI using T2* correction: a clinical approach.
Henninger, Benjamin; Benjamin, Henninger; Kremser, Christian; Christian, Kremser; Rauch, Stefan; Stefan, Rauch; Eder, Robert; Robert, Eder; Judmaier, Werner; Werner, Judmaier; Zoller, Heinz; Heinz, Zoller; Michaely, Henrik; Henrik, Michaely; Schocke, Michael; Michael, Schocke
2013-06-01
To assess magnetic resonance imaging (MRI) with conventional chemical shift-based sequences with and without T2* correction for the evaluation of steatosis hepatitis (SH) in the presence of iron. Thirty-one patients who underwent MRI and liver biopsy because of clinically suspected diffuse liver disease were retrospectively analysed. The signal intensity (SI) was calculated in co-localised regions of interest (ROIs) using conventional spoiled gradient-echo T1 FLASH in-phase and opposed-phase (IP/OP). T2* relaxation time was recorded in a fat-saturated multi-echo-gradient-echo sequence. The fat fraction (FF) was calculated with non-corrected and T2*-corrected SIs. Results were correlated with liver biopsy. There was significant difference (P < 0.001) between uncorrected and T2* corrected FF in patients with SH and concomitant hepatic iron overload (HIO). Using 5 % as a threshold resulted in eight false negative results with uncorrected FF whereas T2* corrected FF lead to true positive results in 5/8 patients. ROC analysis calculated three threshold values (8.97 %, 5.3 % and 3.92 %) for T2* corrected FF with accuracy 84 %, sensitivity 83-91 % and specificity 63-88 %. FF with T2* correction is accurate for the diagnosis of hepatic fat in the presence of HIO. Findings of our study suggest the use of IP/OP imaging in combination with T2* correction. • Magnetic resonance helps quantify both iron and fat content within the liver • T2* correction helps to predict the correct diagnosis of steatosis hepatitis • "Fat fraction" from T2*-corrected chemical shift-based sequences accurately quantifies hepatic fat • "Fat fraction" without T2* correction underestimates hepatic fat with iron overload.
Chronic exposure to water pollutant trichloroethylene increased epigenetic drift in CD4(+) T cells.
Gilbert, Kathleen M; Blossom, Sarah J; Erickson, Stephen W; Reisfeld, Brad; Zurlinden, Todd J; Broadfoot, Brannon; West, Kirk; Bai, Shasha; Cooney, Craig A
2016-05-01
Autoimmune disease and CD4(+) T-cell alterations are induced in mice exposed to the water pollutant trichloroethylene (TCE). We examined here whether TCE altered gene-specific DNA methylation in CD4(+) T cells as a possible mechanism of immunotoxicity. Naive and effector/memory CD4(+) T cells from mice exposed to TCE (0.5 mg/ml in drinking water) for 40 weeks were examined by bisulfite next-generation DNA sequencing. A probabilistic model calculated from multiple genes showed that TCE decreased methylation control in CD4(+) T cells. Data from individual genes fitted to a quadratic regression model showed that TCE increased gene-specific methylation variance in both CD4 subsets. TCE increased epigenetic drift of specific CpG sites in CD4(+) T cells.
Markov-modulated Markov chains and the covarion process of molecular evolution.
Galtier, N; Jean-Marie, A
2004-01-01
The covarion (or site specific rate variation, SSRV) process of biological sequence evolution is a process by which the evolutionary rate of a nucleotide/amino acid/codon position can change in time. In this paper, we introduce time-continuous, space-discrete, Markov-modulated Markov chains as a model for representing SSRV processes, generalizing existing theory to any model of rate change. We propose a fast algorithm for diagonalizing the generator matrix of relevant Markov-modulated Markov processes. This algorithm makes phylogeny likelihood calculation tractable even for a large number of rate classes and a large number of states, so that SSRV models become applicable to amino acid or codon sequence datasets. Using this algorithm, we investigate the accuracy of the discrete approximation to the Gamma distribution of evolutionary rates, widely used in molecular phylogeny. We show that a relatively large number of classes is required to achieve accurate approximation of the exact likelihood when the number of analyzed sequences exceeds 20, both under the SSRV and among site rate variation (ASRV) models.
An RNAi in silico approach to find an optimal shRNA cocktail against HIV-1
2010-01-01
Background HIV-1 can be inhibited by RNA interference in vitro through the expression of short hairpin RNAs (shRNAs) that target conserved genome sequences. In silico shRNA design for HIV has lacked a detailed study of virus variability constituting a possible breaking point in a clinical setting. We designed shRNAs against HIV-1 considering the variability observed in naïve and drug-resistant isolates available at public databases. Methods A Bioperl-based algorithm was developed to automatically scan multiple sequence alignments of HIV, while evaluating the possibility of identifying dominant and subdominant viral variants that could be used as efficient silencing molecules. Student t-test and Bonferroni Dunn correction test were used to assess statistical significance of our findings. Results Our in silico approach identified the most common viral variants within highly conserved genome regions, with a calculated free energy of ≥ -6.6 kcal/mol. This is crucial for strand loading to RISC complex and for a predicted silencing efficiency score, which could be used in combination for achieving over 90% silencing. Resistant and naïve isolate variability revealed that the most frequent shRNA per region targets a maximum of 85% of viral sequences. Adding more divergent sequences maintained this percentage. Specific sequence features that have been found to be related with higher silencing efficiency were hardly accomplished in conserved regions, even when lower entropy values correlated with better scores. We identified a conserved region among most HIV-1 genomes, which meets as many sequence features for efficient silencing. Conclusions HIV-1 variability is an obstacle to achieving absolute silencing using shRNAs designed against a consensus sequence, mainly because there are many functional viral variants. Our shRNA cocktail could be truly effective at silencing dominant and subdominant naïve viral variants. Additionally, resistant isolates might be targeted under specific antiretroviral selective pressure, but in both cases these should be tested exhaustively prior to clinical use. PMID:21172023
From printed color to image appearance: tool for advertising assessment
NASA Astrophysics Data System (ADS)
Bonanomi, Cristian; Marini, Daniele; Rizzi, Alessandro
2012-07-01
We present a methodology to calculate the color appearance of advertising billboards set in indoor and outdoor environments, printed on different types of paper support and viewed under different illuminations. The aim is to simulate the visual appearance of an image printed on a specific support, observed in a certain context and illuminated with a specific source of light. Knowing in advance the visual rendering of an image in different conditions can avoid problems related to its visualization. The proposed method applies a sequence of transformations to convert a four channels image (CMYK) into a spectral one, considering the paper support, then it simulates the chosen illumination, and finally computes an estimation of the appearance.
Mars, Mokhtar; Bouaziz, Mouna; Tbini, Zeineb; Ladeb, Fethi; Gharbi, Souha
2018-06-12
This study aims to determine how Magnetic Resonance Imaging (MRI) acquisition techniques and calculation methods affect T2 values of knee cartilage at 1.5 Tesla and to identify sequences that can be used for high-resolution T2 mapping in short scanning times. This study was performed on phantom and twenty-nine patients who underwent MRI of the knee joint at 1.5 Tesla. The protocol includes T2 mapping sequences based on Single Echo Spin Echo (SESE), Multi-Echo Spin Echo (MESE), Fast Spin Echo (FSE) and Turbo Gradient Spin Echo (TGSE). The T2 relaxation times were quantified and evaluated using three calculation methods (MapIt, Syngo Offline and monoexponential fit). Signal to Noise Ratios (SNR) were measured in all sequences. All statistical analyses were performed using the t-test. The average T2 values in phantom were 41.7 ± 13.8 ms for SESE, 43.2 ± 14.4 ms for MESE, 42.4 ± 14.1 ms for FSE and 44 ± 14.5 ms for TGSE. In the patient study, the mean differences were 6.5 ± 8.2 ms, 7.8 ± 7.6 ms and 8.4 ± 14.2 ms for MESE, FSE and TGSE compared to SESE respectively; these statistical results were not significantly different (p > 0.05). The comparison between the three calculation methods showed no significant difference (p > 0.05). t-Test showed no significant difference between SNR values for all sequences. T2 values depend not only on the sequence type but also on the calculation method. None of the sequences revealed significant differences compared to the SESE reference sequence. TGSE with its short scanning time can be used for high-resolution T2 mapping. ©2018The Author(s). Published by S. Karger AG, Basel.
NASA Technical Reports Server (NTRS)
Zhang, Zhengdong; Willson, Richard C.; Fox, George E.
2002-01-01
MOTIVATION: The phylogenetic structure of the bacterial world has been intensively studied by comparing sequences of 16S ribosomal RNA (16S rRNA). This database of sequences is now widely used to design probes for the detection of specific bacteria or groups of bacteria one at a time. The success of such methods reflects the fact that there are local sequence segments that are highly characteristic of particular organisms or groups of organisms. It is not clear, however, the extent to which such signature sequences exist in the 16S rRNA dataset. A better understanding of the numbers and distribution of highly informative oligonucleotide sequences may facilitate the design of hybridization arrays that can characterize the phylogenetic position of an unknown organism or serve as the basis for the development of novel approaches for use in bacterial identification. RESULTS: A computer-based algorithm that characterizes the extent to which any individual oligonucleotide sequence in 16S rRNA is characteristic of any particular bacterial grouping was developed. A measure of signature quality, Q(s), was formulated and subsequently calculated for every individual oligonucleotide sequence in the size range of 5-11 nucleotides and for 15mers with reference to each cluster and subcluster in a 929 organism representative phylogenetic tree. Subsequently, the perfect signature sequences were compared to the full set of 7322 sequences to see how common false positives were. The work completed here establishes beyond any doubt that highly characteristic oligonucleotides exist in the bacterial 16S rRNA sequence dataset in large numbers. Over 16,000 15mers were identified that might be useful as signatures. Signature oligonucleotides are available for over 80% of the nodes in the representative tree.
IVisTMSA: Interactive Visual Tools for Multiple Sequence Alignments.
Pervez, Muhammad Tariq; Babar, Masroor Ellahi; Nadeem, Asif; Aslam, Naeem; Naveed, Nasir; Ahmad, Sarfraz; Muhammad, Shah; Qadri, Salman; Shahid, Muhammad; Hussain, Tanveer; Javed, Maryam
2015-01-01
IVisTMSA is a software package of seven graphical tools for multiple sequence alignments. MSApad is an editing and analysis tool. It can load 409% more data than Jalview, STRAP, CINEMA, and Base-by-Base. MSA comparator allows the user to visualize consistent and inconsistent regions of reference and test alignments of more than 21-MB size in less than 12 seconds. MSA comparator is 5,200% efficient and more than 40% efficient as compared to BALiBASE c program and FastSP, respectively. MSA reconstruction tool provides graphical user interfaces for four popular aligners and allows the user to load several sequence files at a time. FASTA generator converts seven formats of alignments of unlimited size into FASTA format in a few seconds. MSA ID calculator calculates identity matrix of more than 11,000 sequences with a sequence length of 2,696 base pairs in less than 100 seconds. Tree and Distance Matrix calculation tools generate phylogenetic tree and distance matrix, respectively, using neighbor joining% identity and BLOSUM 62 matrix.
Szulik, Marta W; Pallan, Pradeep S; Nocek, Boguslaw; Voehler, Markus; Banerjee, Surajit; Brooks, Sonja; Joachimiak, Andrzej; Egli, Martin; Eichman, Brandt F; Stone, Michael P
2015-02-10
5-Hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) form during active demethylation of 5-methylcytosine (5mC) and are implicated in epigenetic regulation of the genome. They are differentially processed by thymine DNA glycosylase (TDG), an enzyme involved in active demethylation of 5mC. Three modified Dickerson-Drew dodecamer (DDD) sequences, amenable to crystallographic and spectroscopic analyses and containing the 5'-CG-3' sequence associated with genomic cytosine methylation, containing 5hmC, 5fC, or 5caC placed site-specifically into the 5'-T(8)X(9)G(10)-3' sequence of the DDD, were compared. The presence of 5caC at the X(9) base increased the stability of the DDD, whereas 5hmC or 5fC did not. Both 5hmC and 5fC increased imino proton exchange rates and calculated rate constants for base pair opening at the neighboring base pair A(5):T(8), whereas 5caC did not. At the oxidized base pair G(4):X(9), 5fC exhibited an increase in the imino proton exchange rate and the calculated kop. In all cases, minimal effects to imino proton exchange rates occurred at the neighboring base pair C(3):G(10). No evidence was observed for imino tautomerization, accompanied by wobble base pairing, for 5hmC, 5fC, or 5caC when positioned at base pair G(4):X(9); each favored Watson-Crick base pairing. However, both 5fC and 5caC exhibited intranucleobase hydrogen bonding between their formyl or carboxyl oxygens, respectively, and the adjacent cytosine N(4) exocyclic amines. The lesion-specific differences observed in the DDD may be implicated in recognition of 5hmC, 5fC, or 5caC in DNA by TDG. However, they do not correlate with differential excision of 5hmC, 5fC, or 5caC by TDG, which may be mediated by differences in transition states of the enzyme-bound complexes.
Moschetta, Marco; Telegrafo, Michele; Rella, Leonarda; Stabile Ianora, Amato Antonio; Angelelli, Giuseppe
2014-12-01
Diffusion-weighted imaging with background body signal suppression (DWIBS) provides both qualitative and quantitative imaging of breast lesions and are usually performed before contrast material injection (CMI). This study aims to assess whether the administration of gadolinium significantly affects DWIBS imaging. 200 patients were prospectively evaluated by MRI with STIR, TSE-T2, pre-CMI DWIBS, contrast enhanced THRIVE-T1 and post-CMI DWIBS sequences. Pre and post-CMI DWIBS were analyzed searching for the presence of breast lesions and calculating the ADC value. ADC values of ≤1.44×10(-3) mm(2)/s were considered suspicious for malignancy. This analysis was then compared with the histological findings. Sensitivity, specificity, diagnostic accuracy (DA), positive predictive value (PPV) and negative (NPV) were calculated for both sequences and represented by ROC analysis. Pre and post-CMI ADC values were compared by using the paired t test. In 150/200 (59%) patients, pre and post-CMI DWIBS indicated the presence of breast lesions, 53 (35%) with ADC values of >1.44×10(-3) mm(2)/s and 97 (65%) with ADC≤1.44×10(-3) mm(2)/s. Pre-CMI and post-DWIBS sequences obtained the same sensitivity, specificity, DA, PPV and NPV values of 97%, 83%, 89%, 79% and 98%. The mean ADC value of benign lesions was 1.831±0.18×10(-3) mm(2)/s before and 1.828±0.18×10(-3) mm(2)/s after CMI. The mean ADC value of the malignant lesions was 1.146±0.16×10(-3) mm(2)/s before and 1.144±0.16×10(-3) mm(2)/s after CMI. No significant difference was found between pre and post CMI ADC values (p>0.05). DWIBS imaging is not influenced by CMI. Breast MR protocol could be modified by placing DWIBS after dynamic contrast enhanced sequences in order to maximize patient cooperation. Copyright © 2014 Elsevier Inc. All rights reserved.
Szulik, Marta W.; Pallan, Pradeep S.; Nocek, Boguslaw; ...
2015-01-29
5-Hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) form during active demethylation of 5-methylcytosine (5mC) and are implicated in epigenetic regulation of the genome. They are differentially processed by thymine DNA glycosylase (TDG), an enzyme involved in active demethylation of 5mC. Three modified Dickerson–Drew dodecamer (DDD) sequences, amenable to crystallographic and spectroscopic analyses and containing the 5'-CG-3' sequence associated with genomic cytosine methylation, containing 5hmC, 5fC, or 5caC placed site-specifically into the 5'-T 8X 9G 10-3' sequence of the DDD, were compared. The presence of 5caC at the X9 base increased the stability of the DDD, whereas 5hmC or 5fC didmore » not. Both 5hmC and 5fC increased imino proton exchange rates and calculated rate constants for base pair opening at the neighboring base pair A 5:T 8, whereas 5caC did not. At the oxidized base pair G 4:X 9, 5fC exhibited an increase in the imino proton exchange rate and the calculated k op. In all cases, minimal effects to imino proton exchange rates occurred at the neighboring base pair C 3:G 10. No evidence was observed for imino tautomerization, accompanied by wobble base pairing, for 5hmC, 5fC, or 5caC when positioned at base pair G 4:X 9; each favored Watson–Crick base pairing. However, both 5fC and 5caC exhibited intranucleobase hydrogen bonding between their formyl or carboxyl oxygens, respectively, and the adjacent cytosine N 4 exocyclic amines. The lesion-specific differences observed in the DDD may be implicated in recognition of 5hmC, 5fC, or 5caC in DNA by TDG. Furthermore, they do not correlate with differential excision of 5hmC, 5fC, or 5caC by TDG, which may be mediated by differences in transition states of the enzyme-bound complexes.« less
2016-01-01
5-Hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) form during active demethylation of 5-methylcytosine (5mC) and are implicated in epigenetic regulation of the genome. They are differentially processed by thymine DNA glycosylase (TDG), an enzyme involved in active demethylation of 5mC. Three modified Dickerson–Drew dodecamer (DDD) sequences, amenable to crystallographic and spectroscopic analyses and containing the 5′-CG-3′ sequence associated with genomic cytosine methylation, containing 5hmC, 5fC, or 5caC placed site-specifically into the 5′-T8X9G10-3′ sequence of the DDD, were compared. The presence of 5caC at the X9 base increased the stability of the DDD, whereas 5hmC or 5fC did not. Both 5hmC and 5fC increased imino proton exchange rates and calculated rate constants for base pair opening at the neighboring base pair A5:T8, whereas 5caC did not. At the oxidized base pair G4:X9, 5fC exhibited an increase in the imino proton exchange rate and the calculated kop. In all cases, minimal effects to imino proton exchange rates occurred at the neighboring base pair C3:G10. No evidence was observed for imino tautomerization, accompanied by wobble base pairing, for 5hmC, 5fC, or 5caC when positioned at base pair G4:X9; each favored Watson–Crick base pairing. However, both 5fC and 5caC exhibited intranucleobase hydrogen bonding between their formyl or carboxyl oxygens, respectively, and the adjacent cytosine N4 exocyclic amines. The lesion-specific differences observed in the DDD may be implicated in recognition of 5hmC, 5fC, or 5caC in DNA by TDG. However, they do not correlate with differential excision of 5hmC, 5fC, or 5caC by TDG, which may be mediated by differences in transition states of the enzyme-bound complexes. PMID:25632825
Jalili, Seifollah; Karami, Leila; Schofield, Jeremy
2013-06-01
Proline-rich homeodomain (PRH) is a regulatory protein controlling transcription and gene expression processes by binding to the specific sequence of DNA, especially to the sequence 5'-TAATNN-3'. The impact of base pair mutations on the binding between the PRH protein and DNA is investigated using molecular dynamics and free energy simulations to identify DNA sequences that form stable complexes with PRH. Three 20-ns molecular dynamics simulations (PRH-TAATTG, PRH-TAATTA and PRH-TAATGG complexes) in explicit solvent water were performed to investigate three complexes structurally. Structural analysis shows that the native TAATTG sequence forms a complex that is more stable than complexes with base pair mutations. It is also observed that upon mutation, the number and occupancy of the direct and water-mediated hydrogen bonds decrease. Free energy calculations performed with the thermodynamic integration method predict relative binding free energies of 0.64 and 2 kcal/mol for GC to AT and TA to GC mutations, respectively, suggesting that among the three DNA sequences, the PRH-TAATTG complex is more stable than the two mutated complexes. In addition, it is demonstrated that the stability of the PRH-TAATTA complex is greater than that of the PRH-TAATGG complex.
Beqiri, Arian; Price, Anthony N; Padormo, Francesco; Hajnal, Joseph V; Malik, Shaihan J
2017-06-01
Cardiac magnetic resonance imaging (MRI) at high field presents challenges because of the high specific absorption rate and significant transmit field (B 1 + ) inhomogeneities. Parallel transmission MRI offers the ability to correct for both issues at the level of individual radiofrequency (RF) pulses, but must operate within strict hardware and safety constraints. The constraints are themselves affected by sequence parameters, such as the RF pulse duration and TR, meaning that an overall optimal operating point exists for a given sequence. This work seeks to obtain optimal performance by performing a 'sequence-level' optimization in which pulse sequence parameters are included as part of an RF shimming calculation. The method is applied to balanced steady-state free precession cardiac MRI with the objective of minimizing TR, hence reducing the imaging duration. Results are demonstrated using an eight-channel parallel transmit system operating at 3 T, with an in vivo study carried out on seven male subjects of varying body mass index (BMI). Compared with single-channel operation, a mean-squared-error shimming approach leads to reduced imaging durations of 32 ± 3% with simultaneous improvement in flip angle homogeneity of 32 ± 8% within the myocardium. © 2017 The Authors. NMR in Biomedicine published by John Wiley & Sons Ltd.
McInerney-Leo, Aideen M; Marshall, Mhairi S; Gardiner, Brooke; Coucke, Paul J; Van Laer, Lut; Loeys, Bart L; Summers, Kim M; Symoens, Sofie; West, Jennifer A; West, Malcolm J; Paul Wordsworth, B; Zankl, Andreas; Leo, Paul J; Brown, Matthew A; Duncan, Emma L
2013-01-01
Osteogenesis imperfecta (OI) and Marfan syndrome (MFS) are common Mendelian disorders. Both conditions are usually diagnosed clinically, as genetic testing is expensive due to the size and number of potentially causative genes and mutations. However, genetic testing may benefit patients, at-risk family members and individuals with borderline phenotypes, as well as improving genetic counseling and allowing critical differential diagnoses. We assessed whether whole exome sequencing (WES) is a sensitive method for mutation detection in OI and MFS. WES was performed on genomic DNA from 13 participants with OI and 10 participants with MFS who had known mutations, with exome capture followed by massive parallel sequencing of multiplexed samples. Single nucleotide polymorphisms (SNPs) and small indels were called using Genome Analysis Toolkit (GATK) and annotated with ANNOVAR. CREST, exomeCopy and exomeDepth were used for large deletion detection. Results were compared with the previous data. Specificity was calculated by screening WES data from a control population of 487 individuals for mutations in COL1A1, COL1A2 and FBN1. The target capture of five exome capture platforms was compared. All 13 mutations in the OI cohort and 9/10 in the MFS cohort were detected (sensitivity=95.6%) including non-synonymous SNPs, small indels (<10 bp), and a large UTR5/exon 1 deletion. One mutation was not detected by GATK due to strand bias. Specificity was 99.5%. Capture platforms and analysis programs differed considerably in their ability to detect mutations. Consumable costs for WES were low. WES is an efficient, sensitive, specific and cost-effective method for mutation detection in patients with OI and MFS. Careful selection of platform and analysis programs is necessary to maximize success. PMID:24501682
Smith, M A; Dyson, S J; Murray, R C
2012-11-01
To determine the reliability of 2 magnetic resonance imaging (MRI) systems for detection of cartilage and bone lesions of the equine fetlock. To test the hypotheses that lesions in cartilage, subchondral and trabecular bone of the equine fetlock verified using histopathology can be detected on high- and low-field MR images with a low incidence of false positive or negative results; that low-field images are less reliable than high-field images for detection of cartilage lesions; and that combining results of interpretation from different pulse sequences increases detection of cartilage lesions. High- and low-field MRI was performed on 19 limbs from horses identified with fetlock lameness prior to euthanasia. Grading systems were used to score cartilage, subchondral and trabecular bone on MR images and histopathology. Sensitivity and specificity were calculated for images. High-field T2*-weighted gradient echo (T2*W-GRE) and low-field T2-weighted fast spin echo (T2W-FSE) images had high sensitivity but low specificity for detection of cartilage lesions. All pulse sequences had high sensitivity and low-moderate specificity for detection of subchondral bone lesions and moderate sensitivity and moderate-high specificity for detection of trabecular bone lesions (histopathology as gold standard). For detection of lesions of trabecular bone low-field T2*W-GRE images had higher sensitivity and specificity than T2W-FSE images. There is high likelihood of false positive results using high- or low-field MRI for detection of cartilage lesions and moderate-high likelihood of false positive results for detection of subchondral bone lesions compared with histopathology. Combining results of interpretation from different pulse sequences did not increase detection of cartilage lesions. MRI interpretation of trabecular bone was more reliable than cartilage or subchondral bone in both MR systems. Independent interpretation of a variety of pulse sequences may maximise detection of cartilage and bone lesions in the fetlock. Clinicians should be aware of potential false positive and negative results. © 2012 EVJ Ltd.
Structural basis of RNA folding and recognition in an AMP-RNA aptamer complex.
Jiang, F; Kumar, R A; Jones, R A; Patel, D J
1996-07-11
The catalytic properties of RNA and its well known role in gene expression and regulation are the consequence of its unique solution structures. Identification of the structural determinants of ligand recognition by RNA molecules is of fundamental importance for understanding the biological functions of RNA, as well as for the rational design of RNA Sequences with specific catalytic activities. Towards this latter end, Szostak et al. used in vitro selection techniques to isolate RNA sequences ('aptamers') containing a high-affinity binding site for ATP, the universal currency of cellular energy, and then used this motif to engineer ribozymes with polynucleotide kinase activity. Here we present the solution structure, as determined by multidimensional NMR spectroscopy and molecular dynamics calculations, of both uniformly and specifically 13C-, 15N-labelled 40-mer RNA containing the ATP-binding motif complexed with AMP. The aptamer adopts an L-shaped structure with two nearly orthogonal stems, each capped proximally by a G x G mismatch pair, binding the AMP ligand at their junction in a GNRA-like motif.
Huguet-Tapia, Jose C.; Lefebure, Tristan; Badger, Jonathan H.; Guan, Dongli; Stanhope, Michael J.
2016-01-01
Streptomyces spp. are highly differentiated actinomycetes with large, linear chromosomes that encode an arsenal of biologically active molecules and catabolic enzymes. Members of this genus are well equipped for life in nutrient-limited environments and are common soil saprophytes. Out of the hundreds of species in the genus Streptomyces, a small group has evolved the ability to infect plants. The recent availability of Streptomyces genome sequences, including four genomes of pathogenic species, provided an opportunity to characterize the gene content specific to these pathogens and to study phylogenetic relationships among them. Genome sequencing, comparative genomics, and phylogenetic analysis enabled us to discriminate pathogenic from saprophytic Streptomyces strains; moreover, we calculated that the pathogen-specific genome contains 4,662 orthologs. Phylogenetic reconstruction suggested that Streptomyces scabies and S. ipomoeae share an ancestor but that their biosynthetic clusters encoding the required virulence factor thaxtomin have diverged. In contrast, S. turgidiscabies and S. acidiscabies, two relatively unrelated pathogens, possess highly similar thaxtomin biosynthesis clusters, which suggests that the acquisition of these genes was through lateral gene transfer. PMID:26826232
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wagner, John C; Peplow, Douglas E.; Mosher, Scott W
2014-01-01
This paper presents a new hybrid (Monte Carlo/deterministic) method for increasing the efficiency of Monte Carlo calculations of distributions, such as flux or dose rate distributions (e.g., mesh tallies), as well as responses at multiple localized detectors and spectra. This method, referred to as Forward-Weighted CADIS (FW-CADIS), is an extension of the Consistent Adjoint Driven Importance Sampling (CADIS) method, which has been used for more than a decade to very effectively improve the efficiency of Monte Carlo calculations of localized quantities, e.g., flux, dose, or reaction rate at a specific location. The basis of this method is the development ofmore » an importance function that represents the importance of particles to the objective of uniform Monte Carlo particle density in the desired tally regions. Implementation of this method utilizes the results from a forward deterministic calculation to develop a forward-weighted source for a deterministic adjoint calculation. The resulting adjoint function is then used to generate consistent space- and energy-dependent source biasing parameters and weight windows that are used in a forward Monte Carlo calculation to obtain more uniform statistical uncertainties in the desired tally regions. The FW-CADIS method has been implemented and demonstrated within the MAVRIC sequence of SCALE and the ADVANTG/MCNP framework. Application of the method to representative, real-world problems, including calculation of dose rate and energy dependent flux throughout the problem space, dose rates in specific areas, and energy spectra at multiple detectors, is presented and discussed. Results of the FW-CADIS method and other recently developed global variance reduction approaches are also compared, and the FW-CADIS method outperformed the other methods in all cases considered.« less
Korber, B T; Kunstman, K J; Patterson, B K; Furtado, M; McEvilly, M M; Levy, R; Wolinsky, S M
1994-01-01
Human immunodeficiency virus type 1 (HIV-1) sequences were generated from blood and from brain tissue obtained by stereotactic biopsy from six patients undergoing a diagnostic neurosurgical procedure. Proviral DNA was directly amplified by nested PCR, and 8 to 36 clones from each sample were sequenced. Phylogenetic analysis of intrapatient envelope V3-V5 region HIV-1 DNA sequence sets revealed that brain viral sequences were clustered relative to the blood viral sequences, suggestive of tissue-specific compartmentalization of the virus in four of the six cases. In the other two cases, the blood and brain virus sequences were intermingled in the phylogenetic analyses, suggesting trafficking of virus between the two tissues. Slide-based PCR-driven in situ hybridization of two of the patients' brain biopsy samples confirmed our interpretation of the intrapatient phylogenetic analyses. Interpatient V3 region brain-derived sequence distances were significantly less than blood-derived sequence distances. Relative to the tip of the loop, the set of brain-derived viral sequences had a tendency towards negative or neutral charge compared with the set of blood-derived viral sequences. Entropy calculations were used as a measure of the variability at each position in alignments of blood and brain viral sequences. A relatively conserved set of positions were found, with a significantly lower entropy in the brain-than in the blood-derived viral sequences. These sites constitute a brain "signature pattern," or a noncontiguous set of amino acids in the V3 region conserved in viral sequences derived from brain tissue. This brain-derived signature pattern was also well preserved among isolates previously characterized in vitro as macrophage tropic. Macrophage-monocyte tropism may be the biological constraint that results in the conservation of the viral brain signature pattern. Images PMID:7933130
Cost-effectiveness of sequenced treatment of rheumatoid arthritis with targeted immune modulators.
Jansen, Jeroen P; Incerti, Devin; Mutebi, Alex; Peneva, Desi; MacEwan, Joanna P; Stolshek, Bradley; Kaur, Primal; Gharaibeh, Mahdi; Strand, Vibeke
2017-07-01
To determine the cost-effectiveness of treatment sequences of biologic disease-modifying anti-rheumatic drugs or Janus kinase/STAT pathway inhibitors (collectively referred to as bDMARDs) vs conventional DMARDs (cDMARDs) from the US societal perspective for treatment of patients with moderately to severely active rheumatoid arthritis (RA) with inadequate responses to cDMARDs. An individual patient simulation model was developed that assesses the impact of treatments on disease based on clinical trial data and real-world evidence. Treatment strategies included sequences starting with etanercept, adalimumab, certolizumab, or abatacept. Each of these treatment strategies was compared with cDMARDs. Incremental cost, incremental quality-adjusted life-years (QALYs), and incremental cost-effectiveness ratios (ICERs) were calculated for each treatment sequence relative to cDMARDs. The cost-effectiveness of each strategy was determined using a US willingness-to-pay (WTP) threshold of $150,000/QALY. For the base-case scenario, bDMARD treatment sequences were associated with greater treatment benefit (i.e. more QALYs), lower lost productivity costs, and greater treatment-related costs than cDMARDs. The expected ICERs for bDMARD sequences ranged from ∼$126,000 to $140,000 per QALY gained, which is below the US-specific WTP. Alternative scenarios examining the effects of homogeneous patients, dose increases, increased costs of hospitalization for severely physically impaired patients, and a lower baseline Health Assessment Questionnaire (HAQ) Disability Index score resulted in similar ICERs. bDMARD treatment sequences are cost-effective from a US societal perspective.
An online supervised learning method based on gradient descent for spiking neurons.
Xu, Yan; Yang, Jing; Zhong, Shuiming
2017-09-01
The purpose of supervised learning with temporal encoding for spiking neurons is to make the neurons emit a specific spike train encoded by precise firing times of spikes. The gradient-descent-based (GDB) learning methods are widely used and verified in the current research. Although the existing GDB multi-spike learning (or spike sequence learning) methods have good performance, they work in an offline manner and still have some limitations. This paper proposes an online GDB spike sequence learning method for spiking neurons that is based on the online adjustment mechanism of real biological neuron synapses. The method constructs error function and calculates the adjustment of synaptic weights as soon as the neurons emit a spike during their running process. We analyze and synthesize desired and actual output spikes to select appropriate input spikes in the calculation of weight adjustment in this paper. The experimental results show that our method obviously improves learning performance compared with the offline learning manner and has certain advantage on learning accuracy compared with other learning methods. Stronger learning ability determines that the method has large pattern storage capacity. Copyright © 2017 Elsevier Ltd. All rights reserved.
Manara, Richard M A; Guy, Andrew T; Wallace, E Jayne; Khalid, Syma
2015-02-10
Next generation DNA sequencing methods that utilize protein nanopores have the potential to revolutionize this area of biotechnology. While the technique is underpinned by simple physics, the wild-type protein pores do not have all of the desired properties for efficient and accurate DNA sequencing. Much of the research efforts have focused on protein nanopores, such as α-hemolysin from Staphylococcus aureus. However, the speed of DNA translocation has historically been an issue, hampered in part by incomplete knowledge of the energetics of translocation. Here we have utilized atomistic molecular dynamics simulations of nucleotide fragments in order to calculate the potential of mean force (PMF) through α-hemolysin. Our results reveal specific regions within the pore that play a key role in the interaction with DNA. In particular, charged residues such as D127 and K131 provide stabilizing interactions with the anionic DNA and therefore are likely to reduce the speed of translocation. These regions provide rational targets for pore optimization. Furthermore, we show that the energetic contributions to the protein-DNA interactions are a complex combination of electrostatics and short-range interactions, often mediated by water molecules.
Cleavage Entropy as Quantitative Measure of Protease Specificity
Fuchs, Julian E.; von Grafenstein, Susanne; Huber, Roland G.; Margreiter, Michael A.; Spitzer, Gudrun M.; Wallnoefer, Hannes G.; Liedl, Klaus R.
2013-01-01
A purely information theory-guided approach to quantitatively characterize protease specificity is established. We calculate an entropy value for each protease subpocket based on sequences of cleaved substrates extracted from the MEROPS database. We compare our results with known subpocket specificity profiles for individual proteases and protease groups (e.g. serine proteases, metallo proteases) and reflect them quantitatively. Summation of subpocket-wise cleavage entropy contributions yields a measure for overall protease substrate specificity. This total cleavage entropy allows ranking of different proteases with respect to their specificity, separating unspecific digestive enzymes showing high total cleavage entropy from specific proteases involved in signaling cascades. The development of a quantitative cleavage entropy score allows an unbiased comparison of subpocket-wise and overall protease specificity. Thus, it enables assessment of relative importance of physicochemical and structural descriptors in protease recognition. We present an exemplary application of cleavage entropy in tracing substrate specificity in protease evolution. This highlights the wide range of substrate promiscuity within homologue proteases and hence the heavy impact of a limited number of mutations on individual substrate specificity. PMID:23637583
Schmidt Am Busch, Marcel; Sedano, Audrey; Simonson, Thomas
2010-05-05
Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases. WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed. For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.
Pan, Xiaoyu; Zhang, Chunlei; Li, Xuchao; Chen, Shengpei; Ge, Huijuan; Zhang, Yanyan; Chen, Fang; Jiang, Hui; Jiang, Fuman; Zhang, Hongyun; Wang, Wei; Zhang, Xiuqing
2014-12-01
To develop a fetal sex determination method based on maternal plasma sequencing (MPS), assess its performance and potential use in X-linked disorder counseling. 900 cases of MPS data from a previous study were reviewed, in which 100 and 800 cases were used as training and validation set, respectively. The percentage of uniquely mapped sequencing reads on Y chromosome was calculated and used to classify male and female cases. Eight pregnant women who are carriers of Duchenne muscular dystrophy (DMD) mutations were recruited, whose plasma were subjected to multiplex sequencing and fetal sex determination analysis. In the training set, a sensitivity of 96% and false positive rate of 0% for male cases detection were reached in our method. The blinded validation results showed 421 in 423 male cases and 374 in 377 female cases were successfully identified, revealing sensitivity and specificity of 99.53% and 99.20% for fetal sex determination, at as early as 12 gestational weeks. Fetal sex for all eight DMD genetic counseling cases were correctly identified, which were confirmed by amniocentesis. Based on MPS, high accuracy of non-invasive fetal sex determination can be achieved. This method can potentially be used for prenatal genetic counseling.
The Impact of Normalization Methods on RNA-Seq Data Analysis
Zyprych-Walczak, J.; Szabelska, A.; Handschuh, L.; Górczak, K.; Klamecka, K.; Figlerowicz, M.; Siatkowski, I.
2015-01-01
High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. PMID:26176014
RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries.
Habegger, Lukas; Sboner, Andrea; Gianoulis, Tara A; Rozowsky, Joel; Agarwal, Ashish; Snyder, Michael; Gerstein, Mark
2011-01-15
The advent of next-generation sequencing for functional genomics has given rise to quantities of sequence information that are often so large that they are difficult to handle. Moreover, sequence reads from a specific individual can contain sufficient information to potentially identify and genetically characterize that person, raising privacy concerns. In order to address these issues, we have developed the Mapped Read Format (MRF), a compact data summary format for both short and long read alignments that enables the anonymization of confidential sequence information, while allowing one to still carry out many functional genomics studies. We have developed a suite of tools (RSEQtools) that use this format for the analysis of RNA-Seq experiments. These tools consist of a set of modules that perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads and segmenting that signal into actively transcribed regions. Moreover, the tools can readily be used to build customizable RNA-Seq workflows. In addition to the anonymization afforded by MRF, this format also facilitates the decoupling of the alignment of reads from downstream analyses. RSEQtools is implemented in C and the source code is available at http://rseqtools.gersteinlab.org/.
E-RNAi: a web application for the multi-species design of RNAi reagents—2010 update
Horn, Thomas; Boutros, Michael
2010-01-01
The design of RNA interference (RNAi) reagents is an essential step for performing loss-of-function studies in many experimental systems. The availability of sequenced and annotated genomes greatly facilitates RNAi experiments in an increasing number of organisms that were previously not genetically tractable. The E-RNAi web-service, accessible at http://www.e-rnai.org/, provides a computational resource for the optimized design and evaluation of RNAi reagents. The 2010 update of E-RNAi now covers 12 genomes, including Drosophila, Caenorhabditis elegans, human, emerging model organisms such as Schmidtea mediterranea and Acyrthosiphon pisum, as well as the medically relevant vectors Anopheles gambiae and Aedes aegypti. The web service calculates RNAi reagents based on the input of target sequences, sequence identifiers or by visual selection of target regions through a genome browser interface. It identifies optimized RNAi target-sites by ranking sequences according to their predicted specificity, efficiency and complexity. E-RNAi also facilitates the design of secondary RNAi reagents for validation experiments, evaluation of pooled siRNA reagents and batch design. Results are presented online, as a downloadable HTML report and as tab-delimited files. PMID:20444868
Analysis and Visualization of ChIP-Seq and RNA-Seq Sequence Alignments Using ngs.plot.
Loh, Yong-Hwee Eddie; Shen, Li
2016-01-01
The continual maturation and increasing applications of next-generation sequencing technology in scientific research have yielded ever-increasing amounts of data that need to be effectively and efficiently analyzed and innovatively mined for new biological insights. We have developed ngs.plot-a quick and easy-to-use bioinformatics tool that performs visualizations of the spatial relationships between sequencing alignment enrichment and specific genomic features or regions. More importantly, ngs.plot is customizable beyond the use of standard genomic feature databases to allow the analysis and visualization of user-specified regions of interest generated by the user's own hypotheses. In this protocol, we demonstrate and explain the use of ngs.plot using command line executions, as well as a web-based workflow on the Galaxy framework. We replicate the underlying commands used in the analysis of a true biological dataset that we had reported and published earlier and demonstrate how ngs.plot can easily generate publication-ready figures. With ngs.plot, users would be able to efficiently and innovatively mine their own datasets without having to be involved in the technical aspects of sequence coverage calculations and genomic databases.
Secuencias evolutivas e isocronas para estrellas de baja masa e intermedia
NASA Astrophysics Data System (ADS)
Panei, J.; Baume, G.
2016-08-01
We present theoretical evolutionary sequences for low- and intermediate-mass stars. The masses calculated range from 1.7 to 10 M. The initial chemical composition is . In addition, we have taken into account a nuclear network with 17 isotopes and 34 nuclear reactions. With respect to the mix, we considered overshooting with a parameter . The evolutionary calculations were initialized from the region of instability of Hayashi, in order to calculate isochrones of pre-sequence, too.
Thomas, David; Finan, Chris; Newport, Melanie J; Jones, Susan
2015-10-01
The complexity of DNA can be quantified using estimates of entropy. Variation in DNA complexity is expected between the promoters of genes with different transcriptional mechanisms; namely housekeeping (HK) and tissue specific (TS). The former are transcribed constitutively to maintain general cellular functions, and the latter are transcribed in restricted tissue and cells types for specific molecular events. It is known that promoter features in the human genome are related to tissue specificity, but this has been difficult to quantify on a genomic scale. If entropy effectively quantifies DNA complexity, calculating the entropies of HK and TS gene promoters as profiles may reveal significant differences. Entropy profiles were calculated for a total dataset of 12,003 human gene promoters and for 501 housekeeping (HK) and 587 tissue specific (TS) human gene promoters. The mean profiles show the TS promoters have a significantly lower entropy (p<2.2e-16) than HK gene promoters. The entropy distributions for the 3 datasets show that promoter entropies could be used to identify novel HK genes. Functional features comprise DNA sequence patterns that are non-random and hence they have lower entropies. The lower entropy of TS gene promoters can be explained by a higher density of positive and negative regulatory elements, required for genes with complex spatial and temporary expression. Copyright © 2015 Elsevier Ltd. All rights reserved.
Shmelkov, Evgeny; Krachmarov, Chavdar; Grigoryan, Arsen V.; Pinter, Abraham; Statnikov, Alexander; Cardozo, Timothy
2014-01-01
The extreme diversity of HIV-1 strains presents a formidable challenge for HIV-1 vaccine design. Although antibodies (Abs) can neutralize HIV-1 and potentially protect against infection, antibodies that target the immunogenic viral surface protein gp120 have widely variable and poorly predictable cross-strain reactivity. Here, we developed a novel computational approach, the Method of Dynamic Epitopes, for identification of neutralization epitopes targeted by anti-HIV-1 monoclonal antibodies (mAbs). Our data demonstrate that this approach, based purely on calculated energetics and 3D structural information, accurately predicts the presence of neutralization epitopes targeted by V3-specific mAbs 2219 and 447-52D in any HIV-1 strain. The method was used to calculate the range of conservation of these specific epitopes across all circulating HIV-1 viruses. Accurately identifying an Ab-targeted neutralization epitope in a virus by computational means enables easy prediction of the breadth of reactivity of specific mAbs across the diversity of thousands of different circulating HIV-1 variants and facilitates rational design and selection of immunogens mimicking specific mAb-targeted epitopes in a multivalent HIV-1 vaccine. The defined epitopes can also be used for the purpose of epitope-specific analyses of breakthrough sequences recorded in vaccine clinical trials. Thus, our study is a prototype for a valuable tool for rational HIV-1 vaccine design. PMID:24587168
Nakazato, Takeru; Bono, Hidemasa
2017-01-01
Abstract It is important for public data repositories to promote the reuse of archived data. In the growing field of omics science, however, the increasing number of submissions of high-throughput sequencing (HTSeq) data to public repositories prevents users from choosing a suitable data set from among the large number of search results. Repository users need to be able to set a threshold to reduce the number of results to obtain a suitable subset of high-quality data for reanalysis. We calculated the quality of sequencing data archived in a public data repository, the Sequence Read Archive (SRA), by using the quality control software FastQC. We obtained quality values for 1 171 313 experiments, which can be used to evaluate the suitability of data for reuse. We also visualized the data distribution in SRA by integrating the quality information and metadata of experiments and samples. We provide quality information of all of the archived sequencing data, which enable users to obtain sufficient quality sequencing data for reanalyses. The calculated quality data are available to the public in various formats. Our data also provide an example of enhancing the reuse of public data by adding metadata to published research data by a third party. PMID:28449062
RNA2DMut: a web tool for the design and analysis of RNA structure mutations.
Moss, Walter N
2018-03-01
With the widespread application of high-throughput sequencing, novel RNA sequences are being discovered at an astonishing rate. The analysis of function, however, lags behind. In both the cis - and trans -regulatory functions of RNA, secondary structure (2D base-pairing) plays essential regulatory roles. In order to test RNA function, it is essential to be able to design and analyze mutations that can affect structure. This was the motivation for the creation of the RNA2DMut web tool. With RNA2DMut, users can enter in RNA sequences to analyze, constrain mutations to specific residues, or limit changes to purines/pyrimidines. The sequence is analyzed at each base to determine the effect of every possible point mutation on 2D structure. The metrics used in RNA2DMut rely on the calculation of the Boltzmann structure ensemble and do not require a robust 2D model of RNA structure for designing mutations. This tool can facilitate a wide array of uses involving RNA: for example, in designing and evaluating mutants for biological assays, interrogating RNA-protein interactions, identifying key regions to alter in SELEX experiments, and improving RNA folding and crystallization properties for structural biology. Additional tools are available to help users introduce other mutations (e.g., indels and substitutions) and evaluate their effects on RNA structure. Example calculations are shown for five RNAs that require 2D structure for their function: the MALAT1 mascRNA, an influenza virus splicing regulatory motif, the EBER2 viral noncoding RNA, the Xist lncRNA repA region, and human Y RNA 5. RNA2DMut can be accessed at https://rna2dmut.bb.iastate.edu/. © 2018 Moss; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Bayer, Thomas; Adler, Werner; Janka, Rolf; Uder, Michael; Roemer, Frank
2017-12-01
To study the feasibility of magnetic resonance cinematography of the fingers (MRCF) with comparison of image quality of different protocols for depicting the finger anatomy during motion. MRCF was performed during a full flexion and extension movement in 14 healthy volunteers using a finger-gating device. Three real-time sequences (frame rates 17-59 images/min) and one proton density (PD) sequence (3 images/min) were acquired during incremental and continuous motion. Analyses were performed independently by three readers. Qualitative image analysis included Likert-scale grading from 0 (useless) to 5 (excellent) and specific visual analog scale (VAS) grading from 0 (insufficient) to 100 (excellent). Signal-to-noise calculation was performed. Overall percentage agreement and mean absolute disagreement were calculated. Within the real-time sequences a high frame-rate true fast imaging with steady-state free precession (TRUFI) yielded the best image quality with Likert and overall VAS scores of 3.0 ± 0.2 and 60.4 ± 25.3, respectively. The best sequence regarding image quality was an incremental PD with mean values of 4.8 ± 0.2 and 91.2 ± 9.4, respectively. Overall percentage agreement and mean absolute disagreement were 47.9 and 0.7, respectively. No statistically significant SNR differences were found between continuous and incremental motion for the real-time protocols. MRCF is feasible with appropriate image quality during continuous motion using a finger-gating device. Almost perfect image quality is achievable with incremental PD imaging, which represents a compromise for MRCF with the drawback of prolonged scanning time.
Meereis, Florian; Kaufmann, Michael
2004-10-15
The rapidly increasing number of completely sequenced genomes led to the establishment of the COG-database which, based on sequence homologies, assigns similar proteins from different organisms to clusters of orthologous groups (COGs). There are several bioinformatic studies that made use of this database to determine (hyper)thermophile-specific proteins by searching for COGs containing (almost) exclusively proteins from (hyper)thermophilic genomes. However, public software to perform individually definable group-specific searches is not available. The tool described here exactly fills this gap. The software is accessible at http://www.uni-wh.de/pcogr and is linked to the COG-database. The user can freely define two groups of organisms by selecting for each of the (current) 66 organisms to belong either to groupA, to the reference groupB or to be ignored by the algorithm. Then, for all COGs a specificity index is calculated with respect to the specificity to groupA, i. e. high scoring COGs contain proteins from the most of groupA organisms while proteins from the most organisms assigned to groupB are absent. In addition to ranking all COGs according to the user defined specificity criteria, a graphical visualization shows the distribution of all COGs by displaying their abundance as a function of their specificity indexes. This software allows detecting COGs specific to a predefined group of organisms. All COGs are ranked in the order of their specificity and a graphical visualization allows recognizing (i) the presence and abundance of such COGs and (ii) the phylogenetic relationship between groupA- and groupB-organisms. The software also allows detecting putative protein-protein interactions, novel enzymes involved in only partially known biochemical pathways, and alternate enzymes originated by convergent evolution.
On the relationship between residue structural environment and sequence conservation in proteins.
Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao
2017-09-01
Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.
Liquid-gas phase transition in asymmetric nuclear matter at finite temperature
NASA Astrophysics Data System (ADS)
Maruyama, Toshiki; Tatsumi, Toshitaka; Chiba, Satoshi
2010-03-01
Liquid-gas phase transition is discussed in warm asymmetric nuclear matter. Some peculiar features are figured out from the viewpoint of the basic thermodynamics about the phase equilibrium. We treat the mixed phase of the binary system based on the Gibbs conditions. When the Coulomb interaction is included, the mixed phase is no more uniform and the sequence of the pasta structures appears. Comparing the results with those given by the simple bulk calculation without the Coulomb interaction, we extract specific features of the pasta structures at finite temperature.
Chronic exposure to water pollutant trichloroethylene increased epigenetic drift in CD4+ T cells
Gilbert, Kathleen M; Blossom, Sarah J; Erickson, Stephen W; Reisfeld, Brad; Zurlinden, Todd J; Broadfoot, Brannon; West, Kirk; Bai, Shasha; Cooney, Craig A
2016-01-01
Aim: Autoimmune disease and CD4+ T-cell alterations are induced in mice exposed to the water pollutant trichloroethylene (TCE). We examined here whether TCE altered gene-specific DNA methylation in CD4+ T cells as a possible mechanism of immunotoxicity. Materials & methods: Naive and effector/memory CD4+ T cells from mice exposed to TCE (0.5 mg/ml in drinking water) for 40 weeks were examined by bisulfite next-generation DNA sequencing. Results: A probabilistic model calculated from multiple genes showed that TCE decreased methylation control in CD4+ T cells. Data from individual genes fitted to a quadratic regression model showed that TCE increased gene-specific methylation variance in both CD4 subsets. Conclusion: TCE increased epigenetic drift of specific CpG sites in CD4+ T cells. PMID:27092578
Galson, Jacob D; Trück, Johannes; Fowler, Anna; Clutterbuck, Elizabeth A; Münz, Márton; Cerundolo, Vincenzo; Reinhard, Claudia; van der Most, Robbert; Pollard, Andrew J; Lunter, Gerton; Kelly, Dominic F
2015-12-01
Generating a diverse B cell immunoglobulin repertoire is essential for protection against infection. The repertoire in humans can now be comprehensively measured by high-throughput sequencing. Using hepatitis B vaccination as a model, we determined how the total immunoglobulin sequence repertoire changes following antigen exposure in humans, and compared this to sequences from vaccine-specific sorted cells. Clonal sequence expansions were seen 7 days after vaccination, which correlated with vaccine-specific plasma cell numbers. These expansions caused an increase in mutation, and a decrease in diversity and complementarity-determining region 3 sequence length in the repertoire. We also saw an increase in sequence convergence between participants 14 and 21 days after vaccination, coinciding with an increase of vaccine-specific memory cells. These features allowed development of a model for in silico enrichment of vaccine-specific sequences from the total repertoire. Identifying antigen-specific sequences from total repertoire data could aid our understanding B cell driven immunity, and be used for disease diagnostics and vaccine evaluation.
Testing the Predictive Power of Coulomb Stress on Aftershock Sequences
NASA Astrophysics Data System (ADS)
Woessner, J.; Lombardi, A.; Werner, M. J.; Marzocchi, W.
2009-12-01
Empirical and statistical models of clustered seismicity are usually strongly stochastic and perceived to be uninformative in their forecasts, since only marginal distributions are used, such as the Omori-Utsu and Gutenberg-Richter laws. In contrast, so-called physics-based aftershock models, based on seismic rate changes calculated from Coulomb stress changes and rate-and-state friction, make more specific predictions: anisotropic stress shadows and multiplicative rate changes. We test the predictive power of models based on Coulomb stress changes against statistical models, including the popular Short Term Earthquake Probabilities and Epidemic-Type Aftershock Sequences models: We score and compare retrospective forecasts on the aftershock sequences of the 1992 Landers, USA, the 1997 Colfiorito, Italy, and the 2008 Selfoss, Iceland, earthquakes. To quantify predictability, we use likelihood-based metrics that test the consistency of the forecasts with the data, including modified and existing tests used in prospective forecast experiments within the Collaboratory for the Study of Earthquake Predictability (CSEP). Our results indicate that a statistical model performs best. Moreover, two Coulomb model classes seem unable to compete: Models based on deterministic Coulomb stress changes calculated from a given fault-slip model, and those based on fixed receiver faults. One model of Coulomb stress changes does perform well and sometimes outperforms the statistical models, but its predictive information is diluted, because of uncertainties included in the fault-slip model. Our results suggest that models based on Coulomb stress changes need to incorporate stochastic features that represent model and data uncertainty.
Chang, Chun-Chun; Hsu, Hao-Jen; Yen, Jui-Hung; Lo, Shih-Yen
2017-01-01
Hepatitis C virus (HCV) is a species-specific pathogenic virus that infects only humans and chimpanzees. Previous studies have indicated that interactions between the HCV E2 protein and CD81 on host cells are required for HCV infection. To determine the crucial factors for species-specific interactions at the molecular level, this study employed in silico molecular docking involving molecular dynamic simulations of the binding of HCV E2 onto human and rat CD81s. In vitro experiments including surface plasmon resonance measurements and cellular binding assays were applied for simple validations of the in silico results. The in silico studies identified two binding regions on the HCV E2 loop domain, namely E2-site1 and E2-site2, as being crucial for the interactions with CD81s, with the E2-site2 as the determinant factor for human-specific binding. Free energy calculations indicated that the E2/CD81 binding process might follow a two-step model involving (i) the electrostatic interaction-driven initial binding of human-specific E2-site2, followed by (ii) changes in the E2 orientation to facilitate the hydrophobic and van der Waals interaction-driven binding of E2-site1. The sequence of the human-specific, stronger-binding E2-site2 could serve as a candidate template for the future development of HCV-inhibiting peptide drugs. PMID:28481946
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing
Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph
2011-01-01
Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
Böker, Sarah M.; Bender, Yvonne Y.; Diederichs, Gerd; Fallenberg, Eva M.; Wagner, Moritz; Hamm, Bernd; Makowski, Marcus R.
2017-01-01
Objectives To determine the diagnostic performance of susceptibility-weighted magnetic resonance imaging (SWMR) for the detection of pineal gland calcifications (PGC) compared to conventional magnetic resonance imaging (MRI) sequences, using computed tomography (CT) as a reference standard. Methods 384 patients who received a 1.5 Tesla MRI scan including SWMR sequences and a CT scan of the brain between January 2014 and October 2016 were retrospectively evaluated. 346 patients were included in the analysis, of which 214 showed PGC on CT scans. To assess correlation between imaging modalities, the maximum calcification diameter was used. Sensitivity and specificity and intra- and interobserver reliability were calculated for SWMR and conventional MRI sequences. Results SWMR reached a sensitivity of 95% (95% CI: 91%-97%) and a specificity of 96% (95% CI: 91%-99%) for the detection of PGC, whereas conventional MRI achieved a sensitivity of 43% (95% CI: 36%-50%) and a specificity of 96% (95% CI: 91%-99%). Detection rates for calcifications in SWMR and conventional MRI differed significantly (95% versus 43%, p<0.001). Diameter measurements between SWMR and CT showed a close correlation (R2 = 0.85, p<0.001) with a slight but not significant overestimation of size (SWMR: 6.5 mm ± 2.5; CT: 5.9 mm ± 2.4, p = 0.02). Interobserver-agreement for diameter measurements was excellent on SWMR (ICC = 0.984, p < 0.0001). Conclusions Combining SWMR magnitude and phase information enables the accurate detection of PGC and offers a better diagnostic performance than conventional MRI with CT as a reference standard. PMID:28278291
JCoDA: a tool for detecting evolutionary selection.
Steinway, Steven N; Dannenfelser, Ruth; Laucius, Christopher D; Hayes, James E; Nayak, Sudhir
2010-05-27
The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda.
JCoDA: a tool for detecting evolutionary selection
2010-01-01
Background The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. Results JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. Conclusions JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda. PMID:20507581
Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko
2016-08-15
Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities. Copyright © 2016 Elsevier Ltd. All rights reserved.
Pelgrim, E A M; Kramer, A W M; Mokkink, H G A; van der Vleuten, C P M
2013-09-01
Although the literature suggests that reflection has a positive impact on learning, there is a paucity of evidence to support this notion. We investigated feedback and reflection in relation to the likelihood that feedback will be used to inform action plans. We hypothesised that feedback and reflection present a cumulative sequence (i.e. trainers only pay attention to trainees' reflections when they provided specific feedback) and we hypothesised a supplementary effect of reflection. We analysed copies of assessment forms containing trainees' reflections and trainers' feedback on observed clinical performance. We determined whether the response patterns revealed cumulative sequences in line with the Guttman scale. We further examined the relationship between reflection, feedback and the mean number of specific comments related to an action plan (ANOVA) and we calculated two effect sizes. Both hypotheses were confirmed by the results. The response pattern found showed an almost perfect fit with the Guttman scale (0.99) and reflection seems to have supplementary effect on the variable action plan. Reflection only occurs when a trainer has provided specific feedback; trainees who reflect on their performance are more likely to make use of feedback. These results confirm findings and suggestions reported in the literature.
Loher, Phillipe; Telonis, Aristeidis G.; Rigoutsos, Isidore
2017-01-01
Transfer RNA fragments (tRFs) are an established class of constitutive regulatory molecules that arise from precursor and mature tRNAs. RNA deep sequencing (RNA-seq) has greatly facilitated the study of tRFs. However, the repeat nature of the tRNA templates and the idiosyncrasies of tRNA sequences necessitate the development and use of methodologies that differ markedly from those used to analyze RNA-seq data when studying microRNAs (miRNAs) or messenger RNAs (mRNAs). Here we present MINTmap (for MItochondrial and Nuclear TRF mapping), a method and a software package that was developed specifically for the quick, deterministic and exhaustive identification of tRFs in short RNA-seq datasets. In addition to identifying them, MINTmap is able to unambiguously calculate and report both raw and normalized abundances for the discovered tRFs. Furthermore, to ensure specificity, MINTmap identifies the subset of discovered tRFs that could be originating outside of tRNA space and flags them as candidate false positives. Our comparative analysis shows that MINTmap exhibits superior sensitivity and specificity to other available methods while also being exceptionally fast. The MINTmap codes are available through https://github.com/TJU-CMC-Org/MINTmap/ under an open source GNU GPL v3.0 license. PMID:28220888
Livério, Harisson Oliveira; Ruiz, Luciana da Silva; de Freitas, Roseli Santos; Nishikaku, Angela; de Souza, Ana Clara; Paula, Claudete Rodrigues; Domaneschi, Carina
2017-01-01
ABSTRACT The aim of this study was to assess a collection of yeasts to verify the presence of Candida dubliniensis among strains isolated from the oral mucosa of AIDS pediatric patients which were initially characterized as Candida albicans by the traditional phenotypic method, as well as to evaluate the main phenotypic methods used in the discrimination between the two species and confirm the identification through genotypic techniques, i.e., DNA sequencing. Twenty-nine samples of C. albicans isolated from this population and kept in a fungi collection were evaluated and re-characterized. In order to differentiate the two species, phenotypic tests (Thermotolerance tests, Chromogenic medium, Staib agar, Tobacco agar, Hypertonic medium) were performed and genotypic techniques using DNA sequencing were employed for confirmation of isolated species. Susceptibility and specificity were calculated for each test. No phenotypic test alone was sufficient to provide definitive identification of C. dubliniensis or C. albicans, as opposed to results of molecular tests. After amplification and sequencing of specific regions of the 29 studied strains, 93.1% of the isolates were identified as C. albicans and 6.9% as C. dubliniensis. The Staib agar assay showed a higher susceptibility (96.3%) in comparison with other phenotypic techniques. Therefore, genotypic methods are indispensable for the conclusive identification and differentiation between these species. PMID:28423089
Livério, Harisson Oliveira; Ruiz, Luciana da Silva; Freitas, Roseli Santos de; Nishikaku, Angela; Souza, Ana Clara de; Paula, Claudete Rodrigues; Domaneschi, Carina
2017-04-13
The aim of this study was to assess a collection of yeasts to verify the presence of Candida dubliniensis among strains isolated from the oral mucosa of AIDS pediatric patients which were initially characterized as Candida albicans by the traditional phenotypic method, as well as to evaluate the main phenotypic methods used in the discrimination between the two species and confirm the identification through genotypic techniques, i.e., DNA sequencing. Twenty-nine samples of C. albicans isolated from this population and kept in a fungi collection were evaluated and re-characterized. In order to differentiate the two species, phenotypic tests (Thermotolerance tests, Chromogenic medium, Staib agar, Tobacco agar, Hypertonic medium) were performed and genotypic techniques using DNA sequencing were employed for confirmation of isolated species. Susceptibility and specificity were calculated for each test. No phenotypic test alone was sufficient to provide definitive identification of C. dubliniensis or C. albicans, as opposed to results of molecular tests. After amplification and sequencing of specific regions of the 29 studied strains, 93.1% of the isolates were identified as C. albicans and 6.9% as C. dubliniensis. The Staib agar assay showed a higher susceptibility (96.3%) in comparison with other phenotypic techniques. Therefore, genotypic methods are indispensable for the conclusive identification and differentiation between these species.
Stark width regularities within spectral series of the lithium isoelectronic sequence
NASA Astrophysics Data System (ADS)
Tapalaga, Irinel; Trklja, Nora; Dojčinović, Ivan P.; Purić, Jagoš
2018-03-01
Stark width regularities within spectral series of the lithium isoelectronic sequence have been studied in an approach that includes both neutrals and ions. The influence of environmental conditions and certain atomic parameters on the Stark widths of spectral lines has been investigated. This study gives a simple model for the calculation of Stark broadening data for spectral lines within the lithium isoelectronic sequence. The proposed model requires fewer parameters than any other model. The obtained relations were used for predictions of Stark widths for transitions that have not yet been measured or calculated. In the framework of the present research, three algorithms for fast data processing have been made and they enable quality control and provide verification of the theoretically calculated results.
NASA Technical Reports Server (NTRS)
Wallace, G. R.; Weathers, G. D.; Graf, E. R.
1973-01-01
The statistics of filtered pseudorandom digital sequences called hybrid-sum sequences, formed from the modulo-two sum of several maximum-length sequences, are analyzed. The results indicate that a relation exists between the statistics of the filtered sequence and the characteristic polynomials of the component maximum length sequences. An analysis procedure is developed for identifying a large group of sequences with good statistical properties for applications requiring the generation of analog pseudorandom noise. By use of the analysis approach, the filtering process is approximated by the convolution of the sequence with a sum of unit step functions. A parameter reflecting the overall statistical properties of filtered pseudorandom sequences is derived. This parameter is called the statistical quality factor. A computer algorithm to calculate the statistical quality factor for the filtered sequences is presented, and the results for two examples of sequence combinations are included. The analysis reveals that the statistics of the signals generated with the hybrid-sum generator are potentially superior to the statistics of signals generated with maximum-length generators. Furthermore, fewer calculations are required to evaluate the statistics of a large group of hybrid-sum generators than are required to evaluate the statistics of the same size group of approximately equivalent maximum-length sequences.
Chromosome specific repetitive DNA sequences
Moyzis, Robert K.; Meyne, Julianne
1991-01-01
A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).
Exact method for numerically analyzing a model of local denaturation in superhelically stressed DNA
NASA Astrophysics Data System (ADS)
Fye, Richard M.; Benham, Craig J.
1999-03-01
Local denaturation, the separation at specific sites of the two strands comprising the DNA double helix, is one of the most fundamental processes in biology, required to allow the base sequence to be read both in DNA transcription and in replication. In living organisms this process can be mediated by enzymes which regulate the amount of superhelical stress imposed on the DNA. We present a numerically exact technique for analyzing a model of denaturation in superhelically stressed DNA. This approach is capable of predicting the locations and extents of transition in circular superhelical DNA molecules of kilobase lengths and specified base pair sequences. It can also be used for closed loops of DNA which are typically found in vivo to be kilobases long. The analytic method consists of an integration over the DNA twist degrees of freedom followed by the introduction of auxiliary variables to decouple the remaining degrees of freedom, which allows the use of the transfer matrix method. The algorithm implementing our technique requires O(N2) operations and O(N) memory to analyze a DNA domain containing N base pairs. However, to analyze kilobase length DNA molecules it must be implemented in high precision floating point arithmetic. An accelerated algorithm is constructed by imposing an upper bound M on the number of base pairs that can simultaneously denature in a state. This accelerated algorithm requires O(MN) operations, and has an analytically bounded error. Sample calculations show that it achieves high accuracy (greater than 15 decimal digits) with relatively small values of M (M<0.05N) for kilobase length molecules under physiologically relevant conditions. Calculations are performed on the superhelical pBR322 DNA sequence to test the accuracy of the method. With no free parameters in the model, the locations and extents of local denaturation predicted by this analysis are in quantitatively precise agreement with in vitro experimental measurements. Calculations performed on the fructose-1,6-bisphosphatase gene sequence from yeast show that this approach can also accurately treat in vivo denaturation.
DRUMS: Disk Repository with Update Management and Select option for high throughput sequencing data.
Nettling, Martin; Thieme, Nils; Both, Andreas; Grosse, Ivo
2014-02-04
New technologies for analyzing biological samples, like next generation sequencing, are producing a growing amount of data together with quality scores. Moreover, software tools (e.g., for mapping sequence reads), calculating transcription factor binding probabilities, estimating epigenetic modification enriched regions or determining single nucleotide polymorphism increase this amount of position-specific DNA-related data even further. Hence, requesting data becomes challenging and expensive and is often implemented using specialised hardware. In addition, picking specific data as fast as possible becomes increasingly important in many fields of science. The general problem of handling big data sets was addressed by developing specialized databases like HBase, HyperTable or Cassandra. However, these database solutions require also specialized or distributed hardware leading to expensive investments. To the best of our knowledge, there is no database capable of (i) storing billions of position-specific DNA-related records, (ii) performing fast and resource saving requests, and (iii) running on a single standard computer hardware. Here, we present DRUMS (Disk Repository with Update Management and Select option), satisfying demands (i)-(iii). It tackles the weaknesses of traditional databases while handling position-specific DNA-related data in an efficient manner. DRUMS is capable of storing up to billions of records. Moreover, it focuses on optimizing relating single lookups as range request, which are needed permanently for computations in bioinformatics. To validate the power of DRUMS, we compare it to the widely used MySQL database. The test setting considers two biological data sets. We use standard desktop hardware as test environment. DRUMS outperforms MySQL in writing and reading records by a factor of two up to a factor of 10000. Furthermore, it can work with significantly larger data sets. Our work focuses on mid-sized data sets up to several billion records without requiring cluster technology. Storing position-specific data is a general problem and the concept we present here is a generalized approach. Hence, it can be easily applied to other fields of bioinformatics.
Free Energy Perturbation Calculations of the Thermodynamics of Protein Side-Chain Mutations.
Steinbrecher, Thomas; Abel, Robert; Clark, Anthony; Friesner, Richard
2017-04-07
Protein side-chain mutation is fundamental both to natural evolutionary processes and to the engineering of protein therapeutics, which constitute an increasing fraction of important medications. Molecular simulation enables the prediction of the effects of mutation on properties such as binding affinity, secondary and tertiary structure, conformational dynamics, and thermal stability. A number of widely differing approaches have been applied to these predictions, including sequence-based algorithms, knowledge-based potential functions, and all-atom molecular mechanics calculations. Free energy perturbation theory, employing all-atom and explicit-solvent molecular dynamics simulations, is a rigorous physics-based approach for calculating thermodynamic effects of, for example, protein side-chain mutations. Over the past several years, we have initiated an investigation of the ability of our most recent free energy perturbation methodology to model the thermodynamics of protein mutation for two specific problems: protein-protein binding affinities and protein thermal stability. We highlight recent advances in the field and outline current and future challenges. Copyright © 2017 Elsevier Ltd. All rights reserved.
Fortin, Connor H; Schulze, Katharina V; Babbitt, Gregory A
2015-01-01
It is now widely-accepted that DNA sequences defining DNA-protein interactions functionally depend upon local biophysical features of DNA backbone that are important in defining sites of binding interaction in the genome (e.g. DNA shape, charge and intrinsic dynamics). However, these physical features of DNA polymer are not directly apparent when analyzing and viewing Shannon information content calculated at single nucleobases in a traditional sequence logo plot. Thus, sequence logos plots are severely limited in that they convey no explicit information regarding the structural dynamics of DNA backbone, a feature often critical to binding specificity. We present TRX-LOGOS, an R software package and Perl wrapper code that interfaces the JASPAR database for computational regulatory genomics. TRX-LOGOS extends the traditional sequence logo plot to include Shannon information content calculated with regard to the dinucleotide-based BI-BII conformation shifts in phosphate linkages on the DNA backbone, thereby adding a visual measure of intrinsic DNA flexibility that can be critical for many DNA-protein interactions. TRX-LOGOS is available as an R graphics module offered at both SourceForge and as a download supplement at this journal. To demonstrate the general utility of TRX logo plots, we first calculated the information content for 416 Saccharomyces cerevisiae transcription factor binding sites functionally confirmed in the Yeastract database and matched to previously published yeast genomic alignments. We discovered that flanking regions contain significantly elevated information content at phosphate linkages than can be observed at nucleobases. We also examined broader transcription factor classifications defined by the JASPAR database, and discovered that many general signatures of transcription factor binding are locally more information rich at the level of DNA backbone dynamics than nucleobase sequence. We used TRX-logos in combination with MEGA 6.0 software for molecular evolutionary genetics analysis to visually compare the human Forkhead box/FOX protein evolution to its binding site evolution. We also compared the DNA binding signatures of human TP53 tumor suppressor determined by two different laboratory methods (SELEX and ChIP-seq). Further analysis of the entire yeast genome, center aligned at the start codon, also revealed a distinct sequence-independent 3 bp periodic pattern in information content, present only in coding region, and perhaps indicative of the non-random organization of the genetic code. TRX-LOGOS is useful in any situation in which important information content in DNA can be better visualized at the positions of phosphate linkages (i.e. dinucleotides) where the dynamic properties of the DNA backbone functions to facilitate DNA-protein interaction.
Ohta, Tazro; Nakazato, Takeru; Bono, Hidemasa
2017-06-01
It is important for public data repositories to promote the reuse of archived data. In the growing field of omics science, however, the increasing number of submissions of high-throughput sequencing (HTSeq) data to public repositories prevents users from choosing a suitable data set from among the large number of search results. Repository users need to be able to set a threshold to reduce the number of results to obtain a suitable subset of high-quality data for reanalysis. We calculated the quality of sequencing data archived in a public data repository, the Sequence Read Archive (SRA), by using the quality control software FastQC. We obtained quality values for 1 171 313 experiments, which can be used to evaluate the suitability of data for reuse. We also visualized the data distribution in SRA by integrating the quality information and metadata of experiments and samples. We provide quality information of all of the archived sequencing data, which enable users to obtain sufficient quality sequencing data for reanalyses. The calculated quality data are available to the public in various formats. Our data also provide an example of enhancing the reuse of public data by adding metadata to published research data by a third party. © The Authors 2017. Published by Oxford University Press.
Yamada, Kazuhiko; Kamimura, Eikichi; Kondo, Mariko; Tsuchiya, Kimiyuki; Nishida-Umehara, Chizuko; Matsuda, Yoichi
2006-02-01
We molecularly cloned new families of site-specific repetitive DNA sequences from BglII- and EcoRI-digested genomic DNA of the Syrian hamster (Mesocricetus auratus, Cricetrinae, Rodentia) and characterized them by chromosome in situ hybridization and filter hybridization. They were classified into six different types of repetitive DNA sequence families according to chromosomal distribution and genome organization. The hybridization patterns of the sequences were consistent with the distribution of C-positive bands and/or Hoechst-stained heterochromatin. The centromeric major satellite DNA and sex chromosome-specific and telomeric region-specific repetitive sequences were conserved in the same genus (Mesocricetus) but divergent in different genera. The chromosome-2-specific sequence was conserved in two genera, Mesocricetus and Cricetulus, and a low copy number of repetitive sequences on the heterochromatic chromosome arms were conserved in the subfamily Cricetinae but not in the subfamily Calomyscinae. By contrast, the other type of repetitive sequences on the heterochromatic chromosome arms, which had sequence similarities to a LINE sequence of rodents, was conserved through the three subfamilies, Cricetinae, Calomyscinae and Murinae. The nucleotide divergence of the repetitive sequences of heterochromatin was well correlated with the phylogenetic relationships of the Cricetinae species, and each sequence has been independently amplified and diverged in the same genome.
Oxygen isotope trajectories of crystallizing melts: Insights from modeling and the plutonic record
NASA Astrophysics Data System (ADS)
Bucholz, Claire E.; Jagoutz, Oliver; VanTongeren, Jill A.; Setera, Jacob; Wang, Zhengrong
2017-06-01
Elevated oxygen isotope values in igneous rocks are often used to fingerprint supracrustal alteration or assimilation of material that once resided near the surface of the earth. The δ18O value of a melt, however, can also increase through closed-system fractional crystallization. In order to quantify the change in melt δ18O due to crystallization, we develop a detailed closed-system fractional crystallization mass balance model and apply it to six experimentally- and naturally-determined liquid lines of descent (LLDs), which cover nearly complete crystallization intervals (melt fractions of 1 to <0.1). The studied LLDs vary from anhydrous tholeiitic basalts to hydrous high-K and calc-alkaline basalts and are characterized by distinct melt temperature-SiO2 trajectories, as well as, crystallizing phase relationships. Our model results demonstrate that melt fraction-temperature-SiO2 relationships of crystallizing melts, which are strongly a function of magmatic water content, will control the specific δ18O path of a crystallizing melt. Hydrous melts, typical of subduction zones, undergo larger increases in δ18O during early stages of crystallization due to their lower magmatic temperatures, greater initial increases in SiO2 content, and high temperature stability of low δ18O phases, such as oxides, amphibole, and anorthitic plagioclase (versus albite). Conversely, relatively dry, tholeiitic melts only experience significant increases in δ18O at degrees of crystallization greater than 80%. Total calculated increases in melt δ18O of 1.0-1.5‰ can be attributed to crystallization from ∼50 to 70 wt.% SiO2 for modeled closed-system crystallizing melt compositions. As an example application, we compare our closed system model results to oxygen isotope mineral data from two natural plutonic sequences, a relatively dry, tholeiitic sequence from the Upper and Upper Main Zones (UUMZ) of the Bushveld Complex (South Africa) and a high-K, hydrous sequence from the arc-related Dariv Igneous Complex (Mongolia). These two sequences were chosen as their major and trace element compositions appear to have been predominantly controlled by closed-system fractional crystallization and their LLDs have been modeled in detail. We calculated equilibrium melt δ18O values using the measured mineral δ18O values and calculated mineral-melt fractionation factors. Increases of 2-3‰ and 1-1.5‰ in the equilibrium melts are observed for the Dariv Igneous Complex and the UUMZ of the Bushveld Complex, respectively. Closed-system fractional crystallization model results reproduce the 1‰ increase observed in the equilibrium melt δ18O for the Bushveld UUMZ, whereas for the Dariv Igneous Complex assimilation of high δ18O material is necessary to account for the increase in melt δ18O values. Assimilation of evolved supracrustal material is also confirmed with Sr and Nd isotope analyses of clinopyroxene from the sequence. Beginning with a range of mantle-derived basalt δ18O values of 5.7‰ ("pristine" mantle) to ∼7.0‰ (heavily subduction-influenced mantle), our model results demonstrated that high-silica melts (i.e. granites) with δ18O of up to 8.5‰ can be produced through fractional crystallization alone. Lastly, we model the zircon-melt δ18O fractionations of different LLDs, emphasizing their dependence on the specific SiO2-T relationships of a given crystallizing melt. Wet, relatively cool granitic melts will have larger zircon-melt fractionations, potentially by ∼1.5‰, compared to hot, dry granites. Therefore, it is critical to constrain zircon-melt fractionations specific to a system of interest when using zircon δ18O values to calculate melt δ18O.
Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji
2006-02-28
Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of "chimera proteins." In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape.
Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study
Chikenji, George; Fujitsuka, Yoshimi; Takada, Shoji
2006-01-01
Predicting protein tertiary structure by folding-like simulations is one of the most stringent tests of how much we understand the principle of protein folding. Currently, the most successful method for folding-based structure prediction is the fragment assembly (FA) method. Here, we address why the FA method is so successful and its lesson for the folding problem. To do so, using the FA method, we designed a structure prediction test of “chimera proteins.” In the chimera proteins, local structural preference is specific to the target sequences, whereas nonlocal interactions are only sequence-independent compaction forces. We find that these chimera proteins can find the native folds of the intact sequences with high probability indicating dominant roles of the local interactions. We further explore roles of local structural preference by exact calculation of the HP lattice model of proteins. From these results, we suggest principles of protein folding: For small proteins, compact structures that are fully compatible with local structural preference are few, one of which is the native fold. These local biases shape up the funnel-like energy landscape. PMID:16488978
Structure-affinity relationships for the binding of actinomycin D to DNA
NASA Astrophysics Data System (ADS)
Gallego, José; Ortiz, Angel R.; de Pascual-Teresa, Beatriz; Gago, Federico
1997-03-01
Molecular models of the complexes between actinomycin D and 14 different DNA hexamers were built based on the X-ray crystal structure of the actinomycin-d(GAAGCTTC)2 complex. The DNA sequences included the canonical GpC binding step flanked by different base pairs, nonclassical binding sites such as GpG and GpT, and sites containing 2,6-diamino- purine. A good correlation was found between the intermolecular interaction energies calculated for the refined complexes and the relative preferences of actinomycin binding to standard and modified DNA. A detailed energy decomposition into van der Waals and electrostatic components for the interactions between the DNA base pairs and either the chromophore or the peptidic part of the antibiotic was performed for each complex. The resulting energy matrix was then subjected to principal component analysis, which showed that actinomycin D discriminates among different DNA sequences by an interplay of hydrogen bonding and stacking interactions. The structure-affinity relationships for this important antitumor drug are thus rationalized and may be used to advantage in the design of novel sequence-specific DNA-binding agents.
OrthoANI: An improved algorithm and software for calculating average nucleotide identity.
Lee, Imchang; Ouk Kim, Yeong; Park, Sang-Cheol; Chun, Jongsik
2016-02-01
Species demarcation in Bacteria and Archaea is mainly based on overall genome relatedness, which serves a framework for modern microbiology. Current practice for obtaining these measures between two strains is shifting from experimentally determined similarity obtained by DNA-DNA hybridization (DDH) to genome-sequence-based similarity. Average nucleotide identity (ANI) is a simple algorithm that mimics DDH. Like DDH, ANI values between two genome sequences may be different from each other when reciprocal calculations are compared. We compared 63 690 pairs of genome sequences and found that the differences in reciprocal ANI values are significantly high, exceeding 1 % in some cases. To resolve this problem of not being symmetrical, a new algorithm, named OrthoANI, was developed to accommodate the concept of orthology for which both genome sequences were fragmented and only orthologous fragment pairs taken into consideration for calculating nucleotide identities. OrthoANI is highly correlated with ANI (using BLASTn) and the former showed approximately 0.1 % higher values than the latter. In conclusion, OrthoANI provides a more robust and faster means of calculating average nucleotide identity for taxonomic purposes. The standalone software tools are freely available at http://www.ezbiocloud.net/sw/oat.
Charles, Jermilia; Firth, Andrew E; Loroño-Pino, Maria A; Garcia-Rejon, Julian E; Farfan-Ale, Jose A; Lipkin, W Ian; Blitvich, Bradley J; Briese, Thomas
2016-04-01
Sequences corresponding to a putative, novel rhabdovirus [designated Merida virus (MERDV)] were initially detected in a pool of Culex quinquefasciatus collected in the Yucatan Peninsula of Mexico. The entire genome was sequenced, revealing 11 798 nt and five major ORFs, which encode the nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and RNA-dependent RNA polymerase (L). The deduced amino acid sequences of the N, G and L proteins have no more than 24, 38 and 43 % identity, respectively, to the corresponding sequences of all other known rhabdoviruses, whereas those of the P and M proteins have no significant identity with any sequences in GenBank and their identity is only suggested based on their genome position. Using specific reverse transcription-PCR assays established from the genome sequence, 27 571 C. quinquefasciatus which had been sorted in 728 pools were screened to assess the prevalence of MERDV in nature and 25 pools were found positive. The minimal infection rate (calculated as the number of positive mosquito pools per 1000 mosquitoes tested) was 0.9, and similar for both females and males. Screening another 140 pools of 5484 mosquitoes belonging to four other genera identified positive pools of Ochlerotatus spp. mosquitoes, indicating that the host range is not restricted to C. quinquefasciatus. Attempts to isolate MERDV in C6/36 and Vero cells were unsuccessful. In summary, we provide evidence that a previously undescribed rhabdovirus occurs in mosquitoes in Mexico.
Precise genotyping and recombination detection of Enterovirus
2015-01-01
Enteroviruses (EV) with different genotypes cause diverse infectious diseases in humans and mammals. A correct EV typing result is crucial for effective medical treatment and disease control; however, the emergence of novel viral strains has impaired the performance of available diagnostic tools. Here, we present a web-based tool, named EVIDENCE (EnteroVirus In DEep conception, http://symbiont.iis.sinica.edu.tw/evidence), for EV genotyping and recombination detection. We introduce the idea of using mixed-ranking scores to evaluate the fitness of prototypes based on relatedness and on the genome regions of interest. Using phylogenetic methods, the most possible genotype is determined based on the closest neighbor among the selected references. To detect possible recombination events, EVIDENCE calculates the sequence distance and phylogenetic relationship among sequences of all sliding windows scanning over the whole genome. Detected recombination events are plotted in an interactive figure for viewing of fine details. In addition, all EV sequences available in GenBank were collected and revised using the latest classification and nomenclature of EV in EVIDENCE. These sequences are built into the database and are retrieved in an indexed catalog, or can be searched for by keywords or by sequence similarity. EVIDENCE is the first web-based tool containing pipelines for genotyping and recombination detection, with updated, built-in, and complete reference sequences to improve sensitivity and specificity. The use of EVIDENCE can accelerate genotype identification, aiding clinical diagnosis and enhancing our understanding of EV evolution. PMID:26678286
Patel, D J; Canuel, L L
1977-07-01
The complex formed between the mutagen proflavine and the dC-dC-dG-dG and dG-dG-dC-dC self-complementary tetranucleotide duplexes has been monitored by proton high resolution nuclear magnetic resonance spectroscopy in 0.1 M phosphate solution at high nucleotide/drug ratios. The large upfield shifts (0.5 to 0.85 ppm) observed at all the proflavine ring nonexchangeable protons on complex formation are consistent with intercalation of the mutagen between base pairs of the tetranucleotide duplex. We have proposed an approximate overlap geometry between the proflavine ring and nearest neighbor base pairs at the intercalation site from a comparison between experimental shifts and those calculated for various stacking orientations. We have compared the binding of actinomycin D, propidium diiodide, and proflavine to self-complementary tetranucleotide sequences dC-dC-dG-dG and dG-dG-dC-dC by UV absorbance changes in the drug bands between 400 and 500 nm. Actinomycin D exhibits a pronounced specificity for sequences with dG-dC sites (dG-dG-dC-dC), while propidium diiodide and proflavine exhibit a specificity for sequences with dC-dG sites (dC-dC-dG-dG). Actinomycin D binds more strongly than propidium diiodide and proflavine to dC-dG-dC-dG (contains dC-dG and dG-dC binding sites), indicative of the additional stabilization from hydrogen bonding and hydrophobic interactions between the pentapeptide lactone rings of actinomycin D and the base pair edges and sugar-phosphate backbone of the tetranucleotide duplex.
Patel, Dinshaw J.; Canuel, Lita L.
1977-01-01
The complex formed between the mutagen proflavine and the dC-dC-dG-dG and dG-dG-dC-dC self-complementary tetranucleotide duplexes has been monitored by proton high resolution nuclear magnetic resonance spectroscopy in 0.1 M phosphate solution at high nucleotide/drug ratios. The large upfield shifts (0.5 to 0.85 ppm) observed at all the proflavine ring nonexchangeable protons on complex formation are consistent with intercalation of the mutagen between base pairs of the tetranucleotide duplex. We have proposed an approximate overlap geometry between the proflavine ring and nearest neighbor base pairs at the intercalation site from a comparison between experimental shifts and those calculated for various stacking orientations. We have compared the binding of actinomycin D, propidium diiodide, and proflavine to self-complementary tetranucleotide sequences dC-dC-dG-dG and dG-dG-dC-dC by UV absorbance changes in the drug bands between 400 and 500 nm. Actinomycin D exhibits a pronounced specificity for sequences with dG-dC sites (dG-dG-dC-dC), while propidium diiodide and proflavine exhibit a specificity for sequences with dC-dG sites (dC-dC-dG-dG). Actinomycin D binds more strongly than propidium diiodide and proflavine to dC-dG-dC-dG (contains dC-dG and dG-dC binding sites), indicative of the additional stabilization from hydrogen bonding and hydrophobic interactions between the pentapeptide lactone rings of actinomycin D and the base pair edges and sugar-phosphate backbone of the tetranucleotide duplex. PMID:268613
Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui
2016-01-01
The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.
Understanding the mechanisms of protein-DNA interactions
NASA Astrophysics Data System (ADS)
Lavery, Richard
2004-03-01
Structural, biochemical and thermodynamic data on protein-DNA interactions show that specific recognition cannot be reduced to a simple set of binary interactions between the partners (such as hydrogen bonds, ion pairs or steric contacts). The mechanical properties of the partners also play a role and, in the case of DNA, variations in both conformation and flexibility as a function of base sequence can be a significant factor in guiding a protein to the correct binding site. All-atom molecular modeling offers a means of analyzing the role of different binding mechanisms within protein-DNA complexes of known structure. This however requires estimating the binding strengths for the full range of sequences with which a given protein can interact. Since this number grows exponentially with the length of the binding site it is necessary to find a method to accelerate the calculations. We have achieved this by using a multi-copy approach (ADAPT) which allows us to build a DNA fragment with a variable base sequence. The results obtained with this method correlate well with experimental consensus binding sequences. They enable us to show that indirect recognition mechanisms involving the sequence dependent properties of DNA play a significant role in many complexes. This approach also offers a means of predicting protein binding sites on the basis of binding energies, which is complementary to conventional lexical techniques.
Plasmon-polaritonic bands in sequential doped graphene superlattices
NASA Astrophysics Data System (ADS)
Ramos-Mendieta, Felipe; Palomino-Ovando, Martha; Hernández-López, Alejandro; Fuentecilla-Cárcamo, Iván
Doped graphene has the extraordinary quality of supporting two types of surface excitations that involve electric charges (the transverse magnetic surface plasmons) or electric currents (the transverse electric modes). We have studied numerically the collective modes that result from the coupling of surface plasmons in doped graphene multilayers. By use of structured supercells with fixed dielectric background and inter layer separation, we found a series of plasmon-polaritonic bands of structure dependent on the doping sequence chosen for the graphene sheets. Periodic and quasiperiodic sequences for the graphene chemical potential have been studied. Our results show that transverse magnetic bands exist only in the low frequency regime but transverse electric bands arise within specific ranges of higher frequencies. Our calculations are valid for THz frequencies and graphene sheets with doping levels between 0.1 eV and 1.2 eV have been considered. AHL and IFC aknowledge fellowship support from CONACYT México.
Entropy of finite random binary sequences with weak long-range correlations.
Melnik, S S; Usatenko, O V
2014-11-01
We study the N-step binary stationary ergodic Markov chain and analyze its differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain through the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses the two-point correlators instead of the block probability, it makes it possible to calculate the entropy of strings at much longer distances than using standard methods. A fluctuation contribution to the entropy due to finiteness of random chains is examined. This contribution can be of the same order as its regular part even at the relatively short lengths of subsequences. A self-similar structure of entropy with respect to the decimation transformations is revealed for some specific forms of the pair correlation function. Application of the theory to the DNA sequence of the R3 chromosome of Drosophila melanogaster is presented.
Entropy of finite random binary sequences with weak long-range correlations
NASA Astrophysics Data System (ADS)
Melnik, S. S.; Usatenko, O. V.
2014-11-01
We study the N -step binary stationary ergodic Markov chain and analyze its differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain through the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses the two-point correlators instead of the block probability, it makes it possible to calculate the entropy of strings at much longer distances than using standard methods. A fluctuation contribution to the entropy due to finiteness of random chains is examined. This contribution can be of the same order as its regular part even at the relatively short lengths of subsequences. A self-similar structure of entropy with respect to the decimation transformations is revealed for some specific forms of the pair correlation function. Application of the theory to the DNA sequence of the R3 chromosome of Drosophila melanogaster is presented.
Pattern Recognition of Adsorbing HP Lattice Proteins
NASA Astrophysics Data System (ADS)
Wilson, Matthew S.; Shi, Guangjie; Wüst, Thomas; Landau, David P.; Schmid, Friederike
2015-03-01
Protein adsorption is relevant in fields ranging from medicine to industry, and the qualitative behavior exhibited by course-grained models could shed insight for further research in such fields. Our study on the selective adsorption of lattice proteins utilizes the Wang-Landau algorithm to simulate the Hydrophobic-Polar (H-P) model with an efficient set of Monte Carlo moves. Each substrate is modeled as a square pattern of 9 lattice sites which attract either H or P monomers, and are located on an otherwise neutral surface. The fully enumerated set of 102 unique surfaces is simulated with each protein sequence. A collection of 27-monomer sequences is used- each of which is non-degenerate and protein-like. Thermodynamic quantities such as the specific heat and free energy are calculated from the density of states, and are used to investigate the adsorption of lattice proteins on patterned substrates. Research supported by NSF.
Rényi continuous entropy of DNA sequences.
Vinga, Susana; Almeida, Jonas S
2004-12-07
Entropy measures of DNA sequences estimate their randomness or, inversely, their repeatability. L-block Shannon discrete entropy accounts for the empirical distribution of all length-L words and has convergence problems for finite sequences. A new entropy measure that extends Shannon's formalism is proposed. Renyi's quadratic entropy calculated with Parzen window density estimation method applied to CGR/USM continuous maps of DNA sequences constitute a novel technique to evaluate sequence global randomness without some of the former method drawbacks. The asymptotic behaviour of this new measure was analytically deduced and the calculation of entropies for several synthetic and experimental biological sequences was performed. The results obtained were compared with the distributions of the null model of randomness obtained by simulation. The biological sequences have shown a different p-value according to the kernel resolution of Parzen's method, which might indicate an unknown level of organization of their patterns. This new technique can be very useful in the study of DNA sequence complexity and provide additional tools for DNA entropy estimation. The main MATLAB applications developed and additional material are available at the webpage . Specialized functions can be obtained from the authors.
He, Bing; Caudy, Amy; Parsons, Lance; Rosebrock, Adam; Pane, Attilio; Raj, Sandeep; Wieschaus, Eric
2012-01-01
Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy using chromosomal rearrangements and embryonic phenotypes to position unmapped Drosophila melanogaster heterochromatic sequence to specific chromosomal regions. By excluding sequences that can be mapped to the assembled euchromatic arms, we identified sequences that are specific to heterochromatin and used them to design heterochromatin specific probes (“H-probes”) for microarray. By comparative genomic hybridization (CGH) analyses of embryos deficient for each chromosome or chromosome arm, we were able to map most of our H-probes to specific chromosome arms. We also positioned sequences mapped to the second and X chromosomes to finer intervals by analyzing smaller deletions with breakpoints in heterochromatin. Using this approach, we were able to map >40% (13.9 Mb) of the previously unmapped heterochromatin sequences assembled by the whole-genome sequencing effort on arm U and arm Uextra to specific locations. We also identified and mapped 110 kb of novel heterochromatic sequences. Subsequent analyses revealed that sequences located within different heterochromatic regions have distinct properties, such as sequence composition, degree of repetitiveness, and level of underreplication in polytenized tissues. Surprisingly, although heterochromatin is generally considered to be transcriptionally silent, we detected region-specific temporal patterns of transcription in heterochromatin during oogenesis and early embryonic development. Our study provides a useful approach to elucidate the molecular organization and function of heterochromatin and reveals region-specific variation of heterochromatin. PMID:22745230
Ilk, Nicola; Völlenkle, Christine; Egelseer, Eva M.; Breitwieser, Andreas; Sleytr, Uwe B.; Sára, Margit
2002-01-01
The nucleotide sequence encoding the crystalline bacterial cell surface (S-layer) protein SbpA of Bacillus sphaericus CCM 2177 was determined by a PCR-based technique using four overlapping fragments. The entire sbpA sequence indicated one open reading frame of 3,804 bp encoding a protein of 1,268 amino acids with a theoretical molecular mass of 132,062 Da and a calculated isoelectric point of 4.69. The N-terminal part of SbpA, which is involved in anchoring the S-layer subunits via a distinct type of secondary cell wall polymer to the rigid cell wall layer, comprises three S-layer-homologous motifs. For screening of amino acid positions located on the outer surface of the square S-layer lattice, the sequence encoding Strep-tag I, showing affinity to streptavidin, was linked to the 5′ end of the sequence encoding the recombinant S-layer protein (rSbpA) or a C-terminally truncated form (rSbpA31-1068). The deletion of 200 C-terminal amino acids did not interfere with the self-assembly properties of the S-layer protein but significantly increased the accessibility of Strep-tag I. Thus, the sequence encoding the major birch pollen allergen (Bet v1) was fused via a short linker to the sequence encoding the C-terminally truncated form rSpbA31-1068. Labeling of the square S-layer lattice formed by recrystallization of rSbpA31-1068/Bet v1 on peptidoglycan-containing sacculi with a Bet v1-specific monoclonal mouse antibody demonstrated the functionality of the fused protein sequence and its location on the outer surface of the S-layer lattice. The specific interactions between the N-terminal part of SbpA and the secondary cell wall polymer will be exploited for an oriented binding of the S-layer fusion protein on solid supports to generate regularly structured functional protein lattices. PMID:12089001
Advanced Reactor PSA Methodologies for System Reliability Analysis and Source Term Assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grabaskas, D.; Brunett, A.; Passerini, S.
Beginning in 2015, a project was initiated to update and modernize the probabilistic safety assessment (PSA) of the GE-Hitachi PRISM sodium fast reactor. This project is a collaboration between GE-Hitachi and Argonne National Laboratory (Argonne), and funded in part by the U.S. Department of Energy. Specifically, the role of Argonne is to assess the reliability of passive safety systems, complete a mechanistic source term calculation, and provide component reliability estimates. The assessment of passive system reliability focused on the performance of the Reactor Vessel Auxiliary Cooling System (RVACS) and the inherent reactivity feedback mechanisms of the metal fuel core. Themore » mechanistic source term assessment attempted to provide a sequence specific source term evaluation to quantify offsite consequences. Lastly, the reliability assessment focused on components specific to the sodium fast reactor, including electromagnetic pumps, intermediate heat exchangers, the steam generator, and sodium valves and piping.« less
[Molecular identification of Hibiscus syriacus and its adulterants using ITS2 barcode].
Liu, Yi-Mei; Jin, Li-Na; Xiong, Yong-Xin; Wu, Lan; Chen, Ke-Li
2014-03-01
To identify Hibiscus syriacus and its adulterants using DNA barcoding technique. Nine samples of five species were PCR amplified and sequenced, and twelve samples were downloaded from the GenBank. The intra-specific and interspecific K2P distances were calculated, and neighbor-joining( NJ) tree was constructed by MEGA 5.0. The results showed the intra-specific genetic distances of Hibiscus syriacus were ranged from 0.009 to 0.056, which were far lower than inter-specific genetic distances between Hibiscus syriacus and its adulterants (0.236 - 0.301). Variable sites within Hibiscus syriacus ranged from 2 to 9 which were far less than the adulterants (45 - 52); Different samples of Hibiscus syriacus were gathered together and could be distinguished from its adulterants by NJ tree. ITS2 can discriminate Hibiscus syriacus from its adulterants correctly. The ITS2 region is an efficient barcode for authentication of Hibiscus syriacus and its adulterants.
FY11 Report on Metagenome Analysis using Pathogen Marker Libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, Shea N.; Allen, Jonathan E.; McLoughlin, Kevin S.
2011-06-02
A method, sequence library, and software suite was invented to rapidly assess whether any member of a pre-specified list of threat organisms or their near neighbors is present in a metagenome. The system was designed to handle mega- to giga-bases of FASTA-formatted raw sequence reads from short or long read next generation sequencing platforms. The approach is to pre-calculate a viral and a bacterial "Pathogen Marker Library" (PML) containing sub-sequences specific to pathogens or their near neighbors. A list of expected matches comparing every bacterial or viral genome against the PML sequences is also pre-calculated. To analyze a metagenome, readsmore » are compared to the PML, and observed PML-metagenome matches are compared to the expected PML-genome matches, and the ratio of observed relative to expected matches is reported. In other words, a 3-way comparison among the PML, metagenome, and existing genome sequences is used to quickly assess which (if any) species included in the PML is likely to be present in the metagenome, based on available sequence data. Our tests showed that the species with the most PML matches correctly indicated the organism sequenced for empirical metagenomes consisting of a cultured, relatively pure isolate. These runs completed in 1 minute to 3 hours on 12 CPU (1 thread/CPU), depending on the metagenome and PML. Using more threads on the same number of CPU resulted in speed improvements roughly proportional to the number of threads. Simulations indicated that detection sensitivity depends on both sequencing coverage levels for a species and the size of the PML: species were correctly detected even at ~0.003x coverage by the large PMLs, and at ~0.03x coverage by the smaller PMLs. Matches to true positive species were 3-4 orders of magnitude higher than to false positives. Simulations with short reads (36 nt and ~260 nt) showed that species were usually detected for metagenome coverage above 0.005x and coverage in the PML above 0.05x, and detection probability appears to be a function of both coverages. Multiple species could be detected simultaneously in a simulated low-coverage, complex metagenome, and the largest PML gave no false negative species and no false positive genera. The presence of multiple species was predicted in a complex metagenome from a human gut microbiome with 1.9 GB of short reads (75 nt); the species predicted were reasonable gut flora and no biothreat agents were detected, showing the feasibility of PML analysis of empirical complex metagenomes.« less
Regular Pentagons and the Fibonacci Sequence.
ERIC Educational Resources Information Center
French, Doug
1989-01-01
Illustrates how to draw a regular pentagon. Shows the sequence of a succession of regular pentagons formed by extending the sides. Calculates the general formula of the Lucas and Fibonacci sequences. Presents a regular icosahedron as an example of the golden ratio. (YP)
1998-12-01
Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals
Protein Crystal Eco R1 Endonulease-DNA Complex
NASA Technical Reports Server (NTRS)
1998-01-01
Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals
Pancoska, Petr; Moravek, Zdenek; Moll, Ute M
2004-01-01
Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.
Quantiprot - a Python package for quantitative analysis of protein sequences.
Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold
2017-07-17
The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.
Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.
Zhang, Guoqiang; Wang, Jianfeng; Yang, Jin; Li, Wenjie; Deng, Yutian; Li, Jing; Huang, Jun; Hu, Songnian; Zhang, Bing
2015-08-05
To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3% in four samples, whereas the concordance of co-detected variant loci reached 99%. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5%) was higher than the SNPs specific to TargetSeq-Proton (60.0%) or specific to SureSelect-HiSeq (88.3%). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0%) and SureSelect-HiSeq-specific (89.6%) were higher than those of TargetSeq-Proton-specific (15.8%). In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the sequencing of platform-specific variants, the accuracy of variant calling by HiSeq 2000 was higher than that of Ion Proton, specifically for the InDel detection. Moreover, the variant calling software also influences the detection of SNPs and, specifically, InDels in Ion Proton exome sequencing.
Niv, Masha Y.; Skrabanek, Lucy; Roberts, Richard J.; Scheraga, Harold A.; Weinstein, Harel
2008-01-01
Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering. PMID:17972284
Niv, Masha Y; Skrabanek, Lucy; Roberts, Richard J; Scheraga, Harold A; Weinstein, Harel
2008-05-01
Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering.
mRNA stability in mammalian cells.
Ross, J
1995-01-01
This review concerns how cytoplasmic mRNA half-lives are regulated and how mRNA decay rates influence gene expression. mRNA stability influences gene expression in virtually all organisms, from bacteria to mammals, and the abundance of a particular mRNA can fluctuate manyfold following a change in the mRNA half-life, without any change in transcription. The processes that regulate mRNA half-lives can, in turn, affect how cells grow, differentiate, and respond to their environment. Three major questions are addressed. Which sequences in mRNAs determine their half-lives? Which enzymes degrade mRNAs? Which (trans-acting) factors regulate mRNA stability, and how do they function? The following specific topics are discussed: techniques for measuring eukaryotic mRNA stability and for calculating decay constants, mRNA decay pathways, mRNases, proteins that bind to sequences shared among many mRNAs [like poly(A)- and AU-rich-binding proteins] and proteins that bind to specific mRNAs (like the c-myc coding-region determinant-binding protein), how environmental factors like hormones and growth factors affect mRNA stability, and how translation and mRNA stability are linked. Some perspectives and predictions for future research directions are summarized at the end. PMID:7565413
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.
2007-12-11
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
1999-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
2002-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.
2010-11-09
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
2000-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.
2005-04-05
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
A comprehensive quality control workflow for paired tumor-normal NGS experiments.
Schroeder, Christopher M; Hilke, Franz J; Löffler, Markus W; Bitzer, Michael; Lenz, Florian; Sturm, Marc
2017-06-01
Quality control (QC) is an important part of all NGS data analysis stages. Many available tools calculate QC metrics from different analysis steps of single sample experiments (raw reads, mapped reads and variant lists). Multi-sample experiments, as sequencing of tumor-normal pairs, require additional QC metrics to ensure validity of results. These multi-sample QC metrics still lack standardization. We therefore suggest a new workflow for QC of DNA sequencing of tumor-normal pairs. With this workflow well-known single-sample QC metrics and additional metrics specific for tumor-normal pairs can be calculated. The segmentation into different tools offers a high flexibility and allows reuse for other purposes. All tools produce qcML, a generic XML format for QC of -omics experiments. qcML uses quality metrics defined in an ontology, which was adapted for NGS. All QC tools are implemented in C ++ and run both under Linux and Windows. Plotting requires python 2.7 and matplotlib. The software is available under the 'GNU General Public License version 2' as part of the ngs-bits project: https://github.com/imgag/ngs-bits. christopher.schroeder@med.uni-tuebingen.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Determination of a Screening Metric for High Diversity DNA Libraries.
Guido, Nicholas J; Handerson, Steven; Joseph, Elaine M; Leake, Devin; Kung, Li A
2016-01-01
The fields of antibody engineering, enzyme optimization and pathway construction rely increasingly on screening complex variant DNA libraries. These highly diverse libraries allow researchers to sample a maximized sequence space; and therefore, more rapidly identify proteins with significantly improved activity. The current state of the art in synthetic biology allows for libraries with billions of variants, pushing the limits of researchers' ability to qualify libraries for screening by measuring the traditional quality metrics of fidelity and diversity of variants. Instead, when screening variant libraries, researchers typically use a generic, and often insufficient, oversampling rate based on a common rule-of-thumb. We have developed methods to calculate a library-specific oversampling metric, based on fidelity, diversity, and representation of variants, which informs researchers, prior to screening the library, of the amount of oversampling required to ensure that the desired fraction of variant molecules will be sampled. To derive this oversampling metric, we developed a novel alignment tool to efficiently measure frequency counts of individual nucleotide variant positions using next-generation sequencing data. Next, we apply a method based on the "coupon collector" probability theory to construct a curve of upper bound estimates of the sampling size required for any desired variant coverage. The calculated oversampling metric will guide researchers to maximize their efficiency in using highly variant libraries.
Espinoza, S; Felter, A; Malinvaud, D; Badoual, C; Chatellier, G; Siauve, N; Halimi, P
2016-01-01
Warthin's tumor is the second most frequent benign tumor of the parotid gland, with no risk of malignant evolution. That is why surgery should be avoided if the preoperative diagnosis is certain. The aim of the study was to assess the added value of a decisional algorithm for the preoperative diagnosis of Warthin's tumor. This retrospective IRB-approved study included 75 patients who underwent standardised MRI with conventional sequences (T1- and T2-weighted images, and T1 post-contrast sequences with fat saturation) and functional sequences: diffusion (b0, b1000) and perfusion MR. Two independent readers reviewed the images using the decisional algorithm. The conclusion of each reader was: the lesion is or is not a Warthin's tumor. The MRI conclusion was compared with histology or with cytology and follow-up. We calculated the Cohen's kappa coefficient between the two observers and the sensitivity and specificity of the algorithm-helped-reading for the diagnosis of Warthin's tumor. Seventy-five patients; histology (n=61) or cytology and follow-up (n=14) results revealed 20 Warthin's tumors and 55 other tumors. Using the algorithm, sensitivity and specificity were 80-96%, and 85-100%, respectively for readers 1 and 2. The Cohen's kappa coefficient between the two observers was 0.79 (P<0.05) for the diagnosis of Warthin's tumor. Our decisional algorithm helps the preoperative diagnosis of Warthin's tumor. The specificity of the technique is sufficient to avoid surgery if a parotid gland tumor presents all the MRI characteristics of a Warthin's tumor. Copyright © 2014 Éditions françaises de radiologie. Published by Elsevier Masson SAS. All rights reserved.
Gussakovsky, Daniel; Neustaeter, Haley; Spicer, Victor; Krokhin, Oleg V
2017-11-07
The development of a peptide retention prediction model for strong cation exchange (SCX) separation on a Polysulfoethyl A column is reported. Off-line 2D LC-MS/MS analysis (SCX-RPLC) of S. cerevisiae whole cell lysate was used to generate a retention dataset of ∼30 000 peptides, sufficient for identifying the major sequence-specific features of peptide retention mechanisms in SCX. In contrast to RPLC/hydrophilic interaction liquid chromatography (HILIC) separation modes, where retention is driven by hydrophobic/hydrophilic contributions of all individual residues, SCX interactions depend mainly on peptide charge (number of basic residues at acidic pH) and size. An additive model (incorporating the contributions of all 20 residues into the peptide retention) combined with a peptide length correction produces a 0.976 R 2 value prediction accuracy, significantly higher than the additive models for either HILIC or RPLC. Position-dependent effects on peptide retention for different residues were driven by the spatial orientation of tryptic peptides upon interaction with the negatively charged surface functional groups. The positively charged N-termini serve as a primary point of interaction. For example, basic residues (Arg, His, Lys) increase peptide retention when located closer to the N-terminus. We also found that hydrophobic interactions, which could lead to a mixed-mode separation mechanism, are largely suppressed at 20-30% of acetonitrile in the eluent. The accuracy of the final Sequence-Specific Retention Calculator (SSRCalc) SCX model (∼0.99 R 2 value) exceeds all previously reported predictors for peptide LC separations. This also provides a solid platform for method development in 2D LC-MS protocols in proteomics and peptide retention prediction filtering of false positive identifications.
Zubkov, Mikhail; Stait-Gardner, Timothy; Price, William S
2014-06-01
Precise NMR diffusion measurements require detailed knowledge of the cumulative dephasing effect caused by the numerous gradient pulses present in most NMR pulse sequences. This effect, which ultimately manifests itself as the diffusion-related NMR signal attenuation, is usually described by the b-value or the b-matrix in the case of multidirectional diffusion weighting, the latter being common in diffusion-weighted NMR imaging. Neglecting some of the gradient pulses introduces an error in the calculated diffusion coefficient reaching in some cases 100% of the expected value. Therefore, ensuring the b-matrix calculation includes all the known gradient pulses leads to significant error reduction. Calculation of the b-matrix for simple gradient waveforms is rather straightforward, yet it grows cumbersome when complexly shaped and/or numerous gradient pulses are introduced. Making three broad assumptions about the gradient pulse arrangement in a sequence results in an efficient framework for calculation of b-matrices as well providing some insight into optimal gradient pulse placement. The framework allows accounting for the diffusion-sensitising effect of complexly shaped gradient waveforms with modest computational time and power. This is achieved by using the b-matrix elements of the simple unmodified pulse sequence and minimising the integration of the complexly shaped gradient waveform in the modified sequence. Such re-evaluation of the b-matrix elements retains all the analytical relevance of the straightforward approach, yet at least halves the amount of symbolic integration required. The application of the framework is demonstrated with the evaluation of the expression describing the diffusion-sensitizing effect, caused by different bipolar gradient pulse modules. Copyright © 2014 Elsevier Inc. All rights reserved.
GAMSOR: Gamma Source Preparation and DIF3D Flux Solution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, M. A.; Lee, C. H.; Hill, R. N.
2017-06-28
Nuclear reactors that rely upon the fission reaction have two modes of thermal energy deposition in the reactor system: neutron absorption and gamma absorption. The gamma rays are typically generated by neutron capture reactions or during the fission process which means the primary driver of energy production is of course the neutron interaction. In conventional reactor physics methods, the gamma heating component is ignored such that the gamma absorption is forced to occur at the gamma emission site. For experimental reactor systems like EBR-II and FFTF, the placement of structural pins and assemblies internal to the core leads to problemsmore » with power heating predictions because there is no fission power source internal to the assembly to dictate a spatial distribution of the power. As part of the EBR-II support work in the 1980s, the GAMSOR code was developed to assist analysts in calculating the gamma heating. The GAMSOR code is a modified version of DIF3D and actually functions within a sequence of DIF3D calculations. The gamma flux in a conventional fission reactor system does not perturb the neutron flux and thus the gamma flux calculation can be cast as a fixed source problem given a solution to the steady state neutron flux equation. This leads to a sequence of DIF3D calculations, called the GAMSOR sequence, which involves solving the neutron flux, then the gamma flux, and then combining the results to do a summary edit. In this manuscript, we go over the GAMSOR code and detail how it is put together and functions. We also discuss how to setup the GAMSOR sequence and input for each DIF3D calculation in the GAMSOR sequence.« less
Measuring the labeling efficiency of pseudocontinuous arterial spin labeling.
Chen, Zhensen; Zhang, Xingxing; Yuan, Chun; Zhao, Xihai; van Osch, Matthias J P
2017-05-01
Optimization and validation of a sequence for measuring the labeling efficiency of pseudocontinuous arterial spin labeling (pCASL) perfusion MRI. The proposed sequence consists of a labeling module and a single slice Look-Locker echo planar imaging readout. A model-based algorithm was used to calculate labeling efficiency from the signal acquired from the main brain-feeding arteries. Stability of the labeling efficiency measurement was evaluated with regard to the use of cardiac triggering, flow compensation and vein signal suppression. Accuracy of the measurement was assessed by comparing the measured labeling efficiency to mean brain pCASL signal intensity over a wide range of flip angles as applied in the pCASL labeling. Simulations show that the proposed algorithm can effectively calculate labeling efficiency when correcting for T1 relaxation of the blood spins. Use of cardiac triggering and vein signal suppression improved stability of the labeling efficiency measurement, while flow compensation resulted in little improvement. The measured labeling efficiency was found to be linearly (R = 0.973; P < 0.001) related to brain pCASL signal intensity over a wide range of pCASL flip angles. The optimized labeling efficiency sequence provides robust artery-specific labeling efficiency measurement within a short acquisition time (∼30 s), thereby enabling improved accuracy of pCASL CBF quantification. Magn Reson Med 77:1841-1852, 2017. © 2016 International Society for Magnetic Resonance in Medicine Magn Reson Med 77:1841-1852, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Mu-Chiou; Wang, Li-Tin; Huang, Lina; Lee, Fwu-Ling
2012-10-01
To clearly identify specific species and subspecies of the Lactobacillus acidophilus group using phenotypic and genotypic (16S rDNA sequence analysis) techniques alone is difficult. The aim of this study was to use the recA gene for species discrimination in the L. acidophilus group, as well as to develop a species-specific primer and single nucleotide polymorphism primer based on the recA gene sequence for species and subspecies identification. The average sequence similarity for the recA gene among type strains was 80.0%, and most members of the L. acidophilus group could be clearly distinguished. The species-specific primer was designed according to the recA gene sequencing, which was employed for polymerase chain reaction with the template DNA of Lactobacillus strains. A single 231-bp species-specific band was found only in L. delbrueckii. A SNaPshot mini-sequencing assay using recA as a target gene was also developed. The specificity of the mini-sequencing assay was evaluated using 31 strains of L. delbrueckii species and was able to unambiguously discriminate strains belonging to the subspecies L. delbrueckii subsp. bulgaricus. The phylogenetic relationships of most strains in the L. acidophilus group can be resolved using recA gene sequencing, and a novel method to identify the species and subspecies of the L. delbrueckii and L. delbrueckii subsp. bulgaricus was developed by species-specific polymerase chain reaction combined with SNaPshot mini-sequencing. Copyright © 2012 Society of Chemical Industry.
A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.
Álvarez-Martos, Isabel; Ferapontova, Elena E
2017-08-05
A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.
Detection of the first G6P[14] human rotavirus strain in an infant with diarrhoea in Ghana.
Damanka, Susan; Lartey, Belinda; Agbemabiese, Chantal; Dennis, Francis E; Adiku, Theophilus; Nyarko, Kofi; Ofori, Michael; Armah, George E
2016-11-10
Rotaviruses with G6P[14] specificity are mostly isolated in cattle and have been established as a rare cause of gastroenteritis in humans. This study reports the first detection of G6P[14] rotavirus strain in Ghana from the stool of an infant during a hospital-based rotavirus surveillance study in 2010. Viral RNA was extracted and rotavirus VP7 and VP4 genes amplified by one step RT-PCR using gene-specific primers. The DNA was purified, sequenced and genotypes determined using BLAST and RotaC v2.0. Phylogenetic tree was constructed using maximum likelihood method in MEGA v6.06 software and statistically supported by bootstrapping with 1000 replicates. Phylogenetic distances were calculated using the Kimura-2 parameter model. The study strain, GHA-M0084/2010 was characterised as G6P[14]. The VP7 gene of the Ghanaian strain clustered in G6 lineage-III together with artiodactyl and human rotavirus (HRV) strains. It exhibited the highest nucleotide (88.1 %) and amino acid (86.9 %) sequence identity with Belgian HRV strain, B10925. The VP8* fragment of the VP4 gene was closely related to HRV strains detected in France, Italy, Spain and Belgium. It exhibited the strongest nucleotide sequence identity (87.9 %) with HRV strains, PA169 and PR/1300 (Italy) and the strongest amino acid sequence identity (89.3 %) with HRV strain R2775/FRA/07 (France). The study reports the first detection of G6P[14] HRV strain in an infant in Ghana. The detection of G6P[14], an unusual strain pre-vaccine introduction in Ghana, suggests a potential compromise of vaccine effectiveness and indicates the necessity for continuous surveillance in the post vaccine era.
Baron, S F; Franklund, C V; Hylemon, P B
1991-01-01
Southern blot analysis indicated that the gene encoding the constitutive, NADP-linked bile acid 7 alpha-hydroxysteroid dehydrogenase of Eubacterium sp. strain VPI 12708 was located on a 6.5-kb EcoRI fragment of the chromosomal DNA. This fragment was cloned into bacteriophage lambda gt11, and a 2.9-kb piece of this insert was subcloned into pUC19, yielding the recombinant plasmid pBH51. DNA sequence analysis of the 7 alpha-hydroxysteroid dehydrogenase gene in pBH51 revealed a 798-bp open reading frame, coding for a protein with a calculated molecular weight of 28,500. A putative promoter sequence and ribosome binding site were identified. The 7 alpha-hydroxysteroid dehydrogenase mRNA transcript in Eubacterium sp. strain VPI 12708 was about 0.94 kb in length, suggesting that it is monocistronic. An Escherichia coli DH5 alpha transformant harboring pBH51 had approximately 30-fold greater levels of 7 alpha-hydroxysteroid dehydrogenase mRNA, immunoreactive protein, and specific activity than Eubacterium sp. strain VPI 12708. The 7 alpha-hydroxysteroid dehydrogenase purified from the pBH51 transformant was similar in subunit molecular weight, specific activity, and kinetic properties to that from Eubacterium sp. strain VPI 12708, and it reached with antiserum raised against the authentic enzyme on Western immunoblots. Alignment of the amino acid sequence of the 7 alpha-hydroxysteroid dehydrogenase with those of 10 other pyridine nucleotide-linked alcohol/polyol dehydrogenases revealed six conserved amino acid residues in the N-terminal regions thought to function in coenzyme binding. Images PMID:1856160
Serial Reaction Time Learning in Preschool- and School-Age Children.
ERIC Educational Resources Information Center
Thomas, Kathleen M.; Nelson, Charles A.
2001-01-01
Two experiments assessed visuomotor sequence learning in 4- to 10-year-olds using a serial reaction time (SRT) task with random and sequenced trials. Found that children demonstrated sequence-specific decreases in RT. Participants with explicit awareness of the sequence at the session's end showed larger sequence-specific RT decrements than…
Process of labeling specific chromosomes using recombinant repetitive DNA
Moyzis, R.K.; Meyne, J.
1988-02-12
Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
Shih, Arthur Chun-Chieh; Lee, DT; Peng, Chin-Lin; Wu, Yu-Wei
2007-01-01
Background When aligning several hundreds or thousands of sequences, such as epidemic virus sequences or homologous/orthologous sequences of some big gene families, to reconstruct the epidemiological history or their phylogenies, how to analyze and visualize the alignment results of many sequences has become a new challenge for computational biologists. Although there are several tools available for visualization of very long sequence alignments, few of them are applicable to the alignments of many sequences. Results A multiple-logo alignment visualization tool, called Phylo-mLogo, is presented in this paper. Phylo-mLogo calculates the variabilities and homogeneities of alignment sequences by base frequencies or entropies. Different from the traditional representations of sequence logos, Phylo-mLogo not only displays the global logo patterns of the whole alignment of multiple sequences, but also demonstrates their local homologous logos for each clade hierarchically. In addition, Phylo-mLogo also allows the user to focus only on the analysis of some important, structurally or functionally constrained sites in the alignment selected by the user or by built-in automatic calculation. Conclusion With Phylo-mLogo, the user can symbolically and hierarchically visualize hundreds of aligned sequences simultaneously and easily check the changes of their amino acid sites when analyzing many homologous/orthologous or influenza virus sequences. More information of Phylo-mLogo can be found at URL . PMID:17319966
Wavelengths and energy levels for the Zn I isoelectronic sequence Sn{sup 20+} through U{sup 62+}
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, C.M.; Seely, J.F.; Kania, D.R.
Calculated and experimentally determined transition energies are presented for the Zn I isoelectronic sequence for the elements with atomic numbers Z = 50-92. The excitation energies were calculated for the 84 levels belonging to the 10 configurations of the type 4l4l{prime} by using the Hebrew University Lawrence Livermore Atomic Code (HULLAC). The analysis of the energy level structure along the isoelectronic sequence accounted for 20 avoided level crossings. The differences between the calculated and experimental transition energies were determined for 16 transitions, and the excitation energies of the levels belonging to the 4s4p, 4p{sup 2}, 4s4d, and 4s4f configurations weremore » derived from the semiempirically corrected transition energies. 16 refs., 3 figs., 1 tab.« less
Experimental and analytical study of high velocity impact on Kevlar/Epoxy composite plates
NASA Astrophysics Data System (ADS)
Sikarwar, Rahul S.; Velmurugan, Raman; Madhu, Velmuri
2012-12-01
In the present study, impact behavior of Kevlar/Epoxy composite plates has been carried out experimentally by considering different thicknesses and lay-up sequences and compared with analytical results. The effect of thickness, lay-up sequence on energy absorbing capacity has been studied for high velocity impact. Four lay-up sequences and four thickness values have been considered. Initial velocities and residual velocities are measured experimentally to calculate the energy absorbing capacity of laminates. Residual velocity of projectile and energy absorbed by laminates are calculated analytically. The results obtained from analytical study are found to be in good agreement with experimental results. It is observed from the study that 0/90 lay-up sequence is most effective for impact resistance. Delamination area is maximum on the back side of the plate for all thickness values and lay-up sequences. The delamination area on the back is maximum for 0/90/45/-45 laminates compared to other lay-up sequences.
Coletta, Andrea; Desideri, Alessandro
2013-01-01
Camptothecin (CPT) is a topoisomerase IB (TopIB) selective inhibitor whose derivatives are currently used in cancer therapy. TopIB cleaves DNA at any sequence, but in the presence of CPT the only stabilized protein–DNA covalent complex is the one having a thymine in position −1 with respect to the cleavage site. A metadynamics simulation of two TopIB–DNA–CPT ternary complexes differing for the presence of a thymine or a cytosine in position −1 indicates the occurrence of two different drug’s unbinding pathways. The free-energy difference between the bound state and the transition state is large when a thymine is present in position −1 and is strongly reduced in presence of a cytosine, in line with the different drug stabilization properties of the two systems. Such a difference is strictly related to the changes in the hydrogen bond network between the protein, the DNA and the drug in the two systems, indicating a direct role of the protein in determining the specificity of the cleavage site sequence stabilized by the CPT. Calculations carried out in presence of one compound of the indenoisoquinoline family (NSC314622) indicate a comparable energy difference between the bound and the transition state independently of the presence of a thymine or a cytosine in position −1, in line with the experimental results. PMID:24003027
Isvoran, Adriana; Craciun, Dana; Martiny, Virginie; Sperandio, Olivier; Miteva, Maria A
2013-06-14
Protein-Protein Interactions (PPIs) are key for many cellular processes. The characterization of PPI interfaces and the prediction of putative ligand binding sites and hot spot residues are essential to design efficient small-molecule modulators of PPI. Terphenyl and its derivatives are small organic molecules known to mimic one face of protein-binding alpha-helical peptides. In this work we focus on several PPIs mediated by alpha-helical peptides. We performed computational sequence- and structure-based analyses in order to evaluate several key physicochemical and surface properties of proteins known to interact with alpha-helical peptides and/or terphenyl and its derivatives. Sequence-based analysis revealed low sequence identity between some of the analyzed proteins binding alpha-helical peptides. Structure-based analysis was performed to calculate the volume, the fractal dimension roughness and the hydrophobicity of the binding regions. Besides the overall hydrophobic character of the binding pockets, some specificities were detected. We showed that the hydrophobicity is not uniformly distributed in different alpha-helix binding pockets that can help to identify key hydrophobic hot spots. The presence of hydrophobic cavities at the protein surface with a more complex shape than the entire protein surface seems to be an important property related to the ability of proteins to bind alpha-helical peptides and low molecular weight mimetics. Characterization of similarities and specificities of PPI binding sites can be helpful for further development of small molecules targeting alpha-helix binding proteins.
Accuracy of abdominal auscultation for bowel obstruction.
Breum, Birger Michael; Rud, Bo; Kirkegaard, Thomas; Nordentoft, Tyge
2015-09-14
To investigate the accuracy and inter-observer variation of bowel sound assessment in patients with clinically suspected bowel obstruction. Bowel sounds were recorded in patients with suspected bowel obstruction using a Littmann(®) Electronic Stethoscope. The recordings were processed to yield 25-s sound sequences in random order on PCs. Observers, recruited from doctors within the department, classified the sound sequences as either normal or pathological. The reference tests for bowel obstruction were intraoperative and endoscopic findings and clinical follow up. Sensitivity and specificity were calculated for each observer and compared between junior and senior doctors. Interobserver variation was measured using the Kappa statistic. Bowel sound sequences from 98 patients were assessed by 53 (33 junior and 20 senior) doctors. Laparotomy was performed in 47 patients, 35 of whom had bowel obstruction. Two patients underwent colorectal stenting due to large bowel obstruction. The median sensitivity and specificity was 0.42 (range: 0.19-0.64) and 0.78 (range: 0.35-0.98), respectively. There was no significant difference in accuracy between junior and senior doctors. The median frequency with which doctors classified bowel sounds as abnormal did not differ significantly between patients with and without bowel obstruction (26% vs 23%, P = 0.08). The 53 doctors made up 1378 unique pairs and the median Kappa value was 0.29 (range: -0.15-0.66). Accuracy and inter-observer agreement was generally low. Clinical decisions in patients with possible bowel obstruction should not be based on auscultatory assessment of bowel sounds.
Gao, Xiang; Lin, Huaiying; Revanna, Kashi; Dong, Qunfeng
2017-05-10
Species-level classification for 16S rRNA gene sequences remains a serious challenge for microbiome researchers, because existing taxonomic classification tools for 16S rRNA gene sequences either do not provide species-level classification, or their classification results are unreliable. The unreliable results are due to the limitations in the existing methods which either lack solid probabilistic-based criteria to evaluate the confidence of their taxonomic assignments, or use nucleotide k-mer frequency as the proxy for sequence similarity measurement. We have developed a method that shows significantly improved species-level classification results over existing methods. Our method calculates true sequence similarity between query sequences and database hits using pairwise sequence alignment. Taxonomic classifications are assigned from the species to the phylum levels based on the lowest common ancestors of multiple database hits for each query sequence, and further classification reliabilities are evaluated by bootstrap confidence scores. The novelty of our method is that the contribution of each database hit to the taxonomic assignment of the query sequence is weighted by a Bayesian posterior probability based upon the degree of sequence similarity of the database hit to the query sequence. Our method does not need any training datasets specific for different taxonomic groups. Instead only a reference database is required for aligning to the query sequences, making our method easily applicable for different regions of the 16S rRNA gene or other phylogenetic marker genes. Reliable species-level classification for 16S rRNA or other phylogenetic marker genes is critical for microbiome research. Our software shows significantly higher classification accuracy than the existing tools and we provide probabilistic-based confidence scores to evaluate the reliability of our taxonomic classification assignments based on multiple database matches to query sequences. Despite its higher computational costs, our method is still suitable for analyzing large-scale microbiome datasets for practical purposes. Furthermore, our method can be applied for taxonomic classification of any phylogenetic marker gene sequences. Our software, called BLCA, is freely available at https://github.com/qunfengdong/BLCA .
Verstappen, Koen M; Huijbregts, Loes; Spaninks, Mirlin; Wagenaar, Jaap A; Fluit, Ad C; Duim, Birgitta
2017-01-01
Staphylococcus pseudintermedius is an opportunistic pathogen in dogs and cats and occasionally causes infections in humans. S. pseudintermedius is often resistant to multiple classes of antimicrobials. It requires a reliable detection so that it is not misidentified as S. aureus. Phenotypic and currently-used molecular-based diagnostic assays lack specificity or are labour-intensive using multiplex PCR or nucleic acid sequencing. The aim of this study was to identify a specific target for real-time PCR by comparing whole genome sequences of S. pseudintermedius and non-pseudintermedius.Genome sequences were downloaded from public repositories and supplemented by isolates that were sequenced in this study. A Perl-script was written that analysed 300-nt fragments from a reference genome sequence of S. pseudintermedius and checked if this sequence was present in other S. pseudintermedius genomes (n = 74) and non-pseudintermedius genomes (n = 138). Six sequences specific for S. pseudintermedius were identified (sequence length between 300-500 nt). One sequence, which was located in the spsJ gene, was used to develop primers and a probe. The real-time PCR showed 100% specificity when testing for S. pseudintermedius isolates (n = 54), and eight other staphylococcal species (n = 43). In conclusion, a novel approach by comparing whole genome sequences identified a sequence that is specific for S. pseudintermedius and provided a real-time PCR target for rapid and reliable detection of S. pseudintermedius.
Periasamy, Vengadesh; Rizan, Nastaran; Al-Ta’ii, Hassan Maktuff Jaber; Tan, Yee Shin; Tajuddin, Hairul Annuar; Iwamoto, Mitsumasa
2016-01-01
The discovery of semiconducting behavior of deoxyribonucleic acid (DNA) has resulted in a large number of literatures in the study of DNA electronics. Sequence-specific electronic response provides a platform towards understanding charge transfer mechanism and therefore the electronic properties of DNA. It is possible to utilize these characteristic properties to identify/detect DNA. In this current work, we demonstrate a novel method of DNA-based identification of basidiomycetes using current-voltage (I-V) profiles obtained from DNA-specific Schottky barrier diodes. Electronic properties such as ideality factor, barrier height, shunt resistance, series resistance, turn-on voltage, knee-voltage, breakdown voltage and breakdown current were calculated and used to quantify the identification process as compared to morphological and molecular characterization techniques. The use of these techniques is necessary in order to study biodiversity, but sometimes it can be misleading and unreliable and is not sufficiently useful for the identification of fungi genera. Many of these methods have failed when it comes to identification of closely related species of certain genus like Pleurotus. Our electronics profiles, both in the negative and positive bias regions were however found to be highly characteristic according to the base-pair sequences. We believe that this simple, low-cost and practical method could be useful towards identifying and detecting DNA in biotechnology and pathology. PMID:27435636
NASA Astrophysics Data System (ADS)
Periasamy, Vengadesh; Rizan, Nastaran; Al-Ta'Ii, Hassan Maktuff Jaber; Tan, Yee Shin; Tajuddin, Hairul Annuar; Iwamoto, Mitsumasa
2016-07-01
The discovery of semiconducting behavior of deoxyribonucleic acid (DNA) has resulted in a large number of literatures in the study of DNA electronics. Sequence-specific electronic response provides a platform towards understanding charge transfer mechanism and therefore the electronic properties of DNA. It is possible to utilize these characteristic properties to identify/detect DNA. In this current work, we demonstrate a novel method of DNA-based identification of basidiomycetes using current-voltage (I-V) profiles obtained from DNA-specific Schottky barrier diodes. Electronic properties such as ideality factor, barrier height, shunt resistance, series resistance, turn-on voltage, knee-voltage, breakdown voltage and breakdown current were calculated and used to quantify the identification process as compared to morphological and molecular characterization techniques. The use of these techniques is necessary in order to study biodiversity, but sometimes it can be misleading and unreliable and is not sufficiently useful for the identification of fungi genera. Many of these methods have failed when it comes to identification of closely related species of certain genus like Pleurotus. Our electronics profiles, both in the negative and positive bias regions were however found to be highly characteristic according to the base-pair sequences. We believe that this simple, low-cost and practical method could be useful towards identifying and detecting DNA in biotechnology and pathology.
Jeong, Seongmun; Kim, Jiwoong; Park, Won; Jeon, Hongmin; Kim, Namshin
2017-01-01
Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY) and avian (Gallus gallus; ZW) genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.
NASA Astrophysics Data System (ADS)
Tornow, Ralf P.; Milczarek, Aleksandra; Odstrcilik, Jan; Kolar, Radim
2017-07-01
A parallel video ophthalmoscope was developed to acquire short video sequences (25 fps, 250 frames) of both eyes simultaneously with exact synchronization. Video sequences were registered off-line to compensate for eye movements. From registered video sequences dynamic parameters like cardiac cycle induced reflection changes and eye movements can be calculated and compared between eyes.
Warris, Sven; Boymans, Sander; Muiser, Iwe; Noback, Michiel; Krijnen, Wim; Nap, Jan-Peter
2014-01-13
Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification.
Rocha, Alexandre B; de Moura, Carlos E V
2011-12-14
Potential energy curves for inner-shell states of nitrogen and carbon dioxide molecules are calculated by inner-shell complete active space self-consistent field (CASSCF) method, which is a protocol, recently proposed, to obtain specifically converged inner-shell states at multiconfigurational level. This is possible since the collapse of the wave function to a low-lying state is avoided by a sequence of constrained optimization in the orbital mixing step. The problem of localization of K-shell states is revisited by calculating their energies at CASSCF level based on both localized and delocalized orbitals. The localized basis presents the best results at this level of calculation. Transition energies are also calculated by perturbation theory, by taking the above mentioned MCSCF function as zeroth order wave function. Values for transition energy are in fairly good agreement with experimental ones. Bond dissociation energies for N(2) are considerably high, which means that these states are strongly bound. Potential curves along ground state normal modes of CO(2) indicate the occurrence of Renner-Teller effect in inner-shell states. © 2011 American Institute of Physics
GSP: A web-based platform for designing genome-specific primers in polyploids
USDA-ARS?s Scientific Manuscript database
The sequences among subgenomes in a polyploid species have high similarity. This makes difficult to design genome-specific primers for sequence analysis. We present a web-based platform named GSP for designing genome-specific primers to distinguish subgenome sequences in the polyploid genome backgr...
Exact calculation of distributions on integers, with application to sequence alignment.
Newberg, Lee A; Lawrence, Charles E
2009-01-01
Computational biology is replete with high-dimensional discrete prediction and inference problems. Dynamic programming recursions can be applied to several of the most important of these, including sequence alignment, RNA secondary-structure prediction, phylogenetic inference, and motif finding. In these problems, attention is frequently focused on some scalar quantity of interest, a score, such as an alignment score or the free energy of an RNA secondary structure. In many cases, score is naturally defined on integers, such as a count of the number of pairing differences between two sequence alignments, or else an integer score has been adopted for computational reasons, such as in the test of significance of motif scores. The probability distribution of the score under an appropriate probabilistic model is of interest, such as in tests of significance of motif scores, or in calculation of Bayesian confidence limits around an alignment. Here we present three algorithms for calculating the exact distribution of a score of this type; then, in the context of pairwise local sequence alignments, we apply the approach so as to find the alignment score distribution and Bayesian confidence limits.
Charles, Jermilia; Firth, Andrew E.; Loroño-Pino, Maria A.; Garcia-Rejon, Julian E.; Farfan-Ale, Jose A.; Lipkin, W. Ian; Briese, Thomas
2016-01-01
Sequences corresponding to a putative, novel rhabdovirus [designated Merida virus (MERDV)] were initially detected in a pool of Culex quinquefasciatus collected in the Yucatan Peninsula of Mexico. The entire genome was sequenced, revealing 11 798 nt and five major ORFs, which encode the nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and RNA-dependent RNA polymerase (L). The deduced amino acid sequences of the N, G and L proteins have no more than 24, 38 and 43 % identity, respectively, to the corresponding sequences of all other known rhabdoviruses, whereas those of the P and M proteins have no significant identity with any sequences in GenBank and their identity is only suggested based on their genome position. Using specific reverse transcription-PCR assays established from the genome sequence, 27 571 C. quinquefasciatus which had been sorted in 728 pools were screened to assess the prevalence of MERDV in nature and 25 pools were found positive. The minimal infection rate (calculated as the number of positive mosquito pools per 1000 mosquitoes tested) was 0.9, and similar for both females and males. Screening another 140 pools of 5484 mosquitoes belonging to four other genera identified positive pools of Ochlerotatus spp. mosquitoes, indicating that the host range is not restricted to C. quinquefasciatus. Attempts to isolate MERDV in C6/36 and Vero cells were unsuccessful. In summary, we provide evidence that a previously undescribed rhabdovirus occurs in mosquitoes in Mexico. PMID:26868915
Martel Villagrán, J; Bueno Horcajadas, Á; Pérez Fernández, E; Martín Martín, S
2015-01-01
To determine the ability of MRI to distinguish between benign and malignant vertebral lesions. We included 85 patients and studied a total of 213 vertebrae (both pathologic and normal). For each vertebra, we determined whether the lesion was hypointense in T1-weighted sequences and whether it was hyperintense in STIR and in diffusion-weighted sequences. We calculated the in-phase/out-of-phase quotient and the apparent diffusion coefficient for each vertebra. We combined parameters from T1-weighted, diffusion-weighted, and STIR sequences to devise a formula to distinguish benign from malignant lesions. The group comprised 60 (70.6%) women and 25 (29.4%) men with a mean age of 67±13.5 years (range, 33-90 y). Of the 85 patients, 26 (30.6%) had a known primary tumor. When the lesion was hypointense on T1-weighted sequences, hyperintense on STIR and diffusion-weighted sequences, and had a signal intensity quotient greater than 0.8, the sensitivity was 97.2%, the specificity was 90%, and the diagnostic accuracy was 91.2%. If the patient had a known primary tumor, these values increased to 97.2%, 99.4%, and 99%, respectively. Benign lesions can be distinguished from malignant lesions if we combine the information from T1-weighted, STIR, and diffusion-weighted sequences together with the in-phase/out-of-phase quotient of the lesion detected in the vertebral body on MRI. Copyright © 2013 SERAM. Published by Elsevier España, S.L.U. All rights reserved.
Nomoto, R; Maruyama, F; Ishida, S; Tohya, M; Sekizaki, T; Osawa, Ro
2015-02-01
In order to clarify the taxonomic position of serotypes 20, 22 and 26 of Streptococcus suis, biochemical and molecular genetic studies were performed on isolates (SUT-7, SUT-286(T), SUT-319, SUT-328 and SUT-380) reacted with specific antisera of serotypes 20, 22 or 26 from the saliva of healthy pigs as well as reference strains of serotypes 20, 22 and 26. Comparative recN gene sequencing showed high genetic relatedness among our isolates, but marked differences from the type strain S. suis NCTC 10234(T), i.e. 74.8-75.7 % sequence similarity. The genomic relatedness between the isolates and other strains of species of the genus Streptococcus, including S. suis, was calculated using the average nucleotide identity values of whole genome sequences, which indicated that serotypes 20, 22 and 26 should be removed taxonomically from S. suis and treated as a novel genomic species. Comparative sequence analysis revealed 99.0-100 % sequence similarities for the 16S rRNA genes between the reference strains of serotypes 20, 22 and 26, and our isolates. Isolate STU-286(T) had relatively high 16S rRNA gene sequence similarity with S. suis NCTC 10234(T) (98.8 %). SUT-286(T) could be distinguished from S. suis and other closely related species of the genus Streptococcus using biochemical tests. Due to its phylogenetic and phenotypic similarities to S. suis we propose naming the novel species Streptococcus parasuis sp. nov., with SUT-286(T) ( = JCM 30273(T) = DSM 29126(T)) as the type strain. © 2015 IUMS.
Dinçer, Alp; Yildiz, Erdem; Kohan, Saeed; Memet Özek, M
2011-01-01
The aim of the study is to evaluate the efficiency of turbo spin-echo (TSE), three-dimensional constructive interference in the steady state (3D CISS) and cine phase contrast (Cine PC) sequences in determining flow through the endoscopic third ventriculostomy (ETV) fenestration, and to determine the effect of various TSE sequence parameters. The study was approved by our institutional review board and informed consent from all patients was obtained. Two groups of patients were included: group I (24 patients with good clinical outcome after ETV) and group II (22 patients with hydrocephalus evaluated preoperatively). The imaging protocol for both groups was identical. TSE T2 with various sequence parameters and imaging planes, and 3D CISS, followed by cine PC were obtained. Flow void was graded as four-point scales. The sensitivity, specificity, accuracy, positive and negative predictive values of sequences were calculated. Bidirectional flow through the fenestration was detected in all group I patients by cine PC. Stroke volumes through the fenestration in group I ranged 10-160.8 ml/min. There was no correlation between the presence of reversed flow and flow void grading. Also, there was no correlation between the stroke volumes and flow void grading. The sensitivity of 3D CISS was low, and 2 mm sagittal TSE T2, nearly equal to cine PC, provided best result. Cine PC and TSE T2 both have high confidence in the assessment of the flow through the fenestration. But, sequence parameters significantly affect the efficiency of TSE T2.
Sekar, Yuvaraj; Thoelking, Johannes; Eckl, Miriam; Kalichava, Irakli; Sihono, Dwi Seno Kuncoro; Lohr, Frank; Wenz, Frederik; Wertz, Hansjoerg
2018-04-01
The novel MatriXX FFF (IBA Dosimetry, Germany) detector is a new 2D ionization chamber detector array designed for patient specific IMRT-plan verification including flattening-filter-free (FFF) beams. This study provides a detailed analysis of the characterization and clinical evaluation of the new detector array. The verification of the MatriXX FFF was subdivided into (i) physical dosimetric tests including dose linearity, dose rate dependency and output factor measurements and (ii) patient specific IMRT pre-treatment plan verifications. The MatriXX FFF measurements were compared to the calculated dose distribution of a commissioned treatment planning system by gamma index and dose difference evaluations for 18 IMRT-sequences. All IMRT-sequences were measured with original gantry angles and with collapsing all beams to 0° gantry angle to exclude the influence of the detector's angle dependency. The MatriXX FFF was found to be linear and dose rate independent for all investigated modalities (deviations ≤0.6%). Furthermore, the output measurements of the MatriXX FFF were in very good agreement to reference measurements (deviations ≤1.8%). For the clinical evaluation an average pixel passing rate for γ (3%,3mm) of (98.5±1.5)% was achieved when applying a gantry angle correction. Also, with collapsing all beams to 0° gantry angle an excellent agreement to the calculated dose distribution was observed (γ (3%,3mm) =(99.1±1.1)%). The MatriXX FFF fulfills all physical requirements in terms of dosimetric accuracy. Furthermore, the evaluation of the IMRT-plan measurements showed that the detector particularly together with the gantry angle correction is a reliable device for IMRT-plan verification including FFF. Copyright © 2017. Published by Elsevier GmbH.
NASA Astrophysics Data System (ADS)
Shaffer, Christopher J.; Andrikopoulos, Prokopis C.; Řezáč, Jan; Rulíšek, Lubomír; Tureček, František
2016-04-01
Noncovalent complexes of hydrophobic peptides GLLLG and GLLLK with photoleucine (L*) tagged peptides G(L* n L m )K (n = 1,3, m = 2,0) were generated as singly charged ions in the gas phase and probed by photodissociation at 355 nm. Carbene intermediates produced by photodissociative loss of N2 from the L* diazirine rings underwent insertion into X-H bonds of the target peptide moiety, forming covalent adducts with yields reaching 30%. Gas-phase sequencing of the covalent adducts revealed preferred bond formation at the C-terminal residue of the target peptide. Site-selective carbene insertion was achieved by placing the L* residue in different positions along the photopeptide chain, and the residues in the target peptide undergoing carbene insertion were identified by gas-phase ion sequencing that was aided by specific 13C labeling. Density functional theory calculations indicated that noncovalent binding to GL*L*L*K resulted in substantial changes of the (GLLLK + H)+ ground state conformation. The peptide moieties in [GL*L*LK + GLLLK + H]+ ion complexes were held together by hydrogen bonds, whereas dispersion interactions of the nonpolar groups were only secondary in ground-state 0 K structures. Born-Oppenheimer molecular dynamics for 100 ps trajectories of several different conformers at the 310 K laboratory temperature showed that noncovalent complexes developed multiple, residue-specific contacts between the diazirine carbons and GLLLK residues. The calculations pointed to the substantial fluidity of the nonpolar side chains in the complexes. Diazirine photochemistry in combination with Born-Oppenheimer molecular dynamics is a promising tool for investigations of peptide-peptide ion interactions in the gas phase.
Wojcieszynski, Andrzej P; Berman, Abigail T; Wan, Fei; Plastaras, John P; Metz, James M; Mitra, Nandita; Apisarnthanarax, Smith
2013-06-01
The addition of chemoradiation (CRT) to surgery has been shown to improve survival in patients with esophageal cancer. In the current study, the authors determined whether the sequencing of CRT has an effect on survival and cardiopulmonary mortality in patients with esophageal cancer. Patients with the following inclusion criteria were identified within 17 Surveillance, Epidemiology, and End Results registries from 1988 through 2007: adenocarcinoma or squamous cell carcinoma of the esophagus and having undergone esophagectomy. Patients who died within 90 days of surgery were excluded. Demographic, tumor, and survival data were compared between patients receiving preoperative and postoperative RT. Cox proportional hazards regression models were calculated to identify parameters associated with cause-specific survival and overall survival. A competing risk analysis was performed to account for death due to esophageal cancer in the calculation of cardiopulmonary mortality. Of 5512 patients, 1881 received preoperative RT, 901 received postoperative RT, and 2730 did not receive RT. Patients receiving preoperative RT had improved 5-year cause-specific survival (41% vs 31%; P < .0001) and overall survival (33% vs 23%; P < .0001) compared with those receiving postoperative RT. No differences in adjusted cardiopulmonary mortality were found between patients who received RT versus those who did not (8% vs 10% at 10 years; hazards ratio [HR], 0.84 [95% confidence interval (95% CI), 0.64-1.12] [P = .24]) or between those treated with preoperative RT versus those treated with postoperative RT (HR, 0.70; 95% CI, 0.46-1.08 [P = .11]). These population-based data support the use of preoperative RT in patients with locally advanced esophageal cancer. RT should not be withheld out of concern for cardiopulmonary mortality. Copyright © 2013 American Cancer Society.
The 2016 Mihoub (north-central Algeria) earthquake sequence: Seismological and tectonic aspects
NASA Astrophysics Data System (ADS)
Khelif, M. F.; Yelles-Chaouche, A.; Benaissa, Z.; Semmane, F.; Beldjoudi, H.; Haned, A.; Issaadi, A.; Chami, A.; Chimouni, R.; Harbi, A.; Maouche, S.; Dabbouz, G.; Aidi, C.; Kherroubi, A.
2018-06-01
On 28 May 2016 at 23:54 (UTC), an Mw5.4 earthquake occurred in Mihoub village, Algeria, 60 km southeast of Algiers. This earthquake was the largest event in a sequence recorded from 10 April to 15 July 2016. In addition to the permanent national network, a temporary network was installed in the epicentral region after this shock. Recorded event locations allow us to give a general overview of the sequence and reveal the existence of two main fault segments. The first segment, on which the first event in the sequence was located, is near-vertical and trends E-W. The second fault plane, on which the largest event of the sequence was located, dips to the southeast and strikes NE-SW. A total of 46 well-constrained focal mechanisms were calculated. The events located on the E-W-striking fault segment show mainly right-lateral strike-slip (strike N70°E, dip 77° to the SSE, rake 150°). The events located on the NE-SW-striking segment show mainly reverse faulting (strike N60°E, dip 70° to the SE, rake 130°). We calculated the static stress change caused by the first event (Md4.9) of the sequence; the result shows that the fault plane of the largest event in the sequence (Mw5.4) and most of the aftershocks occurred within an area of increased Coulomb stress. Moreover, using the focal mechanisms calculated in this work, we estimated the orientations of the main axes of the local stress tensor ellipsoid. The results confirm previous findings that the general stress field in this area shows orientations aligned NNW-SSE to NW-SE. The 2016 Mihoub earthquake sequence study thus improves our understanding of seismic hazard in north-central Algeria.
Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures
Wang, Ying; Fu, Lei; Ren, Jie; Yu, Zhaoxia; Chen, Ting; Sun, Fengzhu
2018-01-01
Comparing metagenomic samples is crucial for understanding microbial communities. For different groups of microbial communities, such as human gut metagenomic samples from patients with a certain disease and healthy controls, identifying group-specific sequences offers essential information for potential biomarker discovery. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered “group-specific” in our study. Our main purpose is to discover group-specific sequence regions between control and case groups as disease-associated markers. We developed a long k-mer (k ≥ 30 bps)-based computational pipeline to detect group-specific sequences at strain resolution free from reference sequences, sequence alignments, and metagenome-wide de novo assembly. We called our method MetaGO: Group-specific oligonucleotide analysis for metagenomic samples. An open-source pipeline on Apache Spark was developed with parallel computing. We applied MetaGO to one simulated and three real metagenomic datasets to evaluate the discriminative capability of identified group-specific markers. In the simulated dataset, 99.11% of group-specific logical 40-mers covered 98.89% disease-specific regions from the disease-associated strain. In addition, 97.90% of group-specific numerical 40-mers covered 99.61 and 96.39% of differentially abundant genome and regions between two groups, respectively. For a large-scale metagenomic liver cirrhosis (LC)-associated dataset, we identified 37,647 group-specific 40-mer features. Any one of the features can predict disease status of the training samples with the average of sensitivity and specificity higher than 0.8. The random forests classification using the top 10 group-specific features yielded a higher AUC (from ∼0.8 to ∼0.9) than that of previous studies. All group-specific 40-mers were present in LC patients, but not healthy controls. All the assembled 11 LC-specific sequences can be mapped to two strains of Veillonella parvula: UTDB1-3 and DSM2008. The experiments on the other two real datasets related to Inflammatory Bowel Disease and Type 2 Diabetes in Women consistently demonstrated that MetaGO achieved better prediction accuracy with fewer features compared to previous studies. The experiments showed that MetaGO is a powerful tool for identifying group-specific k-mers, which would be clinically applicable for disease prediction. MetaGO is available at https://github.com/VVsmileyx/MetaGO. PMID:29774017
Wavelengths and energy levels for the Zn I isoelectronic sequence Ga[sup 1+] through Xe[sup 24+
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seely, J.F.; Bar-Shalom, A.
Calculated and experimentally determined transition energies were compared for the Zn I isoelectronic sequence for the elements with atomic numbers Z = 31-54. Using the Hebrew Univ. Lawrence Livermore Atomic Code, the excitation energies were calculated for the 109 levels belonging to the lowest 16 configurations of the types 4/4/[prime] and 4/5/[prime]. The analysis of the energy-level structure along the isoelectronic sequence accounted for a number of avoided level crossings. The differences between the calculated and experimental transition energies were determined for 24 transitions among the 4s[sup 2], 4s4p, 4p[sup 2], 4s4d, and 4s4f configurations. Wavelengths were predicted for previouslymore » unobserved transitions in the highly charged ions. 15 refs., 4 figs., 3 tabs.« less
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using
Weier, H.U.G.; Gray, J.W.
1995-06-27
A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using
Weier, Heinz-Ulrich G.; Gray, Joe W.
1995-01-01
A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.
Koparde, Vishal N.; Jameson-Lee, Maximilian; Elnasseh, Abdelrhman G.; Scalora, Allison F.; Kobulnicky, David J.; Serrano, Myrna G.; Roberts, Catherine H.; Buck, Gregory A.; Neale, Michael C.; Nixon, Daniel E.; Toor, Amir A.
2017-01-01
Human cytomegalovirus (hCMV) reactivation may often coincide with the development of graft-versus-host-disease (GVHD) in stem cell transplantation (SCT). Seventy seven SCT donor-recipient pairs (DRP) (HLA matched unrelated donor (MUD), n = 50; matched related donor (MRD), n = 27) underwent whole exome sequencing to identify single nucleotide polymorphisms (SNPs) generating alloreactive peptide libraries for each DRP (9-mer peptide-HLA complexes); Human CMV CROSS (Cross-Reactive Open Source Sequence) database was compiled from NCBI; HLA class I binding affinity for each DRPs HLA was calculated by NetMHCpan 2.8 and hCMV- derived 9-mers algorithmically compared to the alloreactive peptide-HLA complex libraries. Short consecutive (≥6) amino acid (AA) sequence homology matching hCMV to recipient peptides was considered for HLA-bound-peptide (IC50<500nM) cross reactivity. Of the 70,686 hCMV 9-mers contained within the hCMV CROSS database, an average of 29,658 matched the MRD DRP alloreactive peptides and 52,910 matched MUD DRP peptides (p<0.001). In silico analysis revealed multiple high affinity, immunogenic CMV-Human peptide matches (IC50<500 nM) expressed in GVHD-affected tissue-specific manner. hCMV+GVHD was found in 18 patients, 13 developing hCMV viremia before GVHD onset. Analysis of patients with GVHD identified potential cross reactive peptide expression within affected organs. We propose that hCMV peptide sequence homology with human alloreactive peptides may contribute to the pathophysiology of GVHD. PMID:28800601
Hall, Charles E; Koparde, Vishal N; Jameson-Lee, Maximilian; Elnasseh, Abdelrhman G; Scalora, Allison F; Kobulnicky, David J; Serrano, Myrna G; Roberts, Catherine H; Buck, Gregory A; Neale, Michael C; Nixon, Daniel E; Toor, Amir A
2017-01-01
Human cytomegalovirus (hCMV) reactivation may often coincide with the development of graft-versus-host-disease (GVHD) in stem cell transplantation (SCT). Seventy seven SCT donor-recipient pairs (DRP) (HLA matched unrelated donor (MUD), n = 50; matched related donor (MRD), n = 27) underwent whole exome sequencing to identify single nucleotide polymorphisms (SNPs) generating alloreactive peptide libraries for each DRP (9-mer peptide-HLA complexes); Human CMV CROSS (Cross-Reactive Open Source Sequence) database was compiled from NCBI; HLA class I binding affinity for each DRPs HLA was calculated by NetMHCpan 2.8 and hCMV- derived 9-mers algorithmically compared to the alloreactive peptide-HLA complex libraries. Short consecutive (≥6) amino acid (AA) sequence homology matching hCMV to recipient peptides was considered for HLA-bound-peptide (IC50<500nM) cross reactivity. Of the 70,686 hCMV 9-mers contained within the hCMV CROSS database, an average of 29,658 matched the MRD DRP alloreactive peptides and 52,910 matched MUD DRP peptides (p<0.001). In silico analysis revealed multiple high affinity, immunogenic CMV-Human peptide matches (IC50<500 nM) expressed in GVHD-affected tissue-specific manner. hCMV+GVHD was found in 18 patients, 13 developing hCMV viremia before GVHD onset. Analysis of patients with GVHD identified potential cross reactive peptide expression within affected organs. We propose that hCMV peptide sequence homology with human alloreactive peptides may contribute to the pathophysiology of GVHD.
NASA Astrophysics Data System (ADS)
Liang, G. Y.; Badnell, N. R.
2011-04-01
We present results for the electron-impact excitation of all Li-like ions from Be+ to Kr33+ which we obtained using the radiation- and Auger-damped intermediate-coupling frame transformation R-matrix approach. We have included both valence- and core-electron excitations up to the 1s25l and 1s2l4l' levels, respectively. A detailed comparison of the target structure and collision data has been made for four specific ions (O5+, Ar15+, Fe23+ and Kr33+) spanning the sequence so as to assess the accuracy for the entire sequence. Effective collision strengths (Υs) are presented at temperatures ranging from 2 × 102(z + 1)2 K to 2 × 106(z + 1)2 K (where z is the residual charge of the ions, i.e. Z - 3). Detailed comparisons for the Υs are made with the results of previous calculations for several ions which span the sequence. The radiation and Auger damping effects were explored for core-excitations along the iso-electronic sequence. Furthermore, we examined the iso-electronic trends of effective collision strengths as a function of temperature. These data are made available in the archives of APAP via http://www.apap-network.org, OPEN-ADAS via http://open.adas.ac.uk, as well as anonymous ftp to cdsarc.u-strasbg.fr (130.79.128.5) or via http://cdsweb.u-strasbg.fr/cgi-bin/qcat?J/A+A/528/A69
Churkin, Alexander; Barash, Danny
2008-01-01
Background RNAmute is an interactive Java application which, given an RNA sequence, calculates the secondary structure of all single point mutations and organizes them into categories according to their similarity to the predicted structure of the wild type. The secondary structure predictions are performed using the Vienna RNA package. A more efficient implementation of RNAmute is needed, however, to extend from the case of single point mutations to the general case of multiple point mutations, which may often be desired for computational predictions alongside mutagenesis experiments. But analyzing multiple point mutations, a process that requires traversing all possible mutations, becomes highly expensive since the running time is O(nm) for a sequence of length n with m-point mutations. Using Vienna's RNAsubopt, we present a method that selects only those mutations, based on stability considerations, which are likely to be conformational rearranging. The approach is best examined using the dot plot representation for RNA secondary structure. Results Using RNAsubopt, the suboptimal solutions for a given wild-type sequence are calculated once. Then, specific mutations are selected that are most likely to cause a conformational rearrangement. For an RNA sequence of about 100 nts and 3-point mutations (n = 100, m = 3), for example, the proposed method reduces the running time from several hours or even days to several minutes, thus enabling the practical application of RNAmute to the analysis of multiple-point mutations. Conclusion A highly efficient addition to RNAmute that is as user friendly as the original application but that facilitates the practical analysis of multiple-point mutations is presented. Such an extension can now be exploited prior to site-directed mutagenesis experiments by virologists, for example, who investigate the change of function in an RNA virus via mutations that disrupt important motifs in its secondary structure. A complete explanation of the application, called MultiRNAmute, is available at [1]. PMID:18445289
NASA Astrophysics Data System (ADS)
Jian, Le; Cao, Wang; Jintao, Yang; Yinge, Wang
2018-04-01
This paper describes the design of a dynamic voltage restorer (DVR) that can simultaneously protect several sensitive loads from voltage sags in a region of an MV distribution network. A novel reference voltage calculation method based on zero-sequence voltage optimisation is proposed for this DVR to optimise cost-effectiveness in compensation of voltage sags with different characteristics in an ungrounded neutral system. Based on a detailed analysis of the characteristics of voltage sags caused by different types of faults and the effect of the wiring mode of the transformer on these characteristics, the optimisation target of the reference voltage calculation is presented with several constraints. The reference voltages under all types of voltage sags are calculated by optimising the zero-sequence component, which can reduce the degree of swell in the phase-to-ground voltage after compensation to the maximum extent and can improve the symmetry degree of the output voltages of the DVR, thereby effectively increasing the compensation ability. The validity and effectiveness of the proposed method are verified by simulation and experimental results.
Structure and Sequence Search on Aptamer-Protein Docking
NASA Astrophysics Data System (ADS)
Xiao, Jiajie; Bonin, Keith; Guthold, Martin; Salsbury, Freddie
2015-03-01
Interactions between proteins and deoxyribonucleic acid (DNA) play a significant role in the living systems, especially through gene regulation. However, short nucleic acids sequences (aptamers) with specific binding affinity to specific proteins exhibit clinical potential as therapeutics. Our capillary and gel electrophoresis selection experiments show that specific sequences of aptamers can be selected that bind specific proteins. Computationally, given the experimentally-determined structure and sequence of a thrombin-binding aptamer, we can successfully dock the aptamer onto thrombin in agreement with experimental structures of the complex. In order to further study the conformational flexibility of this thrombin-binding aptamer and to potentially develop a predictive computational model of aptamer-binding, we use GPU-enabled molecular dynamics simulations to both examine the conformational flexibility of the aptamer in the absence of binding to thrombin, and to determine our ability to fold an aptamer. This study should help further de-novo predictions of aptamer sequences by enabling the study of structural and sequence-dependent effects on aptamer-protein docking specificity.
Boehm; Gibson; Lubzens
2000-01-01
This study was initiated to search for species-specific and strain-specific satellite DNA sequences for which oligonucleotide primers could be designed to differentiate between various commercially important strains of the marine monogonont rotifers Brachionus rotundiformis and Brachionus plicatilis. Two unrelated, highly reiterated satellite sequences were cloned and characterized. The eight sequenced monomers from B. rotundiformis and six from B. plicatilis had low intrarepeat variability and were similar in their overall lengths, A + T compositions, and high degrees of repeated motif substructure. However, hybridizations to 19 representative strains, sequence characterizations, and GenBank searches indicated that these two satellites are morphotype-specific and population-specific, respectively, and share little homology to each other or to other characterized sequences in the database. Primer pairs designed for the B. rotundiformis satellite confirmed hybridization specificities on polymerase chain reaction and could serve as a useful molecular diagnostic tool to identify strains belonging to the SS morphotype, which are gaining widespread usage as first feeds for marine fish in commercial production.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Szulik, Marta W.; Pallan, Pradeep S.; Nocek, Boguslaw
5-Hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) form during active demethylation of 5-methylcytosine (5mC) and are implicated in epigenetic regulation of the genome. They are differentially processed by thymine DNA glycosylase (TDG), an enzyme involved in active demethylation of 5mC. Three modified Dickerson–Drew dodecamer (DDD) sequences, amenable to crystallographic and spectroscopic analyses and containing the 5'-CG-3' sequence associated with genomic cytosine methylation, containing 5hmC, 5fC, or 5caC placed site-specifically into the 5'-T 8X 9G 10-3' sequence of the DDD, were compared. The presence of 5caC at the X9 base increased the stability of the DDD, whereas 5hmC or 5fC didmore » not. Both 5hmC and 5fC increased imino proton exchange rates and calculated rate constants for base pair opening at the neighboring base pair A 5:T 8, whereas 5caC did not. At the oxidized base pair G 4:X 9, 5fC exhibited an increase in the imino proton exchange rate and the calculated k op. In all cases, minimal effects to imino proton exchange rates occurred at the neighboring base pair C 3:G 10. No evidence was observed for imino tautomerization, accompanied by wobble base pairing, for 5hmC, 5fC, or 5caC when positioned at base pair G 4:X 9; each favored Watson–Crick base pairing. However, both 5fC and 5caC exhibited intranucleobase hydrogen bonding between their formyl or carboxyl oxygens, respectively, and the adjacent cytosine N 4 exocyclic amines. The lesion-specific differences observed in the DDD may be implicated in recognition of 5hmC, 5fC, or 5caC in DNA by TDG. Furthermore, they do not correlate with differential excision of 5hmC, 5fC, or 5caC by TDG, which may be mediated by differences in transition states of the enzyme-bound complexes.« less
Zenno, S; Saigo, K; Kanoh, H; Inouye, S
1994-01-01
The gene encoding the major NAD(P)H-flavin oxidoreductase (flavin reductase) of the luminous bacterium Vibrio fischeri ATCC 7744 was isolated by using synthetic oligonucleotide probes corresponding to the N-terminal amino acid sequence of the enzyme. Nucleotide sequence analysis suggested that the major flavin reductase of V. fischeri consisted of 218 amino acids and had a calculated molecular weight of 24,562. Cloned flavin reductase expressed in Escherichia coli was purified virtually to homogeneity, and its basic biochemical properties were examined. As in the major flavin reductase in crude extracts of V. fischeri, cloned flavin reductase showed broad substrate specificity and served well as a catalyst to supply reduced flavin mononucleotide (FMNH2) to the bioluminescence reaction. The major flavin reductase of V. fischeri not only showed significant similarity in amino acid sequence to oxygen-insensitive NAD(P)H nitroreductases of Salmonella typhimurium, Enterobacter cloacae, and E. coli but also was associated with a low level of nitroreductase activity. The major flavin reductase of V. fischeri and the nitroreductases of members of the family Enterobacteriaceae would thus appear closely related in evolution and form a novel protein family. Images PMID:8206830
Classification of DNA nucleotides with transverse tunneling currents
NASA Astrophysics Data System (ADS)
Nyvold Pedersen, Jonas; Boynton, Paul; Di Ventra, Massimiliano; Jauho, Antti-Pekka; Flyvbjerg, Henrik
2017-01-01
It has been theoretically suggested and experimentally demonstrated that fast and low-cost sequencing of DNA, RNA, and peptide molecules might be achieved by passing such molecules between electrodes embedded in a nanochannel. The experimental realization of this scheme faces major challenges, however. In realistic liquid environments, typical currents in tunneling devices are of the order of picoamps. This corresponds to only six electrons per microsecond, and this number affects the integration time required to do current measurements in real experiments. This limits the speed of sequencing, though current fluctuations due to Brownian motion of the molecule average out during the required integration time. Moreover, data acquisition equipment introduces noise, and electronic filters create correlations in time-series data. We discuss how these effects must be included in the analysis of, e.g., the assignment of specific nucleobases to current signals. As the signals from different molecules overlap, unambiguous classification is impossible with a single measurement. We argue that the assignment of molecules to a signal is a standard pattern classification problem and calculation of the error rates is straightforward. The ideas presented here can be extended to other sequencing approaches of current interest.
Li, Chunmei; Yu, Zhilong; Fu, Yusi; Pang, Yuhong; Huang, Yanyi
2017-04-26
We develop a novel single-cell-based platform through digital counting of amplified genomic DNA fragments, named multifraction amplification (mfA), to detect the copy number variations (CNVs) in a single cell. Amplification is required to acquire genomic information from a single cell, while introducing unavoidable bias. Unlike prevalent methods that directly infer CNV profiles from the pattern of sequencing depth, our mfA platform denatures and separates the DNA molecules from a single cell into multiple fractions of a reaction mix before amplification. By examining the sequencing result of each fraction for a specific fragment and applying a segment-merge maximum likelihood algorithm to the calculation of copy number, we digitize the sequencing-depth-based CNV identification and thus provide a method that is less sensitive to the amplification bias. In this paper, we demonstrate a mfA platform through multiple displacement amplification (MDA) chemistry. When performing the mfA platform, the noise of MDA is reduced; therefore, the resolution of single-cell CNV identification can be improved to 100 kb. We can also determine the genomic region free of allelic drop-out with mfA platform, which is impossible for conventional single-cell amplification methods.
Upper-body kinematics in team-handball throw, tennis serve, and volleyball spike.
Wagner, H; Pfusterschmied, J; Tilp, M; Landlinger, J; von Duvillard, S P; Müller, E
2014-04-01
Overarm movements are essential skills in many different sport games; however, the adaptations to different sports are not well understood. The aim of the study was to analyze upper-body kinematics in the team-handball throw, tennis serve, and volleyball spike, and to calculate differences in the proximal-to-distal sequencing and joint movements. Three-dimensional kinematic data were analyzed via the Vicon motion capturing system. The subjects (elite players) were instructed to perform a team-handball jump throw, tennis serve, and volleyball spike with a maximal ball velocity and to hit a specific target. Significant differences (P < 0.05) between the three overarm movements were found in 17 of 24 variables. The order of the proximal-to-distal sequencing was equal in the three analyzed overarm movements. Equal order of the proximal-to-distal sequencing and similar angles in the acceleration phase suggest there is a general motor pattern in overarm movements. However, overarm movements appear to be modifiable in situations such as for throwing or hitting a ball with or without a racket, and due to differences at takeoff (with one or two legs). © 2012 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Cohen, Adam S; German, Tamsin C
2010-06-01
In a task where participants' overt task was to track the location of an object across a sequence of events, reaction times to unpredictable probes requiring an inference about a social agent's beliefs about the location of that object were obtained. Reaction times to false belief situations were faster than responses about the (false) contents of a map showing the location of the object (Experiment 1) and about the (false) direction of an arrow signaling the location of the object (Experiment 2). These results are consistent with developmental, neuro-imaging and neuropsychological evidence that there exist domain specific mechanisms within human cognition for encoding and reasoning about mental states. Specialization of these mechanisms may arise from either core cognitive architecture or via the accumulation of expertise in the social domain.
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.
Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W
2010-07-02
The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.
Sequence Factorial of "g"-Gonal Numbers
ERIC Educational Resources Information Center
Asiru, Muniru A.
2013-01-01
The gamma function, which has the property to interpolate the factorial whenever the argument is an integer, is a special case (the case "g"?=?2) of the general term of the sequence factorial of "g"-gonal numbers. In relation to this special case, a formula for calculating the general term of the sequence factorial of any…
BIOPEP database and other programs for processing bioactive peptide sequences.
Minkiewicz, Piotr; Dziuba, Jerzy; Iwaniak, Anna; Dziuba, Marta; Darewicz, Małgorzata
2008-01-01
This review presents the potential for application of computational tools in peptide science based on a sample BIOPEP database and program as well as other programs and databases available via the World Wide Web. The BIOPEP application contains a database of biologically active peptide sequences and a program enabling construction of profiles of the potential biological activity of protein fragments, calculation of quantitative descriptors as measures of the value of proteins as potential precursors of bioactive peptides, and prediction of bonds susceptible to hydrolysis by endopeptidases in a protein chain. Other bioactive and allergenic peptide sequence databases are also presented. Programs enabling the construction of binary and multiple alignments between peptide sequences, the construction of sequence motifs attributed to a given type of bioactivity, searching for potential precursors of bioactive peptides, and the prediction of sites susceptible to proteolytic cleavage in protein chains are available via the Internet as are other approaches concerning secondary structure prediction and calculation of physicochemical features based on amino acid sequence. Programs for prediction of allergenic and toxic properties have also been developed. This review explores the possibilities of cooperation between various programs.
Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G
2018-03-01
Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.
An extended sequence specificity for UV-induced DNA damage.
Chung, Long H; Murray, Vincent
2018-01-01
The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
DRUMS: Disk Repository with Update Management and Select option for high throughput sequencing data
2014-01-01
Background New technologies for analyzing biological samples, like next generation sequencing, are producing a growing amount of data together with quality scores. Moreover, software tools (e.g., for mapping sequence reads), calculating transcription factor binding probabilities, estimating epigenetic modification enriched regions or determining single nucleotide polymorphism increase this amount of position-specific DNA-related data even further. Hence, requesting data becomes challenging and expensive and is often implemented using specialised hardware. In addition, picking specific data as fast as possible becomes increasingly important in many fields of science. The general problem of handling big data sets was addressed by developing specialized databases like HBase, HyperTable or Cassandra. However, these database solutions require also specialized or distributed hardware leading to expensive investments. To the best of our knowledge, there is no database capable of (i) storing billions of position-specific DNA-related records, (ii) performing fast and resource saving requests, and (iii) running on a single standard computer hardware. Results Here, we present DRUMS (Disk Repository with Update Management and Select option), satisfying demands (i)-(iii). It tackles the weaknesses of traditional databases while handling position-specific DNA-related data in an efficient manner. DRUMS is capable of storing up to billions of records. Moreover, it focuses on optimizing relating single lookups as range request, which are needed permanently for computations in bioinformatics. To validate the power of DRUMS, we compare it to the widely used MySQL database. The test setting considers two biological data sets. We use standard desktop hardware as test environment. Conclusions DRUMS outperforms MySQL in writing and reading records by a factor of two up to a factor of 10000. Furthermore, it can work with significantly larger data sets. Our work focuses on mid-sized data sets up to several billion records without requiring cluster technology. Storing position-specific data is a general problem and the concept we present here is a generalized approach. Hence, it can be easily applied to other fields of bioinformatics. PMID:24495746
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.
Khoe, Clairine V; Chung, Long H; Murray, Vincent
2018-06-01
The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Campbell's monkeys concatenate vocalizations into context-specific call sequences
Ouattara, Karim; Lemasson, Alban; Zuberbühler, Klaus
2009-01-01
Primate vocal behavior is often considered irrelevant in modeling human language evolution, mainly because of the caller's limited vocal control and apparent lack of intentional signaling. Here, we present the results of a long-term study on Campbell's monkeys, which has revealed an unrivaled degree of vocal complexity. Adult males produced six different loud call types, which they combined into various sequences in highly context-specific ways. We found stereotyped sequences that were strongly associated with cohesion and travel, falling trees, neighboring groups, nonpredatory animals, unspecific predatory threat, and specific predator classes. Within the responses to predators, we found that crowned eagles triggered four and leopards three different sequences, depending on how the caller learned about their presence. Callers followed a number of principles when concatenating sequences, such as nonrandom transition probabilities of call types, addition of specific calls into an existing sequence to form a different one, or recombination of two sequences to form a third one. We conclude that these primates have overcome some of the constraints of limited vocal control by combinatorial organization. As the different sequences were so tightly linked to specific external events, the Campbell's monkey call system may be the most complex example of ‘proto-syntax’ in animal communication known to date. PMID:20007377
D'Angelo, Maria C; Jiménez, Luis; Milliken, Bruce; Lupiáñez, Juan
2013-01-01
Individuals experience less interference from conflicting information following events that contain conflicting information. Recently, Jiménez, Lupiáñez, and Vaquero (2009) demonstrated that such adaptations to conflict occur even when the source of conflict arises from implicit knowledge of sequences. There is accumulating evidence that momentary changes in adaptations made in response to conflicting information are conflict-type specific (e.g., Funes, Lupiáñez, & Humphreys, 2010a), suggesting that there are multiple modes of control. The current study examined whether conflict-specific sequential congruency effects occur when the 2 sources of conflict are implicitly learned. Participants implicitly learned a motor sequence while simultaneously learning a perceptual sequence. In a first experiment, after learning the 2 orthogonal sequences, participants expressed knowledge of the 2 sequences independently of each other in a transfer phase. In Experiments 2 and 3, within each sequence, the presence of a single control trial disrupted the expression of this specific type of learning on the following trial. There was no evidence of cross-conflict modulations in the expression of sequence learning. The results suggest that the mechanisms involved in transient shifts in conflict-specific control, as reflected in sequential congruency effects, are also engaged when the source of conflict is implicit. (c) 2013 APA, all rights reserved.
Mental Fatigue Impairs Soccer-Specific Physical and Technical Performance.
Smith, Mitchell R; Coutts, Aaron J; Merlini, Michele; Deprez, Dieter; Lenoir, Matthieu; Marcora, Samuele M
2016-02-01
To investigate the effects of mental fatigue on soccer-specific physical and technical performance. This investigation consisted of two separate studies. Study 1 assessed the soccer-specific physical performance of 12 moderately trained soccer players using the Yo-Yo Intermittent Recovery Test, Level 1 (Yo-Yo IR1). Study 2 assessed the soccer-specific technical performance of 14 experienced soccer players using the Loughborough Soccer Passing and Shooting Tests (LSPT, LSST). Each test was performed on two occasions and preceded, in a randomized, counterbalanced order, by 30 min of the Stroop task (mentally fatiguing treatment) or 30 min of reading magazines (control treatment). Subjective ratings of mental fatigue were measured before and after treatment, and mental effort and motivation were measured after treatment. Distance run, heart rate, and ratings of perceived exertion were recorded during the Yo-Yo IR1. LSPT performance time was calculated as original time plus penalty time. LSST performance was assessed using shot speed, shot accuracy, and shot sequence time. Subjective ratings of mental fatigue and effort were higher after the Stroop task in both studies (P < 0.001), whereas motivation was similar between conditions. This mental fatigue significantly reduced running distance in the Yo-Yo IR1 (P < 0.001). No difference in heart rate existed between conditions, whereas ratings of perceived exertion were significantly higher at iso-time in the mental fatigue condition (P < 0.01). LSPT original time and performance time were not different between conditions; however, penalty time significantly increased in the mental fatigue condition (P = 0.015). Mental fatigue also impaired shot speed (P = 0.024) and accuracy (P < 0.01), whereas shot sequence time was similar between conditions. Mental fatigue impairs soccer-specific running, passing, and shooting performance.
Novel green tissue-specific synthetic promoters and cis-regulatory elements in rice.
Wang, Rui; Zhu, Menglin; Ye, Rongjian; Liu, Zuoxiong; Zhou, Fei; Chen, Hao; Lin, Yongjun
2015-12-11
As an important part of synthetic biology, synthetic promoter has gradually become a hotspot in current biology. The purposes of the present study were to synthesize green tissue-specific promoters and to discover green tissue-specific cis-elements. We first assembled several regulatory sequences related to tissue-specific expression in different combinations, aiming to obtain novel green tissue-specific synthetic promoters. GUS assays of the transgenic plants indicated 5 synthetic promoters showed green tissue-specific expression patterns and different expression efficiencies in various tissues. Subsequently, we scanned and counted the cis-elements in different tissue-specific promoters based on the plant cis-elements database PLACE and the rice cDNA microarray database CREP for green tissue-specific cis-element discovery, resulting in 10 potential cis-elements. The flanking sequence of one potential core element (GEAT) was predicted by bioinformatics. Then, the combination of GEAT and its flanking sequence was functionally identified with synthetic promoter. GUS assays of the transgenic plants proved its green tissue-specificity. Furthermore, the function of GEAT flanking sequence was analyzed in detail with site-directed mutagenesis. Our study provides an example for the synthesis of rice tissue-specific promoters and develops a feasible method for screening and functional identification of tissue-specific cis-elements with their flanking sequences at the genome-wide level in rice.
Controlled oxide films formation by nanosecond laser pulses for color marking.
Veiko, Vadim; Odintsova, Galina; Ageev, Eduard; Karlagina, Yulia; Loginov, Anatoliy; Skuratova, Alexandra; Gorbunova, Elena
2014-10-06
A technology of laser-induced coloration of metals by surface oxidation is demonstrated. Each color of the oxide film corresponds to a technologic chromacity coefficient, which takes into account the temperature of the sample after exposure by sequence of laser pulses with nanosecond duration and effective time of action. The coefficient can be used for the calculation of laser exposure regimes for the development of a specific color on the metal. A correlation between the composition of the films obtained on the surface of stainless steel AISI 304 and commercial titanium Grade 2 and its color and chromacity coordinates is shown.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curtis, L.J.
1986-02-01
The 5s/sup 2/ /sup 1/S/sub 0/-5s5p/sup 1,3/P/sub J/ energy intervals in the Cd isoelectronic sequence have been investigated through a semiempirical systematization of recent measurements and through the performance of ab initio multiconfiguration Dirac-Fock calculations. Screening-parameter reductions of the spin-orbit and exchange energies both for the observed data and for the theoretically computed values establish the existence of empirical linearities similar to those exploited earlier for the Be, Mg, and Zn sequences. This permits extrapolative isoelectronic predictions of the relative energies of the 5s5p levels, which can be connected to 5s/sup 2/ using intersinglet intervals obtained from empirically corrected abmore » initio calculations. These linearities have also been examined homologously for the Zn, Cd, and Hg sequences, and common relationships have been found that accurately describe all three of these sequences.« less
DNA/RNA transverse current sequencing: intrinsic structural noise from neighboring bases
Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.
2015-01-01
Nanopore DNA sequencing via transverse current has emerged as a promising candidate for third-generation sequencing technology. It produces long read lengths which could alleviate problems with assembly errors inherent in current technologies. However, the high error rates of nanopore sequencing have to be addressed. A very important source of the error is the intrinsic noise in the current arising from carrier dispersion along the chain of the molecule, i.e., from the influence of neighboring bases. In this work we perform calculations of the transverse current within an effective multi-orbital tight-binding model derived from first-principles calculations of the DNA/RNA molecules, to study the effect of this structural noise on the error rates in DNA/RNA sequencing via transverse current in nanopores. We demonstrate that a statistical technique, utilizing not only the currents through the nucleotides but also the correlations in the currents, can in principle reduce the error rate below any desired precision. PMID:26150827
Molecular design of sequence specific DNA alkylating agents.
Minoshima, Masafumi; Bando, Toshikazu; Shinohara, Ken-ichi; Sugiyama, Hiroshi
2009-01-01
Sequence-specific DNA alkylating agents have great interest for novel approach to cancer chemotherapy. We designed the conjugates between pyrrole (Py)-imidazole (Im) polyamides and DNA alkylating chlorambucil moiety possessing at different positions. The sequence-specific DNA alkylation by conjugates was investigated by using high-resolution denaturing polyacrylamide gel electrophoresis (PAGE). The results showed that polyamide chlorambucil conjugates alkylate DNA at flanking adenines in recognition sequences of Py-Im polyamides, however, the reactivities and alkylation sites were influenced by the positions of conjugation. In addition, we synthesized conjugate between Py-Im polyamide and another alkylating agent, 1-(chloromethyl)-5-hydroxy-1,2-dihydro-3H-benz[e]indole (seco-CBI). DNA alkylation reactivies by both alkylating polyamides were almost comparable. In contrast, cytotoxicities against cell lines differed greatly. These comparative studies would promote development of appropriate sequence-specific DNA alkylating polyamides against specific cancer cells.
2014-01-01
Background Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Results Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. Conclusion The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification. PMID:24418292
A Next-Generation Sequencing Primer—How Does It Work and What Can It Do?
Alekseyev, Yuriy O.; Fazeli, Roghayeh; Yang, Shi; Basran, Raveen; Miller, Nancy S.
2018-01-01
Next-generation sequencing refers to a high-throughput technology that determines the nucleic acid sequences and identifies variants in a sample. The technology has been introduced into clinical laboratory testing and produces test results for precision medicine. Since next-generation sequencing is relatively new, graduate students, medical students, pathology residents, and other physicians may benefit from a primer to provide a foundation about basic next-generation sequencing methods and applications, as well as specific examples where it has had diagnostic and prognostic utility. Next-generation sequencing technology grew out of advances in multiple fields to produce a sophisticated laboratory test with tremendous potential. Next-generation sequencing may be used in the clinical setting to look for specific genetic alterations in patients with cancer, diagnose inherited conditions such as cystic fibrosis, and detect and profile microbial organisms. This primer will review DNA sequencing technology, the commercialization of next-generation sequencing, and clinical uses of next-generation sequencing. Specific applications where next-generation sequencing has demonstrated utility in oncology are provided. PMID:29761157
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.
Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru
2015-01-01
The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
Zhu, Zhikai; Su, Xiaomeng; Go, Eden P; Desaire, Heather
2014-09-16
Glycoproteins are biologically significant large molecules that participate in numerous cellular activities. In order to obtain site-specific protein glycosylation information, intact glycopeptides, with the glycan attached to the peptide sequence, are characterized by tandem mass spectrometry (MS/MS) methods such as collision-induced dissociation (CID) and electron transfer dissociation (ETD). While several emerging automated tools are developed, no consensus is present in the field about the best way to determine the reliability of the tools and/or provide the false discovery rate (FDR). A common approach to calculate FDRs for glycopeptide analysis, adopted from the target-decoy strategy in proteomics, employs a decoy database that is created based on the target protein sequence database. Nonetheless, this approach is not optimal in measuring the confidence of N-linked glycopeptide matches, because the glycopeptide data set is considerably smaller compared to that of peptides, and the requirement of a consensus sequence for N-glycosylation further limits the number of possible decoy glycopeptides tested in a database search. To address the need to accurately determine FDRs for automated glycopeptide assignments, we developed GlycoPep Evaluator (GPE), a tool that helps to measure FDRs in identifying glycopeptides without using a decoy database. GPE generates decoy glycopeptides de novo for every target glycopeptide, in a 1:20 target-to-decoy ratio. The decoys, along with target glycopeptides, are scored against the ETD data, from which FDRs can be calculated accurately based on the number of decoy matches and the ratio of the number of targets to decoys, for small data sets. GPE is freely accessible for download and can work with any search engine that interprets ETD data of N-linked glycopeptides. The software is provided at https://desairegroup.ku.edu/research.
Mukherjee, Sanchita; Kailasam, Senthilkumar; Bansal, Manju; Bhattacharyya, Dhananjay
2014-01-01
Double helical structures of DNA and RNA are mostly determined by base pair stacking interactions, which give them the base sequence-directed features, such as small roll values for the purine-pyrimidine steps. Earlier attempts to characterize stacking interactions were mostly restricted to calculations on fiber diffraction geometries or optimized structure using ab initio calculations lacking variation in geometry to comment on rather unusual large roll values observed in AU/AU base pair step in crystal structures of RNA double helices. We have generated stacking energy hyperspace by modeling geometries with variations along the important degrees of freedom, roll, and slide, which were chosen via statistical analysis as maximally sequence dependent. Corresponding energy contours were constructed by several quantum chemical methods including dispersion corrections. This analysis established the most suitable methods for stacked base pair systems despite the limitation imparted by number of atom in a base pair step to employ very high level of theory. All the methods predict negative roll value and near-zero slide to be most favorable for the purine-pyrimidine steps, in agreement with Calladine's steric clash based rule. Successive base pairs in RNA are always linked by sugar-phosphate backbone with C3'-endo sugars and this demands C1'-C1' distance of about 5.4 Å along the chains. Consideration of an energy penalty term for deviation of C1'-C1' distance from the mean value, to the recent DFT-D functionals, specifically ωB97X-D appears to predict reliable energy contour for AU/AU step. Such distance-based penalty improves energy contours for the other purine-pyrimidine sequences also. © 2013 Wiley Periodicals, Inc. Biopolymers 101: 107-120, 2014. Copyright © 2013 Wiley Periodicals, Inc.
BRAF mutation testing in solid tumors: a methodological comparison.
Weyant, Grace W; Wisotzkey, Jeffrey D; Benko, Floyd A; Donaldson, Keri J
2014-09-01
Solid tumor genotyping has become standard of care for the characterization of proto-oncogene mutational status, which has traditionally been accomplished with Sanger sequencing. However, companion diagnostic assays and comparable laboratory-developed tests are becoming increasingly popular, such as the cobas 4800 BRAF V600 Mutation Test and the INFINITI KRAS-BRAF assay, respectively. This study evaluates and validates the analytical performance of the INFINITI KRAS-BRAF assay and compares concordance of BRAF status with two reference assays, the cobas test and Sanger sequencing. DNA extraction from FFPE tissue specimens was performed followed by multiplex PCR amplification and fluorescent label incorporation using allele-specific primer extension. Hybridization to a microarray, signal detection, and analysis were then performed. The limits of detection were determined by testing dilutions of mutant BRAF alleles within wild-type background DNA, and accuracy was calculated based on these results. The INFINITI KRAS-BRAF assay produced 100% concordance with the cobas test and Sanger sequencing and had sensitivity equivalent to the cobas assay. The INFINITI assay is repeatable with at least 95% accuracy in the detection of mutant and wild-type BRAF alleles. These results confirm that the INFINITI KRAS-BRAF assay is comparable to traditional sequencing and the Food and Drug Administration-approved companion diagnostic assay for the detection of BRAF mutations. Copyright © 2014 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Sumbalova, Lenka; Stourac, Jan; Martinek, Tomas; Bednar, David; Damborsky, Jiri
2018-05-23
HotSpot Wizard is a web server used for the automated identification of hotspots in semi-rational protein design to give improved protein stability, catalytic activity, substrate specificity and enantioselectivity. Since there are three orders of magnitude fewer protein structures than sequences in bioinformatic databases, the major limitation to the usability of previous versions was the requirement for the protein structure to be a compulsory input for the calculation. HotSpot Wizard 3.0 now accepts the protein sequence as input data. The protein structure for the query sequence is obtained either from eight repositories of homology models or is modeled using Modeller and I-Tasser. The quality of the models is then evaluated using three quality assessment tools-WHAT_CHECK, PROCHECK and MolProbity. During follow-up analyses, the system automatically warns the users whenever they attempt to redesign poorly predicted parts of their homology models. The second main limitation of HotSpot Wizard's predictions is that it identifies suitable positions for mutagenesis, but does not provide any reliable advice on particular substitutions. A new module for the estimation of thermodynamic stabilities using the Rosetta and FoldX suites has been introduced which prevents destabilizing mutations among pre-selected variants entering experimental testing. HotSpot Wizard is freely available at http://loschmidt.chemi.muni.cz/hotspotwizard.
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.
Murray, Vincent; Chen, Jon K; Tanaka, Mark M
2016-07-01
The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
40 CFR 1065.650 - Emission calculations.
Code of Federal Regulations, 2011 CFR
2011-07-01
... following sequence of preliminary calculations on recorded concentrations: (i) Correct all THC and CH4.... (iii) Calculate all THC and NMHC concentrations, including dilution air background concentrations, as... NMHC to background corrected mass of THC. If the background corrected mass of NMHC is greater than 0.98...
Kong, Xiaotian; Sun, Huiyong; Pan, Peichen; Tian, Sheng; Li, Dan; Li, Youyong; Hou, Tingjun
2016-01-21
Due to the high sequence identity of the binding pockets of cyclin-dependent kinases (CDKs), designing highly selective inhibitors towards a specific CDK member remains a big challenge. 4-(thiazol-5-yl)-2-(phenylamino) pyrimidine derivatives are effective inhibitors of CDKs, among which the most promising inhibitor 12u demonstrates high binding affinity to CDK9 and attenuated binding affinity to other homologous kinases, such as CDK2. In this study, in order to rationalize the principle of the binding preference towards CDK9 over CDK2 and to explore crucial information that may aid the design of selective CDK9 inhibitors, MM/GBSA calculations based on conventional molecular dynamics (MD) simulations and enhanced sampling simulations (umbrella sampling and steered MD simulations) were carried out on two representative derivatives (12u and 4). The calculation results show that the binding specificity of 12u to CDK9 is primarily controlled by conformational change of the G-loop and variation of the van der Waals interactions. Furthermore, the enhanced sampling simulations revealed the different reaction coordinates and transient interactions of inhibitors 12u and 4 as they dissociate from the binding pockets of CDK9 and CDK2. The physical principles obtained from this study may facilitate the discovery and rational design of novel and specific inhibitors of CDK9.
Tolson, D A; Nicholson, N H
1998-01-01
The determination of DNA sequences by partial exonuclease digestion followed by Matrix-Assisted Laser Desorption Time of Flight Mass Spectrometry (MALDI-TOF) is a well established method. When the same procedure is applied to RNA, difficulties arise due to the small (1 Da) mass difference between the nucleotides U and C, which makes unambiguous assignment difficult using a MALDI-TOF instrument. Here we report our experiences with sequence specific endonucleases and chemical methods followed by MALDI-TOF to resolve these sequence ambiguities. We have found chemical methods superior to endonucleases both in terms of correct specificity and extent of sequence coverage. This methodology can be used in combination with exonuclease digestion to rapidly assign RNA sequences. PMID:9421498
Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi
2014-01-01
A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.
Lehnherr, Dan; Chen, Chen; Pedramrazi, Zahra; DeBlase, Catherine R.; Alzola, Joaquin M.; Keresztes, Ivan; Lobkovsky, Emil B.
2016-01-01
A Cu-catalyzed benzannulation reaction transforms ortho(arylene ethynylene) oligomers into ortho-arylenes. This approach circumvents iterative Suzuki cross-coupling reactions previously used to assemble hindered ortho-arylene backbones. These derivatives form helical folded structures in the solid-state and in solution, as demonstrated by X-ray crystallography and solution-state NMR analysis. DFT calculations of misfolded conformations are correlated with variable-temperature 1H and EXSY NMR to reveal that folding is cooperative and more favorable in halide-substituted naphthalenes. Helical ortho-arylene foldamers with specific aromatic sequences organize functional π-electron systems into arrangements ideal for ambipolar charge transport and show preliminary promise for the surface-mediated synthesis of structurally defined graphene nanoribbons. PMID:28567248
Dynamics of adaptive immunity against phage in bacterial populations
NASA Astrophysics Data System (ADS)
Bradde, Serena; Vucelja, Marija; Tesileanu, Tiberiu; Balasubramanian, Vijay
The CRISPR (clustered regularly interspaced short palindromic repeats) mechanism allows bacteria to adaptively defend against phages by acquiring short genomic sequences (spacers) that target specific sequences in the viral genome. We propose a population dynamical model where immunity can be both acquired and lost. The model predicts regimes where bacterial and phage populations can co-exist, others where the populations oscillate, and still others where one population is driven to extinction. Our model considers two key parameters: (1) ease of acquisition and (2) spacer effectiveness in conferring immunity. Analytical calculations and numerical simulations show that if spacers differ mainly in ease of acquisition, or if the probability of acquiring them is sufficiently high, bacteria develop a diverse population of spacers. On the other hand, if spacers differ mainly in their effectiveness, their final distribution will be highly peaked, akin to a ``winner-take-all'' scenario, leading to a specialized spacer distribution. Bacteria can interpolate between these limiting behaviors by actively tuning their overall acquisition rate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalyuzhnaya, Marina G.; Nercessian, Olivier; Lapidus, Alla
2004-07-01
The recently generated database of microbial genes from anoligotrophic environment populated by a calculated 1,800 of major phylotypes (the Sargasso Sea metagenome) presents a great source for expanding local databases of genes indicative of a specific function. In this paper we analyze the Sargasso Sea metagenome in terms of the presence of methanopterin-linked C1 transfer genes that are signature for methylotrophy. We conclude that more than 10 phylotypes possessing genes of interest are present in this environment, and a few of these are relatively abundant species. The sequences representative of the major phylotypes do not appear to belong to anymore » known microbial group capable of methanopterin-linked C1 transfer. Instead, they separate from all known sequences on phylogenetic trees, pointing towards their affiliation with a novel microbial phylum. These data imply a broader distribution of methanopterin-linked functions in the microbial world than previously known.« less
Li, Yang; Ren, Yi
2017-01-01
Pseudomonas sp. QTF5 was isolated from the continuous permafrost near the bitumen layers in the Qiangtang basin of Qinghai-Tibetan Plateau in China (5,111 m above sea level). It is psychrotolerant and highly and widely tolerant to heavy metals and has the ability to metabolize benzoic acid and salicylic acid. To gain insight into the genetic basis for its adaptation, we performed whole genome sequencing and analyzed the resistant genes and metabolic pathways. Based on 120 published and annotated genomes representing 31 species in the genus Pseudomonas, in silico genomic DNA-DNA hybridization (<54%) and average nucleotide identity calculation (<94%) revealed that QTF5 is closest to Pseudomonas lini and should be classified into a novel species. This study provides the genetic basis to identify the genes linked to its specific mechanisms for adaptation to extreme environment and application of this microorganism in environmental conservation. PMID:29270429
DOE Office of Scientific and Technical Information (OSTI.GOV)
Youn, H; Jeon, H; Nam, J
Purpose: To investigate the feasibility of an analytic framework to estimate patients’ absorbed dose distribution owing to daily cone-beam CT scan for image-guided radiation treatment. Methods: To compute total absorbed dose distribution, we separated the framework into primary and scattered dose calculations. Using the source parameters such as voltage, current, and bowtie filtration, for the primary dose calculation, we simulated the forward projection from the source to each voxel of an imaging object including some inhomogeneous inserts. Then we calculated the primary absorbed dose at each voxel based on the absorption probability deduced from the HU values and Beer’s law.more » In sequence, all voxels constructing the phantom were regarded as secondary sources to radiate scattered photons for scattered dose calculation. Details of forward projection were identical to that of the previous step. The secondary source intensities were given by using scatter-to- primary ratios provided by NIST. In addition, we compared the analytically calculated dose distribution with their Monte Carlo simulation results. Results: The suggested framework for absorbed dose estimation successfully provided the primary and secondary dose distributions of the phantom. Moreover, our analytic dose calculations and Monte Carlo calculations were well agreed each other even near the inhomogeneous inserts. Conclusion: This work indicated that our framework can be an effective monitor to estimate a patient’s exposure owing to cone-beam CT scan for image-guided radiation treatment. Therefore, we expected that the patient’s over-exposure during IGRT might be prevented by our framework.« less
Detection of nucleic acid sequences by invader-directed cleavage
Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert
1999-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.
Phylum- and Class-Specific PCR Primers for General Microbial Community Analysis
Blackwood, Christopher B.; Oaks, Adam; Buyer, Jeffrey S.
2005-01-01
Amplification of a particular DNA fragment from a mixture of organisms by PCR is a common first step in methods of examining microbial community structure. The use of group-specific primers in community DNA profiling applications can provide enhanced sensitivity and phylogenetic detail compared to domain-specific primers. Other uses for group-specific primers include quantitative PCR and library screening. The purpose of the present study was to develop several primer sets targeting commonly occurring and important groups. Primers specific for the 16S ribosomal sequences of Alphaproteobacteria, Betaproteobacteria, Bacilli, Actinobacteria, and Planctomycetes and for parts of both the 18S ribosomal sequence and the internal transcribed spacer region of Basidiomycota were examined. Primers were tested by comparison to sequences in the ARB 2003 database, and chosen primers were further tested by cloning and sequencing from soil community DNA. Eighty-five to 100% of the sequences obtained from clone libraries were found to be placed with the groups intended as targets, demonstrating the specificity of the primers under field conditions. It will be important to reevaluate primers over time because of the continual growth of sequence databases and revision of microbial taxonomy. PMID:16204538
d-Omix: a mixer of generic protein domain analysis tools.
Wichadakul, Duangdao; Numnark, Somrak; Ingsriswang, Supawadee
2009-07-01
Domain combination provides important clues to the roles of protein domains in protein function, interaction and evolution. We have developed a web server d-Omix (a Mixer of Protein Domain Analysis Tools) aiming as a unified platform to analyze, compare and visualize protein data sets in various aspects of protein domain combinations. With InterProScan files for protein sets of interest provided by users, the server incorporates four services for domain analyses. First, it constructs protein phylogenetic tree based on a distance matrix calculated from protein domain architectures (DAs), allowing the comparison with a sequence-based tree. Second, it calculates and visualizes the versatility, abundance and co-presence of protein domains via a domain graph. Third, it compares the similarity of proteins based on DA alignment. Fourth, it builds a putative protein network derived from domain-domain interactions from DOMINE. Users may select a variety of input data files and flexibly choose domain search tools (e.g. hmmpfam, superfamily) for a specific analysis. Results from the d-Omix could be interactively explored and exported into various formats such as SVG, JPG, BMP and CSV. Users with only protein sequences could prepare an InterProScan file using a service provided by the server as well. The d-Omix web server is freely available at http://www.biotec.or.th/isl/Domix.
Cortical columns and the tendency of neighboring neurons to act similarly.
Legéndy, C R
1978-12-08
A tendency by neighboring cortical neurons to act similarly (spatial assimilation) is derived analytically from an assumed facilitatory interaction between the involved neurons at an early age, possibly before the critical period in the cat, an assumed plastic modifiability of the thalamo-cortical contacts at the same earlier time, and exposure of the network at the same time to a largely arbitrary sequence of inputs coming from outside the cortex. The calculational result is that during the assumed period of thalamo-cortical plasticity neuron responses tend toward greater similarity within the approximate range where cortico-cortical excitation dominates over inhibition and toward greater dissimilarity where inhibition dominates over excitation. Through the result, the calculation correctly predicts the horizontal extent of certain cortical columns. In the visual cortex of certain animals the horizontal distance of most dissimilar preferred orientation (90 degrees difference) is about the same as the distance of most dissimilar eye preference (from center of left-eye to center of right-eye region), and both are roughly the same as the range of strongest intracortical inhibition. The sequence of inputs coming from outside the cortex is mathematically allowed to be random, which suggests that signals originating inside the nervous system, as exist in a sensorially deprived animal, without help from genetic specifications, are adequate to give rise to spatial assimilation.
Schwenger, Frédéric; Repasi, Endre
2017-02-20
The knowledge of the spatial energy (or power) distribution of light beams reflected at the dynamic sea surface is of great practical interest in maritime environments. For the estimation of the light energy reflected into a specific spatial direction a lot of parameters need to be taken into account. Both whitecap coverage and its optical properties have a large impact upon the calculated value. In published literature, for applications considering vertical light propagation paths, such as bathymetric lidar, the reflectance of sea surface and whitecaps are approximated by constant values. For near-horizontal light propagation paths the optical properties of the sea surface and the whitecaps must be considered in greater detail. The calculated light energy reflected into a specific direction varies statistically and depends largely on the dynamics of the wavy sea surface and the dynamics of whitecaps. A 3D simulation of the dynamic sea surface populated with whitecaps is presented. The simulation considers the evolution of whitecaps depending on wind speed and fetch. The radiance calculation of the maritime scene (open sea/clear sky) populated with whitecaps is done in the short wavelength infrared spectral band. Wave hiding and shadowing, especially occurring at low viewing angles, are considered. The specular reflection of a light beam at the sea surface in the absence of whitecaps is modeled by an analytical statistical bidirectional reflectance distribution function (BRDF) of the sea surface. For whitecaps, a specific BRDF is used by taking into account their shadowing function. To ensure the credibility of the simulation, the whitecap coverage is determined from simulated image sequences for different wind speeds and compared to whitecap coverage functions from literature. The impact of whitecaps on the radiation balance for bistatic configuration of light source and receiver is calculated for a different incident (zenith/azimuth angles) of the light beam and is presented for two different wind speeds.
Accuracy of abdominal auscultation for bowel obstruction
Breum, Birger Michael; Rud, Bo; Kirkegaard, Thomas; Nordentoft, Tyge
2015-01-01
AIM: To investigate the accuracy and inter-observer variation of bowel sound assessment in patients with clinically suspected bowel obstruction. METHODS: Bowel sounds were recorded in patients with suspected bowel obstruction using a Littmann® Electronic Stethoscope. The recordings were processed to yield 25-s sound sequences in random order on PCs. Observers, recruited from doctors within the department, classified the sound sequences as either normal or pathological. The reference tests for bowel obstruction were intraoperative and endoscopic findings and clinical follow up. Sensitivity and specificity were calculated for each observer and compared between junior and senior doctors. Interobserver variation was measured using the Kappa statistic. RESULTS: Bowel sound sequences from 98 patients were assessed by 53 (33 junior and 20 senior) doctors. Laparotomy was performed in 47 patients, 35 of whom had bowel obstruction. Two patients underwent colorectal stenting due to large bowel obstruction. The median sensitivity and specificity was 0.42 (range: 0.19-0.64) and 0.78 (range: 0.35-0.98), respectively. There was no significant difference in accuracy between junior and senior doctors. The median frequency with which doctors classified bowel sounds as abnormal did not differ significantly between patients with and without bowel obstruction (26% vs 23%, P = 0.08). The 53 doctors made up 1378 unique pairs and the median Kappa value was 0.29 (range: -0.15-0.66). CONCLUSION: Accuracy and inter-observer agreement was generally low. Clinical decisions in patients with possible bowel obstruction should not be based on auscultatory assessment of bowel sounds. PMID:26379407
Allergen cross reactions: a problem greater than ever thought?
Pfiffner, P; Truffer, R; Matsson, P; Rasi, C; Mari, A; Stadler, B M
2010-12-01
Cross reactions are an often observed phenomenon in patients with allergy. Sensitization against some allergens may cause reactions against other seemingly unrelated allergens. Today, cross reactions are being investigated on a per-case basis, analyzing blood serum specific IgE (sIgE) levels and clinical features of patients suffering from cross reactions. In this study, we evaluated the level of sIgE compared to patients' total IgE assuming epitope specificity is a consequence of sequence similarity. Our objective was to evaluate our recently published model of molecular sequence similarities underlying cross reactivity using serum-derived data from IgE determinations of standard laboratory tests. We calculated the probabilities of protein cross reactivity based on conserved sequence motifs and compared these in silico predictions to a database consisting of 5362 sera with sIgE determinations. Cumulating sIgE values of a patient resulted in a median of 25-30% total IgE. Comparing motif cross reactivity predictions to sIgE levels showed that on average three times fewer motifs than extracts were recognized in a given serum (correlation coefficient: 0.967). Extracts belonging to the same motif group co-reacted in a high percentage of sera (up to 80% for some motifs). Cumulated sIgE levels are exaggerated because of a high level of observed cross reactions. Thus, not only bioinformatic prediction of allergenic motifs, but also serological routine testing of allergic patients implies that the immune system may recognize only a small number of allergenic structures. © 2010 John Wiley & Sons A/S.
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Grievink, Liat Shavit; Penny, David; Hendy, Mike D; Holland, Barbara R
2009-01-01
Correction to Shavit Grievink L, Penny D, Hendy MD, Holland BR: LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites. BMC Evol Biol 2008, 8(1):317.
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus
Shoyab, M.; Baluda, M. A.; Evans, R.
1974-01-01
DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Human retina-specific amine oxidase (RAO): cDNA cloning, tissue expression, and chromosomal mapping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Imamura, Yutaka; Kubota, Ryo; Wang, Yimin
In search of candidate genes for hereditary retinal disease, we have employed a subtractive and differential cDNA cloning strategy and isolated a novel retina-specific cDNA. Nucleotide sequence analysis revealed an open reading frame of 2187 bp, which encodes a 729-amino-acid protein with a calculated molecular mass of 80,644 Da. The putative protein contained a conserved domain of copper amine oxidase, which is found in various species from bacteria to mammals. It showed the highest homology to bovine serum amine oxidase, which is believed to control the level of serum biogenic amines. Northern blot analysis of human adult and fetal tissuesmore » revealed that the protein is expressed abundantly and specifically in retina as a 2.7-kb transcript. Thus, we considered this protein a human retina-specific amine oxidase (RAO). The RAO gene (AOC2) was mapped by fluorescence in situ hybridization to human chromosome 17q21. We propose that AOC2 may be a candidate gene for hereditary ocular diseases. 38 refs., 4 figs.« less
Protein sequences bound to mineral surfaces persist into deep time
Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna; Fischer, Roman; Kessler, Benedikt M; Rakownikow Jersie-Christensen, Rosa; Olsen, Jesper V; Haile, James; Thomas, Jessica; Marean, Curtis W; Parkington, John; Presslee, Samantha; Lee-Thorp, Julia; Ditchfield, Peter; Hamilton, Jacqueline F; Ward, Martyn W; Wang, Chunting Michelle; Shaw, Marvin D; Harrison, Terry; Domínguez-Rodrigo, Manuel; MacPhee, Ross DE; Kwekason, Amandus; Ecker, Michaela; Kolska Horwitz, Liora; Chazan, Michael; Kröger, Roland; Thomas-Oates, Jane; Harding, John H; Cappellini, Enrico; Penkman, Kirsty; Collins, Matthew J
2016-01-01
Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated sequence (equivalent to ~16 Ma at a constant 10°C). DOI: http://dx.doi.org/10.7554/eLife.17092.001 PMID:27668515
Mapping Base Modifications in DNA by Transverse-Current Sequencing
NASA Astrophysics Data System (ADS)
Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.
2018-02-01
Sequencing DNA modifications and lesions, such as methylation of cytosine and oxidation of guanine, is even more important and challenging than sequencing the genome itself. The traditional methods for detecting DNA modifications are either insensitive to these modifications or require additional processing steps to identify a particular type of modification. Transverse-current sequencing in nanopores can potentially identify the canonical bases and base modifications in the same run. In this work, we demonstrate that the most common DNA epigenetic modifications and lesions can be detected with any predefined accuracy based on their tunneling current signature. Our results are based on simulations of the nanopore tunneling current through DNA molecules, calculated using nonequilibrium electron-transport methodology within an effective multiorbital model derived from first-principles calculations, followed by a base-calling algorithm accounting for neighbor current-current correlations. This methodology can be integrated with existing experimental techniques to improve base-calling fidelity.
QueTAL: a suite of tools to classify and compare TAL effectors functionally and phylogenetically
Pérez-Quintero, Alvaro L.; Lamy, Léo; Gordon, Jonathan L.; Escalon, Aline; Cunnac, Sébastien; Szurek, Boris; Gagnevin, Lionel
2015-01-01
Transcription Activator-Like (TAL) effectors from Xanthomonas plant pathogenic bacteria can bind to the promoter region of plant genes and induce their expression. DNA-binding specificity is governed by a central domain made of nearly identical repeats, each determining the recognition of one base pair via two amino acid residues (a.k.a. Repeat Variable Di-residue, or RVD). Knowing how TAL effectors differ from each other within and between strains would be useful to infer functional and evolutionary relationships, but their repetitive nature precludes reliable use of traditional alignment methods. The suite QueTAL was therefore developed to offer tailored tools for comparison of TAL effector genes. The program DisTAL considers each repeat as a unit, transforms a TAL effector sequence into a sequence of coded repeats and makes pair-wise alignments between these coded sequences to construct trees. The program FuncTAL is aimed at finding TAL effectors with similar DNA-binding capabilities. It calculates correlations between position weight matrices of potential target DNA sequence predicted from the RVD sequence, and builds trees based on these correlations. The programs accurately represented phylogenetic and functional relationships between TAL effectors using either simulated or literature-curated data. When using the programs on a large set of TAL effector sequences, the DisTAL tree largely reflected the expected species phylogeny. In contrast, FuncTAL showed that TAL effectors with similar binding capabilities can be found between phylogenetically distant taxa. This suite will help users to rapidly analyse any TAL effector genes of interest and compare them to other available TAL genes and should improve our understanding of TAL effectors evolution. It is available at http://bioinfo-web.mpl.ird.fr/cgi-bin2/quetal/quetal.cgi. PMID:26284082
Pagan, Rafael F; Massey, Steven E
2014-02-01
Proteins are regarded as being robust to the deleterious effects of mutations. Here, the neutral emergence of mutational robustness in a population of single domain proteins is explored using computer simulations. A pairwise contact model was used to calculate the ΔG of folding (ΔG folding) using the three dimensional protein structure of leech eglin C. A random amino acid sequence with low mutational robustness, defined as the average ΔΔG resulting from a point mutation (ΔΔG average), was threaded onto the structure. A population of 1,000 threaded sequences was evolved under selection for stability, using an upper and lower energy threshold. Under these conditions, mutational robustness increased over time in the most common sequence in the population. In contrast, when the wild type sequence was used it did not show an increase in robustness. This implies that the emergence of mutational robustness is sequence specific and that wild type sequences may be close to maximal robustness. In addition, an inverse relationship between ∆∆G average and protein stability is shown, resulting partly from a larger average effect of point mutations in more stable proteins. The emergence of mutational robustness was also observed in the Escherichia coli colE1 Rop and human CD59 proteins, implying that the property may be common in single domain proteins under certain simulation conditions. The results indicate that at least a portion of mutational robustness in small globular proteins might have arisen by a process of neutral emergence, and could be an example of a beneficial trait that has not been directly selected for, termed a "pseudaptation."
Cost-effectiveness of biological treatment sequences for fistulising Crohn’s disease across Europe
Baji, Petra; Gulácsi, László; Brodszky, Valentin; Végh, Zsuzsanna; Danese, Silvio; Irving, Peter M; Peyrin-Biroulet, Laurent; Schreiber, Stefan; Rencz, Fanni; Lakatos, Péter L; Péntek, Márta
2017-01-01
Background In clinical practice, treatment sequences of biologicals are applied for active fistulising Crohn’s disease, however underlying health economic analyses are lacking. Objective The purpose of this study was to analyse the cost-effectiveness of different biological sequences including infliximab, biosimilar-infliximab, adalimumab and vedolizumab in nine European countries. Methods A Markov model was developed to compare treatment sequences of one, two and three biologicals from the payer’s perspective on a five-year time horizon. Data on effectiveness and health state utilities were obtained from the literature. Country-specific costs were considered. Calculations were performed with both official list prices and estimated real prices of biologicals. Results Biosimilar-infliximab is the most cost-effective treatment against standard care across the countries (with list prices: €34684–€72551/quality adjusted life year; with estimated real prices: €24364–€56086/quality adjusted life year). The most cost-effective two-agent sequence, except for Germany, is the biosimilar-infliximab–adalimumab therapy compared with single biosimilar-infliximab (with list prices: €58533–€133831/quality adjusted life year; with estimated prices: €45513–€105875/quality adjusted life year). The cost-effectiveness of the biosimilar-infliximab–adalimumab–vedolizumab three-agent sequence compared wit biosimilar-infliximab –adalimumab is €87214–€152901/quality adjusted life year. Conclusions The suggested first-choice biological treatment is biosimilar-infliximab. In case of treatment failure, switching to adalimumab then to vedolizumab provides meaningful additional health gains but at increased costs. Inter-country differences in cost-effectiveness are remarkable due to significant differences in costs. PMID:29511561
Nagaki, Kiyotaka; Shibata, Fukashi; Kanatani, Asaka; Kashihara, Kazunari; Murata, Minoru
2012-04-01
The centromere is a multi-functional complex comprising centromeric DNA and a number of proteins. To isolate unidentified centromeric DNA sequences, centromere-specific histone H3 variants (CENH3) and chromatin immunoprecipitation (ChIP) have been utilized in some plant species. However, anti-CENH3 antibody for ChIP must be raised in each species because of its species specificity. Production of the antibodies is time-consuming and costly, and it is not easy to produce ChIP-grade antibodies. In this study, we applied a HaloTag7-based chromatin affinity purification system to isolate centromeric DNA sequences in tobacco. This system required no specific antibody, and made it possible to apply a highly stringent wash to remove contaminated DNA. As a result, we succeeded in isolating five tandem repetitive DNA sequences in addition to the centromeric retrotransposons that were previously identified by ChIP. Three of the tandem repeats were centromere-specific sequences located on different chromosomes. These results confirm the validity of the HaloTag7-based chromatin affinity purification system as an alternative method to ChIP for isolating unknown centromeric DNA sequences. The discovery of more than two chromosome-specific centromeric DNA sequences indicates the mosaic structure of tobacco centromeres. © Springer-Verlag 2011
Taylor, P.W.; Winton, J.R.
2002-01-01
Nested polymerase chain reaction (PCR) assays were developed using first-round primers complementary to highly conserved regions within the bacterial 16S ribosomal RNA (rRNA) gene (universal eubacterial primers) and second-round primers specific for sequences within the 16S rRNA genes of Aeromonas salmonicida, Yersinia ruckeri, andFlavobacterium psychrophilum. Following optimization of the MgCl2 concentration and primer annealing temperature, PCR employing the universal eubacterial primers was used to amplify a 1,500-base-pair (bp) product visible in agarose gels stained with ethidium bromide. The calculated detection limit of this single-round assay was less than 1.4 × 104 colony-forming units (CFU) per reaction for all bacterial species tested. Single-round PCR using primer sets specific for A. salmonicida, Y. ruckeri, and F. psychrophilumamplified bands of 271, 575, and 1,100 bp, respectively, with detection limits of less than 1.4 × 104, 1.4 × 105, and 1.4 × 105 CFU per reaction. Using the universal eubacterial primers in the first round and the species-specific primer sets in the second round of nested PCR assays improved the detection ability by approximately four orders of magnitude to fewer than 14 CFU per sample for each of the three bacterial species. Such nested assays could be adapted to a wide variety of bacterial fish pathogens for which 16S sequences are available.
Adenine specific DNA chemical sequencing reaction.
Iverson, B L; Dervan, P B
1987-01-01
Reaction of DNA with K2PdCl4 at pH 2.0 followed by a piperidine workup produces specific cleavage at adenine (A) residues. Product analysis revealed the K2PdCl4 reaction involves selective depurination at adenine, affording an excision reaction analogous to the other chemical DNA sequencing reactions. Adenine residues methylated at the exocyclic amine (N6) react with lower efficiency than unmethylated adenine in an identical sequence. This simple protocol specific for A may be a useful addition to current chemical sequencing reactions. Images PMID:3671067
Sevy, Alexander M.; Jacobs, Tim M.; Crowe, James E.; Meiler, Jens
2015-01-01
Computational protein design has found great success in engineering proteins for thermodynamic stability, binding specificity, or enzymatic activity in a ‘single state’ design (SSD) paradigm. Multi-specificity design (MSD), on the other hand, involves considering the stability of multiple protein states simultaneously. We have developed a novel MSD algorithm, which we refer to as REstrained CONvergence in multi-specificity design (RECON). The algorithm allows each state to adopt its own sequence throughout the design process rather than enforcing a single sequence on all states. Convergence to a single sequence is encouraged through an incrementally increasing convergence restraint for corresponding positions. Compared to MSD algorithms that enforce (constrain) an identical sequence on all states the energy landscape is simplified, which accelerates the search drastically. As a result, RECON can readily be used in simulations with a flexible protein backbone. We have benchmarked RECON on two design tasks. First, we designed antibodies derived from a common germline gene against their diverse targets to assess recovery of the germline, polyspecific sequence. Second, we design “promiscuous”, polyspecific proteins against all binding partners and measure recovery of the native sequence. We show that RECON is able to efficiently recover native-like, biologically relevant sequences in this diverse set of protein complexes. PMID:26147100
NASA Astrophysics Data System (ADS)
Davidson, S.; Cui, J.; Followill, D.; Ibbott, G.; Deasy, J.
2008-02-01
The Dose Planning Method (DPM) is one of several 'fast' Monte Carlo (MC) computer codes designed to produce an accurate dose calculation for advanced clinical applications. We have developed a flexible machine modeling process and validation tests for open-field and IMRT calculations. To complement the DPM code, a practical and versatile source model has been developed, whose parameters are derived from a standard set of planning system commissioning measurements. The primary photon spectrum and the spectrum resulting from the flattening filter are modeled by a Fatigue function, cut-off by a multiplying Fermi function, which effectively regularizes the difficult energy spectrum determination process. Commonly-used functions are applied to represent the off-axis softening, increasing primary fluence with increasing angle ('the horn effect'), and electron contamination. The patient dependent aspect of the MC dose calculation utilizes the multi-leaf collimator (MLC) leaf sequence file exported from the treatment planning system DICOM output, coupled with the source model, to derive the particle transport. This model has been commissioned for Varian 2100C 6 MV and 18 MV photon beams using percent depth dose, dose profiles, and output factors. A 3-D conformal plan and an IMRT plan delivered to an anthropomorphic thorax phantom were used to benchmark the model. The calculated results were compared to Pinnacle v7.6c results and measurements made using radiochromic film and thermoluminescent detectors (TLD).
Nanogrid rolling circle DNA sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Church, George M.; Porreca, Gregory J.; Shendure, Jay
The present invention relates to methods for sequencing a polynucleotide immobilized on an array having a plurality of specific regions each having a defined diameter size, including synthesizing a concatemer of a polynucleotide by rolling circle amplification, wherein the concatemer has a cross-sectional diameter greater than the diameter of a specific region, immobilizing the concatemer to the specific region to make an immobilized concatemer, and sequencing the immobilized concatemer.
Mining for class-specific motifs in protein sequence classification
2013-01-01
Background In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms. PMID:23496846
2014-01-01
Background Deciphering of the information content of eukaryotic promoters has remained confined to universal landmarks and conserved sequence elements such as enhancers and transcription factor binding motifs, which are considered sufficient for gene activation and regulation. Gene-specific sequences, interspersed between the canonical transacting factor binding sites or adjoining them within a promoter, are generally taken to be devoid of any regulatory information and have therefore been largely ignored. An unanswered question therefore is, do gene-specific sequences within a eukaryotic promoter have a role in gene activation? Here, we present an exhaustive experimental analysis of a gene-specific sequence adjoining the heat shock element (HSE) in the proximal promoter of the small heat shock protein gene, αB-crystallin (cryab). These sequences are highly conserved between the rodents and the humans. Results Using human retinal pigment epithelial cells in culture as the host, we have identified a 10-bp gene-specific promoter sequence (GPS), which, unlike an enhancer, controls expression from the promoter of this gene, only when in appropriate position and orientation. Notably, the data suggests that GPS in comparison with the HSE works in a context-independent fashion. Additionally, when moved upstream, about a nucleosome length of DNA (−154 bp) from the transcription start site (TSS), the activity of the promoter is markedly inhibited, suggesting its involvement in local promoter access. Importantly, we demonstrate that deletion of the GPS results in complete loss of cryab promoter activity in transgenic mice. Conclusions These data suggest that gene-specific sequences such as the GPS, identified here, may have critical roles in regulating gene-specific activity from eukaryotic promoters. PMID:24589182
Meehan, Sean K.; Randhawa, Bubblepreet; Wessel, Brenda; Boyd, Lara A.
2010-01-01
Implicit motor learning is preserved after stroke, but how the brain compensates for damage to facilitate learning is unclear. We used a random effects analysis to determine how stroke alters patterns of brain activity during implicit sequence-specific motor learning as compared to general improvements in motor control. Nine healthy participants and 9 individuals with chronic, right focal sub-cortical stroke performed a continuous joystick-based tracking task during an initial fMRI session, over 5 days of practice, and a retention test during a separate fMRI session. Sequence-specific implicit motor learning was differentiated from general improvements in motor control by comparing tracking performance on a novel, repeated tracking sequences during early practice and again at the retention test. Both groups demonstrated implicit sequence-specific motor learning at the retention test, yet substantial differences were apparent. At retention, healthy control participants demonstrated increased BOLD response in left dorsal premotor cortex (BA 6) but decreased BOLD response left dorsolateral prefrontal cortex (DLPFC; BA 9) during repeated sequence tracking. In contrast, at retention individuals with stroke did not show this reduction in DLPFC during repeated tracking. Instead implicit sequence-specific motor learning and general improvements in motor control were associated with increased BOLD response in the left middle frontal gyrus BA 8, regardless of sequence type after stroke. These data emphasize the potential importance of a prefrontal-based attentional network for implicit motor learning after stroke. The present study is the first to highlight the importance of the prefrontal cortex for implicit sequence-specific motor learning after stroke. PMID:20725908
Specific minor groove solvation is a crucial determinant of DNA binding site recognition
Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.
2014-01-01
The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
Promoter classifier: software package for promoter database analysis.
Gershenzon, Naum I; Ioshikhes, Ilya P
2005-01-01
Promoter Classifier is a package of seven stand-alone Windows-based C++ programs allowing the following basic manipulations with a set of promoter sequences: (i) calculation of positional distributions of nucleotides averaged over all promoters of the dataset; (ii) calculation of the averaged occurrence frequencies of the transcription factor binding sites and their combinations; (iii) division of the dataset into subsets of sequences containing or lacking certain promoter elements or combinations; (iv) extraction of the promoter subsets containing or lacking CpG islands around the transcription start site; and (v) calculation of spatial distributions of the promoter DNA stacking energy and bending stiffness. All programs have a user-friendly interface and provide the results in a convenient graphical form. The Promoter Classifier package is an effective tool for various basic manipulations with eukaryotic promoter sequences that usually are necessary for analysis of large promoter datasets. The program Promoter Divider is described in more detail as a representative component of the package.
Volume calculation of CT lung lesions based on Halton low-discrepancy sequences
NASA Astrophysics Data System (ADS)
Li, Shusheng; Wang, Liansheng; Li, Shuo
2017-03-01
Volume calculation from the Computed Tomography (CT) lung lesions data is a significant parameter for clinical diagnosis. The volume is widely used to assess the severity of the lung nodules and track its progression, however, the accuracy and efficiency of previous studies are not well achieved for clinical uses. It remains to be a challenging task due to its tight attachment to the lung wall, inhomogeneous background noises and large variations in sizes and shape. In this paper, we employ Halton low-discrepancy sequences to calculate the volume of the lung lesions. The proposed method directly compute the volume without the procedure of three-dimension (3D) model reconstruction and surface triangulation, which significantly improves the efficiency and reduces the complexity. The main steps of the proposed method are: (1) generate a certain number of random points in each slice using Halton low-discrepancy sequences and calculate the lesion area of each slice through the proportion; (2) obtain the volume by integrating the areas in the sagittal direction. In order to evaluate our proposed method, the experiments were conducted on the sufficient data sets with different size of lung lesions. With the uniform distribution of random points, our proposed method achieves more accurate results compared with other methods, which demonstrates the robustness and accuracy for the volume calculation of CT lung lesions. In addition, our proposed method is easy to follow and can be extensively applied to other applications, e.g., volume calculation of liver tumor, atrial wall aneurysm, etc.
Parsons, Tom; Dreger, Douglas S.
2000-01-01
The proximity in time (∼7 years) and space (∼20 km) between the 1992 M=7.3 Landers earthquake and the 1999 M=7.1 Hector Mine event suggests a possible link between the quakes. We thus calculated the static stress changes following the 1992 Joshua Tree/Landers/Big Bear earthquake sequence on the 1999 M=7.1 Hector Mine rupture plane in southern California. Resolving the stress tensor into rake-parallel and fault-normal components and comparing with changes in the post-Landers seismicity rate allows us to estimate a coefficient of friction on the Hector Mine plane. Seismicity following the 1992 sequence increased at Hector Mine where the fault was unclamped. This increase occurred despite a calculated reduction in right-lateral shear stress. The dependence of seismicity change primarily on normal stress change implies a high coefficient of static friction (µ≥0.8). We calculated the Coulomb stress change using µ=0.8 and found that the Hector Mine hypocenter was mildly encouraged (0.5 bars) by the 1992 earthquake sequence. In addition, the region of peak slip during the Hector Mine quake occurred where Coulomb stress is calculated to have increased by 0.5–1.5 bars. In general, slip was more limited where Coulomb stress was reduced, though there was some slip where the strongest stress decrease was calculated. Interestingly, many smaller earthquakes nucleated at or near the 1999 Hector Mine hypocenter after 1992, but only in 1999 did an event spread to become a M=7.1 earthquake.
NASA Astrophysics Data System (ADS)
Mielke, Steven P.; Grønbech-Jensen, Niels; Krishnan, V. V.; Fink, William H.; Benham, Craig J.
2005-09-01
The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.
Bai, W L; Yin, R H; Dou, Q L; Jiang, W Q; Zhao, S J; Ma, Z J; Luo, G B; Zhao, Z H
2011-04-01
κ-Casein is one of the major proteins in the milk of mammals. It plays an important role in determining the size and specific function of milk micelles. We have previously identified and characterized a genetic variant of yak κ-casein by evaluating genomic DNA. Here, we isolate and characterize a yak κ-casein cDNA harboring the full-length open reading frame (ORF) from lactating mammary gland. Total RNA was extracted from mammary tissue of lactating female yak, and the κ-casein cDNA were synthesized by RT-PCR technique, then cloned and sequenced. The obtained cDNA of 660-bp contained an ORF sufficient to encode the entire amino acid sequence of κ-casein precursor protein consisting of 190 amino acids with a signal peptide of 21 amino acids. Yak κ-casein has a predicted molecular mass of 19,006.588 Da with a calculated isoelectric point of 7.245. Compared with the corresponding sequences in GenBank of cattle, buffalo, sheep, goat, Arabian camel, horse, and rabbit, yak κ-casein sequence had identity of 64.76-98.78% in cDNA, and identity of 44.79-98.42% and similarity of 53.65-98.42% in deduced amino acids, revealing a high homology with the other livestock species. Based on κ-casein cDNA sequences, the phylogenetic analysis indicated that yak κ-casein had a close relationship with that of cattle. This work might be useful in the genetic engineering researches for yak κ-casein.
Mielke, Steven P; Grønbech-Jensen, Niels; Krishnan, V V; Fink, William H; Benham, Craig J
2005-09-22
The topological state of DNA in vivo is dynamically regulated by a number of processes that involve interactions with bound proteins. In one such process, the tracking of RNA polymerase along the double helix during transcription, restriction of rotational motion of the polymerase and associated structures, generates waves of overtwist downstream and undertwist upstream from the site of transcription. The resulting superhelical stress is often sufficient to drive double-stranded DNA into a denatured state at locations such as promoters and origins of replication, where sequence-specific duplex opening is a prerequisite for biological function. In this way, transcription and other events that actively supercoil the DNA provide a mechanism for dynamically coupling genetic activity with regulatory and other cellular processes. Although computer modeling has provided insight into the equilibrium dynamics of DNA supercoiling, to date no model has appeared for simulating sequence-dependent DNA strand separation under the nonequilibrium conditions imposed by the dynamic introduction of torsional stress. Here, we introduce such a model and present results from an initial set of computer simulations in which the sequences of dynamically superhelical, 147 base pair DNA circles were systematically altered in order to probe the accuracy with which the model can predict location, extent, and time of stress-induced duplex denaturation. The results agree both with well-tested statistical mechanical calculations and with available experimental information. Additionally, we find that sites susceptible to denaturation show a propensity for localizing to supercoil apices, suggesting that base sequence determines locations of strand separation not only through the energetics of interstrand interactions, but also by influencing the geometry of supercoiling.
Flanking sequence determination and specific PCR identification of transgenic wheat B102-1-2.
Cao, Jijuan; Xu, Junyi; Zhao, Tongtong; Cao, Dongmei; Huang, Xin; Zhang, Piqiao; Luan, Fengxia
2014-01-01
The exogenous fragment sequence and flanking sequence between the exogenous fragment and recombinant chromosome of transgenic wheat B102-1-2 were successfully acquired using genome walking technology. The newly acquired exogenous fragment encoded the full-length sequence of transformed genes with transformed plasmid and corresponding functional genes including ubi, vector pBANF-bar, vector pUbiGUSPlus, vector HSP, reporter vector pUbiGUSPlus, promoter ubiquitin, and coli DH1. A specific polymerase chain reaction (PCR) identification method for transgenic wheat B102-1-2 was established on the basis of designed primers according to flanking sequence. This established specific PCR strategy was validated by using transgenic wheat, transgenic corn, transgenic soybean, transgenic rice, and non-transgenic wheat. A specifically amplified target band was observed only in transgenic wheat B102-1-2. Therefore, this method is characterized by high specificity, high reproducibility, rapid identification, and excellent accuracy for the identification of transgenic wheat B102-1-2.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nosrati, R; Sunnybrook Health Sciences Centre, Toronto, Ontario; Soliman, A
Purpose: This study aims at developing an MRI-only workflow for post-implant dosimetry of the prostate LDR brachytherapy seeds. The specific goal here is to develop a post-processing algorithm to produce positive contrast for the seeds and prostatic calcifications and differentiate between them on MR images. Methods: An agar-based phantom incorporating four dummy seeds (I-125) and five calcifications of different sizes (from sheep cortical bone) was constructed. Seeds were placed arbitrarily in the coronal plane. The phantom was scanned with 3T Philips Achieva MR scanner using an 8-channel head coil array. Multi-echo turbo spin echo (ME-TSE) and multi-echo gradient recalled echomore » (ME-GRE) sequences were acquired. Due to minimal susceptibility artifacts around seeds, ME-GRE sequence (flip angle=15; TR/TE=20/2.3/2.3; resolution=0.7×0.7×2mm3) was further processed.The induced field inhomogeneity due to the presence of titaniumencapsulated seeds was corrected using a B0 field map. B0 map was calculated using the ME-GRE sequence by calculating the phase difference at two different echo times. Initially, the product of the first echo and B0 map was calculated. The features corresponding to the seeds were then extracted in three steps: 1) the edge pixels were isolated using “Prewitt” operator; 2) the Hough transform was employed to detect ellipses approximately matching the dimensions of the seeds and 3) at the position and orientation of the detected ellipses an ellipse was drawn on the B0-corrected image. Results: The proposed B0-correction process produced positive contrast for the seeds and calcifications. The Hough transform based on Prewitt edge operator successfully identified all the seeds according to their ellipsoidal shape and dimensions in the edge image. Conclusion: The proposed post-processing algorithm successfully visualized the seeds and calcifications with positive contrast and differentiates between them according to their shapes. Further assessments on more realistic phantoms and patient study are required to validate the outcome.« less
Sequence-specific DNA binding Pyrrole-imidazole polyamides and their applications.
Kawamoto, Yusuke; Bando, Toshikazu; Sugiyama, Hiroshi
2018-05-01
Pyrrole-imidazole polyamides (Py-Im polyamides) are cell-permeable compounds that bind to the minor groove of double-stranded DNA in a sequence-specific manner without causing denaturation of the DNA. These compounds can be used to control gene expression and to stain specific sequences in cells. Here, we review the history, structural variations, and functional investigations of Py-Im polyamides. Copyright © 2018 Elsevier Ltd. All rights reserved.
Sequence information signal processor for local and global string comparisons
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1997-01-01
A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.
Murray, V
1999-01-01
This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.
Duesberg, Peter H.; Vogt, Peter K.
1979-01-01
The genome of the defective avian tumor virus MH2 was identified as a RNA of 5.7 kilobases by its presence in different MH2-helper virus complexes and its absence from pure helper virus, by its unique fingerprint pattern of RNase T1-resistant (T1) oligonucleotides that differed from those of two helper virus RNAs, and by its structural analogy to the RNA of MC29, another avian acute leukemia virus. Two sets of sequences were distinguished in MH2 RNA: 66% hybridized with DNA complementary to helper-independent avian tumor viruses, termed group-specific, and 34% were specific. The percentage of specific sequences is considered a minimal estimate because the MH2 RNA used was about 30% contaminated by helper virus RNA. No sequences related to the transforming src gene of avian sarcoma viruses were found in MH2. MH2 shared three large T1 oligonucleotides with MC29, two of which could also be isolated from a RNase A- and T1-resistant hybrid formed between MH2 RNA and MC29 specific cDNA. These oligonucleotides belong to a group of six that define the specific segment of MC29 RNA described previously. The group-specific sequences of MH2 and MC29 RNA shared only the two smallest out of about 20 T1 oligonucleotides associated with MH2 RNA. It is concluded that the specific sequences of MH2 and MC29 are related, and it is proposed that they are necessary for, or identical with, the onc genes of these viruses. These sequences would define a related class of transforming genes in avian tumor viruses that differs from the src genes of avian sarcoma viruses. Images PMID:221900
Kim, Dae-Weung; Kim, Woo Hyoung; Kim, Myoung Hyoun; Kim, Chang Guhn
2015-11-01
Arginine-arginine-leucine (RRL) is considered a tumor endothelial cell-specific binding sequence. RRL-containing peptide targeting tumor vessels is an excellent candidate for tumor imaging. In this study, we developed RRL-containing hexapeptides and evaluated their feasibility as a tumor imaging agent in a HT-1080 fibrosarcoma-bearing murine model. The hexapeptide, glutamic acid-cysteine-glycine (ECG)-RRL was synthesized using Fmoc solid-phase peptide synthesis. Radiolabeling efficiency was evaluated using instant thin-layer chromatography. Uptake of Tc-99m ECG-RRL within HT-1080 cells was evaluated in vitro by confocal microscopy and cellular binding affinity was calculated. Gamma images were acquired In HT-1080 fibrosarcoma tumor-bearing mice, and the tumor-to-muscle uptake ratio was calculated. The inflammatory-to-normal muscle uptake ratio was also calculated in an inflammation mouse model. A biodistribution study was performed to calculate %ID/g. A high yield of Tc-99m ECG-RRL complexes was prepared after Tc-99m radiolabeling. Binding of Tc-99m ECG-RRL to tumor cells had was confirmed by in vitro studies. Gamma camera imaging in the murine model showed that Tc-99m ECG-RRL accumulated substantially in the subcutaneously engrafted tumor and that tumoral uptake was blocked by co-injecting excess RRL. Moreover, Tc-99m ECG-RRL accumulated minimally in inflammatory lesions. We successfully developed Tc-99m ECG-RRL as a new tumor imaging candidate. Specific tumoral uptake of Tc-99m ECG-RRL was evaluated both in vitro and in vivo, and it was determined to be a good tumor imaging candidate. Additionally, Tc-99m ECG-RRL effectively distinguished between cancerous tissue and inflammatory lesions.
Babben, Steve; Perovic, Dragan; Koch, Michael; Ordon, Frank
2015-01-01
Recent declines in costs accelerated sequencing of many species with large genomes, including hexaploid wheat (Triticum aestivum L.). Although the draft sequence of bread wheat is known, it is still one of the major challenges to developlocus specific primers suitable to be used in marker assisted selection procedures, due to the high homology of the three genomes. In this study we describe an efficient approach for the development of locus specific primers comprising four steps, i.e. (i) identification of genomic and coding sequences (CDS) of candidate genes, (ii) intron- and exon-structure reconstruction, (iii) identification of wheat A, B and D sub-genome sequences and primer development based on sequence differences between the three sub-genomes, and (iv); testing of primers for functionality, correct size and localisation. This approach was applied to single, low and high copy genes involved in frost tolerance in wheat. In summary for 27 of these genes for which sequences were derived from Triticum aestivum, Triticum monococcum and Hordeum vulgare, a set of 119 primer pairs was developed and after testing on Nulli-tetrasomic (NT) lines, a set of 65 primer pairs (54.6%), corresponding to 19 candidate genes, turned out to be specific. Out of these a set of 35 fragments was selected for validation via Sanger's amplicon re-sequencing. All fragments, with the exception of one, could be assigned to the original reference sequence. The approach presented here showed a much higher specificity in primer development in comparison to techniques used so far in bread wheat and can be applied to other polyploid species with a known draft sequence. PMID:26565976
Method and computer program product for maintenance and modernization backlogging
Mattimore, Bernard G; Reynolds, Paul E; Farrell, Jill M
2013-02-19
According to one embodiment, a computer program product for determining future facility conditions includes a computer readable medium having computer readable program code stored therein. The computer readable program code includes computer readable program code for calculating a time period specific maintenance cost, for calculating a time period specific modernization factor, and for calculating a time period specific backlog factor. Future facility conditions equal the time period specific maintenance cost plus the time period specific modernization factor plus the time period specific backlog factor. In another embodiment, a computer-implemented method for calculating future facility conditions includes calculating a time period specific maintenance cost, calculating a time period specific modernization factor, and calculating a time period specific backlog factor. Future facility conditions equal the time period specific maintenance cost plus the time period specific modernization factor plus the time period specific backlog factor. Other embodiments are also presented.
Villard, Pierre; Malausa, Thibaut
2013-07-01
SP-Designer is an open-source program providing a user-friendly tool for the design of specific PCR primer pairs from a DNA sequence alignment containing sequences from various taxa. SP-Designer selects PCR primer pairs for the amplification of DNA from a target species on the basis of several criteria: (i) primer specificity, as assessed by interspecific sequence polymorphism in the annealing regions, (ii) the biochemical characteristics of the primers and (iii) the intended PCR conditions. SP-Designer generates tables, detailing the primer pair and PCR characteristics, and a FASTA file locating the primer sequences in the original sequence alignment. SP-Designer is Windows-compatible and freely available from http://www2.sophia.inra.fr/urih/sophia_mart/sp_designer/info_sp_designer.php. © 2013 John Wiley & Sons Ltd.
Strategies to Improve Efficiency and Specificity of Degenerate Primers in PCR.
Campos, Maria Jorge; Quesada, Alberto
2017-01-01
PCR with degenerate primers can be used to identify the coding sequence of an unknown protein or to detect a genetic variant within a gene family. These primers, which are complex mixtures of slightly different oligonucleotide sequences, can be optimized to increase the efficiency and/or specificity of PCR in the amplification of a sequence of interest by the introduction of mismatches with the target sequence and balancing their position toward the primers 5'- or 3'-ends. In this work, we explain in detail examples of rational design of primers in two different applications, including the use of specific determinants at the 3'-end, to: (1) improve PCR efficiency with coding sequences for members of a protein family by fully degeneration at a core box of conserved genetic information, with the reduction of degeneration at the 5'-end, and (2) optimize specificity of allelic discrimination of closely related orthologous by 5'-end degenerate primers.
Thieme, Frank; Marillonnet, Sylvestre
2014-01-01
Identification of unknown sequences that flank known sequences of interest requires PCR amplification of DNA fragments that contain the junction between the known and unknown flanking sequences. Since amplified products often contain a mixture of specific and nonspecific products, the quick and clean (QC) cloning procedure was developed to clone specific products only. QC cloning is a ligation-independent cloning procedure that relies on the exonuclease activity of T4 DNA polymerase to generate single-stranded extensions at the ends of the vector and insert. A specific feature of QC cloning is the use of vectors that contain a sequence called catching sequence that allows cloning specific products only. QC cloning is performed by a one-pot incubation of insert and vector in the presence of T4 DNA polymerase at room temperature for 10 min followed by direct transformation of the incubation mix in chemo-competent Escherichia coli cells.
Jiang, Faming; Huang, Weiwei; Wang, Ye; Tian, Panwen; Chen, Xuerong; Liang, Zongan
2016-01-01
Smear-negative pulmonary tuberculosis (PTB) is common and difficult to diagnose. In this study, we investigated the diagnostic value of nucleic acid amplification testing and sequencing combined with acid-fast bacteria (AFB) staining of needle biopsy lung tissues for patients with suspected smear-negative PTB. Patients with suspected smear-negative PTB who underwent percutaneous transthoracic needle biopsy between May 1, 2012, and June 30, 2015, were enrolled in this retrospective study. Patients with AFB in sputum smears were excluded. All lung biopsy specimens were fixed in formalin, embedded in paraffin, and subjected to acid-fast staining and tuberculous polymerase chain reaction (TB-PCR). For patients with positive AFB and negative TB-PCR results in lung tissues, probe assays and 16S rRNA sequencing were used for identification of nontuberculous mycobacteria (NTM). The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic accuracy of PCR and AFB staining were calculated separately and in combination. Among the 220 eligible patients, 133 were diagnosed with TB (men/women: 76/57; age range: 17-80 years, confirmed TB: 9, probable TB: 124). Forty-eight patients who were diagnosed with other specific diseases were assigned as negative controls, and 39 patients with indeterminate final diagnosis were excluded from statistical analysis. The sensitivity, specificity, PPV, NPV, and accuracy of histological AFB (HAFB) for the diagnosis of smear-negative were 61.7% (82/133), 100% (48/48), 100% (82/82), 48.5% (48/181), and 71.8% (130/181), respectively. The sensitivity, specificity, PPV, and NPV of histological PCR were 89.5% (119/133), 95.8% (46/48), 98.3% (119/121), and 76.7% (46/60), respectively, demonstrating that histological PCR had significantly higher accuracy (91.2% [165/181]) than histological acid-fast staining (71.8% [130/181]), P < 0.001. Parallel testing of histological AFB staining and PCR showed the sensitivity, specificity, PPV, NPV, and accuracy to be 94.0% (125/133), 95.8% (46/48), 98.4% (125/127), 85.2% (46/54), and 94.5% (171/181), respectively. Among patients with positive AFB and negative PCR results in lung tissue specimens, two were diagnosed with NTM infections (Mycobacterium avium-intracellulare complex and Mycobacterium kansasii). Nucleic acid amplification testing combined with acid-fast staining in lung biopsy tissues can lead to early and accurate diagnosis in patients with smear-negative pulmonary tuberculosis. For patients with positive histological AFB and negative tuberculous PCR results in lung tissue, NTM infection should be suspected and could be identified by specific probe assays or 16S rRNA sequencing.
Diagnosing periprosthetic infection: false-positive intraoperative Gram stains.
Oethinger, Margret; Warner, Debra K; Schindler, Susan A; Kobayashi, Hideo; Bauer, Thomas W
2011-04-01
Intraoperative Gram stains have a reported low sensitivity but high specificity when used to help diagnose periprosthetic infections. In early 2008, we recognized an unexpectedly high frequency of apparent false-positive Gram stains from revision arthroplasties. The purpose of this report is to describe the cause of these false-positive test results. We calculated the sensitivity and specificity of all intraoperative Gram stains submitted from revision arthroplasty cases during a 3-month interval using microbiologic cultures of the same samples as the gold standard. Methods of specimen harvesting, handling, transport, distribution, specimen processing including tissue grinding/macerating, Gram staining, and interpretation were studied. After a test modification, results of specimens were prospectively collected for a second 3-month interval, and the sensitivity and specificity of intraoperative Gram stains were calculated. The retrospective review of 269 Gram stains submitted from revision arthroplasties indicated historic sensitivity and specificity values of 23% and 92%, respectively. Systematic analysis of all steps of the procedure identified Gram-stained but nonviable bacteria in commercial broth reagents used as diluents for maceration of periprosthetic membranes before Gram staining and culture. Polymerase chain reaction and sequencing showed mixed bacterial DNA. Evaluation of 390 specimens after initiating standardized Millipore filtering of diluent fluid revealed a reduced number of positive Gram stains, yielding 9% sensitivity and 99% specificity. Clusters of false-positive Gram stains have been reported in other clinical conditions. They are apparently rare related to diagnosing periprosthetic infections but have severe consequences if used to guide treatment. Even occasional false-positive Gram stains should prompt review of laboratory methods. Our observations implicate dead bacteria in microbiologic reagents as potential sources of false-positive Gram stains.
Engineering peptide ligase specificity by proteomic identification of ligation sites.
Weeks, Amy M; Wells, James A
2018-01-01
Enzyme-catalyzed peptide ligation is a powerful tool for site-specific protein bioconjugation, but stringent enzyme-substrate specificity limits its utility. We developed an approach for comprehensively characterizing peptide ligase specificity for N termini using proteome-derived peptide libraries. We used this strategy to characterize the ligation efficiency for >25,000 enzyme-substrate pairs in the context of the engineered peptide ligase subtiligase and identified a family of 72 mutant subtiligases with activity toward N-terminal sequences that were previously recalcitrant to modification. We applied these mutants individually for site-specific bioconjugation of purified proteins, including antibodies, and in algorithmically selected combinations for sequencing of the cellular N terminome with reduced sequence bias. We also developed a web application to enable algorithmic selection of the most efficient subtiligase variant(s) for bioconjugation to user-defined sequences. Our methods provide a new toolbox of enzymes for site-specific protein modification and a general approach for rapidly defining and engineering peptide ligase specificity.
Kurka, Hedwig; Ehrenreich, Armin; Ludwig, Wolfgang; Monot, Marc; Rupnik, Maja; Barbut, Frederic; Indra, Alexander; Dupuy, Bruno; Liebl, Wolfgang
2014-01-01
PCR-ribotyping is a broadly used method for the classification of isolates of Clostridium difficile, an emerging intestinal pathogen, causing infections with increased disease severity and incidence in several European and North American countries. We have now carried out clustering analysis with selected genes of numerous C. difficile strains as well as gene content comparisons of their genomes in order to broaden our view of the relatedness of strains assigned to different ribotypes. We analyzed the genomic content of 48 C. difficile strains representing 21 different ribotypes. The calculation of distance matrix-based dendrograms using the neighbor joining method for 14 conserved genes (standard phylogenetic marker genes) from the genomes of the C. difficile strains demonstrated that the genes from strains with the same ribotype generally clustered together. Further, certain ribotypes always clustered together and formed ribotype groups, i.e. ribotypes 078, 033 and 126, as well as ribotypes 002 and 017, indicating their relatedness. Comparisons of the gene contents of the genomes of ribotypes that clustered according to the conserved gene analysis revealed that the number of common genes of the ribotypes belonging to each of these three ribotype groups were very similar for the 078/033/126 group (at most 69 specific genes between the different strains with the same ribotype) but less similar for the 002/017 group (86 genes difference). It appears that the ribotype is indicative not only of a specific pattern of the amplified 16S–23S rRNA intergenic spacer but also reflects specific differences in the nucleotide sequences of the conserved genes studied here. It can be anticipated that the sequence deviations of more genes of C. difficile strains are correlated with their PCR-ribotype. In conclusion, the results of this study corroborate and extend the concept of clonal C. difficile lineages, which correlate with ribotypes affiliation. PMID:24482682
In-depth resistome analysis by targeted metagenomics.
Lanza, Val F; Baquero, Fernando; Martínez, José Luís; Ramos-Ruíz, Ricardo; González-Zorn, Bruno; Andremont, Antoine; Sánchez-Valenzuela, Antonio; Ehrlich, Stanislav Dusko; Kennedy, Sean; Ruppé, Etienne; van Schaik, Willem; Willems, Rob J; de la Cruz, Fernando; Coque, Teresa M
2018-01-15
Antimicrobial resistance is a major global health challenge. Metagenomics allows analyzing the presence and dynamics of "resistomes" (the ensemble of genes encoding antimicrobial resistance in a given microbiome) in disparate microbial ecosystems. However, the low sensitivity and specificity of available metagenomic methods preclude the detection of minority populations (often present below their detection threshold) and/or the identification of allelic variants that differ in the resulting phenotype. Here, we describe a novel strategy that combines targeted metagenomics using last generation in-solution capture platforms, with novel bioinformatics tools to establish a standardized framework that allows both quantitative and qualitative analyses of resistomes. We developed ResCap, a targeted sequence capture platform based on SeqCapEZ (NimbleGene) technology, which includes probes for 8667 canonical resistance genes (7963 antibiotic resistance genes and 704 genes conferring resistance to metals or biocides), and 2517 relaxase genes (plasmid markers) and 78,600 genes homologous to the previous identified targets (47,806 for antibiotics and 30,794 for biocides or metals). Its performance was compared with metagenomic shotgun sequencing (MSS) for 17 fecal samples (9 humans, 8 swine). ResCap significantly improves MSS to detect "gene abundance" (from 2.0 to 83.2%) and "gene diversity" (26 versus 14.9 genes unequivocally detected per sample per million of reads; the number of reads unequivocally mapped increasing up to 300-fold by using ResCap), which were calculated using novel bioinformatic tools. ResCap also facilitated the analysis of novel genes potentially involved in the resistance to antibiotics, metals, biocides, or any combination thereof. ResCap, the first targeted sequence capture, specifically developed to analyze resistomes, greatly enhances the sensitivity and specificity of available metagenomic methods and offers the possibility to analyze genes related to the selection and transfer of antimicrobial resistance (biocides, heavy metals, plasmids). The model opens the possibility to study other complex microbial systems in which minority populations play a relevant role.
Swenson, Luke C; Moores, Andrew; Low, Andrew J; Thielen, Alexander; Dong, Winnie; Woods, Conan; Jensen, Mark A; Wynhoven, Brian; Chan, Dennison; Glascock, Christopher; Harrigan, P Richard
2010-08-01
Tropism testing should rule out CXCR4-using HIV before treatment with CCR5 antagonists. Currently, the recombinant phenotypic Trofile assay (Monogram) is most widely utilized; however, genotypic tests may represent alternative methods. Independent triplicate amplifications of the HIV gp120 V3 region were made from either plasma HIV RNA or proviral DNA. These underwent standard, population-based sequencing with an ABI3730 (RNA n = 63; DNA n = 40), or "deep" sequencing with a Roche/454 Genome Sequencer-FLX (RNA n = 12; DNA n = 12). Position-specific scoring matrices (PSSMX4/R5) (-6.96 cutoff) and geno2pheno[coreceptor] (5% false-positive rate) inferred tropism from V3 sequence. These methods were then independently validated with a separate, blinded dataset (n = 278) of screening samples from the maraviroc MOTIVATE trials. Standard sequencing of HIV RNA with PSSM yielded 69% sensitivity and 91% specificity, relative to Trofile. The validation dataset gave 75% sensitivity and 83% specificity. Proviral DNA plus PSSM gave 77% sensitivity and 71% specificity. "Deep" sequencing of HIV RNA detected >2% inferred-CXCR4-using virus in 8/8 samples called non-R5 by Trofile, and <2% in 4/4 samples called R5. Triplicate analyses of V3 standard sequence data detect greater proportions of CXCR4-using samples than previously achieved. Sequencing proviral DNA and "deep" V3 sequencing may also be useful tools for assessing tropism.
Whitaker, Weston R; Lee, Hanson; Arkin, Adam P; Dueber, John E
2015-03-20
Genetic sequences ported into non-native hosts for synthetic biology applications can gain unexpected properties. In this study, we explored sequences functioning as ribosome binding sites (RBSs) within protein coding DNA sequences (CDSs) that cause internal translation, resulting in truncated proteins. Genome-wide prediction of bacterial RBSs, based on biophysical calculations employed by the RBS calculator, suggests a selection against internal RBSs within CDSs in Escherichia coli, but not those in Saccharomyces cerevisiae. Based on these calculations, silent mutations aimed at removing internal RBSs can effectively reduce truncation products from internal translation. However, a solution for complete elimination of internal translation initiation is not always feasible due to constraints of available coding sequences. Fluorescence assays and Western blot analysis showed that in genes with internal RBSs, increasing the strength of the intended upstream RBS had little influence on the internal translation strength. Another strategy to minimize truncated products from an internal RBS is to increase the relative strength of the upstream RBS with a concomitant reduction in promoter strength to achieve the same protein expression level. Unfortunately, lower transcription levels result in increased noise at the single cell level due to stochasticity in gene expression. At the low expression regimes desired for many synthetic biology applications, this problem becomes particularly pronounced. We found that balancing promoter strengths and upstream RBS strengths to intermediate levels can achieve the target protein concentration while avoiding both excessive noise and truncated protein.
Price, Anthony N.; Padormo, Francesco; Hajnal, Joseph V.; Malik, Shaihan J.
2017-01-01
Cardiac magnetic resonance imaging (MRI) at high field presents challenges because of the high specific absorption rate and significant transmit field (B 1 +) inhomogeneities. Parallel transmission MRI offers the ability to correct for both issues at the level of individual radiofrequency (RF) pulses, but must operate within strict hardware and safety constraints. The constraints are themselves affected by sequence parameters, such as the RF pulse duration and TR, meaning that an overall optimal operating point exists for a given sequence. This work seeks to obtain optimal performance by performing a ‘sequence‐level’ optimization in which pulse sequence parameters are included as part of an RF shimming calculation. The method is applied to balanced steady‐state free precession cardiac MRI with the objective of minimizing TR, hence reducing the imaging duration. Results are demonstrated using an eight‐channel parallel transmit system operating at 3 T, with an in vivo study carried out on seven male subjects of varying body mass index (BMI). Compared with single‐channel operation, a mean‐squared‐error shimming approach leads to reduced imaging durations of 32 ± 3% with simultaneous improvement in flip angle homogeneity of 32 ± 8% within the myocardium. PMID:28195684
Hackenberg, Michael; Rodríguez-Ezpeleta, Naiara; Aransay, Ana M.
2011-01-01
We present a new version of miRanalyzer, a web server and stand-alone tool for the detection of known and prediction of new microRNAs in high-throughput sequencing experiments. The new version has been notably improved regarding speed, scope and available features. Alignments are now based on the ultrafast short-read aligner Bowtie (granting also colour space support, allowing mismatches and improving speed) and 31 genomes, including 6 plant genomes, can now be analysed (previous version contained only 7). Differences between plant and animal microRNAs have been taken into account for the prediction models and differential expression of both, known and predicted microRNAs, between two conditions can be calculated. Additionally, consensus sequences of predicted mature and precursor microRNAs can be obtained from multiple samples, which increases the reliability of the predicted microRNAs. Finally, a stand-alone version of the miRanalyzer that is based on a local and easily customized database is also available; this allows the user to have more control on certain parameters as well as to use specific data such as unpublished assemblies or other libraries that are not available in the web server. miRanalyzer is available at http://bioinfo2.ugr.es/miRanalyzer/miRanalyzer.php. PMID:21515631
Kudo, H; Doi, Y; Ueda, H; Kaeriyama, M
2009-09-01
Despite the importance of olfactory receptor neurons (ORNs) for homing migration, the expression of olfactory marker protein (OMP) is not well understood in ORNs of Pacific salmon (genus Oncorhynchus). In this study, salmon OMP was characterized in the olfactory epithelia of lacustrine sockeye salmon (O. nerka) by molecular biological and histochemical techniques. Two cDNAs encoding salmon OMP were isolated and sequenced. These cDNAs both contained a coding region encoding 173 amino acid residues, and the molecular mass of the two proteins was calculated to be 19,581.17 and 19,387.11Da, respectively. Both amino acid sequences showed marked homology (90%). The protein and nucleotide sequencing demonstrates the existence of high-level homology between salmon OMPs and those of other teleosts. By in situ hybridization using a digoxigenin-labeled salmon OMP cRNA probe, signals for salmon OMP mRNA were observed preferentially in the perinuclear regions of the ORNs. By immunohistochemistry using a specific antibody to salmon OMP, OMP-immunoreactivities were noted in the cytosol of those neurons. The present study is the first to describe cDNA cloning of OMP in salmon olfactory epithelium, and indicate that OMP is a useful molecular marker for the detection of the ORNs in Pacific salmon.
NASA Astrophysics Data System (ADS)
Sajjadi, Seyed; Buelna, Xavier; Eloranta, Jussi
2018-01-01
Application of inexpensive light emitting diodes as backlight sources for time-resolved shadowgraph imaging is demonstrated. The two light sources tested are able to produce light pulse sequences in the nanosecond and microsecond time regimes. After determining their time response characteristics, the diodes were applied to study the gas bubble formation around laser-heated copper nanoparticles in superfluid helium at 1.7 K and to determine the local cavitation bubble dynamics around fast moving metal micro-particles in the liquid. A convolutional neural network algorithm for analyzing the shadowgraph images by a computer is presented and the method is validated against the results from manual image analysis. The second application employed the red-green-blue light emitting diode source that produces light pulse sequences of the individual colors such that three separate shadowgraph frames can be recorded onto the color pixels of a charge-coupled device camera. Such an image sequence can be used to determine the moving object geometry, local velocity, and acceleration/deceleration. These data can be used to calculate, for example, the instantaneous Reynolds number for the liquid flow around the particle. Although specifically demonstrated for superfluid helium, the technique can be used to study the dynamic response of any medium that exhibits spatial variations in the index of refraction.
SAR and scan-time optimized 3D whole-brain double inversion recovery imaging at 7T.
Pracht, Eberhard D; Feiweier, Thorsten; Ehses, Philipp; Brenner, Daniel; Roebroeck, Alard; Weber, Bernd; Stöcker, Tony
2018-05-01
The aim of this project was to implement an ultra-high field (UHF) optimized double inversion recovery (DIR) sequence for gray matter (GM) imaging, enabling whole brain coverage in short acquisition times ( ≈5 min, image resolution 1 mm 3 ). A 3D variable flip angle DIR turbo spin echo (TSE) sequence was optimized for UHF application. We implemented an improved, fast, and specific absorption rate (SAR) efficient TSE imaging module, utilizing improved reordering. The DIR preparation was tailored to UHF application. Additionally, fat artifacts were minimized by employing water excitation instead of fat saturation. GM images, covering the whole brain, were acquired in 7 min scan time at 1 mm isotropic resolution. SAR issues were overcome by using a dedicated flip angle calculation considering SAR and SNR efficiency. Furthermore, UHF related artifacts were minimized. The suggested sequence is suitable to generate GM images with whole-brain coverage at UHF. Due to the short total acquisition times and overall robustness, this approach can potentially enable DIR application in a routine setting and enhance lesion detection in neurological diseases. Magn Reson Med 79:2620-2628, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
NMR-based automated protein structure determination.
Würz, Julia M; Kazemi, Sina; Schmidt, Elena; Bagaria, Anurag; Güntert, Peter
2017-08-15
NMR spectra analysis for protein structure determination can now in many cases be performed by automated computational methods. This overview of the computational methods for NMR protein structure analysis presents recent automated methods for signal identification in multidimensional NMR spectra, sequence-specific resonance assignment, collection of conformational restraints, and structure calculation, as implemented in the CYANA software package. These algorithms are sufficiently reliable and integrated into one software package to enable the fully automated structure determination of proteins starting from NMR spectra without manual interventions or corrections at intermediate steps, with an accuracy of 1-2 Å backbone RMSD in comparison with manually solved reference structures. Copyright © 2017 Elsevier Inc. All rights reserved.
MODBASE, a database of annotated comparative protein structure models
Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C.; Ilyin, Valentin A.; Sali, Andrej
2002-01-01
MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10–4) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server. PMID:11752309
Restoration of distorted depth maps calculated from stereo sequences
NASA Technical Reports Server (NTRS)
Damour, Kevin; Kaufman, Howard
1991-01-01
A model-based Kalman estimator is developed for spatial-temporal filtering of noise and other degradations in velocity and depth maps derived from image sequences or cinema. As an illustration of the proposed procedures, edge information from image sequences of rigid objects is used in the processing of the velocity maps by selecting from a series of models for directional adaptive filtering. Adaptive filtering then allows for noise reduction while preserving sharpness in the velocity maps. Results from several synthetic and real image sequences are given.
Thomeer, Maarten G; Steensma, Anneke B; van Santbrink, Evert J; Willemssen, Francois E; Wielopolski, Piotr A; Hunink, Myriam G; Spronk, Sandra; Laven, Joop S; Krestin, Gabriel P
2014-04-01
The aim of this study was to determine whether an optimized 3.0-Tesla magnetic resonance imaging (MRI) protocol is sensitive and specific enough to detect patients with endometriosis. This was a prospective cohort study with consecutive patients. Forty consecutive patients with clinical suspicion of endometriosis underwent 3.0-Tesla MRI, including a T2-weighted high-resolution fast spin echo sequence (spatial resolution=0.75 ×1.2 ×1.5 mm³) and a 3D T1-weighted high-resolution gradient echo sequence (spatial resolution=0.75 ×1.2 × 2.0 mm³). Two radiologists reviewed the dataset with consensus reading. During laparoscopy, which was used as reference standard, all lesions were characterized according to the revised criteria of the American Fertility Society. Patient-level and region-level sensitivities and specificities and lesion-level sensitivities were calculated. Patient-level sensitivity was 42% for stage I (5/12) and 100% for stages II, III and IV (25/25). Patient-level specificity for all stages was 100% (3/3). The region-level sensitivity and specificity was 63% and 97%, respectively. The sensitivity per lesion was 61% (90% for deep lesions, 48% for superficial lesions and 100% for endometriomata). The detection rate of obliteration of the cul-the-sac was 100% (10/10) with no false positive findings. The interreader agreement was substantial to perfect (kappa=1 per patient, 0.65 per lesion and 0.71 for obliteration of the cul-the-sac). An optimized 3.0-Tesla MRI protocol is accurate in detecting stage II to stage IV endometriosis. © 2014 The Authors. Journal of Obstetrics and Gynaecology Research © 2014 Japan Society of Obstetrics and Gynecology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Darbon, H.; Weber, C.; Braun, W.
1991-02-19
Sequence-specific nuclear magnetic resonance assignments for the polypeptide backbone and for most of the amino acid side-chain protons, as well as the general folding of AaH IT, are described. AaH IT is a neurotoxin purified from the venom of the scorpion Androctonus australis Hector and is specifically active on the insect nervous system. The secondary structure and the hydrogen-bonding patterns in the regular secondary structure elements are deduced from nuclear Overhauser effects and the sequence locations of the slowly exchanging amide protons. The backbone folding is determined by distance geometry calculations with the DISMAN program. The regular secondary structure includesmore » two and a half turns of {alpha}-helix running from residues 21 to 30 and a three-stranded antiparallel {beta}-sheet including peptides 3-5, 34-38, and 41-46. Two tight turns are present, one connecting the end of the {alpha}-helix to an external strand of the {beta}-sheet, i.e., turn 31-34, and another connecting this same strand to the central one, i.e., turn 38-41. The differences in the specificity of these related proteins, which are able to discriminate between mammalian and insect voltage-dependent sodium channels of excitable tissues, are most probably brought about by the position of the C-terminal peptide with regard to a hydrophobic surface common to all scorpion toxins examined thus far. Thus, the interaction of a given scorpion toxin with its receptor might well be governed by the presence of this solvent-exposed hydrophobic surface, whereas adjacent areas modulate the specificity of the interaction.« less
Turkbey, Baris; Xu, Sheng; Kruecker, Jochen; Locklin, Julia; Pang, Yuxi; Shah, Vijay; Bernardo, Marcelino; Baccala, Angelo; Rastinehad, Ardeshir; Benjamin, Compton; Merino, Maria J; Wood, Bradford J; Choyke, Peter L; Pinto, Peter A
2011-03-29
During transrectal ultrasound (TRUS)-guided prostate biopsies, the actual location of the biopsy site is rarely documented. Here, we demonstrate the capability of TRUS-magnetic resonance imaging (MRI) image fusion to document the biopsy site and correlate biopsy results with multi-parametric MRI findings. Fifty consecutive patients (median age 61 years) with a median prostate-specific antigen (PSA) level of 5.8 ng/ml underwent 12-core TRUS-guided biopsy of the prostate. Pre-procedural T2-weighted magnetic resonance images were fused to TRUS. A disposable needle guide with miniature tracking sensors was attached to the TRUS probe to enable fusion with MRI. Real-time TRUS images during biopsy and the corresponding tracking information were recorded. Each biopsy site was superimposed onto the MRI. Each biopsy site was classified as positive or negative for cancer based on the results of each MRI sequence. Sensitivity, specificity, and receiver operating curve (ROC) area under the curve (AUC) values were calculated for multi-parametric MRI. Gleason scores for each multi-parametric MRI pattern were also evaluated. Six hundred and 5 systemic biopsy cores were analyzed in 50 patients, of whom 20 patients had 56 positive cores. MRI identified 34 of 56 positive cores. Overall, sensitivity, specificity, and ROC area values for multi-parametric MRI were 0.607, 0.727, 0.667, respectively. TRUS-MRI fusion after biopsy can be used to document the location of each biopsy site, which can then be correlated with MRI findings. Based on correlation with tracked biopsies, T2-weighted MRI and apparent diffusion coefficient maps derived from diffusion-weighted MRI are the most sensitive sequences, whereas the addition of delayed contrast enhancement MRI and three-dimensional magnetic resonance spectroscopy demonstrated higher specificity consistent with results obtained using radical prostatectomy specimens.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.
Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A
1993-01-01
The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Java web tools for PCR, in silico PCR, and oligonucleotide assembly and analysis.
Kalendar, Ruslan; Lee, David; Schulman, Alan H
2011-08-01
The polymerase chain reaction is fundamental to molecular biology and is the most important practical molecular technique for the research laboratory. We have developed and tested efficient tools for PCR primer and probe design, which also predict oligonucleotide properties based on experimental studies of PCR efficiency. The tools provide comprehensive facilities for designing primers for most PCR applications and their combinations, including standard, multiplex, long-distance, inverse, real-time, unique, group-specific, bisulphite modification assays, Overlap-Extension PCR Multi-Fragment Assembly, as well as a programme to design oligonucleotide sets for long sequence assembly by ligase chain reaction. The in silico PCR primer or probe search includes comprehensive analyses of individual primers and primer pairs. It calculates the melting temperature for standard and degenerate oligonucleotides including LNA and other modifications, provides analyses for a set of primers with prediction of oligonucleotide properties, dimer and G-quadruplex detection, linguistic complexity, and provides a dilution and resuspension calculator. Copyright © 2011 Elsevier Inc. All rights reserved.
Majumder, P; Choudhury, A; Banerjee, M; Lahiri, A; Bhattacharyya, N P
2007-08-01
To investigate the mechanism of increased expression of caspase-1 caused by exogenous Hippi, observed earlier in HeLa and Neuro2A cells, in this work we identified a specific motif AAAGACATG (- 101 to - 93) at the caspase-1 gene upstream sequence where HIPPI could bind. Various mutations in this specific sequence compromised the interaction, showing the specificity of the interactions. In the luciferase reporter assay, when the reporter gene was driven by caspase-1 gene upstream sequences (- 151 to - 92) with the mutation G to T at position - 98, luciferase activity was decreased significantly in green fluorescent protein-Hippi-expressing HeLa cells in comparison to that obtained with the wild-type caspase-1 gene 60 bp upstream sequence, indicating the biological significance of such binding. It was observed that the C-terminal 'pseudo' death effector domain of HIPPI interacted with the 60 bp (- 151 to - 92) upstream sequence of the caspase-1 gene containing the motif. We further observed that expression of caspase-8 and caspase-10 was increased in green fluorescent protein-Hippi-expressing HeLa cells. In addition, HIPPI interacted in vitro with putative promoter sequences of these genes, containing a similar motif. In summary, we identified a novel function of HIPPI; it binds to specific upstream sequences of the caspase-1, caspase-8 and caspase-10 genes and alters the expression of the genes. This result showed the motif-specific interaction of HIPPI with DNA, and indicates that it could act as transcription regulator.
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets
2010-01-01
Background The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. Findings We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. Conclusions TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease. PMID:20598141
NASA Technical Reports Server (NTRS)
Wheeler, Ward C.
2003-01-01
A method to align sequence data based on parsimonious synapomorphy schemes generated by direct optimization (DO; earlier termed optimization alignment) is proposed. DO directly diagnoses sequence data on cladograms without an intervening multiple-alignment step, thereby creating topology-specific, dynamic homology statements. Hence, no multiple-alignment is required to generate cladograms. Unlike general and globally optimal multiple-alignment procedures, the method described here, implied alignment (IA), takes these dynamic homologies and traces them back through a single cladogram, linking the unaligned sequence positions in the terminal taxa via DO transformation series. These "lines of correspondence" link ancestor-descendent states and, when displayed as linearly arrayed columns without hypothetical ancestors, are largely indistinguishable from standard multiple alignment. Since this method is based on synapomorphy, the treatment of certain classes of insertion-deletion (indel) events may be different from that of other alignment procedures. As with all alignment methods, results are dependent on parameter assumptions such as indel cost and transversion:transition ratios. Such an IA could be used as a basis for phylogenetic search, but this would be questionable since the homologies derived from the implied alignment depend on its natal cladogram and any variance, between DO and IA + Search, due to heuristic approach. The utility of this procedure in heuristic cladogram searches using DO and the improvement of heuristic cladogram cost calculations are discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.
The effects of auditory and visual cues on timing synchronicity for robotic rehabilitation.
English, Brittney A; Howard, Ayanna M
2017-07-01
In this paper, we explore how the integration of auditory and visual cues can help teach the timing of motor skills for the purpose of motor function rehabilitation. We conducted a study using Amazon's Mechanical Turk in which 106 participants played a virtual therapy game requiring wrist movements. To validate that our results would translate to trends that could also be observed during robotic rehabilitation sessions, we recreated this experiment with 11 participants using a robotic wrist rehabilitation system as means to control the therapy game. During interaction with the therapy game, users were asked to learn and reconstruct a tapping sequence as defined by musical notes flashing on the screen. Participants were divided into 2 test groups: (1) control: participants only received visual cues to prompt them on the timing sequence, and (2) experimental: participants received both visual and auditory cues to prompt them on the timing sequence. To evaluate performance, the timing and length of the sequence were measured. Performance was determined by calculating the number of trials needed before the participant was able to master the specific aspect of the timing task. In the virtual experiment, the group that received visual and auditory cues was able to master all aspects of the timing task faster than the visual cue only group with p-values < 0.05. This trend was also verified for participants using the robotic arm exoskeleton in the physical experiment.
Mishra, Chinmoy; Kumar, Subodh; Sonwane, Arvind Asaram; Yathish, H M; Chaudhary, Rajni
2017-01-02
The exploration of candidate genes for immune response in cattle may be vital for improving our understanding regarding the species specific response to pathogens. Toll-like receptor 4 (TLR4) is mostly involved in protection against the deleterious effects of Gram negative pathogens. Approximately 2.6 kb long cDNA sequence of TLR4 gene covering the entire coding region was characterized in two Indian milk cattle (Vrindavani and Tharparkar). The phylogenetic analysis confirmed that the bovine TLR4 was apparently evolved from an ancestral form that predated the appearance of vertebrates, and it is grouped with buffalo, yak, and mithun TLR4s. Sequence analysis revealed a 2526-nucleotide long open reading frame (ORF) encoding 841 amino acids, similar to other cattle breeds. The calculated molecular weight of the translated ORF was 96144 and 96040.9 Da; the isoelectric point was 6.35 and 6.42 in Vrindavani and Tharparkar cattle, respectively. The Simple Modular Architecture Research Tool (SMART) analysis identified 14 leucine rich repeats (LRR) motifs in bovine TLR4 protein. The deduced TLR4 amino acid sequence of Tharparkar had 4 different substitutions as compared to Bos taurus, Sahiwal, and Vrindavani. The signal peptide cleavage site predicted to lie between 16th and 17th amino acid of mature peptide. The transmebrane helix was identified between 635-657 amino acids in the mature peptide.
Sequences Associated with Centromere Competency in the Human Genome
Hayden, Karen E.; Strome, Erin D.; Merrett, Stephanie L.; Lee, Hye-Ran; Rudd, M. Katharine
2013-01-01
Centromeres, the sites of spindle attachment during mitosis and meiosis, are located in specific positions in the human genome, normally coincident with diverse subsets of alpha satellite DNA. While there is strong evidence supporting the association of some subfamilies of alpha satellite with centromere function, the basis for establishing whether a given alpha satellite sequence is or is not designated a functional centromere is unknown, and attempts to understand the role of particular sequence features in establishing centromere identity have been limited by the near identity and repetitive nature of satellite sequences. Utilizing a broadly applicable experimental approach to test sequence competency for centromere specification, we have carried out a genomic and epigenetic functional analysis of endogenous human centromere sequences available in the current human genome assembly. The data support a model in which functionally competent sequences confer an opportunity for centromere specification, integrating genomic and epigenetic signals and promoting the concept of context-dependent centromere inheritance. PMID:23230266
GAMSOR: Gamma Source Preparation and DIF3D Flux Solution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, M. A.; Lee, C. H.; Hill, R. N.
2016-12-15
Nuclear reactors that rely upon the fission reaction have two modes of thermal energy deposition in the reactor system: neutron absorption and gamma absorption. The gamma rays are typically generated by neutron absorption reactions or during the fission process which means the primary driver of energy production is of course the neutron interaction. In conventional reactor physics methods, the gamma heating component is ignored such that the gamma absorption is forced to occur at the gamma emission site. For experimental reactor systems like EBR-II and FFTF, the placement of structural pins and assemblies internal to the core leads to problemsmore » with power heating predictions because there is no fission power source internal to the assembly to dictate a spatial distribution of the power. As part of the EBR-II support work in the 1980s, the GAMSOR code was developed to assist analysts in calculating the gamma heating. The GAMSOR code is a modified version of DIF3D and actually functions within a sequence of DIF3D calculations. The gamma flux in a conventional fission reactor system does not perturb the neutron flux and thus the gamma flux calculation can be cast as a fixed source problem given a solution to the steady state neutron flux equation. This leads to a sequence of DIF3D calculations, called the GAMSOR sequence, which involves solving the neutron flux, then the gamma flux, then combining the results to do a summary edit. In this manuscript, we go over the GAMSOR code and detail how it is put together and functions. We also discuss how to setup the GAMSOR sequence and input for each DIF3D calculation in the GAMSOR sequence. With the GAMSOR capability, users can take any valid steady state DIF3D calculation and compute the power distribution due to neutron and gamma heating. The MC2-3 code is the preferable companion code to use for generating neutron and gamma cross section data, but the GAMSOR code can accept cross section data from other sources. To further this aspect, an additional utility code was created which demonstrates how to merge the neutron and gamma cross section data together to carry out a simultaneous solve of the two systems.« less
Image based automatic water meter reader
NASA Astrophysics Data System (ADS)
Jawas, N.; Indrianto
2018-01-01
Water meter is used as a tool to calculate water consumption. This tool works by utilizing water flow and shows the calculation result with mechanical digit counter. Practically, in everyday use, an operator will manually check the digit counter periodically. The Operator makes logs of the number shows by water meter to know the water consumption. This manual operation is time consuming and prone to human error. Therefore, in this paper we propose an automatic water meter digit reader from digital image. The digits sequence is detected by utilizing contour information of the water meter front panel.. Then an OCR method is used to get the each digit character. The digit sequence detection is an important part of overall process. It determines the success of overall system. The result shows promising results especially in sequence detection.
Structural optimization with approximate sensitivities
NASA Technical Reports Server (NTRS)
Patnaik, S. N.; Hopkins, D. A.; Coroneos, R.
1994-01-01
Computational efficiency in structural optimization can be enhanced if the intensive computations associated with the calculation of the sensitivities, that is, gradients of the behavior constraints, are reduced. Approximation to gradients of the behavior constraints that can be generated with small amount of numerical calculations is proposed. Structural optimization with these approximate sensitivities produced correct optimum solution. Approximate gradients performed well for different nonlinear programming methods, such as the sequence of unconstrained minimization technique, method of feasible directions, sequence of quadratic programming, and sequence of linear programming. Structural optimization with approximate gradients can reduce by one third the CPU time that would otherwise be required to solve the problem with explicit closed-form gradients. The proposed gradient approximation shows potential to reduce intensive computation that has been associated with traditional structural optimization.
Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis
McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Motin, Vladinir L [League City, TX
2009-02-24
Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Microchip method for the enrichment of specific DNA sequences
Mirzabekov, A.D.; Lysov, Y.P.; Shick, V.V.; Dubiley, S.A.
1998-12-22
A method for enriching specific genetic material sequences is provided, whereby oligonucleotide molecules complementary to the desired genetic material is first used to isolate the genetic material from a first source of genomic material. Then the genetic material is used as a label to isolate similar genetic sequences from other sources. 4 figs.
Microchip method for the enrichment of specific DNA sequences
Mirzabekov, Andrei Darievich; Lysov, Yuri Petrovich; Shick, Valentine Vladimirovich; Dubiley, Svetlana Alekseevna
1998-01-01
A method for enriching specific genetic material sequences is provided, whereby oligonucleotide molecules complementary to the desired genetic material is first used to isolate the genetic material from a first source of genomic material. Then the genetic material is used as a label to isolate similar genetic sequences from other sources.
USDA-ARS?s Scientific Manuscript database
Comparative sequence analysis of six independent chicken and turkey parvovirus nonstructural (NS) genes revealed specific genomic regions with 100% nucleotide sequence identity. A PCR assay with primers targeting these conserved genome sequences proved to be highly specific and sensitive to detect p...
Nucleotide sequences specific to Brucella and methods for the detection of Brucella
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCready, Paula M; Radnedge, Lyndsay; Andersen, Gary L
Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
An improved and validated RNA HLA class I SBT approach for obtaining full length coding sequences.
Gerritsen, K E H; Olieslagers, T I; Groeneweg, M; Voorter, C E M; Tilanus, M G J
2014-11-01
The functional relevance of human leukocyte antigen (HLA) class I allele polymorphism beyond exons 2 and 3 is difficult to address because more than 70% of the HLA class I alleles are defined by exons 2 and 3 sequences only. For routine application on clinical samples we improved and validated the HLA sequence-based typing (SBT) approach based on RNA templates, using either a single locus-specific or two overlapping group-specific polymerase chain reaction (PCR) amplifications, with three forward and three reverse sequencing reactions for full length sequencing. Locus-specific HLA typing with RNA SBT of a reference panel, representing the major antigen groups, showed identical results compared to DNA SBT typing. Alleles encountered with unknown exons in the IMGT/HLA database and three samples, two with Null and one with a Low expressed allele, have been addressed by the group-specific RNA SBT approach to obtain full length coding sequences. This RNA SBT approach has proven its value in our routine full length definition of alleles. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Toxins of Prokaryotic Toxin-Antitoxin Systems with Sequence-Specific Endoribonuclease Activity
Masuda, Hisako; Inouye, Masayori
2017-01-01
Protein translation is the most common target of toxin-antitoxin system (TA) toxins. Sequence-specific endoribonucleases digest RNA in a sequence-specific manner, thereby blocking translation. While past studies mainly focused on the digestion of mRNA, recent analysis revealed that toxins can also digest tRNA, rRNA and tmRNA. Purified toxins can digest single-stranded portions of RNA containing recognition sequences in the absence of ribosome in vitro. However, increasing evidence suggests that in vivo digestion may occur in association with ribosomes. Despite the prevalence of recognition sequences in many mRNA, preferential digestion seems to occur at specific positions within mRNA and also in certain reading frames. In this review, a variety of tools utilized to study the nuclease activities of toxins over the past 15 years will be reviewed. A recent adaptation of an RNA-seq-based technique to analyze entire sets of cellular RNA will be introduced with an emphasis on its strength in identifying novel targets and redefining recognition sequences. The differences in biochemical properties and postulated physiological roles will also be discussed. PMID:28420090
Wells, Michael L; Moynagh, Michael R; Carter, Rickey E; Childs, Robert A; Leitch, Cameron E; Fletcher, Joel G; Yeh, Benjamin M; Venkatesh, Sudhakar K
2017-01-01
To compare MR hepatic fractional extracellular space (fECS) to liver stiffness (LS) with magnetic resonance elastography (MRE) for evaluation of liver fibrosis. 71 consecutive patients with suspected chronic liver disease underwent standard liver MRI with MR elastography and additional delayed Gd-DTPA-enhanced sequences at 5 and 10 min in order to calculate hepatic fECS (%) and LS (kilopascals, kPa). Two radiologists blinded to clinical history examined MR images and calculated fECS and LS in identical locations for every patient. Interobserver agreement was calculated using the intraclass correlation coefficient. Pearson's correlation was calculated for LS and fECS measures, as was the area under the receiver operatic curve (AUROC), sensitivity and specificity of fECS to predict liver stiffness ≥2.93 and ≥5 kPa. The sensitivity of fECS for detecting fibrosis was separately analyzed in the subgroup of patients without anatomic findings of cirrhosis. Substantial to excellent interobserver agreement for both LS and fECS measurements was seen with intraclass correlation of 0.88 (95% CI 0.81-0.92) for LS, 0.77 (95% CI 0.66-0.85) for fECS 5 and 0.76 (95% CI 0.64-0.84) for fECS 10 . A significant correlation was found between MRE and fECS 5 (r = 0.47, p < 0.0001) and fECS 10 (r = 0.44, p < 0.0001). The performance of fECS improved for detection of advanced fibrosis (≥5 kPa) with AUROC, sensitivity and specificity of 0.72, 38%, and 94% for fECS 5 and 0.72, 67%, and 66% for fECS 10 . fECS correlates modestly with MRE-determined LS. fECS at MRI is a simple calculation to perform and may represent a practical way to suggest the presence of fibrosis during routine liver evaluation.
Flanking sequence determination and event-specific detection of genetically modified wheat B73-6-1.
Xu, Junyi; Cao, Jijuan; Cao, Dongmei; Zhao, Tongtong; Huang, Xin; Zhang, Piqiao; Luan, Fengxia
2013-05-01
In order to establish a specific identification method for genetically modified (GM) wheat, exogenous insert DNA and flanking sequence between exogenous fragment and recombinant chromosome of GM wheat B73-6-1 were successfully acquired by means of conventional polymerase chain reaction (PCR) and thermal asymmetric interlaced (TAIL)-PCR strategies. Newly acquired exogenous fragment covered the full-length sequence of transformed genes such as transformed plasmid and corresponding functional genes including marker uidA, herbicide-resistant bar, ubiquitin promoter, and high-molecular-weight gluten subunit. The flanking sequence between insert DNA revealed high similarity with Triticum turgidum A gene (GenBank: AY494981.1). A specific PCR detection method for GM wheat B73-6-1 was established on the basis of primers designed according to the flanking sequence. This specific PCR method was validated by GM wheat, GM corn, GM soybean, GM rice, and non-GM wheat. The specifically amplified target band was observed only in GM wheat B73-6-1. This method is of high specificity, high reproducibility, rapid identification, and excellent accuracy for the identification of GM wheat B73-6-1.
Identification and correction of systematic error in high-throughput sequence data
2011-01-01
Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. Results We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets. Conclusions Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments. PMID:22099972
Detection of nucleic acids by multiple sequential invasive cleavages
Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.
1999-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann; Kwiatkowski, Robert W.; Vavra, Stephanie H.
2005-03-29
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of nucleic acid from various viruses in a sample.
Detection of nucleic acids by multiple sequential invasive cleavages 02
Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.
2002-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Detection of nucleic acids by multiple sequential invasive cleavages
Hall, Jeff G; Lyamichev, Victor I; Mast, Andrea L; Brow, Mary Ann D
2012-10-16
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Bhat, Abhay Prasad; Shin, Minsang; Choy, Hyon E
2014-07-01
Histone-like nucleoid structuring protein (H-NS) is a small but abundant protein present in enteric bacteria and is involved in compaction of the DNA and regulation of the transcription. Recent reports have suggested that H-NS binds to a specific AT rich DNA sequence than to intrinsically curved DNA in sequence independent manner. We detected two high-specificity H-NS binding sites in LEE5 promoter of EPEC centered at -110 and -138, which were close to the proposed consensus H-NS binding motif. To identify H-NS binding sequence in LEE5 promoter, we took a random mutagenesis approach and found the mutations at around -138 were specifically defective in the regulation by H-NS. It was concluded that H-NS exerts maximum repression via the specific sequence at around -138 and subsequently contacts a subunit of RNAP through oligomerization.
Deep Sequencing to Identify the Causes of Viral Encephalitis
Chan, Benjamin K.; Wilson, Theodore; Fischer, Kael F.; Kriesel, John D.
2014-01-01
Deep sequencing allows for a rapid, accurate characterization of microbial DNA and RNA sequences in many types of samples. Deep sequencing (also called next generation sequencing or NGS) is being developed to assist with the diagnosis of a wide variety of infectious diseases. In this study, seven frozen brain samples from deceased subjects with recent encephalitis were investigated. RNA from each sample was extracted, randomly reverse transcribed and sequenced. The sequence analysis was performed in a blinded fashion and confirmed with pathogen-specific PCR. This analysis successfully identified measles virus sequences in two brain samples and herpes simplex virus type-1 sequences in three brain samples. No pathogen was identified in the other two brain specimens. These results were concordant with pathogen-specific PCR and partially concordant with prior neuropathological examinations, demonstrating that deep sequencing can accurately identify viral infections in frozen brain tissue. PMID:24699691
Bai, Yu; Iwasaki, Yuki; Kanaya, Shigehiko; Zhao, Yue; Ikemura, Toshimichi
2014-01-01
With remarkable increase of genomic sequence data of a wide range of species, novel tools are needed for comprehensive analyses of the big sequence data. Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional data such as oligonucleotide composition on one map. By modifying the conventional SOM, we have previously developed Batch-Learning SOM (BLSOM), which allows classification of sequence fragments according to species, solely depending on the oligonucleotide composition. In the present study, we introduce the oligonucleotide BLSOM used for characterization of vertebrate genome sequences. We first analyzed pentanucleotide compositions in 100 kb sequences derived from a wide range of vertebrate genomes and then the compositions in the human and mouse genomes in order to investigate an efficient method for detecting differences between the closely related genomes. BLSOM can recognize the species-specific key combination of oligonucleotide frequencies in each genome, which is called a "genome signature," and the specific regions specifically enriched in transcription-factor-binding sequences. Because the classification and visualization power is very high, BLSOM is an efficient powerful tool for extracting a wide range of information from massive amounts of genomic sequences (i.e., big sequence data).
Travel-time source-specific station correction improves location accuracy
NASA Astrophysics Data System (ADS)
Giuntini, Alessandra; Materni, Valerio; Chiappini, Stefano; Carluccio, Roberto; Console, Rodolfo; Chiappini, Massimo
2013-04-01
Accurate earthquake locations are crucial for investigating seismogenic processes, as well as for applications like verifying compliance to the Comprehensive Test Ban Treaty (CTBT). Earthquake location accuracy is related to the degree of knowledge about the 3-D structure of seismic wave velocity in the Earth. It is well known that modeling errors of calculated travel times may have the effect of shifting the computed epicenters far from the real locations by a distance even larger than the size of the statistical error ellipses, regardless of the accuracy in picking seismic phase arrivals. The consequences of large mislocations of seismic events in the context of the CTBT verification is particularly critical in order to trigger a possible On Site Inspection (OSI). In fact, the Treaty establishes that an OSI area cannot be larger than 1000 km2, and its larger linear dimension cannot be larger than 50 km. Moreover, depth accuracy is crucial for the application of the depth event screening criterion. In the present study, we develop a method of source-specific travel times corrections based on a set of well located events recorded by dense national seismic networks in seismically active regions. The applications concern seismic sequences recorded in Japan, Iran and Italy. We show that mislocations of the order of 10-20 km affecting the epicenters, as well as larger mislocations in hypocentral depths, calculated from a global seismic network and using the standard IASPEI91 travel times can be effectively removed by applying source-specific station corrections.
Picard, François J.; Ke, Danbing; Boudreau, Dominique K.; Boissinot, Maurice; Huletsky, Ann; Richard, Dave; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.
2004-01-01
A 761-bp portion of the tuf gene (encoding the elongation factor Tu) from 28 clinically relevant streptococcal species was obtained by sequencing amplicons generated using broad-range PCR primers. These tuf sequences were used to select Streptococcus-specific PCR primers and to perform phylogenetic analysis. The specificity of the PCR assay was verified using 102 different bacterial species, including the 28 streptococcal species. Genomic DNA purified from all streptococcal species was efficiently detected, whereas there was no amplification with DNA from 72 of the 74 nonstreptococcal bacterial species tested. There was cross-amplification with DNAs from Enterococcus durans and Lactococcus lactis. However, the 15 to 31% nucleotide sequence divergence in the 761-bp tuf portion of these two species compared to any streptococcal tuf sequence provides ample sequence divergence to allow the development of internal probes specific to streptococci. The Streptococcus-specific assay was highly sensitive for all 28 streptococcal species tested (i.e., detection limit of 1 to 10 genome copies per PCR). The tuf sequence data was also used to perform extensive phylogenetic analysis, which was generally in agreement with phylogeny determined on the basis of 16S rRNA gene data. However, the tuf gene provided a better discrimination at the streptococcal species level that should be particularly useful for the identification of very closely related species. In conclusion, tuf appears more suitable than the 16S ribosomal RNA gene for the development of diagnostic assays for the detection and identification of streptococcal species because of its higher level of species-specific genetic divergence. PMID:15297518
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Suhkmann; Zhang, Ziming; Upchurch, Sean
2004-04-16
2 ARID is a homologous family of DNA-binding domains that occur in DNA binding proteins from a wide variety of species, ranging from yeast to nematodes, insects, mammals and plants. SWI1, a member of the SWI/SNF protein complex that is involved in chromatin remodeling during transcription, contains the ARID motif. The ARID domain of human SWI1 (also known as p270) does not select for a specific DNA sequence from a random sequence pool. The lack of sequence specificity shown by the SWI1 ARID domain stands in contrast to the other characterized ARID domains, which recognize specific AT-rich sequences. We havemore » solved the three-dimensional structure of human SWI1 ARID using solution NMR methods. In addition, we have characterized non-specific DNA-binding by the SWI1 ARID domain. Results from this study indicate that a flexible long internal loop in ARID motif is likely to be important for sequence specific DNA-recognition. The structure of human SWI1 ARID domain also represents a distinct structural subfamily. Studies of ARID indicate that boundary of the DNA binding structural and functional domains can extend beyond the sequence homologous region in a homologous family of proteins. Structural studies of homologous domains such as ARID family of DNA-binding domains should provide information to better predict the boundary of structural and functional domains in structural genomic studies. Key Words: ARID, SWI1, NMR, structural genomics, protein-DNA interaction.« less
Long-range correlations and charge transport properties of DNA sequences
NASA Astrophysics Data System (ADS)
Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui
2010-04-01
By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
Nadzirah, Sh; Azizah, N; Hashim, Uda; Gopinath, Subash C B; Kashif, Mohd
2015-01-01
Nanoparticle-mediated bio-sensing promoted the development of novel sensors in the front of medical diagnosis. In the present study, we have generated and examined the potential of titanium dioxide (TiO2) crystalline nanoparticles with aluminium interdigitated electrode biosensor to specifically detect single-stranded E.coli O157:H7 DNA. The performance of this novel DNA biosensor was measured the electrical current response using a picoammeter. The sensor surface was chemically functionalized with (3-aminopropyl) triethoxysilane (APTES) to provide contact between the organic and inorganic surfaces of a single-stranded DNA probe and TiO2 nanoparticles while maintaining the sensing system's physical characteristics. The complement of the target DNA of E. coli O157:H7 to the carboxylate-probe DNA could be translated into electrical signals and confirmed by the increased conductivity in the current-to-voltage curves. The specificity experiments indicate that the biosensor can discriminate between the complementary sequences from the base-mismatched and the non-complementary sequences. After duplex formation, the complementary target sequence can be quantified over a wide range with a detection limit of 1.0 x 10(-13)M. With target DNA from the lysed E. coli O157:H7, we could attain similar sensitivity. Stability of DNA immobilized surface was calculated with the relative standard deviation (4.6%), displayed the retaining with 99% of its original response current until 6 months. This high-performance interdigitated DNA biosensor with high sensitivity, stability and non-fouling on a novel sensing platform is suitable for a wide range of biomolecular interactive analyses.
Chiu, Chi-Chien; John, Joseph Abraham Christopher; Hseu, Tzong-Hsiung; Chang, Chi-Yao
2002-03-01
The pituitary-specific transcription factor Pit-1 belongs to the family of POU-domain proteins and is known to play an important role in the differentiation of pituitary cells. Here we report the complete nucleotide sequence of cDNA encoding Pit-1 from the brackish water fish, ayu (Plecoglossus altivelis). Nucleotide sequence analysis of 1910 bp of ayu Pit-1 cDNA revealed an open reading frame of 1074 bp that encodes a protein of 358 amino acids containing a POU-specific domain, POU homeodomain, and an STA (Ser/Thr-rich activation) transactivation domain. We inserted the coding region of Pit-1 cDNA, obtained by PCR, into a pET-20b(+) plasmid to produce recombinant Pit-1 in Escherichia coli BL21 (DE3) pLysS cells. Upon induction with isopropyl beta-D-thiogalactopyranoside, Pit-1 was expressed and accumulated as inclusion bodies in E. coli. The protein was then purified in one step by affinity chromatography on a nickel-nitrilotriacetic acid agarose column under denaturing conditions. This method yielded 0.7 mg of highly pure and stable protein per 200 ml of bacterial culture. A band of 40 kDa, resolved as recombinant ayu Pit-1 by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, agrees well with the molecular mass calculated from the translated cDNA sequence. The purified recombinant Pit-1 was confirmed in vitro through Western blot analysis, using its monoclonal antibody. This monoclonal antibody detected Pit-1 in the nuclei of ayu developing pituitary by immunohistochemical reaction. It serves as a good reagent for the detection of ayu Pit-1 in situ. Copyright 2002 Elsevier Science (USA).
Huang, Yi-Wen; Roa, Juan C.; Goodfellow, Paul J.; Kizer, E. Lynette; Huang, Tim H. M.; Chen, Yidong
2013-01-01
Background DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Methodology/Principal Findings Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. Conclusions/Significance CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/. PMID:23630576
Abbreviated Combined MR Protocol: A New Faster Strategy for Characterizing Breast Lesions.
Moschetta, Marco; Telegrafo, Michele; Rella, Leonarda; Stabile Ianora, Amato Antonio; Angelelli, Giuseppe
2016-06-01
The use of an abbreviated magnetic resonance (MR) protocol has been recently proposed for cancer screening. The aim of our study is to evaluate the diagnostic accuracy of an abbreviated MR protocol combining short TI inversion recovery (STIR), turbo-spin-echo (TSE)-T2 sequences, a pre-contrast T1, and a single intermediate (3 minutes after contrast injection) post-contrast T1 sequence for characterizing breast lesions. A total of 470 patients underwent breast MR examination for screening, problem solving, or preoperative staging. Two experienced radiologists evaluated both standard and abbreviated protocols in consensus. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic accuracy for both protocols were calculated (with the histological findings and 6-month ultrasound follow-up as the reference standard) and compared with the McNemar test. The post-processing and interpretation times for the MR images were compared with the paired t test. In 177 of 470 (38%) patients, the MR sequences detected 185 breast lesions. Standard and abbreviated protocols obtained sensitivity, specificity, diagnostic accuracy, PPV, and NPV values respectively of 92%, 92%, 92%, 68%, and 98% and of 89%, 91%, 91%, 64%, and 98% with no statistically significant difference (P < .0001). The mean post-processing and interpretation time were, respectively, 7 ± 1 minutes and 6 ± 3.2 minutes for the standard protocol and 1 ± 1.2 minutes and 2 ± 1.2 minutes for the abbreviated protocol, with a statistically significant difference (P < .01). An abbreviated combined MR protocol represents a time-saving tool for radiologists and patients with the same diagnostic potential as the standard protocol in patients undergoing breast MRI for screening, problem solving, or preoperative staging. Copyright © 2016 Elsevier Inc. All rights reserved.
Nadzirah, Sh.; Azizah, N.; Hashim, Uda; Gopinath, Subash C. B.; Kashif, Mohd
2015-01-01
Nanoparticle-mediated bio-sensing promoted the development of novel sensors in the front of medical diagnosis. In the present study, we have generated and examined the potential of titanium dioxide (TiO2) crystalline nanoparticles with aluminium interdigitated electrode biosensor to specifically detect single-stranded E.coli O157:H7 DNA. The performance of this novel DNA biosensor was measured the electrical current response using a picoammeter. The sensor surface was chemically functionalized with (3-aminopropyl) triethoxysilane (APTES) to provide contact between the organic and inorganic surfaces of a single-stranded DNA probe and TiO2 nanoparticles while maintaining the sensing system’s physical characteristics. The complement of the target DNA of E. coli O157:H7 to the carboxylate-probe DNA could be translated into electrical signals and confirmed by the increased conductivity in the current-to-voltage curves. The specificity experiments indicate that the biosensor can discriminate between the complementary sequences from the base-mismatched and the non-complementary sequences. After duplex formation, the complementary target sequence can be quantified over a wide range with a detection limit of 1.0 x 10-13M. With target DNA from the lysed E. coli O157:H7, we could attain similar sensitivity. Stability of DNA immobilized surface was calculated with the relative standard deviation (4.6%), displayed the retaining with 99% of its original response current until 6 months. This high-performance interdigitated DNA biosensor with high sensitivity, stability and non-fouling on a novel sensing platform is suitable for a wide range of biomolecular interactive analyses. PMID:26445455
Constructing and detecting a cDNA library for mites.
Hu, Li; Zhao, YaE; Cheng, Juan; Yang, YuanJun; Li, Chen; Lu, ZhaoHui
2015-10-01
RNA extraction and construction of complementary DNA (cDNA) library for mites have been quite challenging due to difficulties in acquiring tiny living mites and breaking their hard chitin. The present study is to explore a better method to construct cDNA library for mites that will lay the foundation on transcriptome and molecular pathogenesis research. We selected Psoroptes cuniculi as an experimental subject and took the following steps to construct and verify cDNA library. First, we combined liquid nitrogen grinding with TRIzol for total RNA extraction. Then, switching mechanism at 5' end of the RNA transcript (SMART) technique was used to construct full-length cDNA library. To evaluate the quality of cDNA library, the library titer and recombination rate were calculated. The reliability of cDNA library was detected by sequencing and analyzing positive clones and genes amplified by specific primers. The results showed that the RNA concentration was 836 ng/μl and the absorbance ratio at 260/280 nm was 1.82. The library titer was 5.31 × 10(5) plaque-forming unit (PFU)/ml and the recombination rate was 98.21%, indicating that the library was of good quality. In the 33 expressed sequence tags (ESTs) of P. cuniculi, two clones of 1656 and 1658 bp were almost identical with only three variable sites detected, which had an identity of 99.63% with that of Psoroptes ovis, indicating that the cDNA library was reliable. Further detection by specific primers demonstrated that the 553-bp Pso c II gene sequences of P. cuniculi had an identity of 98.56% with those of P. ovis, confirming that the cDNA library was not only reliable but also feasible.
Gu, Fei; Doderer, Mark S; Huang, Yi-Wen; Roa, Juan C; Goodfellow, Paul J; Kizer, E Lynette; Huang, Tim H M; Chen, Yidong
2013-01-01
DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/.
A Novel Helicase-Type Protein in the Nucleolus: Protein NOH61
Zirwes, Rudolf F.; Eilbracht, Jens; Kneissel, Sandra; Schmidt-Zachmann, Marion S.
2000-01-01
We report the identification, cDNA cloning, and molecular characterization of a novel, constitutive nucleolar protein. The cDNA-deduced amino acid sequence of the human protein defines a polypeptide of a calculated mass of 61.5 kDa and an isoelectric point of 9.9. Inspection of the primary sequence disclosed that the protein is a member of the family of “DEAD-box” proteins, representing a subgroup of putative ATP-dependent RNA helicases. ATPase activity of the recombinant protein is evident and stimulated by a variety of polynucleotides tested. Immunolocalization studies revealed that protein NOH61 (nucleolar helicase of 61 kDa) is highly conserved during evolution and shows a strong accumulation in nucleoli. Biochemical experiments have shown that protein NOH61 synthesized in vitro sediments with ∼11.5 S, i.e., apparently as homo-oligomeric structures. By contrast, sucrose gradient centrifugation analysis of cellular extracts obtained with buffers of elevated ionic strength (600 mM NaCl) revealed that the solubilized native protein sediments with ∼4 S, suggestive of the monomeric form. Interestingly, protein NOH61 has also been identified as a specific constituent of free nucleoplasmic 65S preribosomal particles but is absent from cytoplasmic ribosomes. Treatment of cultured cells with 1) the transcription inhibitor actinomycin D and 2) RNase A results in a complete dissociation of NOH61 from nucleolar structures. The specific intracellular localization and its striking sequence homology to other known RNA helicases lead to the hypothesis that protein NOH61 might be involved in ribosome synthesis, most likely during the assembly process of the large (60S) ribosomal subunit. PMID:10749921
Domain-specific learning of grammatical structure in musical and phonological sequences.
Bly, Benjamin Martin; Carrión, Ricardo E; Rasch, Björn
2009-01-01
Artificial grammar learning depends on acquisition of abstract structural representations rather than domain-specific representational constraints, or so many studies tell us. Using an artificial grammar task, we compared learning performance in two stimulus domains in which respondents have differing tacit prior knowledge. We found that despite grammatically identical sequence structures, learning was better for harmonically related chord sequences than for letter name sequences or harmonically unrelated chord sequences. We also found transfer effects within the musical and letter name tasks, but not across the domains. We conclude that knowledge acquired in implicit learning depends not only on abstract features of structured stimuli, but that the learning of regularities is in some respects domain-specific and strongly linked to particular features of the stimulus domain.
Gupta, Parth Sarthi Sen; Banerjee, Shyamashree; Islam, Rifat Nawaz Ul; Mondal, Sudipta; Mondal, Buddhadev; Bandyopadhyay, Amal K
2014-01-01
In the genomic and proteomic era, efficient and automated analyses of sequence properties of protein have become an important task in bioinformatics. There are general public licensed (GPL) software tools to perform a part of the job. However, computations of mean properties of large number of orthologous sequences are not possible from the above mentioned GPL sets. Further, there is no GPL software or server which can calculate window dependent sequence properties for a large number of sequences in a single run. With a view to overcome above limitations, we have developed a standalone procedure i.e. PHYSICO, which performs various stages of computation in a single run based on the type of input provided either in RAW-FASTA or BLOCK-FASTA format and makes excel output for: a) Composition, Class composition, Mean molecular weight, Isoelectic point, Aliphatic index and GRAVY, b) column based compositions, variability and difference matrix, c) 25 kinds of window dependent sequence properties. The program is fast, efficient, error free and user friendly. Calculation of mean and standard deviation of homologous sequences sets, for comparison purpose when relevant, is another attribute of the program; a property seldom seen in existing GPL softwares. PHYSICO is freely available for non-commercial/academic user in formal request to the corresponding author akbanerjee@biotech.buruniv.ac.in.
Gupta, Parth Sarthi Sen; Banerjee, Shyamashree; Islam, Rifat Nawaz Ul; Mondal, Sudipta; Mondal, Buddhadev; Bandyopadhyay, Amal K
2014-01-01
In the genomic and proteomic era, efficient and automated analyses of sequence properties of protein have become an important task in bioinformatics. There are general public licensed (GPL) software tools to perform a part of the job. However, computations of mean properties of large number of orthologous sequences are not possible from the above mentioned GPL sets. Further, there is no GPL software or server which can calculate window dependent sequence properties for a large number of sequences in a single run. With a view to overcome above limitations, we have developed a standalone procedure i.e. PHYSICO, which performs various stages of computation in a single run based on the type of input provided either in RAW-FASTA or BLOCK-FASTA format and makes excel output for: a) Composition, Class composition, Mean molecular weight, Isoelectic point, Aliphatic index and GRAVY, b) column based compositions, variability and difference matrix, c) 25 kinds of window dependent sequence properties. The program is fast, efficient, error free and user friendly. Calculation of mean and standard deviation of homologous sequences sets, for comparison purpose when relevant, is another attribute of the program; a property seldom seen in existing GPL softwares. Availability PHYSICO is freely available for non-commercial/academic user in formal request to the corresponding author akbanerjee@biotech.buruniv.ac.in PMID:24616564
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, Kong Ooi; Meier, Beat H., E-mail: beme@ethz.ch, E-mail: maer@ethz.ch; Ernst, Matthias, E-mail: beme@ethz.ch, E-mail: maer@ethz.ch
2016-09-07
We present a generalized theoretical framework that allows the approximate but rapid analysis of residual couplings of arbitrary decoupling sequences in solid-state NMR under magic-angle spinning conditions. It is a generalization of the tri-modal Floquet analysis of TPPM decoupling [Scholz et al., J. Chem. Phys. 130, 114510 (2009)] where three characteristic frequencies are used to describe the pulse sequence. Such an approach can be used to describe arbitrary periodic decoupling sequences that differ only in the magnitude of the Fourier coefficients of the interaction-frame transformation. It allows a ∼100 times faster calculation of second-order residual couplings as a function ofmore » pulse sequence parameters than full spin-dynamics simulations. By comparing the theoretical calculations with full numerical simulations, we show the potential of the new approach to examine the performance of decoupling sequences. We exemplify the usefulness of this framework by analyzing the performance of commonly used high-power decoupling sequences and low-power decoupling sequences such as amplitude-modulated XiX (AM-XiX) and its super-cycled variant SC-AM-XiX. In addition, the effect of chemical-shift offset is examined for both high- and low-power decoupling sequences. The results show that the cross-terms between the dipolar couplings are the main contributions to the line broadening when offset is present. We also show that the SC-AM-XIX shows a better offset compensation.« less
Tan, Kong Ooi; Agarwal, Vipin; Meier, Beat H; Ernst, Matthias
2016-09-07
We present a generalized theoretical framework that allows the approximate but rapid analysis of residual couplings of arbitrary decoupling sequences in solid-state NMR under magic-angle spinning conditions. It is a generalization of the tri-modal Floquet analysis of TPPM decoupling [Scholz et al., J. Chem. Phys. 130, 114510 (2009)] where three characteristic frequencies are used to describe the pulse sequence. Such an approach can be used to describe arbitrary periodic decoupling sequences that differ only in the magnitude of the Fourier coefficients of the interaction-frame transformation. It allows a ∼100 times faster calculation of second-order residual couplings as a function of pulse sequence parameters than full spin-dynamics simulations. By comparing the theoretical calculations with full numerical simulations, we show the potential of the new approach to examine the performance of decoupling sequences. We exemplify the usefulness of this framework by analyzing the performance of commonly used high-power decoupling sequences and low-power decoupling sequences such as amplitude-modulated XiX (AM-XiX) and its super-cycled variant SC-AM-XiX. In addition, the effect of chemical-shift offset is examined for both high- and low-power decoupling sequences. The results show that the cross-terms between the dipolar couplings are the main contributions to the line broadening when offset is present. We also show that the SC-AM-XIX shows a better offset compensation.
Identification of tissue-specific targeting peptide
NASA Astrophysics Data System (ADS)
Jung, Eunkyoung; Lee, Nam Kyung; Kang, Sang-Kee; Choi, Seung-Hoon; Kim, Daejin; Park, Kisoo; Choi, Kihang; Choi, Yun-Jaie; Jung, Dong Hyun
2012-11-01
Using phage display technique, we identified tissue-targeting peptide sets that recognize specific tissues (bone-marrow dendritic cell, kidney, liver, lung, spleen and visceral adipose tissue). In order to rapidly evaluate tissue-specific targeting peptides, we performed machine learning studies for predicting the tissue-specific targeting activity of peptides on the basis of peptide sequence information using four machine learning models and isolated the groups of peptides capable of mediating selective targeting to specific tissues. As a representative liver-specific targeting sequence, the peptide "DKNLQLH" was selected by the sequence similarity analysis. This peptide has a high degree of homology with protein ligands which can interact with corresponding membrane counterparts. We anticipate that our models will be applicable to the prediction of tissue-specific targeting peptides which can recognize the endothelial markers of target tissues.
Sequence specificity of the human mRNA N6-adenosine methylase in vitro.
Harper, J E; Miceli, S M; Roberts, R J; Manley, J L
1990-01-01
N6-adenosine methylation is a frequent modification of mRNAs and their precursors, but little is known about the mechanism of the reaction or the function of the modification. To explore these questions, we developed conditions to examine N6-adenosine methylase activity in HeLa cell nuclear extracts. Transfer of the methyl group from S-[3H methyl]-adenosylmethionine to unlabeled random copolymer RNA substrates of varying ribonucleotide composition revealed a substrate specificity consistent with a previously deduced consensus sequence, Pu[G greater than A]AC[A/C/U]. 32-P labeled RNA substrates of defined sequence were used to examine the minimum sequence requirements for methylation. Each RNA was 20 nucleotides long, and contained either the core consensus sequence GGACU, or some variation of this sequence. RNAs containing GGACU, either in single or multiple copies, were good substrates for methylation, whereas RNAs containing single base substitutions within the GGACU sequence gave dramatically reduced methylation. These results demonstrate that the N6-adenosine methylase has a strict sequence specificity, and that there is no requirement for extended sequences or secondary structures for methylation. Recognition of this sequence does not require an RNA component, as micrococcal nuclease pretreatment of nuclear extracts actually increased methylation efficiency. Images PMID:2216767
Methods for chromosome-specific staining
Gray, Joe W.; Pinkel, Daniel
1995-01-01
Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogenous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include methods for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes.
Focal point determination in magnetic resonance-guided focused ultrasound using tracking coils.
Svedin, Bryant T; Beck, Michael J; Hadley, J Rock; Merrill, Robb; de Bever, Joshua T; Bolster, Bradley D; Payne, Allison; Parker, Dennis L
2017-06-01
To develop a method for rapid prediction of the geometric focus location in MR coordinates of a focused ultrasound (US) transducer with arbitrary position and orientation without sonicating. Three small tracker coil circuits were designed, constructed, attached to the transducer housing of a breast-specific MR-guided focused US (MRgFUS) system with 5 degrees of freedom, and connected to receiver channel inputs of an MRI scanner. A one-dimensional sequence applied in three orthogonal directions determined the position of each tracker, which was then corrected for gradient nonlinearity. In a calibration step, low-level heating located the US focus in one transducer position orientation where the tracker positions were also known. Subsequent US focus locations were determined from the isometric transformation of the trackers. The accuracy of this method was verified by comparing the tracking coil predictions to thermal center of mass calculated using MR thermometry data acquired at 16 different transducer positions for MRgFUS sonications in a homogeneous gelatin phantom. The tracker coil predicted focus was an average distance of 2.1 ± 1.1 mm from the thermal center of mass. The one-dimensional locator sequence and prediction calculations took less than 1 s to perform. This technique accurately predicts the geometric focus for a transducer with arbitrary position and orientation without sonicating. Magn Reson Med 77:2424-2430, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike
2006-10-01
Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
NASA Astrophysics Data System (ADS)
Kolb, Ulrich; Baraffe, Isabelle
Using improved, up-to-date stellar input physics tested against observations of low-mass stars and brown dwarfs we calculate the secular evolution of low-donor-mass CVs, including those which form with a brown dwarf donor star. Our models confirm the mismatch between the calculated minimum period (plus or minus in ~= 70 min) and the observed short-period cut-off (~= 80 min) in the CV period histogram. Theoretical period distributions synthesized from our model sequences always show an accumulation of systems at the minimum period, a feature absent in the observed distribution. We suggest that non-magnetic CVs become unobservable as they are effectively trapped in permanent quiescence before they reach plus or minus in, and that small-number statistics may hide the period spike for magnetic CVs. We calculate the minimum period for high mass transfer rate sequences and discuss the relevance of these for explaining the location of CV secondaries in the orbital-period-spectral-type diagram. We also show that a recently suggested revised mass-radius relation for low-mass main-sequence stars cannot explain the CV period gap.
Dynamic programming algorithms for biological sequence comparison.
Pearson, W R; Miller, W
1992-01-01
Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.
Sequence periodicity in nucleosomal DNA and intrinsic curvature.
Nair, T Murlidharan
2010-05-17
Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
USDA-ARS?s Scientific Manuscript database
The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...
The Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to address needs for rapid, cost effective methods of species extrapolation of chemical susceptibility. Specifically, the SeqAPASS tool compares the primary sequence (Level 1), functiona...
Chavada, Ruchir; Maley, Michael
2015-01-01
Introduction: Community and healthcare associated infections caused by multi-drug resistant gram negative organisms (MDR GN) represent a worldwide threat. Nucleic Acid Detection tests are becoming more common for their detection; however they can be expensive requiring specialised equipment and local expertise. This study was done to evaluate the utility of a commercial multiplex tandem (MT) PCR for detection of MDR GN. Methods: The study was done on stored laboratory MDR GN isolates from sterile and non-sterile specimens (n=126, out of stored 567 organisms). Laboratory validation of the MT PCR was done to evaluate sensitivity, specificity and agreement with the current phenotypic methods used in the laboratory. Amplicon sequencing was also done on selected isolates for assessing performance characteristics. Workflow and cost implications of the MT PCR were evaluated. Results: The sensitivity and specificity of the MT PCR were calculated to be 95% and 96.7% respectively. Agreement with the phenotypic methods was 80%. Major lack of agreement was seen in detection of AmpC beta lactamase in enterobacteriaceae and carbapenemase in non-fermenters. Agreement of the MT PCR with another multiplex PCR was found to be 87%. Amplicon sequencing confirmed the genotype detected by MT PCR in 94.2 % of cases tested. Time to result was faster for the MT PCR but cost per test was higher. Conclusion: This study shows that with carefully chosen targets for detection of resistance genes in MDR GN, rapid and efficient identification is possible. MT PCR was sensitive and specific and likely more accurate than phenotypic methods. PMID:26464612
Morabito, Marco; Crisci, Alfonso; Grifoni, Daniele; Orlandini, Simone; Cecchi, Lorenzo; Bacci, Laura; Modesti, Pietro Amedeo; Gensini, Gian Franco; Maracchi, Giampiero
2006-09-01
The aim of this study was to evaluate the relationship between the risk of hospital admission for myocardial infarction (MI) and the daily weather conditions during the winters of 1998-2003, according to an air-mass-based synoptic climatological approach. The effects of time lag and 2-day sequences with specific air mass types were also investigated. Studies concerning the relationship between atmospheric conditions and human health need to take into consideration simultaneous effects of many weather variables. At the moment few studies have surveyed these effects on hospitalizations for MI. Analyses were concentrated on winter, when the maximum peak of hospitalization occurred. An objective daily air mass classification by means of statistical analyses based on ground meteorological data was carried out. A comparison between air mass classification and hospital admissions was made by the calculation of a MI admission index, and to detect significant relationships the Mann-Whitney U test, the analysis of variance, and the Bonferroni test were used. Significant increases in hospital admissions for MI were evident 24h after a day characterized by an anticyclonic continental air mass and 6 days after a day characterized by a cyclonic air mass. Increased risk of hospitalization was found even when specific 2-day air mass sequences occurred. These results represent an important step in identifying reliable linkages between weather and health.
Computational analysis of sequence selection mechanisms.
Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron
2004-04-01
Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.
Towards a Logical Distinction Between Swarms and Aftershock Sequences
NASA Astrophysics Data System (ADS)
Gardine, M.; Burris, L.; McNutt, S.
2007-12-01
The distinction between swarms and aftershock sequences has, up to this point, been fairly arbitrary and non- uniform. Typically 0.5 to 1 order of magnitude difference between the mainshock and largest aftershock has been a traditional choice, but there are many exceptions. Seismologists have generally assumed that the mainshock carries most of the energy, but this is only true if it is sufficiently large compared to the size and numbers of aftershocks. Here we present a systematic division based on energy of the aftershock sequence compared to the energy of the largest event of the sequence. It is possible to calculate the amount of aftershock energy assumed to be in the sequence using the b-value of the frequency-magnitude relation with a fixed choice of magnitude separation (M-mainshock minus M-largest aftershock). Assuming that the energy of an aftershock sequence is less than the energy of the mainshock, the b-value at which the aftershock energy exceeds that of the mainshock energy determines the boundary between aftershock sequences and swarms. The amount of energy for various choices of b-value is also calculated using different values of magnitude separation. When the minimum b-value at which the sequence energy exceeds that of the largest event/mainshock is plotted against the magnitude separation, a linear trend emerges. Values plotting above this line represent swarms and values plotting below it represent aftershock sequences. This scheme has the advantage that it represents a physical quantity - energy - rather than only statistical features of earthquake distributions. As such it may be useful to help distinguish swarms from mainshock/aftershock sequences and to better determine the underlying causes of earthquake swarms.
Williams, Tony D.; Ames, Caroline E.; Kiparissis, Yiannis; Wynne-Edwards, Katherine E.
2005-01-01
We investigated the relationship between plasma and yolk oestrogens in laying female zebra finches (Taeniopygia guttata) by manipulating plasma oestradiol (E2) levels, via injection of oestradiol-17β, in a sequence-specific manner to maintain chronically high plasma levels for later-developing eggs (contrasting with the endogenous pattern of decreasing plasma E2 concentrations during laying). We report systematic variation in yolk oestrogen concentrations, in relation to laying sequence, similar to that widely reported for androgenic steroids. In sham-manipulated females, yolk E2 concentrations decreased with laying sequence. However, in E2-treated females plasma E2 levels were higher during the period of rapid yolk development of later-laid eggs, compared with control females. As a consequence, we reversed the laying-sequence-specific pattern of yolk E2: in E2-treated females, yolk E2 concentrations increased with laying-sequence. In general therefore, yolk E2 levels were a direct reflection of plasma E2 levels. However, in control females there was some inter-individual variability in the endogenous pattern of plasma E2 levels through the laying cycle which could generate variation in sequence-specific patterns of yolk hormone levels even if these primarily reflect circulating steroid levels. PMID:15695208
Sperber, Göran; Lövgren, Anders; Eriksson, Nils-Einar; Benachenhou, Farid; Blomberg, Jonas
2009-01-01
Background The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences. Methods RetroTector© (ReTe) is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL) which does not require specific installation procedures is provided, via the World Wide Web. Results ROL was implemented under the Batchelor web interface (A Lövgren et al). It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10 000 kilobases). Up to ten submissions can be done simultaneously, allowing batch analysis of <= 100 Megabases. Jobs are shown in an IP-number specific list. Results are text files, and can be viewed with the program, RetroTectorViewer.jar (at the same site), which has the full graphical capabilities of the basic ReTe program. A detailed analysis of any retroviral sequences found in the submitted sequence is graphically presented, exportable in standard formats. With the current server, a complete analysis of a 1 Megabase sequence is complete in 10 minutes. It is possible to mask nonretroviral repetitive sequences in the submitted sequence, using host genome specific "brooms", which increase specificity. Discussion Proviral sequences can be hard to recognize, especially if the integration occurred many million years ago. Precise delineation of LTR, gag, pro, pol and env can be difficult, requiring manual work. ROL is a way of simplifying these tasks. Conclusion ROL provides 1. annotation and presentation of known retroviral sequences, 2. detection of proviral chains in unknown genomic sequences, with up to 100 Mbase per submission. PMID:19534753
Sperber, Göran; Lövgren, Anders; Eriksson, Nils-Einar; Benachenhou, Farid; Blomberg, Jonas
2009-06-16
The rapid accumulation of genomic information in databases necessitates rapid and specific algorithms for extracting biologically meaningful information. More or less complete retroviral sequences, also called proviral or endogenous retroviral sequences; ERVs, constitutes at least 5% of vertebrate genomes. After infecting the host, these retroviruses have integrated in germ line cells, and have then been carried in genomes for at least several 100 million years. A better understanding of structure and function of these sequences can have profound biological and medical consequences. RetroTector (ReTe) is a platform-independent Java program for identification and characterization of proviral sequences in vertebrate genomes. The full ReTe requires a local installation with a MySQL database. Although not overly complicated, the installation may take some time. A "light" version of ReTe, (RetroTector online; ROL) which does not require specific installation procedures is provided, via the World Wide Web. ROL http://www.fysiologi.neuro.uu.se/jbgs/ was implemented under the Batchelor web interface (A Lövgren et al). It allows both GenBank accession number, file and FASTA cut-and-paste admission of sequences (5 to 10,000 kilobases). Up to ten submissions can be done simultaneously, allowing batch analysis of
Ergünay, Koray; Brinkmann, Annika; Litzba, Nadine; Günay, Filiz; Kar, Sırrı; Öter, Kerem; Örsten, Serra; Sarıkaya, Yasemen; Alten, Bülent; Nitsche, Andreas; Linton, Yvonne-Marie
2017-07-01
Next-generation sequencing technologies have significantly facilitated the discovery of novel viruses, and metagenomic surveillance of arthropods has enabled exploration of the diversity of novel or known viral agents. We have identified a novel rhabdovirus that is genetically related to the recently described Merida virus via next-generation sequencing in a mosquito pool from Thrace. The complete viral genome contains 11,798 nucleotides with 83% genome-wide nucleotide sequence similarity to Merida virus. Five major putative open reading frames that follow the canonical rhabdovirus genome organization were identified. A total of 1380 mosquitoes comprising 13 species, collected from Thrace and the Mediterranean and Aegean regions of Anatolia were screened for the novel virus using primers based on the N and L genes of the prototype genome. Eight positive pools (6.2%) exclusively comprised Culex pipiens sensu lato specimens originating from all study regions. Infections were observed in pools with female as well as male or mixed-sex individuals. The overall and Cx. pipiens-specific minimal infection rates were calculated to be 5.7 and 14.8, respectively. Sequencing of the PCR products revealed marked diversity within a portion of the N gene, with up to 4% divergence and distinct amino acid substitutions that were unrelated to the collection site. Phylogenetic analysis of the complete and partial viral polymerase (L gene) amino acid sequences placed the novel virus and Merida virus in a distinct group, indicating that these strains are closely related. The strain is tentatively named "Merida-like virus Turkey". Studies are underway to isolate and further explore the host range and distribution of this new strain.
The power and promise of RNA-seq in ecology and evolution.
Todd, Erica V; Black, Michael A; Gemmell, Neil J
2016-03-01
Reference is regularly made to the power of new genomic sequencing approaches. Using powerful technology, however, is not the same as having the necessary power to address a research question with statistical robustness. In the rush to adopt new and improved genomic research methods, limitations of technology and experimental design may be initially neglected. Here, we review these issues with regard to RNA sequencing (RNA-seq). RNA-seq adds large-scale transcriptomics to the toolkit of ecological and evolutionary biologists, enabling differential gene expression (DE) studies in nonmodel species without the need for prior genomic resources. High biological variance is typical of field-based gene expression studies and means that larger sample sizes are often needed to achieve the same degree of statistical power as clinical studies based on data from cell lines or inbred animal models. Sequencing costs have plummeted, yet RNA-seq studies still underutilize biological replication. Finite research budgets force a trade-off between sequencing effort and replication in RNA-seq experimental design. However, clear guidelines for negotiating this trade-off, while taking into account study-specific factors affecting power, are currently lacking. Study designs that prioritize sequencing depth over replication fail to capitalize on the power of RNA-seq technology for DE inference. Significant recent research effort has gone into developing statistical frameworks and software tools for power analysis and sample size calculation in the context of RNA-seq DE analysis. We synthesize progress in this area and derive an accessible rule-of-thumb guide for designing powerful RNA-seq experiments relevant in eco-evolutionary and clinical settings alike. © 2016 John Wiley & Sons Ltd.
Parker, Jennifer K.; Havird, Justin C.
2012-01-01
Isolates of the plant pathogen Xylella fastidiosa are genetically very similar, but studies on their biological traits have indicated differences in virulence and infection symptomatology. Taxonomic analyses have identified several subspecies, and phylogenetic analyses of housekeeping genes have shown broad host-based genetic differences; however, results are still inconclusive for genetic differentiation of isolates within subspecies. This study employs multilocus sequence analysis of environmentally mediated genes (MLSA-E; genes influenced by environmental factors) to investigate X. fastidiosa relationships and differentiate isolates with low genetic variability. Potential environmentally mediated genes, including host colonization and survival genes related to infection establishment, were identified a priori. The ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions (dN/dS) was calculated to select genes that may be under increased positive selection compared to previously studied housekeeping genes. Nine genes were sequenced from 54 X. fastidiosa isolates infecting different host plants across the United States. Results of maximum likelihood (ML) and Bayesian phylogenetic (BP) analyses are in agreement with known X. fastidiosa subspecies clades but show novel within-subspecies differentiation, including geographic differentiation, and provide additional information regarding host-based isolate variation and specificity. dN/dS ratios of environmentally mediated genes, though <1 due to high sequence similarity, are significantly greater than housekeeping gene dN/dS ratios and correlate with increased sequence variability. MLSA-E can more precisely resolve relationships between closely related bacterial strains with low genetic variability, such as X. fastidiosa isolates. Discovering the genetic relationships between X. fastidiosa isolates will provide new insights into the epidemiology of populations of X. fastidiosa, allowing improved disease management in economically important crops. PMID:22194287
Parker, Jennifer K; Havird, Justin C; De La Fuente, Leonardo
2012-03-01
Isolates of the plant pathogen Xylella fastidiosa are genetically very similar, but studies on their biological traits have indicated differences in virulence and infection symptomatology. Taxonomic analyses have identified several subspecies, and phylogenetic analyses of housekeeping genes have shown broad host-based genetic differences; however, results are still inconclusive for genetic differentiation of isolates within subspecies. This study employs multilocus sequence analysis of environmentally mediated genes (MLSA-E; genes influenced by environmental factors) to investigate X. fastidiosa relationships and differentiate isolates with low genetic variability. Potential environmentally mediated genes, including host colonization and survival genes related to infection establishment, were identified a priori. The ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions (dN/dS) was calculated to select genes that may be under increased positive selection compared to previously studied housekeeping genes. Nine genes were sequenced from 54 X. fastidiosa isolates infecting different host plants across the United States. Results of maximum likelihood (ML) and Bayesian phylogenetic (BP) analyses are in agreement with known X. fastidiosa subspecies clades but show novel within-subspecies differentiation, including geographic differentiation, and provide additional information regarding host-based isolate variation and specificity. dN/dS ratios of environmentally mediated genes, though <1 due to high sequence similarity, are significantly greater than housekeeping gene dN/dS ratios and correlate with increased sequence variability. MLSA-E can more precisely resolve relationships between closely related bacterial strains with low genetic variability, such as X. fastidiosa isolates. Discovering the genetic relationships between X. fastidiosa isolates will provide new insights into the epidemiology of populations of X. fastidiosa, allowing improved disease management in economically important crops.
Mohorianu, Irina; Stocks, Matthew Benedict; Wood, John; Dalmay, Tamas; Moulton, Vincent
2013-07-01
Small RNAs (sRNAs) are 20-25 nt non-coding RNAs that act as guides for the highly sequence-specific regulatory mechanism known as RNA silencing. Due to the recent increase in sequencing depth, a highly complex and diverse population of sRNAs in both plants and animals has been revealed. However, the exponential increase in sequencing data has also made the identification of individual sRNA transcripts corresponding to biological units (sRNA loci) more challenging when based exclusively on the genomic location of the constituent sRNAs, hindering existing approaches to identify sRNA loci. To infer the location of significant biological units, we propose an approach for sRNA loci detection called CoLIde (Co-expression based sRNA Loci Identification) that combines genomic location with the analysis of other information such as variation in expression levels (expression pattern) and size class distribution. For CoLIde, we define a locus as a union of regions sharing the same pattern and located in close proximity on the genome. Biological relevance, detected through the analysis of size class distribution, is also calculated for each locus. CoLIde can be applied on ordered (e.g., time-dependent) or un-ordered (e.g., organ, mutant) series of samples both with or without biological/technical replicates. The method reliably identifies known types of loci and shows improved performance on sequencing data from both plants (e.g., A. thaliana, S. lycopersicum) and animals (e.g., D. melanogaster) when compared with existing locus detection techniques. CoLIde is available for use within the UEA Small RNA Workbench which can be downloaded from: http://srna-workbench.cmp.uea.ac.uk.
Miyai, Manami; Eikawa, Shingo; Hosoi, Akihiro; Iino, Tamaki; Matsushita, Hirokazu; Isobe, Midori; Uenaka, Akiko; Udono, Heiichiro; Nakajima, Jun; Nakayama, Eiichi; Kakimi, Kazuhiro
2015-01-01
Comprehensive immunological evaluation is crucial for monitoring patients undergoing antigen-specific cancer immunotherapy. The identification and quantification of T cell responses is most important for the further development of such therapies. Using well-characterized clinical samples from a high responder patient (TK-f01) in an NY-ESO-1f peptide vaccine study, we performed high-throughput T cell receptor β-chain (TCRB) gene next generation sequencing (NGS) to monitor the frequency of NY-ESO-1-specific CD8+ T cells. We compared these results with those of conventional immunological assays, such as IFN-γ capture, tetramer binding and limiting dilution clonality assays. We sequenced human TCRB complementarity-determining region 3 (CDR3) rearrangements of two NY-ESO-1f-specific CD8+ T cell clones, 6-8L and 2F6, as well as PBMCs over the course of peptide vaccination. Clone 6-8L possessed the TCRB CDR3 gene TCRBV11-03*01 and BJ02-01*01 with amino acid sequence CASSLRGNEQFF, whereas 2F6 possessed TCRBV05-08*01 and BJ02-04*01 (CASSLVGTNIQYF). Using these two sequences as models, we evaluated the frequency of NY-ESO-1-specific CD8+ T cells in PBMCs ex vivo. The 6-8L CDR3 sequence was the second most frequent in PBMC and was present at high frequency (0.7133%) even prior to vaccination, and sustained over the course of vaccination. Despite a marked expansion of NY-ESO-1-specific CD8+ T cells detected from the first through 6th vaccination by tetramer staining and IFN-γ capture assays, as evaluated by CDR3 sequencing the frequency did not increase with increasing rounds of peptide vaccination. By clonal analysis using 12 day in vitro stimulation, the frequency of B*52:01-restricted NY-ESO-1f peptide-specific CD8+ T cells in PBMCs was estimated as only 0.0023%, far below the 0.7133% by NGS sequencing. Thus, assays requiring in vitro stimulation might be underestimating the frequency of clones with lower proliferation potential. High-throughput TCRB sequencing using NGS can potentially better estimate the actual frequency of antigen-specific T cells and thus provide more accurate patient monitoring.
Miyai, Manami; Eikawa, Shingo; Hosoi, Akihiro; Iino, Tamaki; Matsushita, Hirokazu; Isobe, Midori; Uenaka, Akiko; Udono, Heiichiro; Nakajima, Jun; Nakayama, Eiichi; Kakimi, Kazuhiro
2015-01-01
Comprehensive immunological evaluation is crucial for monitoring patients undergoing antigen-specific cancer immunotherapy. The identification and quantification of T cell responses is most important for the further development of such therapies. Using well-characterized clinical samples from a high responder patient (TK-f01) in an NY-ESO-1f peptide vaccine study, we performed high-throughput T cell receptor β-chain (TCRB) gene next generation sequencing (NGS) to monitor the frequency of NY-ESO-1-specific CD8+ T cells. We compared these results with those of conventional immunological assays, such as IFN-γ capture, tetramer binding and limiting dilution clonality assays. We sequenced human TCRB complementarity-determining region 3 (CDR3) rearrangements of two NY-ESO-1f-specific CD8+ T cell clones, 6-8L and 2F6, as well as PBMCs over the course of peptide vaccination. Clone 6-8L possessed the TCRB CDR3 gene TCRBV11-03*01 and BJ02-01*01 with amino acid sequence CASSLRGNEQFF, whereas 2F6 possessed TCRBV05-08*01 and BJ02-04*01 (CASSLVGTNIQYF). Using these two sequences as models, we evaluated the frequency of NY-ESO-1-specific CD8+ T cells in PBMCs ex vivo. The 6-8L CDR3 sequence was the second most frequent in PBMC and was present at high frequency (0.7133%) even prior to vaccination, and sustained over the course of vaccination. Despite a marked expansion of NY-ESO-1-specific CD8+ T cells detected from the first through 6th vaccination by tetramer staining and IFN-γ capture assays, as evaluated by CDR3 sequencing the frequency did not increase with increasing rounds of peptide vaccination. By clonal analysis using 12 day in vitro stimulation, the frequency of B*52:01-restricted NY-ESO-1f peptide-specific CD8+ T cells in PBMCs was estimated as only 0.0023%, far below the 0.7133% by NGS sequencing. Thus, assays requiring in vitro stimulation might be underestimating the frequency of clones with lower proliferation potential. High-throughput TCRB sequencing using NGS can potentially better estimate the actual frequency of antigen-specific T cells and thus provide more accurate patient monitoring. PMID:26291626
Kizaki, Seiichiro; Chandran, Anandhakumar; Sugiyama, Hiroshi
2016-03-02
Tet (ten-eleven translocation) family proteins have the ability to oxidize 5-methylcytosine (mC) to 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxycytosine (caC). However, the oxidation reaction of Tet is not understood completely. Evaluation of genomic-level epigenetic changes by Tet protein requires unbiased identification of the highly selective oxidation sites. In this study, we used high-throughput sequencing to investigate the sequence specificity of mC oxidation by Tet1. A 6.6×10(4) -member mC-containing random DNA-sequence library was constructed. The library was subjected to Tet-reactive pulldown followed by high-throughput sequencing. Analysis of the obtained sequence data identified the Tet1-reactive sequences. We identified mCpG as a highly reactive sequence of Tet1 protein. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The cDNA-derived amino acid sequence of hemoglobin II from Lucina pectinata.
Torres-Mercado, Elineth; Renta, Jessicca Y; Rodríguez, Yolanda; López-Garriga, Juan; Cadilla, Carmen L
2003-11-01
Hemoglobin II from the clam Lucina pectinata is an oxygen-reactive protein with a unique structural organization in the heme pocket involving residues Gln65 (E7), Tyr30 (B10), Phe44 (CD1), and Phe69 (E11). We employed the reverse transcriptase-polymerase chain reaction (RT-PCR) and methods to synthesize various cDNA(HbII). An initial 300-bp cDNA clone was amplified from total RNA by RT-PCR using degenerate oligonucleotides. Gene-specific primers derived from the HbII-partial cDNA sequence were used to obtain the 5' and 3' ends of the cDNA by RACE. The length of the HbII cDNA, estimated from overlapping clones, was approximately 2114 bases. Northern blot analysis revealed that the mRNA size of HbII agrees with the estimated size using cDNA data. The coding region of the full-length HbII cDNA codes for 151 amino acids. The calculated molecular weight of HbII, including the heme group and acetylated N-terminal residue, is 17,654.07 Da.
Ahmed, Towfiq; Haraldsen, Jason T; Rehr, John J; Di Ventra, Massimiliano; Schuller, Ivan; Balatsky, Alexander V
2014-03-28
Nanopore-based sequencing has demonstrated a significant potential for the development of fast, accurate, and cost-efficient fingerprinting techniques for next generation molecular detection and sequencing. We propose a specific multilayered graphene-based nanopore device architecture for the recognition of single biomolecules. Molecular detection and analysis can be accomplished through the detection of transverse currents as the molecule or DNA base translocates through the nanopore. To increase the overall signal-to-noise ratio and the accuracy, we implement a new 'multi-point cross-correlation' technique for identification of DNA bases or other molecules on the single molecular level. We demonstrate that the cross-correlations between each nanopore will greatly enhance the transverse current signal for each molecule. We implement first-principles transport calculations for DNA bases surveyed across a multilayered graphene nanopore system to illustrate the advantages of the proposed geometry. A time-series analysis of the cross-correlation functions illustrates the potential of this method for enhancing the signal-to-noise ratio. This work constitutes a significant step forward in facilitating fingerprinting of single biomolecules using solid state technology.
Fluctuations in the DNA double helix
NASA Astrophysics Data System (ADS)
Peyrard, M.; López, S. C.; Angelov, D.
2007-08-01
DNA is not the static entity suggested by the famous double helix structure. It shows large fluctuational openings, in which the bases, which contain the genetic code, are temporarily open. Therefore it is an interesting system to study the effect of nonlinearity on the physical properties of a system. A simple model for DNA, at a mesoscopic scale, can be investigated by computer simulation, in the same spirit as the original work of Fermi, Pasta and Ulam. These calculations raise fundamental questions in statistical physics because they show a temporary breaking of equipartition of energy, regions with large amplitude fluctuations being able to coexist with regions where the fluctuations are very small, even when the model is studied in the canonical ensemble. This phenomenon can be related to nonlinear excitations in the model. The ability of the model to describe the actual properties of DNA is discussed by comparing theoretical and experimental results for the probability that base pairs open an a given temperature in specific DNA sequences. These studies give us indications on the proper description of the effect of the sequence in the mesoscopic model.
Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.
Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N
1984-03-26
The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.
NASA Astrophysics Data System (ADS)
Holden, Todd; Marchese, P.; Tremberger, G., Jr.; Cheung, E.; Subramaniam, R.; Sullivan, R.; Schneider, P.; Flamholz, A.; Lieberman, D.; Cheung, T.
2008-08-01
We have characterized function related DNA sequences of various organisms using informatics techniques, including fractal dimension calculation, nucleotide and multi-nucleotide statistics, and sequence fluctuation analysis. Our analysis shows trends which differentiate extremophile from non-extremophile organisms, which could be reproduced in extraterrestrial life. Among the systems studied are radiation repair genes, genes involved in thermal shocks, and genes involved in drug resistance. We also evaluate sequence level changes that have occurred during short term evolution (several thousand generations) under extreme conditions.
Statistical physics of interacting neural networks
NASA Astrophysics Data System (ADS)
Kinzel, Wolfgang; Metzler, Richard; Kanter, Ido
2001-12-01
Recent results on the statistical physics of time series generation and prediction are presented. A neural network is trained on quasi-periodic and chaotic sequences and overlaps to the sequence generator as well as the prediction errors are calculated numerically. For each network there exists a sequence for which it completely fails to make predictions. Two interacting networks show a transition to perfect synchronization. A pool of interacting networks shows good coordination in the minority game-a model of competition in a closed market. Finally, as a demonstration, a perceptron predicts bit sequences produced by human beings.
Bobrova, E V; Liakhovetskiĭ, V A; Borshchevskaia, E R
2011-01-01
The dependence of errors during reproduction of a sequence of hand movements without visual feedback on the previous right- and left-hand performance ("prehistory") and on positions in space of sequence elements (random or ordered by the explicit rule) was analyzed. It was shown that the preceding information about the ordered positions of the sequence elements was used during right-hand movements, whereas left-hand movements were performed with involvement of the information about the random sequence. The data testify to a central mechanism of the analysis of spatial structure of sequence elements. This mechanism activates movement coding specific for the left hemisphere (vector coding) in case of an ordered sequence structure and positional coding specific for the right hemisphere in case of a random sequence structure.
Poliovirus serotype-specific VP1 sequencing primers.
Kilpatrick, David R; Iber, Jane C; Chen, Qi; Ching, Karen; Yang, Su-Ju; De, Lina; Mandelbaum, Mark D; Emery, Brian; Campagnoli, Ray; Burns, Cara C; Kew, Olen
2011-06-01
The Global Polio Laboratory Network routinely uses poliovirus-specific PCR primers and probes to determine the serotype and genotype of poliovirus isolates obtained as part of global poliovirus surveillance. To provide detailed molecular epidemiologic information, poliovirus isolates are further characterized by sequencing the ~900-nucleotide region encoding the major capsid protein, VP1. It is difficult to obtain quality sequence information when clinical or environmental samples contain poliovirus mixtures. As an alternative to conventional methods for resolving poliovirus mixtures, sets of serotype-specific primers were developed for amplifying and sequencing the VP1 regions of individual components of mixed populations of vaccine-vaccine, vaccine-wild, and wild-wild polioviruses. Published by Elsevier B.V.
Methods for chromosome-specific staining
Gray, J.W.; Pinkel, D.
1995-09-05
Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogeneous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include ways for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes. 3 figs.
Methods and compositions for chromosome-specific staining
Gray, Joe W.; Pinkel, Daniel
2003-07-22
Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogenous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include methods for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes.
Sequence specificity of single-stranded DNA-binding proteins: a novel DNA microarray approach
Morgan, Hugh P.; Estibeiro, Peter; Wear, Martin A.; Max, Klaas E.A.; Heinemann, Udo; Cubeddu, Liza; Gallagher, Maurice P.; Sadler, Peter J.; Walkinshaw, Malcolm D.
2007-01-01
We have developed a novel DNA microarray-based approach for identification of the sequence-specificity of single-stranded nucleic-acid-binding proteins (SNABPs). For verification, we have shown that the major cold shock protein (CspB) from Bacillus subtilis binds with high affinity to pyrimidine-rich sequences, with a binding preference for the consensus sequence, 5′-GTCTTTG/T-3′. The sequence was modelled onto the known structure of CspB and a cytosine-binding pocket was identified, which explains the strong preference for a cytosine base at position 3. This microarray method offers a rapid high-throughput approach for determining the specificity and strength of ss DNA–protein interactions. Further screening of this newly emerging family of transcription factors will help provide an insight into their cellular function. PMID:17488853
Discovering frequently recurring movement sequences in team-sport athlete spatiotemporal data.
Sweeting, Alice J; Aughey, Robert J; Cormack, Stuart J; Morgan, Stuart
2017-12-01
Athlete external load is typically analysed from predetermined movement thresholds. The combination of movement sequences and differences in these movements between playing positions is also currently unknown. This study developed a method to discover the frequently recurring movement sequences across playing position during matches. The external load of 12 international female netball athletes was collected by a local positioning system during four national-level matches. Velocity, acceleration and angular velocity were calculated from positional (X, Y) data, clustered via one-dimensional k-means and assigned a unique alphabetic label. Combinations of velocity, acceleration and angular velocity movement were compared using the Levenshtein distance and similarities computed by the longest common substring problem. The contribution of each movement sequence, according to playing position and relative to the wider data set, was then calculated via the Minkowski distance. A total of 10 frequently recurring combinations of movement were discovered, regardless of playing position. Only the wing attack, goal attack and goal defence playing positions are closely related. We developed a technique to discover the movement sequences, according to playing position, performed by elite netballers. This methodology can be extended to discover the frequently recurring movements within other team sports and across levels of competition.
Wyllie, Anne L; Pannekoek, Yvonne; Bovenkerk, Sandra; van Engelsdorp Gastelaars, Jody; Ferwerda, Bart; van de Beek, Diederik; Sanders, Elisabeth A M; Trzciński, Krzysztof; van der Ende, Arie
2017-09-01
The vast majority of streptococci colonizing the human upper respiratory tract are commensals, only sporadically implicated in disease. Of these, the most pathogenic is Mitis group member, Streptococcus pneumoniae Phenotypic and genetic similarities between streptococci can cause difficulties in species identification. Using ribosomal S2-gene sequences extracted from whole-genome sequences published from 501 streptococci, we developed a method to identify streptococcal species. We validated this method on non-pneumococcal isolates cultured from cases of severe streptococcal disease ( n = 101) and from carriage ( n = 103), and on non-typeable pneumococci from asymptomatic individuals ( n = 17) and on whole-genome sequences of 1157 pneumococcal isolates from meningitis in the Netherlands. Following this, we tested 221 streptococcal isolates in molecular assays originally assumed specific for S. pneumoniae , targeting cpsA , lytA , piaB , ply , Spn9802, zmpC and capsule-type-specific genes. Cluster analysis of S2-sequences showed grouping according to species in line with published phylogenies of streptococcal core genomes. S2-typing convincingly distinguished pneumococci from non-pneumococcal species (99.2% sensitivity, 100% specificity). Molecular assays targeting regions of lytA and piaB were 100% specific for S. pneumoniae , whereas assays targeting cpsA , ply , Spn9802, zmpC and selected serotype-specific assays (but not capsular sequence typing) showed a lack of specificity. False positive results were over-represented in species associated with carriage, although no particular confounding signal was unique for carriage isolates. © 2017 The Authors.
Pannekoek, Yvonne; Bovenkerk, Sandra; van Engelsdorp Gastelaars, Jody; Ferwerda, Bart; van de Beek, Diederik; Sanders, Elisabeth A. M.; Trzciński, Krzysztof; van der Ende, Arie
2017-01-01
The vast majority of streptococci colonizing the human upper respiratory tract are commensals, only sporadically implicated in disease. Of these, the most pathogenic is Mitis group member, Streptococcus pneumoniae. Phenotypic and genetic similarities between streptococci can cause difficulties in species identification. Using ribosomal S2-gene sequences extracted from whole-genome sequences published from 501 streptococci, we developed a method to identify streptococcal species. We validated this method on non-pneumococcal isolates cultured from cases of severe streptococcal disease (n = 101) and from carriage (n = 103), and on non-typeable pneumococci from asymptomatic individuals (n = 17) and on whole-genome sequences of 1157 pneumococcal isolates from meningitis in the Netherlands. Following this, we tested 221 streptococcal isolates in molecular assays originally assumed specific for S. pneumoniae, targeting cpsA, lytA, piaB, ply, Spn9802, zmpC and capsule-type-specific genes. Cluster analysis of S2-sequences showed grouping according to species in line with published phylogenies of streptococcal core genomes. S2-typing convincingly distinguished pneumococci from non-pneumococcal species (99.2% sensitivity, 100% specificity). Molecular assays targeting regions of lytA and piaB were 100% specific for S. pneumoniae, whereas assays targeting cpsA, ply, Spn9802, zmpC and selected serotype-specific assays (but not capsular sequence typing) showed a lack of specificity. False positive results were over-represented in species associated with carriage, although no particular confounding signal was unique for carriage isolates. PMID:28931649
Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.
Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I
2001-08-01
DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.
Allele-specific copy-number discovery from whole-genome and whole-exome sequencing
Wang, WeiBo; Wang, Wei; Sun, Wei; Crowley, James J.; Szatkiewicz, Jin P.
2015-01-01
Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/. PMID:25883151
Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel
2014-01-01
Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
Sequence Polishing Library (SPL) v10.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oberortner, Ernst
The Sequence Polishing Library (SPL) is a suite of software tools in order to automate "Design for Synthesis and Assembly" workflows. Specifically: The SPL "Converter" tool converts files among the following sequence data exchange formats: CSV, FASTA, GenBank, and Synthetic Biology Open Language (SBOL); The SPL "Juggler" tool optimizes the codon usages of DNA coding sequences according to an optimization strategy, a user-specific codon usage table and genetic code. In addition, the SPL "Juggler" can translate amino acid sequences into DNA sequences.:The SPL "Polisher" verifies NA sequences against DNA synthesis constraints, such as GC content, repeating k-mers, and restriction sites.more » In case of violations, the "Polisher" reports the violations in a comprehensive manner. The "Polisher" tool can also modify the violating regions according to an optimization strategy, a user-specific codon usage table and genetic code;The SPL "Partitioner" decomposes large DNA sequences into smaller building blocks with partial overlaps that enable an efficient assembly. The "Partitioner" enables the user to configure the characteristics of the overlaps, which are mostly determined by the utilized assembly protocol, such as length, GC content, or melting temperature.« less
McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Vitalis, Elizabeth A [Livermore, CA
2007-02-06
Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Vitalis, Elizabeth A [Livermore, CA
2009-02-24
Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.
Inhibition in motor imagery: a novel action mode switching paradigm.
Rieger, Martina; Dahm, Stephan F; Koch, Iring
2017-04-01
Motor imagery requires that actual movements are prevented (i.e., inhibited) from execution. To investigate at what level inhibition takes place in motor imagery, we developed a novel action mode switching paradigm. Participants imagined (indicating only start and end) and executed movements from start buttons to target buttons, and we analyzed trial sequence effects. Trial sequences depended on current action mode (imagination or execution), previous action mode (pure blocks/same mode, mixed blocks/same mode, or mixed blocks/other mode), and movement sequence (action repetition, hand repetition, or hand alternation). Results provided evidence for global inhibition (indicated by switch benefits in execution-imagination (E-I)-sequences in comparison to I-I-sequences), effector-specific inhibition (indicated by hand repetition costs after an imagination trial), and target inhibition (indicated by target repetition benefits in I-I-sequences). No evidence for subthreshold motor activation or action-specific inhibition (inhibition of the movement of an effector to a specific target) was obtained. Two (global inhibition and effector-specific inhibition) of the three observed mechanisms are active inhibition mechanisms. In conclusion, motor imagery is not simply a weaker form of execution, which often is implied in views focusing on similarities between imagination and execution.
Sun, Xiaoqin; Wei, Yanglian; Qin, Minjian; Guo, Qiaosheng; Guo, Jianlin; Zhou, Yifeng; Hang, Yueyu
2012-03-01
The rDNA ITS region of 18 samples of Changium smyrnioides from 7 areas and of 2 samples of Chuanminshen violaceum were sequenced and analyzed. The amplified ITS region of the samples, including a partial sequence of ITS1 and complete sequences of 5.8S and ITS2, had a total length of 555 bp. After complete alignment, there were 49 variable sites, of which 45 were informative, when gaps were treated as missing data. Samples of C. smyrnioides from different locations could be identified exactly based on the variable sites. The maximum parsimony (MP) and neighbor joining (NJ) tree constructed from the ITS sequences based on Kumar's two-parameter model showed that the genetic distances of the C. smyrnioides samples from different locations were not always related to their geographical distances. A specific primer set for Allele-specific PCR authentication of C. violaceum from Jurong of Jiangsu was designed based on the SNP in the ITS sequence alignment. C. violaceum from the major genuine producing area in Jurong of Jiangsu could be identified exactly and quickly by Allele-specific PCR.
Roberts, C H; Turino, C; Madrigal, J A; Marsh, S G E
2007-06-01
DNA enrichment by allele-specific hybridization (DEASH) was used as a means to isolate individual alleles of the killer cell immunoglobulin-like receptor (KIR2DL4) gene from heterozygous genomic DNA. Using long-template polymerase chain reaction (LT-PCR), the complete KIR2DL4 gene was amplified from a cell line that had previously been characterized for its KIR gene content by PCR using sequence-specific primers (PCR-SSP). The whole gene amplicons were sequenced and we identified two heterozygous positions in accordance with the predictions of the PCR-SSP. The amplicons were then hybridized to allele-specific, biotinylated oligonucleotide probes and through binding to streptavidin-coated beads, the targeted alleles were enriched. A second PCR amplified only the exonic regions of the enriched allele, and these were then sequenced in full. We show DEASH to be capable of enriching single alleles from a heterozygous PCR product, and through sequencing the enriched DNA, we are able to produce complete coding sequences of the KIR2DL4 alleles in accordance with the typing predicted by PCR-SSP.
ERIC Educational Resources Information Center
Laming, Donald
2006-01-01
This article reports some calculations on free-recall data from B. Murdock and J. Metcalfe (1978), with vocal rehearsal during the presentation of a list. Given the sequence of vocalizations, with the stimuli inserted in their proper places, it is possible to predict the subsequent sequence of recalls--the predictions taking the form of a…
Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia
2016-01-01
Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis. PMID:27792763
Chen, Shi-Yi; Deng, Feilong; Huang, Ying; Li, Cao; Liu, Linhai; Jia, Xianbo; Lai, Song-Jia
2016-01-01
Although various computer tools have been elaborately developed to calculate a series of statistics in molecular population genetics for both small- and large-scale DNA data, there is no efficient and easy-to-use toolkit available yet for exclusively focusing on the steps of mathematical calculation. Here, we present PopSc, a bioinformatic toolkit for calculating 45 basic statistics in molecular population genetics, which could be categorized into three classes, including (i) genetic diversity of DNA sequences, (ii) statistical tests for neutral evolution, and (iii) measures of genetic differentiation among populations. In contrast to the existing computer tools, PopSc was designed to directly accept the intermediate metadata, such as allele frequencies, rather than the raw DNA sequences or genotyping results. PopSc is first implemented as the web-based calculator with user-friendly interface, which greatly facilitates the teaching of population genetics in class and also promotes the convenient and straightforward calculation of statistics in research. Additionally, we also provide the Python library and R package of PopSc, which can be flexibly integrated into other advanced bioinformatic packages of population genetics analysis.
Optimal rotation sequences for active perception
NASA Astrophysics Data System (ADS)
Nakath, David; Rachuy, Carsten; Clemens, Joachim; Schill, Kerstin
2016-05-01
One major objective of autonomous systems navigating in dynamic environments is gathering information needed for self localization, decision making, and path planning. To account for this, such systems are usually equipped with multiple types of sensors. As these sensors often have a limited field of view and a fixed orientation, the task of active perception breaks down to the problem of calculating alignment sequences which maximize the information gain regarding expected measurements. Action sequences that rotate the system according to the calculated optimal patterns then have to be generated. In this paper we present an approach for calculating these sequences for an autonomous system equipped with multiple sensors. We use a particle filter for multi- sensor fusion and state estimation. The planning task is modeled as a Markov decision process (MDP), where the system decides in each step, what actions to perform next. The optimal control policy, which provides the best action depending on the current estimated state, maximizes the expected cumulative reward. The latter is computed from the expected information gain of all sensors over time using value iteration. The algorithm is applied to a manifold representation of the joint space of rotation and time. We show the performance of the approach in a spacecraft navigation scenario where the information gain is changing over time, caused by the dynamic environment and the continuous movement of the spacecraft
Chavhan, Govind B; Babyn, Paul S; Vasanawala, Shreyas S
2013-05-01
Familiarity with basic sequence properties and their trade-offs is necessary for radiologists performing abdominal magnetic resonance (MR) imaging. Acquiring diagnostic-quality MR images in the pediatric abdomen is challenging due to motion, inability to breath hold, varying patient size, and artifacts. Motion-compensation techniques (eg, respiratory gating, signal averaging, suppression of signal from moving tissue, swapping phase- and frequency-encoding directions, use of faster sequences with breath holding, parallel imaging, and radial k-space filling) can improve image quality. Each of these techniques is more suitable for use with certain sequences and acquisition planes and in specific situations and age groups. Different T1- and T2-weighted sequences work better in different age groups and with differing acquisition planes and have specific advantages and disadvantages. Dynamic imaging should be performed differently in younger children than in older children. In younger children, the sequence and the timing of dynamic phases need to be adjusted. Different sequences work better in smaller children and in older children because of differing breath-holding ability, breathing patterns, field of view, and use of sedation. Hence, specific protocols should be maintained for younger children and older children. Combining longer-higher-resolution sequences and faster-lower-resolution sequences helps acquire diagnostic-quality images in a reasonable time. © RSNA, 2013.
Gene calling and bacterial genome annotation with BG7.
Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo
2015-01-01
New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).
SU-E-T-250: New IMRT Sequencing Strategy: Towards Intra-Fraction Plan Adaptation for the MR-Linac
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kontaxis, C; Bol, G; Lagendijk, J
2014-06-01
Purpose: To develop a new sequencer for IMRT planning that during treatment makes the inclusion of external factors possible and by doing so accounts for intra-fraction anatomy changes. Given a real-time imaging modality that will provide the updated patient anatomy during delivery, this sequencer is able to take these changes into account during the calculation of subsequent segments. Methods: Pencil beams are generated for each beam angle of the treatment and a fluence optimization is performed. The pencil beams, together with the patient anatomy and the above optimal fluence form the input of our algorithm. During each iteration the followingmore » steps are performed: A fluence optimization is done and each beam's fluence is then split to discrete intensity levels. Deliverable segments are calculated for each one of these. Each segment's area multiplied by its intensity describes its efficiency. The most efficient segment among all beams is then chosen to deliver a part of the calculated fluence and the dose that will be delivered by this segment is calculated. This delivered dose is then subtracted from the remaining dose. This loop is repeated until 90% of the dose has been delivered and a final segment weight optimization is performed to reach full convergence. Results: This algorithm was tested in several prostate cases yielding results that meet all clinical constraints. Quality assurance was performed on Delta4 and film phantoms for one of these prostate cases and received clinical acceptance after passing both gamma analyses with the 3%/3mm criteria. Conclusion: A new sequencing algorithm was developed to facilitate the needs of intensity modulated treatment. The first results on static anatomy confirm that it can calculate clinical plans equivalent to those of the commercially available planning systems. We are now working towards 100% dose convergence which will allow us to handle anatomy deformations. This work is financially supported by Elekta AB, Stockholm, Sweden.« less
Selection of a DNA barcode for Nectriaceae from fungal whole-genomes.
Zeng, Zhaoqing; Zhao, Peng; Luo, Jing; Zhuang, Wenying; Yu, Zhihe
2012-01-01
A DNA barcode is a short segment of sequence that is able to distinguish species. A barcode must ideally contain enough variation to distinguish every individual species and be easily obtained. Fungi of Nectriaceae are economically important and show high species diversity. To establish a standard DNA barcode for this group of fungi, the genomes of Neurospora crassa and 30 other filamentous fungi were compared. The expect value was treated as a criterion to recognize homologous sequences. Four candidate markers, Hsp90, AAC, CDC48, and EF3, were tested for their feasibility as barcodes in the identification of 34 well-established species belonging to 13 genera of Nectriaceae. Two hundred and fifteen sequences were analyzed. Intra- and inter-specific variations and the success rate of PCR amplification and sequencing were considered as important criteria for estimation of the candidate markers. Ultimately, the partial EF3 gene met the requirements for a good DNA barcode: No overlap was found between the intra- and inter-specific pairwise distances. The smallest inter-specific distance of EF3 gene was 3.19%, while the largest intra-specific distance was 1.79%. In addition, there was a high success rate in PCR and sequencing for this gene (96.3%). CDC48 showed sufficiently high sequence variation among species, but the PCR and sequencing success rate was 84% using a single pair of primers. Although the Hsp90 and AAC genes had higher PCR and sequencing success rates (96.3% and 97.5%, respectively), overlapping occurred between the intra- and inter-specific variations, which could lead to misidentification. Therefore, we propose the EF3 gene as a possible DNA barcode for the nectriaceous fungi.
Specificity, Privacy, and Degeneracy in the CD4 T Cell Receptor Repertoire Following Immunization
Sun, Yuxin; Best, Katharine; Cinelli, Mattia; Heather, James M.; Reich-Zeliger, Shlomit; Shifrut, Eric; Friedman, Nir; Shawe-Taylor, John; Chain, Benny
2017-01-01
T cells recognize antigen using a large and diverse set of antigen-specific receptors created by a complex process of imprecise somatic cell gene rearrangements. In response to antigen-/receptor-binding-specific T cells then divide to form memory and effector populations. We apply high-throughput sequencing to investigate the global changes in T cell receptor sequences following immunization with ovalbumin (OVA) and adjuvant, to understand how adaptive immunity achieves specificity. Each immunized mouse contained a predominantly private but related set of expanded CDR3β sequences. We used machine learning to identify common patterns which distinguished repertoires from mice immunized with adjuvant with and without OVA. The CDR3β sequences were deconstructed into sets of overlapping contiguous amino acid triplets. The frequencies of these motifs were used to train the linear programming boosting (LPBoost) algorithm LPBoost to classify between TCR repertoires. LPBoost could distinguish between the two classes of repertoire with accuracies above 80%, using a small subset of triplet sequences present at defined positions along the CDR3. The results suggest a model in which such motifs confer degenerate antigen specificity in the context of a highly diverse and largely private set of T cell receptors. PMID:28450864
Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T
2017-04-15
RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.
Warfield, Linda; Tuttle, Lisa M; Pacheco, Derek; Klevit, Rachel E; Hahn, Steven
2014-08-26
Although many transcription activators contact the same set of coactivator complexes, the mechanism and specificity of these interactions have been unclear. For example, do intrinsically disordered transcription activation domains (ADs) use sequence-specific motifs, or do ADs of seemingly different sequence have common properties that encode activation function? We find that the central activation domain (cAD) of the yeast activator Gcn4 functions through a short, conserved sequence-specific motif. Optimizing the residues surrounding this short motif by inserting additional hydrophobic residues creates very powerful ADs that bind the Mediator subunit Gal11/Med15 with high affinity via a "fuzzy" protein interface. In contrast to Gcn4, the activity of these synthetic ADs is not strongly dependent on any one residue of the AD, and this redundancy is similar to that of some natural ADs in which few if any sequence-specific residues have been identified. The additional hydrophobic residues in the synthetic ADs likely allow multiple faces of the AD helix to interact with the Gal11 activator-binding domain, effectively forming a fuzzier interface than that of the wild-type cAD.
Zou, Lingyun; Wang, Zhengzhi; Huang, Jiaomin
2007-12-01
Subcellular location is one of the key biological characteristics of proteins. Position-specific profiles (PSP) have been introduced as important characteristics of proteins in this article. In this study, to obtain position-specific profiles, the Position Specific Iterative-Basic Local Alignment Search Tool (PSI-BLAST) has been used to search for protein sequences in a database. Position-specific scoring matrices are extracted from the profiles as one class of characteristics. Four-part amino acid compositions and 1st-7th order dipeptide compositions have also been calculated as the other two classes of characteristics. Therefore, twelve characteristic vectors are extracted from each of the protein sequences. Next, the characteristic vectors are weighed by a simple weighing function and inputted into a BP neural network predictor named PSP-Weighted Neural Network (PSP-WNN). The Levenberg-Marquardt algorithm is employed to adjust the weight matrices and thresholds during the network training instead of the error back propagation algorithm. With a jackknife test on the RH2427 dataset, PSP-WNN has achieved a higher overall prediction accuracy of 88.4% rather than the prediction results by the general BP neural network, Markov model, and fuzzy k-nearest neighbors algorithm on this dataset. In addition, the prediction performance of PSP-WNN has been evaluated with a five-fold cross validation test on the PK7579 dataset and the prediction results have been consistently better than those of the previous method on the basis of several support vector machines, using compositions of both amino acids and amino acid pairs. These results indicate that PSP-WNN is a powerful tool for subcellular localization prediction. At the end of the article, influences on prediction accuracy using different weighting proportions among three characteristic vector categories have been discussed. An appropriate proportion is considered by increasing the prediction accuracy.
Altimari, Annalisa; de Biase, Dario; De Maglio, Giovanna; Gruppioni, Elisa; Capizzi, Elisa; Degiovanni, Alessio; D’Errico, Antonia; Pession, Annalisa; Pizzolitto, Stefano; Fiorentino, Michelangelo; Tallini, Giovanni
2013-01-01
Detection of KRAS mutations in archival pathology samples is critical for therapeutic appropriateness of anti-EGFR monoclonal antibodies in colorectal cancer. We compared the sensitivity, specificity, and accuracy of Sanger sequencing, ARMS-Scorpion (TheraScreen®) real-time polymerase chain reaction (PCR), pyrosequencing, chip array hybridization, and 454 next-generation sequencing to assess KRAS codon 12 and 13 mutations in 60 nonconsecutive selected cases of colorectal cancer. Twenty of the 60 cases were detected as wild-type KRAS by all methods with 100% specificity. Among the 40 mutated cases, 13 were discrepant with at least one method. The sensitivity was 85%, 90%, 93%, and 92%, and the accuracy was 90%, 93%, 95%, and 95% for Sanger sequencing, TheraScreen real-time PCR, pyrosequencing, and chip array hybridization, respectively. The main limitation of Sanger sequencing was its low analytical sensitivity, whereas TheraScreen real-time PCR, pyrosequencing, and chip array hybridization showed higher sensitivity but suffered from the limitations of predesigned assays. Concordance between the methods was k = 0.79 for Sanger sequencing and k > 0.85 for the other techniques. Tumor cell enrichment correlated significantly with the abundance of KRAS-mutated deoxyribonucleic acid (DNA), evaluated as ΔCt for TheraScreen real-time PCR (P = 0.03), percentage of mutation for pyrosequencing (P = 0.001), ratio for chip array hybridization (P = 0.003), and percentage of mutation for 454 next-generation sequencing (P = 0.004). Also, 454 next-generation sequencing showed the best cross correlation for quantification of mutation abundance compared with all the other methods (P < 0.001). Our comparison showed the superiority of next-generation sequencing over the other techniques in terms of sensitivity and specificity. Next-generation sequencing will replace Sanger sequencing as the reference technique for diagnostic detection of KRAS mutation in archival tumor tissues. PMID:23950653
Schmidt, DJ; Pickett, BE; Camacho, D; Comach, G; Xhaja, K; Lennon, NJ; Rizzolo, K; de Bosch, N; Becerra, A; Nogueira, ML; Mondini, A; da Silva, EV; Vasconcelos, PF; Muñoz-Jordán, JL; Santiago, GA; Ocazionez, R; Gehrke, L; Lefkowitz, EJ; Birren, BW; Henn, MR; Bosch, I
2013-01-01
Dengue virus currently causes 50-100 million infections annually. Comprehensive knowledge about the evolution of Dengue in response to selection pressure is currently unavailable, but would greatly enhance vaccine design efforts. In the current study, we sequenced 187 new dengue virus serotype 3(DENV-3) genotype III whole genomes isolated from Asia and the Americas. We analyzed them together with previously-sequenced isolates to gain a more detailed understanding of the evolutionary adaptations existing in this prevalent American serotype. In order to analyze the phylogenetic dynamics of DENV-3 during outbreak periods; we incorporated datasets of 48 and 11 sequences spanning two major outbreaks in Venezuela during 2001 and 2007-2008 respectively. Our phylogenetic analysis of newly sequenced viruses shows that subsets of genomes cluster primarily by geographic location, and secondarily by time of virus isolation. DENV-3 genotype III sequences from Asia are significantly divergent from those from the Americas due to their geographical separation and subsequent speciation. We measured amino acid variation for the E protein by calculating the Shannon entropy at each position between Asian and American genomes. We found a cluster of 7 amino acid substitutions having high variability within E protein domain III, which has previously been implicated in serotype-specific neutralization escape mutants. No novel mutations were found in the E protein of sequences isolated during either Venezuelan outbreak. Shannon entropy analysis of the NS5 polymerase mature protein revealed that a G374E mutation, in a region that contributes to interferon resistance in other flaviviruses by interfering with JAK-STAT signaling was present in both the Asian and American sequences from the 2007-2008 Venezuelan outbreak, but was absent in the sequences from the 2001 Venezuelan outbreak. In addition to E, several NS5 amino acid changes were unique to the 2007-2008 epidemic in Venezuela and may give additional insight into the adaptive response of DENV-3 at the population level. PMID:21964598
First-order and higher order sequence learning in specific language impairment.
Clark, Gillian M; Lum, Jarrad A G
2017-02-01
A core claim of the procedural deficit hypothesis of specific language impairment (SLI) is that the disorder is associated with poor implicit sequence learning. This study investigated whether implicit sequence learning problems in SLI are present for first-order conditional (FOC) and higher order conditional (HOC) sequences. Twenty-five children with SLI and 27 age-matched, nonlanguage-impaired children completed 2 serial reaction time tasks. On 1 version, the sequence to be implicitly learnt comprised a FOC sequence and on the other a HOC sequence. Results showed that the SLI group learned the HOC sequence (η p ² = .285, p = .005) but not the FOC sequence (η p ² = .099, p = .118). The control group learned both sequences (FOC η p ² = .497, HOC η p 2= .465, ps < .001). The SLI group's difficulty learning the FOC sequence is consistent with the procedural deficit hypothesis. However, the study provides new evidence that multiple mechanisms may underpin the learning of FOC and HOC sequences. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Levin, Mattias; King, Jasmine J.; Glanville, Jacob; Jackson, Katherine J. L.; Looney, Timothy J.; Hoh, Ramona A.; Mari, Adriano; Andersson, Morgan; Greiff, Lennart; Fire, Andrew Z.; Boyd, Scott D.; Ohlin, Mats
2016-01-01
Background Specific immunotherapy (SIT) is the only treatment with proven long-term curative potential in allergic disease. Allergen-specific IgE is the causative agent of allergic disease, and antibodies contribute to SIT, but the effects of SIT on aeroallergen-specific B cell repertoires are not well understood. Objective To characterize the IgE sequences expressed by allergen-specific B cells, and track the fate of these B cell clones during SIT. Methods We have used high-throughput antibody gene sequencing and identification of allergen-specific IgE using combinatorial antibody fragment library technology to analyze immunoglobulin repertoires of blood and nasal mucosa of aeroallergen-sensitized individuals before and during the first year of subcutaneous SIT. Results Of 52 distinct allergen-specific IgE heavy chains from eight allergic donors, 37 were also detected by high-throughput antibody gene sequencing of blood, nasal mucosa, or both sample types. The allergen-specific clones had increased persistence, higher likelihood of belonging to clones expressing other switched isotypes, and possibly larger clone size than the rest of the IgE repertoire. Clone members in nasal tissue showed close mutational relationships. Conclusion Combining functional binding studies, deep antibody repertoire sequencing, and information on clinical outcomes in larger studies may in the future aid assessment of SIT mechanisms and efficacy. PMID:26559321
Zhang, Lu; Xu, Jinhao; Ma, Jinbiao
2016-07-25
RNA-binding protein exerts important biological function by specifically recognizing RNA motif. SELEX (Systematic evolution of ligands by exponential enrichment), an in vitro selection method, can obtain consensus motif with high-affinity and specificity for many target molecules from DNA or RNA libraries. Here, we combined SELEX with next-generation sequencing to study the protein-RNA interaction in vitro. A pool of RNAs with 20 bp random sequences were transcribed by T7 promoter, and target protein was inserted into plasmid containing SBP-tag, which can be captured by streptavidin beads. Through only one cycle, the specific RNA motif can be obtained, which dramatically improved the selection efficiency. Using this method, we found that human hnRNP A1 RRMs domain (UP1 domain) bound RNA motifs containing AGG and AG sequences. The EMSA experiment indicated that hnRNP A1 RRMs could bind the obtained RNA motif. Taken together, this method provides a rapid and effective method to study the RNA binding specificity of proteins.
Accelerating calculations of RNA secondary structure partition functions using GPUs
2013-01-01
Background RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts between pairs of complementary bases. Prediction of the secondary structure of RNA from its sequence is therefore of great interest, but can be computationally expensive. In this work we accelerate computations of base-pair probababilities using parallel graphics processing units (GPUs). Results Calculation of the probabilities of base pairs in RNA secondary structures using nearest-neighbor standard free energy change parameters has been implemented using CUDA to run on hardware with multiprocessor GPUs. A modified set of recursions was introduced, which reduces memory usage by about 25%. GPUs are fastest in single precision, and for some hardware, restricted to single precision. This may introduce significant roundoff error. However, deviations in base-pair probabilities calculated using single precision were found to be negligible compared to those resulting from shifting the nearest-neighbor parameters by a random amount of magnitude similar to their experimental uncertainties. For large sequences running on our particular hardware, the GPU implementation reduces execution time by a factor of close to 60 compared with an optimized serial implementation, and by a factor of 116 compared with the original code. Conclusions Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision. The source code is integrated into the RNAstructure software package and available for download at http://rna.urmc.rochester.edu. PMID:24180434
Evers, R; Grummt, I
1995-01-01
Both the DNA elements and the nuclear factors that direct termination of ribosomal gene transcription exhibit species-specific differences. Even between mammals--e.g., human and mouse--the termination signals are not identical and the respective transcription termination factors (TTFs) which bind to the terminator sequence are not fully interchangeable. To elucidate the molecular basis for this species-specificity, we have cloned TTF-I from human and mouse cells and compared their structural and functional properties. Recombinant TTF-I exhibits species-specific DNA binding and terminates transcription both in cell-free transcription assays and in transfection experiments. Chimeric constructs of mouse TTF-I and human TTF-I reveal that the major determinant for species-specific DNA binding resides within the C terminus of TTF-I. Replacing 31 C-terminal amino acids of mouse TTF-I with the homologous human sequences relaxes the DNA-binding specificity and, as a consequence, allows the chimeric factor to bind the human terminator sequence and to specifically stop rDNA transcription. Images Fig. 2 Fig. 3 Fig. 4 PMID:7597036
Pal, Debojyoti; Sharma, Deepak; Kumar, Mukesh; Sandur, Santosh K
2016-09-01
S-glutathionylation of proteins plays an important role in various biological processes and is known to be protective modification during oxidative stress. Since, experimental detection of S-glutathionylation is labor intensive and time consuming, bioinformatics based approach is a viable alternative. Available methods require relatively longer sequence information, which may prevent prediction if sequence information is incomplete. Here, we present a model to predict glutathionylation sites from pentapeptide sequences. It is based upon differential association of amino acids with glutathionylated and non-glutathionylated cysteines from a database of experimentally verified sequences. This data was used to calculate position dependent F-scores, which measure how a particular amino acid at a particular position may affect the likelihood of glutathionylation event. Glutathionylation-score (G-score), indicating propensity of a sequence to undergo glutathionylation, was calculated using position-dependent F-scores for each amino-acid. Cut-off values were used for prediction. Our model returned an accuracy of 58% with Matthew's correlation-coefficient (MCC) value of 0.165. On an independent dataset, our model outperformed the currently available model, in spite of needing much less sequence information. Pentapeptide motifs having high abundance among glutathionylated proteins were identified. A list of potential glutathionylation hotspot sequences were obtained by assigning G-scores and subsequent Protein-BLAST analysis revealed a total of 254 putative glutathionable proteins, a number of which were already known to be glutathionylated. Our model predicted glutathionylation sites in 93.93% of experimentally verified glutathionylated proteins. Outcome of this study may assist in discovering novel glutathionylation sites and finding candidate proteins for glutathionylation.
Rank-order-selective neurons form a temporal basis set for the generation of motor sequences.
Salinas, Emilio
2009-04-08
Many behaviors are composed of a series of elementary motor actions that must occur in a specific order, but the neuronal mechanisms by which such motor sequences are generated are poorly understood. In particular, if a sequence consists of a few motor actions, a primate can learn to replicate it from memory after practicing it for just a few trials. How do the motor and premotor areas of the brain assemble motor sequences so fast? The network model presented here reveals part of the solution to this problem. The model is based on experiments showing that, during the performance of motor sequences, some cortical neurons are always activated at specific times, regardless of which motor action is being executed. In the model, a population of such rank-order-selective (ROS) cells drives a layer of downstream motor neurons so that these generate specific movements at different times in different sequences. A key ingredient of the model is that the amplitude of the ROS responses must be modulated by sequence identity. Because of this modulation, which is consistent with experimental reports, the network is able not only to produce multiple sequences accurately but also to learn a new sequence with minimal changes in connectivity. The ROS neurons modulated by sequence identity thus serve as a basis set for constructing arbitrary sequences of motor responses downstream. The underlying mechanism is analogous to the mechanism described in parietal areas for generating coordinate transformations in the spatial domain.
RANK-ORDER-SELECTIVE NEURONS FORM A TEMPORAL BASIS SET FOR THE GENERATION OF MOTOR SEQUENCES
Salinas, Emilio
2009-01-01
Many behaviors are composed of a series of elementary motor actions that must occur in a specific order, but the neuronal mechanisms by which such motor sequences are generated are poorly understood. In particular, if a sequence consists of a few motor actions, a primate can learn to replicate it from memory after practicing it for just a few trials. How do the motor and premotor areas of the brain assemble motor sequences so fast? The network model presented here reveals part of the solution to this problem. The model is based on experiments showing that, during the performance of motor sequences, some cortical neurons are always activated at specific times, regardless of which motor action is being executed. In the model, a population of such rank-order-selective (ROS) cells drives a layer of downstream motor neurons so that these generate specific movements at different times in different sequences. A key ingredient of the model is that the amplitude of the ROS responses must be modulated by sequence identity. Because of this modulation, which is consistent with experimental reports, the network is able not only to produce multiple sequences accurately but also to learn a new sequence with minimal changes in connectivity. The ROS neurons modulated by sequence identity thus serve as a basis set for constructing arbitrary sequences of motor responses downstream. The underlying mechanism is analogous to the mechanism described in parietal areas for generating coordinate transformations in the spatial domain. PMID:19357265
TOPPE: A framework for rapid prototyping of MR pulse sequences.
Nielsen, Jon-Fredrik; Noll, Douglas C
2018-06-01
To introduce a framework for rapid prototyping of MR pulse sequences. We propose a simple file format, called "TOPPE", for specifying all details of an MR imaging experiment, such as gradient and radiofrequency waveforms and the complete scan loop. In addition, we provide a TOPPE file "interpreter" for GE scanners, which is a binary executable that loads TOPPE files and executes the sequence on the scanner. We also provide MATLAB scripts for reading and writing TOPPE files and previewing the sequence prior to hardware execution. With this setup, the task of the pulse sequence programmer is reduced to creating TOPPE files, eliminating the need for hardware-specific programming. No sequence-specific compilation is necessary; the interpreter only needs to be compiled once (for every scanner software upgrade). We demonstrate TOPPE in three different applications: k-space mapping, non-Cartesian PRESTO whole-brain dynamic imaging, and myelin mapping in the brain using inhomogeneous magnetization transfer. We successfully implemented and executed the three example sequences. By simply changing the various TOPPE sequence files, a single binary executable (interpreter) was used to execute several different sequences. The TOPPE file format is a complete specification of an MR imaging experiment, based on arbitrary sequences of a (typically small) number of unique modules. Along with the GE interpreter, TOPPE comprises a modular and flexible platform for rapid prototyping of new pulse sequences. Magn Reson Med 79:3128-3134, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach.
Nielsen, Morten; Lundegaard, Claus; Worning, Peder; Hvid, Christina Sylvester; Lamberth, Kasper; Buus, Søren; Brunak, Søren; Lund, Ole
2004-06-12
Prediction of which peptides will bind a specific major histocompatibility complex (MHC) constitutes an important step in identifying potential T-cell epitopes suitable as vaccine candidates. MHC class II binding peptides have a broad length distribution complicating such predictions. Thus, identifying the correct alignment is a crucial part of identifying the core of an MHC class II binding motif. In this context, we wish to describe a novel Gibbs motif sampler method ideally suited for recognizing such weak sequence motifs. The method is based on the Gibbs sampling method, and it incorporates novel features optimized for the task of recognizing the binding motif of MHC classes I and II. The method locates the binding motif in a set of sequences and characterizes the motif in terms of a weight-matrix. Subsequently, the weight-matrix can be applied to identifying effectively potential MHC binding peptides and to guiding the process of rational vaccine design. We apply the motif sampler method to the complex problem of MHC class II binding. The input to the method is amino acid peptide sequences extracted from the public databases of SYFPEITHI and MHCPEP and known to bind to the MHC class II complex HLA-DR4(B1*0401). Prior identification of information-rich (anchor) positions in the binding motif is shown to improve the predictive performance of the Gibbs sampler. Similarly, a consensus solution obtained from an ensemble average over suboptimal solutions is shown to outperform the use of a single optimal solution. In a large-scale benchmark calculation, the performance is quantified using relative operating characteristics curve (ROC) plots and we make a detailed comparison of the performance with that of both the TEPITOPE method and a weight-matrix derived using the conventional alignment algorithm of ClustalW. The calculation demonstrates that the predictive performance of the Gibbs sampler is higher than that of ClustalW and in most cases also higher than that of the TEPITOPE method.
2012-01-01
Background Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat. Results The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs. Conclusion This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping. PMID:22559868
Sequence periodicity in nucleosomal DNA and intrinsic curvature
2010-01-01
Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
Analysis of codon usage in beta-tubulin sequences of helminths.
von Samson-Himmelstjerna, G; Harder, A; Failing, K; Pape, M; Schnieder, T
2003-07-01
Codon usage bias has been shown to be correlated with gene expression levels in many organisms, including the nematode Caenorhabditis elegans. Here, the codon usage (cu) characteristics for a set of currently available beta-tubulin coding sequences of helminths were assessed by calculating several indices, including the effective codon number (Nc), the intrinsic codon deviation index (ICDI), the P2 value and the mutational response index (MRI). The P2 value gives a measure of translational pressure, which has been shown to be correlated to high gene expression levels in some organisms, but it has not yet been analysed in that respect in helminths. For all but two of the C. elegans beta-tubulin coding sequences investigated, the P2 value was the only index that indicated the presence of codon usage bias. Therefore, we propose that in general the helminth beta-tubulin sequences investigated here are not expressed at high levels. Furthermore, we calculated the correlation coefficients for the cu patterns of the helminth beta-tubulin sequences compared with those of highly expressed genes in organisms such as Escherichia coli and C. elegans. It was found that beta-tubulin cu patterns for all sequences of members of the Strongylida were significantly correlated to those for highly expressed C. elegans genes. This approach provides a new measure for comparing the adaptation of cu of a particular coding sequence with that of highly expressed genes in possible expression systems.Finally, using the cu patterns of the sequences studied, a phylogenetic tree was constructed. The topology of this tree was very much in concordance with that of a phylogeny based on small subunit ribosomal DNA sequence alignments.
A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses
USDA-ARS?s Scientific Manuscript database
Background: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected ...
Fenstermacher, Katherine J; Achuthan, Vasudevan; Schneider, Thomas D; DeStefano, Jeffrey J
2018-01-16
DNA polymerases (DNAPs) recognize 3' recessed termini on duplex DNA and carry out nucleotide catalysis. Unlike promoter-specific RNA polymerases (RNAPs), no sequence specificity is required for binding or initiation of catalysis. Despite this, previous results indicate that viral reverse transcriptases bind much more tightly to DNA primers that mimic the polypurine tract. In the current report, primer sequences that bind with high affinity to Taq and Klenow polymerases were identified using a modified Selective Evolution of Ligands by Exponential Enrichment (SELEX) approach. Two Taq -specific primers that bound ∼10 (Taq1) and over 100 (Taq2) times more stably than controls to Taq were identified. Taq1 contained 8 nucleotides (5' -CACTAAAG-3') that matched the phage T3 RNAP "core" promoter. Both primers dramatically outcompeted primers with similar binding thermodynamics in PCR reactions. Similarly, exonuclease minus Klenow polymerase also selected a high affinity primer that contained a related core promoter sequence from phage T7 RNAP (5' -ACTATAG-3'). For both Taq and Klenow, even small modifications to the sequence resulted in large losses in binding affinity suggesting that binding was highly sequence-specific. The results are discussed in the context of possible effects on multi-primer (multiplex) PCR assays, molecular information theory, and the evolution of RNAPs and DNAPs. Importance This work further demonstrates that primer-dependent DNA polymerases can have strong sequence biases leading to dramatically tighter binding to specific sequences. These may be related to biological function, or be a consequences of the structural architecture of the enzyme. New sequence specificity for Taq and Klenow polymerases were uncovered and among them were sequences that contained the core promoter elements from T3 and T7 phage RNA polymerase promoters. This suggests the intriguing possibility that phage RNA polymerases exploited intrinsic binding affinities of ancestral DNA polymerases to develop their promotors. Conversely, DNA polymerases could have evolved from related RNA polymerases and retained the intrinsic binding preference despite there being no clear function for such a preference in DNA biology. Copyright © 2018 American Society for Microbiology.
Manipulation of lignin composition in plants using a tissue-specific promoter
Chapple, Clinton C. S.
2003-08-26
The present invention relates to methods and materials in the field of molecular biology, the manipulation of the phenylpropanoid pathway and the regulation of proteins synthesis through plant genetic engineering. More particularly, the invention relates to the introduction of a foreign nucleotide sequence into a plant genome, wherein the introduction of the nucleotide sequence effects an increase in the syringyl content of the plant's lignin. In one specific aspect, the invention relates to methods for modifying the plant lignin composition in a plant cell by the introduction there into of a foreign nucleotide sequence comprising at issue specific plant promoter sequence and a sequence encoding an active ferulate-5-hydroxylase (F5H) enzyme. Plant transformants harboring an inventive promoter-F5H construct demonstrate increased levels of syringyl monomer residues in their lignin, rendering the polymer more readily delignified and, thereby, rendering the plant more readily pulped or digested.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moore, B; Yin, F; Cai, J
Purpose: To determine the variation in tumor contrast between different MRI sequences and between patients for the purpose of MRI-based treatment planning. Methods: Multiple MRI scans of 11 patients with cancer(s) in the liver were included in this IRB-approved study. Imaging sequences consisted of T1W MRI, Contrast-Enhanced T1W MRI, T2W MRI, and T2*/T1W MRI. MRI images were acquired on a 1.5T GE Signa scanner with a four-channel torso coil. We calculated the tumor-to-tissue contrast to noise ratio (CNR) for each MR sequence by contouring the tumor and a region of interest (ROI) in a homogeneous region of the liver usingmore » the Eclipse treatment planning software. CNR was calculated (I-Tum-I-ROI)/SD-ROI, where I-Tum and I-ROI are the mean values of the tumor and the ROI respectively, and SD-ROI is the standard deviation of the ROI. The same tumor and ROI structures were used in all measurements for different MR sequences. Inter-patient Coefficient of variation (CV), and inter-sequence CV was determined. In addition, mean and standard deviation of CNR were calculated and compared between different MR sequences. Results: Our preliminary results showed large inter-patient CV (range: 37.7% to 88%) and inter-sequence CV (range 5.3% to 104.9%) of liver tumor CNR, indicating great variations in tumor CNR between MR sequences and between patients. Tumor CNR was found to be largest in CE-T1W (8.5±7.5), followed by T2W (4.2±2.4), T1W (3.4±2.2), and T2*/T1W (1.7±0.6) MR scans. The inter-patient CV of tumor CNR was also the largest in CE-T1W (88%), followed by T1W (64.3%), T1W (56.2%), and T2*/T1W (37.7) MR scans. Conclusion: Large inter-sequence and inter-patient variations were observed in liver tumor CNR. CE-T1W MR images on average provided the best tumor CNR. Efforts are needed to optimize tumor contrast and its consistency for MRI-based treatment planning of cancer in the liver. This project is supported by NIH grant: 1R21CA165384.« less
Sequence harmony: detecting functional specificity from alignments
Feenstra, K. Anton; Pirovano, Walter; Krab, Klaas; Heringa, Jaap
2007-01-01
Multiple sequence alignments are often used for the identification of key specificity-determining residues within protein families. We present a web server implementation of the Sequence Harmony (SH) method previously introduced. SH accurately detects subfamily specific positions from a multiple alignment by scoring compositional differences between subfamilies, without imposing conservation. The SH web server allows a quick selection of subtype specific sites from a multiple alignment given a subfamily grouping. In addition, it allows the predicted sites to be directly mapped onto a protein structure and displayed. We demonstrate the use of the SH server using the family of plant mitochondrial alternative oxidases (AOX). In addition, we illustrate the usefulness of combining sequence and structural information by showing that the predicted sites are clustered into a few distinct regions in an AOX homology model. The SH web server can be accessed at www.ibi.vu.nl/programs/seqharmwww. PMID:17584793
Mans, Ben J; Pienaar, Ronel; Ratabane, John; Pule, Boitumelo; Latif, Abdalla A
2016-07-01
Molecular classification and systematics of the Theileria is based on the analysis of the 18S rRNA gene. Reverse line blot or conventional sequencing approaches have disadvantages in the study of 18S rRNA diversity and a next-generation 454 sequencing approach was investigated. The 18S rRNA gene was amplified using RLB primers coupled to 96 unique sequence identifiers (MIDs). Theileria positive samples from African buffalo (672) and cattle (480) from southern Africa were combined in batches of 96 and sequenced using the GS Junior 454 sequencer to produce 825711 informative sequences. Sequences were extracted based on MIDs and analysed to identify Theileria genotypes. Genotypes observed in buffalo and cattle were confirmed in the current study, while no new genotypes were discovered. Genotypes showed specific geographic distributions, most probably linked with vector distributions. Host specificity of buffalo and cattle specific genotypes were confirmed and prevalence data as well as relative parasitemia trends indicate preference for different hosts. Mixed infections are common with African buffalo carrying more genotypes compared to cattle. Associative or exclusion co-infection profiles were observed between genotypes that may have implications for speciation and systematics: specifically that more Theileria species may exist in cattle and buffalo than currently recognized. Analysis of primers used for Theileria parva diagnostics indicate that no new genotypes will be amplified by the current primer sets confirming their specificity. T. parva SNP variants that occur in the 18S rRNA hypervariable region were confirmed. A next generation sequencing approach is useful in obtaining comprehensive knowledge regarding 18S rRNA diversity and prevalence for the Theileria, allowing for the assessment of systematics and diagnostic assays based on the 18S gene. Copyright © 2016 Elsevier GmbH. All rights reserved.
Compositions and methods for xylem-specific expression in plant cells
DOE Office of Scientific and Technical Information (OSTI.GOV)
Han, Kyung-Hwan; Ko, Jae-Heung
The invention provides promoter sequences that regulate specific expression of operably linked sequences in developing xylem cells and/or in developing xylem tissue. The developing xylem-specific sequences are exemplified by the DX5, DX8, DX11, and DX15 promoters, portions thereof, and homologs thereof. The invention further provides expression vectors, cells, tissues and plants that contain the invention's sequences. The compositions of the invention and methods of using them are useful in, for example, improving the quantity (biomass) and/or the quality (wood density, lignin content, sugar content etc.) of expressed biomass feedstock products that may be used for bioenergy, biorefinary, and generating woodmore » products such as pulp, paper, and solid wood.« less
Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie
2003-04-02
Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
Functional specificity of a Hox protein mediated by the recognition of minor groove structure.
Joshi, Rohit; Passner, Jonathan M; Rohs, Remo; Jain, Rinku; Sosinsky, Alona; Crickmore, Michael A; Jacob, Vinitha; Aggarwal, Aneel K; Honig, Barry; Mann, Richard S
2007-11-02
The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly understood step in the control of gene expression. Members of the Hox family of transcription factors bind DNA by making nearly identical major groove contacts via the recognition helices of their homeodomains. In vivo specificity, however, often depends on extended and unstructured regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd). Using a combination of structure determination, computational analysis, and in vitro and in vivo assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.
Sequence-specific procedural learning deficits in children with specific language impairment.
Hsu, Hsinjen Julie; Bishop, Dorothy V M
2014-05-01
This study tested the procedural deficit hypothesis of specific language impairment (SLI) by comparing children's performance in two motor procedural learning tasks and an implicit verbal sequence learning task. Participants were 7- to 11-year-old children with SLI (n = 48), typically developing age-matched children (n = 20) and younger typically developing children matched for receptive grammar (n = 28). In a serial reaction time task, the children with SLI performed at the same level as the grammar-matched children, but poorer than age-matched controls in learning motor sequences. When tested with a motor procedural learning task that did not involve learning sequential relationships between discrete elements (i.e. pursuit rotor), the children with SLI performed comparably with age-matched children and better than younger grammar-matched controls. In addition, poor implicit learning of word sequences in a verbal memory task (the Hebb effect) was found in the children with SLI. Together, these findings suggest that SLI might be characterized by deficits in learning sequence-specific information, rather than generally weak procedural learning. © 2014 The Authors. Developmental Science Published by John Wiley & Sons Ltd.
Townsley, Brad T; Covington, Michael F; Ichihashi, Yasunori; Zumstein, Kristina; Sinha, Neelima R
2015-01-01
Next Generation Sequencing (NGS) is driving rapid advancement in biological understanding and RNA-sequencing (RNA-seq) has become an indispensable tool for biology and medicine. There is a growing need for access to these technologies although preparation of NGS libraries remains a bottleneck to wider adoption. Here we report a novel method for the production of strand specific RNA-seq libraries utilizing the terminal breathing of double-stranded cDNA to capture and incorporate a sequencing adapter. Breath Adapter Directional sequencing (BrAD-seq) reduces sample handling and requires far fewer enzymatic steps than most available methods to produce high quality strand-specific RNA-seq libraries. The method we present is optimized for 3-prime Digital Gene Expression (DGE) libraries and can easily extend to full transcript coverage shotgun (SHO) type strand-specific libraries and is modularized to accommodate a diversity of RNA and DNA input materials. BrAD-seq offers a highly streamlined and inexpensive option for RNA-seq libraries.
Using PATIMDB to Create Bacterial Transposon Insertion Mutant Libraries
Urbach, Jonathan M.; Wei, Tao; Liberati, Nicole; Grenfell-Lee, Daniel; Villanueva, Jacinto; Wu, Gang; Ausubel, Frederick M.
2015-01-01
PATIMDB is a software package for facilitating the generation of transposon mutant insertion libraries. The software has two main functions: process tracking and automated sequence analysis. The process tracking function specifically includes recording the status and fates of multiwell plates and samples in various stages of library construction. Automated sequence analysis refers specifically to the pipeline of sequence analysis starting with ABI files from a sequencing facility and ending with insertion location identifications. The protocols in this unit describe installation and use of PATIMDB software. PMID:19343706
VizieR Online Data Catalog: Silicon isoelectronic sequence data (Jonsson+, 2016)
NASA Astrophysics Data System (ADS)
Jonsson, P.; Radziute, L.; Gaigalas, G.; Godefroid, M. R.; Marques, J. P.; Brage, T.; Froese Fischer, C.; Grant, I. P.
2015-11-01
Calculations were performed for the five states belonging to the 3s23p2 even configuration and the 22 states belonging to the 3s3p3 and 3s23p3d odd configurations. The calculations were made by parity, meaning that the even and odd states were determined in separate calculations in the EOL scheme. (3 data files).
htsint: a Python library for sequencing pipelines that combines data through gene set generation.
Richards, Adam J; Herrel, Anthony; Bonneaud, Camille
2015-09-24
Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.
Bellerophon: A program to detect chimeric sequences in multiple sequence alignments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip
2003-12-23
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS
Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mueller, C.; Nabelssi, B.; Roglans-Ribas, J.
1995-04-01
This report contains the Appendices for the Analysis of Accident Sequences and Source Terms at Waste Treatment and Storage Facilities for Waste Generated by the U.S. Department of Energy Waste Management Operations. The main report documents the methodology, computational framework, and results of facility accident analyses performed as a part of the U.S. Department of Energy (DOE) Waste Management Programmatic Environmental Impact Statement (WM PEIS). The accident sequences potentially important to human health risk are specified, their frequencies are assessed, and the resultant radiological and chemical source terms are evaluated. A personal computer-based computational framework and database have been developedmore » that provide these results as input to the WM PEIS for calculation of human health risk impacts. This report summarizes the accident analyses and aggregates the key results for each of the waste streams. Source terms are estimated and results are presented for each of the major DOE sites and facilities by WM PEIS alternative for each waste stream. The appendices identify the potential atmospheric release of each toxic chemical or radionuclide for each accident scenario studied. They also provide discussion of specific accident analysis data and guidance used or consulted in this report.« less
Rapid comparison of protein binding site surfaces with Property Encoded Shape Distributions (PESD)
Das, Sourav; Kokardekar, Arshad
2009-01-01
Patterns in shape and property distributions on the surface of binding sites are often conserved across functional proteins without significant conservation of the underlying amino-acid residues. To explore similarities of these sites from the viewpoint of a ligand, a sequence and fold-independent method was created to rapidly and accurately compare binding sites of proteins represented by property-mapped triangulated Gauss-Connolly surfaces. Within this paradigm, signatures for each binding site surface are produced by calculating their property-encoded shape distributions (PESD), a measure of the probability that a particular property will be at a specific distance to another on the molecular surface. Similarity between the signatures can then be treated as a measure of similarity between binding sites. As postulated, the PESD method rapidly detected high levels of similarity in binding site surface characteristics even in cases where there was very low similarity at the sequence level. In a screening experiment involving each member of the PDBBind 2005 dataset as a query against the rest of the set, PESD was able to retrieve a binding site with identical E.C. (Enzyme Commission) numbers as the top match in 79.5% of cases. The ability of the method in detecting similarity in binding sites with low sequence conservations were compared with state-of-the-art binding site comparison methods. PMID:19919089